Economics 612/613 Spring, 2001 David Mitch Mid-term Assignment. For this assignment, you should use the data set in the file called "State" which is being sent to you as a separate attachment. This data set consists of observations for each of the 50 U.S. states plus the District of Columbia for the years 1980, 1985, 1990, and 1995. The variables in this data set are defined as follows: pop = population of the state in the given year. er = enrollment rate in K-12 as a proportion of the population aged 5 - 17 in the state in the Given year. hsc = percentage of the population aged 25 or higher who have completed a high school degree or higher level of educational attainment for 1996. bsc = percentage of the population aged 25 or higher who have completed a bachelors degree or higher level of educational attainment for 1996. yhsc = percentage of the population aged 18 to 24 who have completed a high school degree for 1996. GSP = gross state product in the given year (the equivalent at the state level of Gross National Product). Ibt = indirect business taxes collected in the state in the given year (consider this a proxy for investment). Pti = property type income in the state in the given year. (another possible proxy for investment.) 1. Estimate the basic MRW model as in Tables 1 and 2 of the MRW article using the state level observations for 1995 as data rather than the country-level observations of MRW. In this and what follows treat ibt/gsp (indirect business taxes relative to gross state product ) as an indicator of state level investment activity. Compare your estimates for the 50 U.S. states with those of MRW. Based on those estimates discuss the extent to which MRW model seems to apply to the 50 U.S. States plus D.C. More specifically, run the following two regressions: a) ln(GSP/Capita 1995) = a0 + a1 ln(ibt/gsp 1995) + a2 ln( ln(pop95) - ln(pop90) + .05). b) b) ln(GSP/Capita 1995) = a0 + a1 ln(ibt/gsp 1995) + a2 ln( ln(pop95) - ln(pop90) + .05) _ + a3 ln bsc96. Plot the residuals for each regression against ln(GSP/capita 1995) and test for heteroscedasticity. 2.Redo regression b) in problem 1 above, using alternatively hsc96, ysc96, and er95 as measures of human capital. Based on these alternative regressions, do an Extreme Bounds Analysis of the range of values for a3 in equation b). 3. Examine whether unconditional and conditional convergence have been present across the states of the U.S. between 1980 and 1995 by running: a)the unconditional convergence regression: ln(GSP/capita 95) - ln(GSP/capita 80) = a0 + a1 ln(GSP/capita 1980). b)the conditional convergence regression: ln(GSP/capita 95) - ln(GSP/capita 1980) = a0 + a1 ln(GSP/capita 1980) + a2 ln(ibt/gsp 1995) + + a2 ln( ln(pop95) - ln(pop90) + .05) + a3 ln bsc96. Calculate the values of lambda implied by your results from regressions 3a and 3b respectively. Based on your calculated values of lambda, discuss the extent to which unconditional and conditional convergence appear to have been present for the states between 1980 and 1995. 4.Consider the following four regional classifications of the State data set (following census classifications as listed in Barro and Sala-I-Martin 1995, p.371) and using standard state abbreviations as provided in the State Data set: Northeast: CT, ME, MA, NH, NJ, NY, PA, RI, VT. South: AL, AR, FL, GA, KY, LA, MS, NC, OK, SC, TN, TX, VA, WV, DE, MD, DC. Midwest: IL, IN, IA, KS, MI, MN, MO, NE, ND, OH, SD, WI. West: AZ, ID, MT, NV, NM, OR, UT, WA, WY, CO, CA. Estimate equation 3b from question 3 for each of these four regions. Based on your results discuss the extent to which each of these four regions constitutes a convergence club. Do any of these four regions exhibit a greater tendency to constitute a convergence club than the U.S. as a whole? 5.Test the hypothesis that the coefficient a3 on bsc96 in equation 3b) above in problem 3 is the same between the four regions specified in problem 4. 6.Consider the state level observations for the years 1980, 1985, 1990, and 1995 as a panel data set. Use ln(GSP/capita) in a given year as the dependent variable and ln(ibt/gsp), ln(er) in the given years along with ln (pop growth between given year and 5 years earlier) as explanatory variables. [HINT: Before using the reshape command in STATA to create the panel data set, use the encode command in STATA to assign a numerical value to each state. Do this by the following command: encode state, generate (scode) which will create scode as a new variable consisting of a numerical value assigned to each state. Then use scode to identify each state in using the reshape command.]. a)estimate a pooled OLS regression using these variables. b)estimate a fixed effects regression using these variables. Plot the estimated fixed effect for each state against GSP per capita in 1980. (HINT: do this in STATA by creating the 15 year lag of GSP per capita in 1995 and plotting the fixed effect against this lagged GSP per capita.) Discuss any implications of the relationship your plot suggests between GSP per capita and the state fixed effect. c)Estimate a Random effects model using these variables. Use the Hausman test for the applicability here of the random effects model versus the fixed effects model. 6.Consider a model of GSP per capita that uses ln(ibt/gsp) as an instrument for GSP/capita, ln(population growth) over a 5 year interval as an instrument for school enrollment and GSP per capita lagged 5 years as an instrument for both GSP per capita and school enrollment. Estimate the coefficient on the impact of ln(school enrollment) in a given year on Ln(GSP per capita for that year using: a)indirect least squares. b)two stage least squares c)three stage least squares. Discuss the interpretation of any similarities or differences in your estimates using these three procedures. 1 2