# Example 3.6, Hourly Wage Equation
# Data set: wage1
load("wage1.Rdata")
# Recap the single regression model
model=lm(lwage~educ, data=data)
cat("This example uses the wage data set that was used in Example 2.10\nSuppose the following model satisfies MLR.1 through MLR.4, and hence the OLS estimators are unbiased",
"\nlwage = beta0 + beta1 * educ + beta2 * abil + u",
"\nwhere lwage is ", paste(desc[desc[,1]=="lwage",2]),
",\neduc is ", paste(desc[desc[,1]=="educ",2]), ",\nand abil stands for ability",
"\nSince the data set does not contain data on ability, we can only estimate the simple regression model lwage = beta0 + beta1 * educ + u, as we did in Example 2.10\n",
"The estimated regression line was\nlwagehat = ", round(model$coefficients[1],digits=3), " + ",
round(model$coefficients[2],digits=3), " * educ\n",
"n = ", nrow(data), ", R^2 = ", round(summary(model)$r.squared,digits=3),
sep="")
# Interpretation
cat("We expect that educ and abil are positively correlated, and that higher abil also leads to higher lwage. Therefore, we can reasonably believe that the MEAN of the OLS estimators of beta1 across ALL random samples would be larger than the true value of beta1, i.e. a positive bias\n",
"However, this particular estimate, ", round(model$coefficients[2],digits=3), ", is the result from only one sample, and we are not able to determine whether it is larger or smaller than the true value of beta1",
sep="")