Test 1 Answers

CALIFORNIA STATE UNIVERSITY, SACRAMENTO
Department of Economics

Economics 141/241
Prof. Yang

Answers to Test #1

Compare and contrast between the following concepts:

Estimator and estimate An estimator is simply a rule or formula that tells us how to go about estimating a population quantity, such as population mean. An estimate is simply the numerical value taken by an estimator.

Correlation and causation: Correlation measures the degree of linear association of two

Variables, i.e. how closely the two are related to each other. The causation implies a causal relationship between the dependent variable and independent variable. Two points between the two terms are: 1) the correlation does not necessarily imply the causation, and 2) causation is implied from a theory or previous research results.

PRF and SRF: PRF specifies the relationship (in the population) between the mean (average) value of the depedent variable (Y) corresponding to each value of the independent variable (X). This relationship may be specified algebraically as E(Y˝ X_i ) = B₁+ B₂X_i. Graphically, the population regression line is a line that passes through the conditional means of Y. SRF specifies the relationship (in the sample) between the Y_i , estimator of E(Y˝ X_i ), or the estimator of the population conditional mean, and its primary determinant X. Algebraically, it may be written as Y_I = b₁ + b₂X_i.

A. Each entry represents a joint probability-its gives the probability that the random variable

X takes a given value, like 1, and Y takes a given value, like 8.5.

The marginal PDF represents the probability that X assumes a given value, like 2,

regardless of the values taken by Y. f(X=2) = 0.40 f(Y=17.5) = 0.28

The conditional PDF is defined as f(Y˝ X) = P(Y=y ˝ X=x ). It gives the probability that

Y takes the value of y conditional on the knowledge that X has assumed the value of x.

One simple method of calculating the conditional PDF is: f(Y˝ X) = f(X,Y) / f(X).

F(Y=11.5˝ X=2) = 0.28 / 0.40 = 0.70.

E(Y˝ X=2) = S Y f(Y˝ X=2) = 8.5 x 0.10/0.40 + 11.5 x 0.28/0.40 + 17.5 x 0.02/0.40

= 8.5 x 0.25 + 11.5 x 0.7 + 17.5 x 0.05 = 2.125 + 8.05 + 0.875 = 11.05

A. Z = (X - m )/ (s / root of n) and Z ~ N(0,1)

t = (X - m ) / (S/ sq. root of n), where S = sample standard deviation. The resulting t variable defined above follows Student’s t distribution with (n-1) degrees of freedom.

Difference? The means of the two distributions are zero, so they are same. The only difference between them lies in the variance: The variance of the standard normal distribution is 1, while the variance of the t-distribution is k/k-2. As the degrees of freedom increases, the t-distribution resembles or approximates the normal distribution.

State whether the following statements are true, false, or uncertain.

An estimator of a parameter is a random variable, but the parameter is nonrandom.

True. An estimator is a rule or formula and is a random variable; the parameter is an estimate and it is a fixed number or nonrandom.

An unbiased estimator of a parameter, say m _X, means that it will always be equal to

m _X.

False. An unbiased estimator may be or may not be equal to the true parameter. The

mean or expected value of unbiased estimator is equal to the true parameter. E(m _X ) = m _X.

C. An efficient estimator means an estimator with minimum variance.

Uncertain. To be an efficient estimator, the estimator needs to have both the

unbiasedness and minimum variance. Therefore, an efficient estimator means more than

estimator with minimum variance.

Refer to Figure 5-4 on page 132, Essentials of Econometrics.

State with reasons whether the following statements are true, false, or uncertain. Be precise.

The stochastic error term u_i and the residual term e_i mean the same thing.

False. The stochastic error term u_i is the error term in the PRF and the residual error term ei is the error term in the SRF and it is the estimate of u_i with a given sample. e = actual Y_i – estimated or predicted Y.

A linear regression model means a model linear in the variables.

False. It is linear in the parameter.

In the population regression function, the regression coefficients Bs are random variables. On the other hand, the regression coefficients, b₁ and b₂ in the SRF are papameters.

False. Bs are parameters, while the regression coefficients in the SRF are random variables.

A. Since X ~ N(70,9), Z=(X-m _X) / square root of 9. Z = (75-70) / 3 @ 1.67. Therefore, we want to find P(Z > 1.67). For a graphical illustration, refer to Figure 3-3 (A) on page 70, Essentials of Econometrics.

B. Since X ~ N(m , s ²), sample mean of X ~ N(m _X, S²_X /n). Given the distribution of sample mean, then (sample mean of X - m ) / (S / square root of N) will follow t- distribution.

For help, see the discussion on t distribution in the textbook.

H₀: m = 7.5; H_A: m š 7.5

t = (6.5 – 7.5) / (2/square root of 20) = 1/ [2/4.472] = 1/0.447 @ 2.37. Since the degrees of freedom is N-1 = 19, the critical t value for 5% level of confidence for the two tail test [Note: H_{A: m}š 7.5 with two possibilities and thus we need a two-tail test] is 2.093 from the t-table. Since the computed t statistics is larger than the critical t value, we reject the null hypothesis.

Alternatively, one can set up a confidence interval and see if the hypothesized value is within or outside the confidence interval in testing, as is illustrated in Example 4.3 in the Essetnials of Ecoonometrics, p. 108. The textbook example uses a = 1% and this test question asks you to use a =5%. From the Appendix t-table, the critical t-value for the 19 degrees of freedom is 2.093.

The confidence interval is: sample mean of X ą t_c (S/square root of n), where S = sample standard deviation. By plug in numerical values to the above confidence interval, we obtain: 6.5 ą 2.093 (2/sq. root of 20) = 6.5 ą 0.936. So the confidence interval is:

5.564 Ł m _X Ł 7.436. Since the hypothesized value of true mean is 7.5, we note that it is outside the confidence interval. Therefore, we reject the hypothesis, the same result as we obtained from the t-test, which is simpler and easy to use.

A. Sketch the regression line.

Interpretation of the intercept. A strict interpretation of the intercept is: when retail price is equal to zero, the average amount of coffee consumption is 2.691 cups per person per day. But this interpretation is absurd. What does the intercept term then represents? It essentially represents the influences of excluded variables in the regression.

C. Slope coefficient: When the retail price of coffee increases by one dollar a pound, the coffee consumption decreases by 0.4795 pounds on the average. Remember: In regression analysis, we are examining the relationship between the conditional mean of Y and its determinants, say X.

Back to Prof. Yang's Homepage