How to...

tickersu · Apr 18, 2015

What is the best way to approach the Institute regarding an incorrect (pretty sure) practice test (subject tests) question (in terms of approach and contact method, I didn’t notice an email address for the practice tests)?
I took a QM practice test, “Charlent”, and the last question asks about choosing the statement that is least correct. The last question regarding an improved model and multicollinearity- I chose the following statement as least likely correct “The F-statistic is likely to be overstated”. This statement is factually incorrect, unless I’m missing something. Multicollinearity does not affect the fit of the model; in other words, the population error variance is unaffected by multicollinearity (which is why you can still use the model for prediction in the presence of multicollinearity). If the error variance and model fit (indicated by R-squared) are unaffected, then the F-statistic is unaffected. They said this was incorrect and that [paraphrase] R-squared and the F-statistic are overstated in the presence of multicollinearity.
Anyone care to weigh in (preferably on both topics)?

Harrogath · Apr 18, 2015

What do you mean with “population error variance”?, and what do you mean with “R-squared is unaffected”?
I have understood that multicollinearity does affect the error variance of the coefficients and the whole model to the extent of making contradictory what coefficients T-tests say and what F-statistic say. You can reject practically all coefficients significance and still saying that the whole model is great (a high F so a high R-sq too). I think all those indicators are affected.
A model specification must be parsimonious, because if you add many variables that does not increase explanation or are related with other independent variables you are inflating the variance of the model blandly, so that model is not efficient and consistent anymore, or at least you increase the probability that it may not be.

jounin83 · Apr 18, 2015

Hi Tickersu,
you can email like i am doing with “potential errata investigated” to : [email protected]
IMO regarding the answer : Multicollinearity –> F-Test overstated (significant R^2)
Thanks

tickersu · Apr 18, 2015

Harrogath wrote:
What do you mean with “population error variance”?, and what do you mean with “R-squared is unaffected”? If the Standard error of the estimate (SEE or Root MSE) is the standard deviation of the random error term in a regression, then I am referring to the SEE-squared (the variance of the random error term, MSE). In this case, the true (population) variance for the error term is unaffected (neither under nor overstated) in the presence of multicollinearity. This also means that the true (population) R-squared is neither under nor overstated in the presence of multicollinearity.
I have understood that multicollinearity does affect the error variance of the coefficients — correct if you say the variance of the coefficients (or the standard errors of the coefficients), this is because multicollinearity affects the accuracy with which we can esimate these coefficients (and is demonstrable through VIFs). It does not change the predictive power of our model, though (using values of Xs to predict Y); therefore, the the error variance (variance of Y minus predicted Y) is estimated properly.
and the whole model to the extent of making contradictory what coefficients T-tests say and what F-statistic say. You can reject practically all coefficients significance and still saying that the whole model is great (a high F so a high R-sq too). I think all those indicators are affected. The estimated coefficients and their standard errors (therefore T-tests) are affected. R-squared and the F-statistic are called model based statistics and they are unaffected by multicollinearity. The contradictory T-test and F-tests can occur because the T-statistics are deflated (via inflated standard errors for coefficients), whereas the F-statistic is a ratio of variances that are not influenced by MC (one of which is the MSE, which is unaltered by multicollinearity). Also, “high” F statistics and R-squared are relative. An F-statistic is “high” if it exceeds the critical F value, and a “high” R-squared might depend on previous research results (40% migh be high if all previous studies were <30%, for example).
A model specification must be parsimonious, Not must, but it is a practical guideline in certain scenarios.
because if you add many variables that does not increase explanation agreed, then remove the superfluous variables
or are related with other independent variables you are inflating the variance of the model blandly, Multicollinearity isn’t always an issue. There are several practical scenarios where your model exhibits multicollinearity, but you are not concerned with this. It’s a very common misinterpretation that multicollinearity is always bad and must be remedied.
so that model is not efficient and consistent anymore, or at least you increase the probability that it may not be. —Not true, the consistency of OLS is not affected by multicollinearity.

tickersu · Apr 18, 2015

jounin83 wrote:
Hi Tickersu,
you can email like i am doing with “potential errata investigated” to : [email protected]
IMO regarding the answer : Multicollinearity –> F-Test overstated (significant R^2)
Thanks

Thanks for the reply! I’ll work up the email.
To your approach: Multicollinearity does not, in fact, overstate the f-test or r-squared. The way they are calculated can show this. It is the t-statistics that can be understated and the standard errors of the coefficients that can be inflated.
I believe the mix up is coming from a potential indicator of MC and how the Institute has phrased it (and a few other texts): low t-stats and “high” f-stats and high R-squared. In this case, “high” is a relative term, implying statistical significance (not high as in an overstatement).
The practical meaning for what they are saying is this: you run a regression, you check out the F-test and it’s significant– great, let’s check out the explanatory power! You look at R-squared and it’s indicating pretty good explanatory power of the model (things are going well). Then, you glance at the t-statistics, which tell you that none of the X-variables are good! So, you scratch your head, and you wonder, “what’s going on?” My t-tests say the model is trash (no x-variables are statistically significant), but my f-test says at least one x-variable is statistically significant (and R-squared says the model is relatively good for explaining the DV). Ah, now I remember, this is a classic indication that multicollinearity might be involved. Since, the model has a significant F-test, the model is still useful if we want to plug in numbers for X and predict Y, but we should be careful about examining relationships (coefficient estimates) between x and y. Basically, this is the gist of that “sign” of MC.
In many statistics texts I’ve read or seen, this is the case, and some address that the model fit is unaffected by MC (model fit can be assessed with R-squared).

How to...

tickersu

New member

Harrogath

New member

jounin83

New member

tickersu

New member

tickersu

New member