Omitted Variable bias vs Multicollinearity

ar169

New member
Joined
Jun 18, 2026
Messages
0
Reaction score
0
In Multicollinearity, there are issues with the standard errors if the independent variables are correlated. However under Omitted variable bias, it says
“If the true regression model was Y=b+bX1+bX2+ε and we estimate the model Yi = a0 + a1X1i + εi then our regression model would be misspecified. What is wrong with the model? If the omitted variable (X2) is correlated with the remaining variable (X1), then the error term in the model will be correlated with (X1), and the estimated values of the regression coefficients a0 and a1 would be biased and inconsistent. In addition, the estimates of the standard errors of those coefficients will also be inconsistent, so we can use neither the coefficients estimates nor the estimated standard errors to make statistical tests.”
Institute, CFA. CFA Institute Level II 2014 Volume 1 Ethical and Professional Standards, Quantitative Methods, and Economics. John Wiley & Sons P&T, 2013-07-12.

I dont understand this-on one hand introducing two independent correlated variables can be a problem and then on the other hand, if an omitted variable is correlated then we also have an issue?
 
Right. In one case, the model is properly specified, but some independent variables are correlated with one another– not a terrible problem (especially if the model is only going to be used for prediction purposes). In the other case, you have left out a necessary piece from the puzzle. Now your x-variable is correlated with the error term, which is also problematic. The regressors should be exogenous (mean error equal to zero and errors uncorrelated to regressors, E(e|x)=0). If you have multicollinearity, you can fix this issue a few ways. The other issue is more limited in it’s remedies. To satisfy the assumption you need to include the omitted variable.
 
But Multi-collinearity is a huge issue-coefficients and standard errors are unreliable and the recommendation to fix is to remove one of the variables (which is completely opposite to omitted variable bias)
 
Yes, but multicollinearity (excluding perfect) doesn’t violate any of the necessary assumptions. Multicollinearity can be mitigated by increasing the sample size to reduce the standard errors for the coefficients. Additionally, if the x-variables are so highly correlated, are you really in need of two variables? Combining the data from these two into one seems pretty reasonable (if the variables are SO correlated and similar).
Also, only variables that are highly correlated will be affected (unreliable) by this (other variables in the model that are uncorrelated can still be estimated properly). If you are trying to interpret the partial effect of x1 on y, and if x1 is uncorrelated (or lowly) with the other regressors, why fix something that isn’t affecting your analysis?
Edited, not necessary. As you can see, there are many (more than this) ways to fix the problem of multicollinearity, some are at the expense of practical interpretations, some are not.
So as I said, multicollinearity is not that big of a deal if you are using the model only for prediction purposes (getting predicted values, confidence intervals, etc.). Once you try to make practical/economic interpretations from the coefficients, then you might run into trouble, but then we have some solutions. It all comes down to evaluating the degree of multicollinearity and how it could be influencing your interpretations and how you can fix the issue.
You are right, though, if you drop the variable that solves one problem but creates another. The problem created by dropping a variable violates a basic regression assumption. This is usually percieved as a bigger deal (inconsistent estimators, bias doesn’t dissipate with large samples) than some wonky coefficients and inflated variances…
 
Back
Top