Sunday, November 2, 2025

What is Variable Inflation factor?

## **Variance Inflation Factor (VIF)**


The **Variance Inflation Factor (VIF)** measures how much the variance of a regression coefficient is inflated due to multicollinearity in the model.

---

### **Formula**

For predictor \( X_k \):

\[

\text{VIF}_k = \frac{1}{1 - R_k^2}

\]

where \( R_k^2 \) is the R-squared value from regressing \( X_k \) on all other predictors.

---


### **Interpretation**

- **VIF = 1**: No multicollinearity

- **1 < VIF ≤ 5**: Moderate correlation (usually acceptable)

- **VIF > 5 to 10**: High multicollinearity (may be problematic)

- **VIF > 10**: Severe multicollinearity (coefficient estimates are unstable)

---

## **How VIF is Helpful**

1. **Detects Multicollinearity**

   - Identifies when predictors are highly correlated with each other

   - Helps understand which variables contribute to collinearity

2. **Assesses Regression Coefficient Stability**

   - High VIF → large standard errors → unreliable coefficient estimates

   - Helps decide if some variables should be removed or combined

3. **Guides Model Improvement**

   - Suggests when to:

     - Remove redundant variables

     - Combine correlated variables (e.g., using PCA)

     - Use regularization (Ridge regression)

4. **Better Model Interpretation**

   - With lower multicollinearity, coefficient interpretations are more reliable

   - Each predictor's effect can be isolated more clearly

---

### **Example Usage**

If you have predictors: House Size, Number of Rooms, Number of Bathrooms

- Regress "Number of Rooms" on "House Size" and "Number of Bathrooms"

- High \( R^2 \) → High VIF → these variables contain overlapping information

- Solution: Maybe use only "House Size" and one other, or create a composite feature

---

**Bottom line**: VIF helps build more robust, interpretable models by identifying and addressing multicollinearity issues.



 


No comments:

Post a Comment