What is VIF (Variance Inflation Factor)

What is Variance Inflation Factor

Multi-collinearity is a state where multiple dependent attributes are correlated to each other. In other words, one attribute is somehow directly or indirectly related to other attributes and they provide similar predictive power to the model. In this blog, you will understand what is Variance Inflation Factor and how to find Variance Inflation Factor.

The correlation coefficient is a measure of multi-collinearity but this can find a correlation between two variables. Therefore Variance Inflation Factor(VIF) metric can be used to measure collinearity amount multiple attributes.

The formula for finding VIF of any attribute is below:

CodeCogsEqn1 1



The VIF of each attribute is calculated by running a multiple regression model where the attribute is the dependent variable and the other are independent variables.

For example, To find VIF of x1 from the set of dependent variables such as {x1,x2,x3 }, a multiple linear regression model is built, where x1 acts as a dependent variable whereas x2 and x3 act as independent variables. Based on the R-sq value for each variable, VIF is determined.

Higher the VIF, the Higher the multi-collinearity between other variables. As a rule of thumb, you should remove VIF values more than 10 to avoid multi-collinearity issues in the model.

def vif(input_data, dependent_col):
    vif_df = pd.DataFrame( columns = ['Var', 'Vif'])
    x_vars=input_data.drop([dependent_col], axis=1)
    for i in range(0,xvar_names.shape[0]):
        vif_df.loc[i] = [xvar_names[i], vif]
    return vif_df.sort_values(by = 'Vif', axis=0, ascending=False, inplace=False)

# Calculating Vif value
vif(input_data=housing, dependent_col="price")

In this blog, we discussed about what is Variance Inflation Factor and how to find Variance Inflation Factor.

One thought on “What is VIF (Variance Inflation Factor)

Leave a Reply