Polynomial Regression and Cross Validation

Polynomial regression is a type of regression analysis used in statistics and machine learning to model the relationship between a dependent variable (target) and one or more independent variables (predictors) as an nth-degree polynomial function. In simple terms, it extends linear regression by allowing the relationship between the variables to be more complex, capturing non-linear patterns in the data.It allows for more flexibility by introducing higher-order terms of the independent variable(s). The equation for a polynomial regression model of degree can be represented as

Y=b0​+b1​X+b2​X2+…+bn​Xn
Where:
is still the dependent variable.
is the independent variable.
​ is the intercept.
,bn​ are the coefficients of the polynomial terms.

  • Observed for range of 5 degrees of polynomial regression. For each degree, we created polynomial features, fit a polynomial regression model, and performed cross-validation to obtain R-squared scores.
  • Plotted the learning curve to visualize how the cross-validation score changes with the polynomial degree.
  • Identified the best degree with the highest cross-validation R-squared score.
  • From the below graph, we can conclude that the best degree fit for the present data is 2. 

Leave a Reply

Your email address will not be published. Required fields are marked *