Hi Guys,

Here a few things to note about Regularization :

Let’s Start with training a Linear Regression Machine Learning Model & it reported well on our Training Data with an accuracy score of 98% but has failed to do the same on the Test data with an accuracy score of 36%. Which is not a good sign for any Machine Learning Model to deploy! Right???

What happened?

Why did the Model fail?

Was there a fault in the metrics selected?

Lots of Questions Right!!!

**Let’s decode!!!** The Machine Learning Model learns from the given training data (which is available) & fits the model based on the pattern. Unseen data (Test Data) will be having a different pattern. This leads us to the Overfit of a Model (Completely Blindsided).

Let’s take an example of data with overfit conditions for better understanding.

**Note: We are using Linear Regression aka Ordinary Least Squares Method.**

The above plots explain the fit of a model on training data with 100% accuracy with the sum of residuals is 0. Perfect right, our machine learning model is in perfect shape.

Let’s take this to model for a test ride (check on Test Data).

Our test ride took a huge diversion & was the outcome was completely wrong, where the sum of residuals is more & the accuracy has been dropped drastically !!!

One more Ooooooops.

This represents the Overfitting of the model.

# What is Overfit & Under Fit?

**OverFitting:** Good Performance on the training Data, Bad/Poor Performance on Other Data(Test Data).

**UnderFitting:** Poor Performace on the Training Data, Poor Performance on Other Data (Test Data).

# How do we reduce this Overfit?

As the title suggested we are going to use the regularization method. This means to regularize or shrink the coefficient towards Zero by adding some additional value to prevent Overfitting the data.

# What is the term Regularization? What are the different Regularization Models?

Regularization is a technique used to reduce the error by fitting a function appropriately on a given training data set & to avoid noise & overfitting issues.

## Ridge Regression aka L2 Regularization

Ridge Regression (also called Tikhonov Regularization) is a regularised version of Linear Regression & a technique for analyzing multiple regression data (features) that suffer from multicollinearity. Multicollinearity leads us to unbiased data & with high variance, which has a huge difference from the true value. Introducing a Bias to the data will have a reduction in Variance & standard errors thus we can have an optimal model.

Here the term Lambda is a hyperparameter that can be tuned based on the best fit. The slope is squared for getting the best fit line towards the zero (towards X-axis). As it can provide us best fit by a small increase in Bias & reducing the variance.

## Elastic Net Regression aka L1 & L2 Regularization

Elastic Net Regression is in between Ridge & Lasso Regression. As a regularisation term, we are adding a mix of both Ridge and Lasso penalty terms.

Overall summary Regularization reduces the Overfit of a Linear Model. & to reduce this to happen we use Regularization by either L1 or L2 or L1+L2.