Model Fitting: Underfitting vs Balanced vs Overfitting in Machine Learning
Machine Learning models is to learn patterns through different data and make accurate predictions. In essence, however, the central issue remains getting the right balance in this learning act. Two very common mistakes that a model can make during its training phase are related to overfitting and underfitting. Let’s discuss both in detail, along with the strategies to obtain a balanced model.
Underfitting
Underfitting occurs when your model is too simple to capture the trend of your data. As such, it will generally perform poorly on both the training dataset and on new unseen data.
Signs of Underfitting:
Accuracy is low on training and validation/testing data.
The model does not capture the trend of the underlying data.
Causes of Underfitting:
Too simple of a model used, e.g. linear model on nonlinear data
The model has not been trained enough.
Over-regularization: This will over-constrain the model.
Techniques to Prevent Underfitting:
1. Increase Model Complexity: Use a more complex model or add more parameters.
2. Feature Engineering: Engineer more relevant features that capture underlying patterns.
3. Decrease Regularization: Reduce regularization parameters and let the model have a bit more flexibility.
4. Increase Training Duration: Train the model for a longer time if it hasn’t converged yet.
Balanced Model
A balanced model performs well on the training data and validation or testing data. It generalizes well to new, unseen data since the underlying patterns are captured without noise influence.
Strategies to get a Balanced Model:
1. Area under Curve: Carry out cross-validation for tuning hyperparameters and model selection.
2. Model Selection: Select model complexity appropriate to the problem at hand.
3. Ensemble Methods: Combine several models to prevent overfitting or underfitting.
4. Hyperparameter Tuning: Grid search or random search methods to find the best hyperparameters.
Overfitting
Overfitting is when a model learns detailed noise in training data to an extent that it negatively affects its ability to perform on new, unseen data. In other words, it overfits the model to irrelevant data.
Signs of Overfitting:
High accuracy on training data but abysmal performance on validation/testing data.
The model does great on the training set but is really bad at generalization to new data.
Causes of Overfitting:
Complex models with large numbers of parameters
A dataset with a big number of features
A small dataset in terms of size
Overfitting by letting a model train for a very long time, for a large number of epochs
Techniques to Avoid Overfitting:
1. Cross Validation: Apply techniques such as cross-validation of k-folds so that it performs not only well on one set, but on different subsets of data.
2. Model Simplification: This means simplifying the model is best done by the reduction of features or parameters.
3. Regularization: The regression model can get regularized using L1 (Lasso) or L2 (Ridge) types of regularization to try and penalize big coefficients.
4. Pruning: Deleting branches from decision trees that do not have much significance.
5. Early stopping: Monitor the model while performing on a validation set, while dropping it once the level of performing criteria reaches the maximum.
Practical Example
Suppose we have a dataset of house prices. We want to predict the price of the house based on other features like size, number of bedrooms, and location.
Overfitting Scenarios: We often use a very complex model, such as a deep neural network with a great number of hidden layers, and train it during many epochs. It commits the sin of memorization on the training dataset, generalizing poorly to new data.
Underfitting Scenario: If we use a simple model of linear regression, it might miss these nonlinear relationships in the data and hence perform poorly on both the training and testing data.
Balanced Model: A balanced model would then be either a polynomial regression or a decision tree of the right depth, regularized to prevent overfitting, and tuned not to make it too complex, yet capture underlying patterns.
Summary
In a Underfitting condition training error is high so High Bias + Low Variance.
In a Generalized condition training error is low so Low Bias + Low Variance.
In a Overfitting condition training error is zero but test error is high so Low Bias + High Variance.
Join me in exploring these pillars of technological evolution. Let’s unravel the mysteries, debunk the myths, and harness the power of data to shape our future. Follow my journey, engage with the visuals, and let’s decode the future, one pixel at a time.