This content originally appeared on DEV Community and was authored by Super Kai (Kazuya Ito)
Overfitting:
- is the problem which a model can make accurate predictions for train data a lot but a little for new data(including test data) so the model fits train data much more than new data.
- occurs because:
- train data is small(not enough) so the model can only learn a small number of patterns.
- train data is imbalanced(biased) having a lot of specific(limitted), similar or same data but not a lot of various data so the model can only learn a small number of patterns.
- train data has a lot of noise(noisy data) so the model learns the patterns of the noise a lot but not the patters of normal data. *Noise(noisy data) means outliers, anomalies or sometimes duplicated data.
- the training time is too long with a too large number of epochs.
- the model is too complex.
- can be mitigated by:
- larger train data.
- having a lot of various data.
- reduceing noise.
- stopping training early.
- Dropout. *My post explains Dropout().
- Ensemble learning.
- Regularization to reduce model complexity:
*Memos:
- There is L1 Regularization also called L1 Norm or Lasso Regression.
- There is L2 Regularization also called L2 Norm or Ridge Regression.
- My post explains linalg.norm().
- My post explains linalg.vector_norm().
- My post explains linalg.matrix_norm().
Underfitting:
- is the problem which a model cannot make accurate predictions both for train data and new data(including test data) a lot so the model doesn’t fit both train data and new data.
- occurs because:
- the model is too simple(not complex enough).
- the training time is too short with a too small number of epochs.
- Excessive regularization is applied.
- can be mitigated by:
- Increasing model complexity.
- Increasing the training time with a larger number of epochs
- Decreasing regularization.
This content originally appeared on DEV Community and was authored by Super Kai (Kazuya Ito)