Overfitting vs Underfitting in PyTorch

September 26, 2024

This content originally appeared on DEV Community and was authored by Super Kai (Kazuya Ito)

is the problem which a model can make accurate predictions for train data a lot but a little for new data(including test data) so the model fits train data much more than new data.
occurs because:
- train data is small(not enough) so the model can only learn a small number of patterns.
- train data is imbalanced(biased) having a lot of specific(limitted), similar or same data but not a lot of various data so the model can only learn a small number of patterns.
- train data has a lot of noise(noisy data) so the model learns the patterns of the noise a lot but not the patters of normal data. *Noise(noisy data) means outliers, anomalies or sometimes duplicated data.
- the training time is too long with a too large number of epochs.
- the model is too complex.
can be mitigated by:
- larger train data.
- having a lot of various data.
- reduceing noise.
- stopping training early.
- Dropout. *My post explains Dropout().
- Ensemble learning.
- Regularization to reduce model complexity: *Memos:
  - There is L1 Regularization also called L1 Norm or Lasso Regression.
  - There is L2 Regularization also called L2 Norm or Ridge Regression.
  - My post explains linalg.norm().
  - My post explains linalg.vector_norm().
  - My post explains linalg.matrix_norm().

is the problem which a model cannot make accurate predictions both for train data and new data(including test data) a lot so the model doesn’t fit both train data and new data.
occurs because:
- the model is too simple(not complex enough).
- the training time is too short with a too small number of epochs.
- Excessive regularization is applied.
can be mitigated by:
- Increasing model complexity.
- Increasing the training time with a larger number of epochs
- Decreasing regularization.

This content originally appeared on DEV Community and was authored by Super Kai (Kazuya Ito)