Building a Heart Risk Detector from Scratch with NumPy – Lessons in Overfitting & Neural Networks – ██FR█████ █INTELL███████████

This content originally appeared on DEV Community and was authored by Sreehari S J

As part of my journey into neural networks, I recently built a Heart Risk Detector from scratch using NumPy, no TensorFlow, no PyTorch. The goal was to predict the risk of heart disease using structured clinical data. Here’s a version of my experience, the technical choices I made, and the challenges I tackled.

The Project Overview

The model is a binary classifier trained on a dataset of numeric and categorical features. I implemented everything from scratch: data preprocessing, forward and backward passes, and training with regularization. Key highlights:

Data split: 70% train, 15% validation, 15% test (randomized but reproducible).
Preprocessing: Min-max normalization of numeric features, one-hot encoding for categorical features.
Model architecture:


  Input → Dense(16, ReLU) → Dropout → Dense(8, ReLU) → Dropout → Dense(1, Sigmoid)

Optimization: Mini-batch SGD with learning rate 0.01.
Regularization: L2 weight decay + Dropout.
Early stopping: Tracks validation loss to prevent overfitting.

Overfitting Struggles

At first, my model overfit quickly—training accuracy shot up, but validation stagnated. I realized that my network was memorizing the training data rather than learning patterns.

The method by which I solved it

Dropout: Applied 30% inverted dropout in hidden layers during training to prevent co-adaptation of neurons.
L2 Regularization: Penalized large weights, which helped reduce model complexity.
Early Stopping: Monitored validation loss and restored the best weights when no improvement was observed for 50 epochs.
Proper Data Preprocessing: Normalizing numeric features using only training stats and aligning categorical features across splits prevented subtle data leakage.

by doing this training and validation accuracies became much closer, and the model generalized better on unseen test data

Technical Insights From This

Forward & Backward Pass: Implemented ReLU and Sigmoid activations, and manually derived gradients for backpropagation.
Parameter Initialization: Used He/Xavier scaling to stabilize gradients.
Reproducibility: Fixed random seeds for data splits and dropout masks, ensuring consistent results across runs.

Lessons Learned

Building a neural network from scratch is valuable for understanding the math and mechanics behind high-level libraries.
Overfitting is inevitable with small datasets, but simple tools like dropout, L2, and early stopping can save the day.
Proper preprocessing and careful handling of categorical features can significantly affect model performance.

Next Steps

Experiment with Adam optimizer and learning rate schedules.
Implement Batch Normalization to mitigate internal covariate shifts.
Explore imbalanced datasets handling with class weighting or focal loss.

This project was not just a technical work taught me patience, debugging, and how subtle design choices impact model performance.

If you’ve ever stucked with overfitting or any experience I’d love to hear your experience!

Check out my linkedin post about this

Give me a star on github if you like it

This content originally appeared on DEV Community and was authored by Sreehari S J