This content originally appeared on DEV Community and was authored by Ertugrul
“Faces donβt lie β but landmarks sometimes do.”
Project Idea
Hey there! In this post, Iβll share my journey of building a Face Landmark Detection pipeline using OpenCV DNN and Facemark LBF. The system takes a raw video as input, detects faces, extracts 68 facial landmarks, smooths them across frames, and finally outputs:
- an annotated video with landmarks and bounding boxes
- an optional CSV file with landmark coordinates for every frame
The idea was simple:
“Take a face β get the points.”
But to make it robust, I had to mix deep learning detection with classical landmarking and add a touch of signal processing.
Code Structure
The project is split into modular components:
-
detector.py
β Loads and runs the DNN-based face detector (SSD ResNet) -
landmarks.py
β Drawing utilities for the 68-point facial structure -
helpers.py
β Video I/O, CSV logging, smoothing, and per-frame pipeline -
main.py
β Entry point to run the full pipeline
Step 1 β Face Detection
I used OpenCVβs Deep Neural Network (DNN) SSD ResNet model. The detector takes each frame, converts it into a blob, and feeds it into the Caffe network:
blob = cv.dnn.blobFromImage(cv.resize(frame, (300, 300)), 1.0, (300, 300),
(104.0, 177.0, 123.0), False, False)
self.net.setInput(blob)
detections = self.net.forward()
This gives us bounding boxes with confidence scores. I kept only the ones above a threshold (conf_thr=0.6
).
Step 2 β Landmark Extraction
With face boxes ready, I used Facemark LBF to extract the 68 landmark points:
facemark = cv.face.createFacemarkLBF()
facemark.loadModel(LBF_MODEL)
ok, landmarks = facemark.fit(frame, np.array(boxes))
This returns arrays shaped (68, 2)
β coordinates for jawline, eyebrows, eyes, nose, and lips.
Step 3 β Landmark Smoothing
Raw landmarks jitter a lot between frames. To stabilize them, I applied an Exponential Moving Average (EMA):
if prev_pts is None:
smooth = pts.copy()
else:
smooth = alpha * prev_pts + (1.0 - alpha) * pts
This keeps the motion natural but removes frame-by-frame noise.
Step 4 β Drawing the Mesh
I grouped the 68 points into face regions and connected them with polylines:
- Jawline
- Eyebrows
- Nose bridge & base
- Eyes
- Inner & outer lips
for (x, y) in pts:
cv.circle(frame, (int(x), int(y)), 1, (0, 255, 0), -1)
The result? A clear, real-time facial mesh overlay.
Outputs
Annotated Video:
Watch on YouTubeCSV Example:
frame_idx,x0,x1,...,y66,y67
0,123,130,...,200,205
1,124,129,...,199,206
This makes the system useful both for visualization and downstream ML tasks.
Lessons Learned
- DNN face detection is robust, but combining it with traditional landmarking is still effective.
- Smoothing is mandatory β raw landmarks are too noisy for real use.
- CSV logging adds value for research/analytics beyond just visualization.
GitHub Repository
You can find the full code here:
GitHub: Face Landmarks Detection
“A single face in a frame is simple β but tracking it smoothly across time is where the real challenge begins.”
This content originally appeared on DEV Community and was authored by Ertugrul