Complete Guide to MLOps: 10 Essential Steps from a Bird’s-Eye View



This content originally appeared on Level Up Coding – Medium and was authored by Karan Shingde

Learn how to turn a simple ML notebook into a production-ready system with MLOps.

What is MLOps?

Everyone knows it is a well-known term in the tech industry. It is often seen as DevOps for machine learning, right? But it is not that straightforward. In this blog, I will explain why you need MLOps for ML projects and how you can design a system using MLOps.

Image by Karan Shingde (generated using ChatGPT)

Why MLOps?

We all use Jupyter Notebook or Google Colab to code our machine learning projects. We install the dependencies, import them, write the code, and run it step by step in cells.

In more ML terms:

  1. Install dependencies or libraries like NumPy, Pandas, and Torch.
  2. Import them in the top cells.
  3. Import data using Pandas, clean it, apply normalization techniques, and do a train-test split.
  4. Import models from Torch or Scikit-learn.
  5. Train the model.
  6. Evaluate the model and plot a confusion matrix or regression line.
  7. Save the notebook and the model, then feel like we know machine learning.

From point 1 to point 6 everything works well, but look at point 7. We save the models in binary files (.pkl, .pth), but these files do not perform any real-time tasks. So how can we say our ML project is complete?

This is where MLOps comes in. Before moving forward, let me explain. MLOps is a general term, but when you work on a project you usually work with pipelines. A pipeline is a sequence of modules or functions that move data and actions from start to end. So instead of learning MLOps only in general terms, let us dive deeper and understand what a pipeline is for machine learning projects.

Frame the problem statement

The first step is to understand the business objective. Ask what the product is expected to achieve. If it is a B2C product that customers will use directly, low latency becomes a priority because users expect fast responses. If it is a B2B product meant to automate business tasks or support decision-making, then accuracy and reliability are more important than speed.

Next, look at the data. This is where you will find both the problems and the solutions. As a Machine Learning Engineer, I highly recommend learning statistics and feature engineering because they are key to building strong models.

Once you understand the business expectations and the patterns in the data, you reach the most critical stage of the ML pipeline. At this point, you can start experimenting with state-of-the-art algorithms, unless your company is focused on research. In my practice, if the data is tabular (classification, regression, or even time-series), I usually start with XGBoost. In 2025, XGBoost remains one of the most powerful algorithms for classic ML problems. For text data, you can use BERT or Sentence-BERT, and for text generation you can simply load pre-trained LLMs like Llama, Qwen, and others.

The approach should be to first explore the data and do simple model training. Then evaluate the model’s performance and discuss the metrics with your team.

Remember, machine learning metrics matter only to the ML team. The business side cares about numbers and growth. So even if your model shows 90% accuracy with good precision and recall, it means the model is ready for A/B testing, not for direct production.

Turning a notebook into pipeline

This is where actual MLOps begins. Many beginner ML engineers think machine learning is only about building a model and saving it on their system. In real-world projects, we take the messy notebook code and convert it into a clean, modular Python format.

Here’s what you can do:

  1. Break each part of your project into separate pipelines. For example, have one for importing data, another for cleaning data, one for feature engineering, one for splitting the data into train, test, and validation sets, and others for training and evaluation.
  2. Write separate Python files for each module, and make sure you have a good understanding of Python modules and object-oriented programming (OOP).
  3. Create a main script that runs everything in sequence.

These are some of the most important things to consider before diving into MLOps.

Complete MLOps cycle

Here’s the 10 essential steps of MLOps lifecycle👇

1. Problem Definition & Data Collection: Every ML project starts with a clear goal, like predicting sales or detecting fraud. Once the problem is defined, we collect raw data from databases, APIs, sensors, or logs. Data is the foundation, so gathering the right and reliable data is important.
Tools: SQL, MongoDB, Kafka, Google BigQuery, APIs.

2. Data Cleaning & Preprocessing: Raw data often has missing values, duplicates, and errors. Cleaning makes it reliable by filling gaps, removing noise, and standardizing formats. Preprocessing also includes normalizing and splitting data into training, validation, and testing sets.
Tools: Pandas, NumPy, PySpark

3. Data Versioning & Storage: Data keeps changing over time, so versioning is needed to track changes and ensure reproducibility. Storing processed datasets securely also makes collaboration easier. This step avoids confusion between different dataset versions.
Tools: DVC, Git-LFS

4. Model Development: Here, data scientists experiment with ML algorithms and architectures. They train models, compare performance, and tune parameters to achieve the best results. This step is like the “research” part of MLOps.
Tools: PyTorch, TensorFlow, Scikit-learn, HuggingFace, XGBoost

5. Experiment Tracking: Many experiments are run with different settings, so tracking results is necessary. Tracking helps compare metrics, hyperparameters, and outcomes to choose the best model. This avoids confusion and makes research organized.
Tools: MLflow, Weights & Biases, Comet.

6. Model Validation & Testing: Before deployment, models must be validated on unseen data to check accuracy, fairness, and robustness. Testing ensures the model is not biased and works well under real-world conditions. This step prevents failures later.
Tools: pytest

7. Model Packaging & CI/CD: Once the model is ready, it is packaged into a deployable format (like Docker containers). CI/CD pipelines automate testing, integration, and deployment, reducing manual work. This makes the system reliable and repeatable.
Tools: Docker, GitHub Actions, Jenkins, CircleCI.

8. Model Deployment: Models are deployed into production so users or applications can use them. Deployment can be batch (scheduled jobs) or real-time (API-based). Proper scaling is also important to handle many requests.
Tools: FastAPI, Flask, Kubernetes, AWS Sagemaker, GCP Vertex AI.

9. Monitoring & Logging: After deployment, models need monitoring to check performance, accuracy, and system health. Logs capture errors and unusual patterns, while monitoring helps detect model drift and data changes. This ensures reliability.
Tools: Prometheus, Grafana, ELK Stack

10. Continuous Training & Feedback Loop: Over time, data and user behavior change, so models must be retrained regularly. Continuous training uses new data to keep models updated. Feedback from users also helps improve accuracy and usefulness.
Tools: Airflow, Kubeflow, Prefect, MLflow Pipelines.

Bonus:

MLOps is not just about tools. It is more about the practices you follow in your machine learning project to develop, deploy, and monitor systems quickly. As data keeps growing, we need to automate projects at the same pace, and this is where MLOps becomes important. The pipeline shown above is not fixed. It depends on your company and the tools being used.

If you are a beginner, I highly recommend learning Python modular coding and Docker at the very least. Learning FastAPI is also a great choice, and understanding how system design works in software engineering is essential. Concepts like response models, APIs, and rate limiting are very important for machine learning projects too.

In applied ML, where MLOps is most relevant, you need to be stronger in software engineering skills than in pure ML.

Sources:

  1. Book: Designing Machine Learning Systems by Chip Huyen
  2. Marvelous MLOps YT playlist on DataBricks
  3. Youtube Playlist in Hindi

I am writing more about MLOps on Substack (https://kmeanskaran.substack.com/), so you can subscribe to my Substack K-Means Karan where I plan to share detailed technical content on MLOps. I am also building a project on MLOps for beginners and intermediate engineers.

Do follow me on Medium for more content. You can also connect with me on X and LinkedIn, where I share thoughts on ML, MLOps, and my career journey.

Thanks for reading, and see you soon!


Complete Guide to MLOps: 10 Essential Steps from a Bird’s-Eye View was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.


This content originally appeared on Level Up Coding – Medium and was authored by Karan Shingde