This content originally appeared on DEV Community and was authored by Mwenda Harun Mbaabu
In the data engineering and data science space, reproducibility and environment management are just as critical as writing efficient code. Few things derail productivity faster than dependency hell when different projects require conflicting versions of Python libraries. This is where Anaconda comes in.
Anaconda is more than just a Python distribution; it’s a complete ecosystem. It ships with conda, a powerful package and environment manager, and includes over a thousand preinstalled scientific libraries such as NumPy, pandas, scikit-learn, TensorFlow, and Jupyter Notebook. For engineers and scientists working across machine learning pipelines, ETL processes, or analytics workflows, Anaconda provides a clean, isolated, and dependable environment right out of the box.
In this article, we’ll walk through the step-by-step installation of Anaconda on Ubuntu, so you can get started quickly with a stable and well-managed Python setup on your local machine or a remote Linux server.
Prerequisites
- Ubuntu 20.04, 22.04, or later
- Sudo privileges
- At least 3–4 GB of free disk space
- Basic familiarity with the Linux terminal
Step 1: Update Your System
sudo apt update && sudo apt upgrade -y
Step 2: Install Utilities
sudo apt install wget curl git -y
Step 3: Download the Anaconda Installer
Check the Anaconda download page for the latest version. Example:
wget https://repo.anaconda.com/archive/Anaconda3-2024.02-1-Linux-x86_64.sh
Step 4: Verify Integrity (Optional)
sha256sum Anaconda3-2024.02-1-Linux-x86_64.sh
Compare the hash with the official checksum on Anaconda’s site.
Step 5: Run the Installer
bash Anaconda3-2024.02-1-Linux-x86_64.sh
- Press Enter to review the license
- Type yes to accept
- Choose the installation directory (default:
~/anaconda3
)
Step 6: Initialize Anaconda
~/anaconda3/bin/conda init
source ~/.bashrc
Step 7: Verify Installation
conda --version
python --version
Step 8: Create a Virtual Environment (Recommended)
conda create --name myenv python=3.11
conda activate myenv
Mini Project: Extract Cryptocurrency Data with Python
Now that Anaconda is installed and running, let’s put it to the test with a small project: fetching live cryptocurrency prices from a public API.
We’ll use the CoinGecko API, which is free and doesn’t require an API key.
Step 1: Install Dependencies
Inside your conda environment, install requests
and pandas
:
conda install requests pandas
Step 2: Write the Python Script
Create a file called crypto_data.py
:
import requests
import pandas as pd
from datetime import datetime
# Define the API endpoint
url = "https://api.coingecko.com/api/v3/coins/markets"
# Define parameters
params = {
"vs_currency": "usd",
"ids": "bitcoin,ethereum,cardano,solana",
"order": "market_cap_desc",
"per_page": 10,
"page": 1,
"sparkline": False
}
# Fetch data
response = requests.get(url, params=params)
if response.status_code == 200:
data = response.json()
df = pd.DataFrame(data, columns=["id", "symbol", "current_price", "market_cap", "total_volume"])
df["timestamp"] = datetime.now()
print(df)
else:
print("Error fetching data:", response.status_code)
Step 3: Run the Script
python crypto_data.py
Example output:
id symbol current_price market_cap total_volume timestamp
0 bitcoin btc 59000.0 1.1e+12 3.5e+10 2025-08-22 12:30:45
1 ethereum eth 3100.0 4.0e+11 2.2e+10 2025-08-22 12:30:45
2 cardano ada 0.7 2.5e+10 1.1e+09 2025-08-22 12:30:45
3 solana sol 110.0 5.0e+10 8.9e+09 2025-08-22 12:30:45
Step 4: Extend the Project
Ideas to enhance the script:
- Save results to a CSV file for historical tracking.
- Schedule the script with cron to run every hour/day.
- Visualize crypto price trends using Matplotlib or Seaborn.
- Integrate with a dashboard tool (like Jupyter Notebook) for analysis.
Conclusion
By installing Anaconda on Ubuntu, you’ve set up a reliable Python environment ready for data engineering and data science projects. We then tested the setup by pulling live cryptocurrency market data using Python and the CoinGecko API.
With conda managing environments, you can easily extend this workflow — whether for building ETL pipelines, analyzing financial data, or running machine learning models — without worrying about dependency conflicts. Anaconda gives you the freedom to focus on insights, not environment headaches.
This content originally appeared on DEV Community and was authored by Mwenda Harun Mbaabu