A Comparative Analysis of Julia and Python: Advancements in High-Performance Computing – ██FR█████ █INTELL███████████

This content originally appeared on Level Up Coding – Medium and was authored by Arjun Mathur

https://github.com/akmathur1/MedEval3D.jl

I used to underestimate the importance of programming. While growing up, I spent a lot of time playing video games and using applications like Google Earth and Sandbox. Now, I realize that there is a whole world of code behind every application. We often overlook this fact and tend to label those who work with code as "nerds". However, I've started paying more attention to the behind-the-scenes work, trying to understand how websites and companies function. We often take for granted the entertainment and conveniences that are made possible through complex coding. It's time to appreciate the detailed work that makes these experiences possible.

From taking my first introductory Computer Science course as a sophomore, I never had an interest in software development. However, as I delved deeper into the world of programming, I began to appreciate its intricacies and immense potential. This journey led me to explore different programming languages, with Julia and Python standing out as significant players in high-performance computing. In this article, I will discuss why Julia is often considered superior to Python for certain applications, particularly in high-performance computing.

MedEval3D package

Performance and Speed

One of Julia's most significant advantages over Python is its performance. Julia was designed from the ground up to be fast. It achieves this by compiling machine code using the LLVM framework, allowing it to approach C's speed. This is particularly advantageous in high-performance computing tasks, where execution speed is critical.

For example, Julia's performance shines in medical image segmentation. The MedEval3D package, a CUDA-accelerated package for calculating medical segmentation metrics, leverages Julia's speed to significantly reduce computation time compared to traditional CPU-based methods. This performance boost is essential for processing large 3D medical datasets efficiently.

Ease of Use and Syntax

Julia combines the ease of use of Python with the speed of languages like C and Fortran. Its syntax is clean and intuitive, making it accessible to beginners and powerful enough for advanced users. This makes it an excellent choice for both prototyping and production.

Furthermore, Julia's ability to call C and Python libraries directly without wrappers simplifies the development process. This interoperability ensures developers can leverage existing libraries and tools, making Julia a versatile addition to any tech stack.

Built for Numerical and Scientific Computing

Julia was explicitly designed for numerical and scientific computing. It includes features like multiple dispatch, allowing the same function to behave differently based on input types, making it highly flexible and efficient for "Mathematical" operations. This is a significant advantage over Python, where achieving similar performance requires additional libraries like NumPy and SciPy.

In the context of deep learning and data science, Julia's native capabilities for handling mathematical computations and its performance optimizations make it a strong contender. Its ability to seamlessly integrate with machine learning frameworks further enhances its utility in these domains.

Robust Ecosystem

While Julia is a newer language than Python, it has rapidly developed a robust ecosystem of packages and libraries. The Julia community is active and growing, contributing to various domains, from data science and machine learning to finance and biology. This community support ensures that Julia remains up-to-date with the latest advancements and continues to evolve to meet the needs of its users.

Medical Image Segmentation and Who Should Use JULIA?

The MedEval3D package exemplifies Julia's strengths in high-performance computing. This package uses CUDA acceleration to calculate medical segmentation metrics with unprecedented speed. By integrating MedEval3D with Python modules like nnunet_utilities, researchers can compute metrics such as Dice loss using both Julia and Python functions. Benchmarking results have shown that MedEval3D achieves up to 214 times faster execution than traditional CPU-based methods.

This performance is crucial in medical applications, where rapid and accurate image segmentation can significantly impact diagnosis and treatment planning. Julia's ability to handle large datasets efficiently and its compatibility with existing Python tools make it an invaluable asset in this field.

Luckily, I could work with a profound Google Engineer named Jakub Mitura, who was from Poland. He introduced me to the language and a project to prove that Julia was the best language for medical and segmentation analysis. To start my test, I wrote a Python code using random matrix datasets to provide me with a Dice Loss calculation.

import numpy as np


def dice_loss(arrGold, arrAlgo):
    intersection = np.sum(arrGold * arrAlgo)
    union = np.sum(arrGold) + np.sum(arrAlgo)
    if union == 0:
        return 1.0
    return 1.0 - (2.0 * intersection/union)

# Test data
arrGold = np.array([
    [[1, 0, 0], [0, 1, 0], [0, 0, 1]],
    [[1, 1, 0], [0, 1, 1], [0, 0, 1]],
    [[1, 0, 1], [0, 1, 0], [1, 0, 1]]
])

arrAlgo = np.array([
    [[1, 1, 0], [1, 0, 0], [0, 1, 1]],
    [[1, 0, 1], [0, 1, 0], [1, 1, 0]],
    [[0, 1, 0], [1, 0, 1], [1, 1, 0]]
])

# Calculate Dice loss
loss = dice_loss(arrGold, arrAlgo)
print("Python Dice Loss: ", loss)

I received a Dice Loss of 0.662

integration with SimpleITK in Python:

# Save this as test_simpleitk.py

import SimpleITK as sitk
import numpy as np

def dice_loss(arrGold, arrAlgo):
    intersection = np.sum(arrGold * arrAlgo)
    union = np.sum(arrGold) + np.sum(arrAlgo)
    if union == 0:
        return 1.0
    return 1.0 - (2.0 * intersection/union)

# Test data
arrGold = np.array([
    [[1, 0, 0], [0, 1, 0], [0, 0, 1]],
    [[1, 1, 0], [0, 1, 1], [0, 0, 1]],
    [[1, 0, 1], [0, 1, 0], [1, 0, 1]]
])

arrAlgo = np.array([
    [[1, 1, 0], [1, 0, 0], [0, 1, 1]],
    [[1, 0, 1], [0, 1, 0], [1, 1, 0]],
    [[0, 1, 0], [1, 0, 1], [1, 1, 0]]
])

# Convert to SimpleITK images
arrGold_sitk = sitk.GetImageFromArray(arrGold.astype(np.float32))
arrAlgo_sitk = sitk.GetImageFromArray(arrAlgo.astype(np.float32))

# Calculate Dice coefficient using SimpleITK
dice_filter = sitk.LabelOverlapMeasuresImageFilter()
dice_filter.Execute(arrGold_sitk, arrAlgo_sitk)
sitk_dice = dice_filter.GetDiceCoefficient()

# Calculate Dice loss using NumPy
numpy_loss = dice_loss(arrGold, arrAlgo)

print("SimpleITK Dice Coefficient: ", sitk_dice)
print("Numpy Dice Loss: ", numpy_loss)


After verifying the correctness of both the Julia and Python implementations independently, you can integrate them using PyCall in Julia. Here’s the complete workflow:

using Pkg
Pkg.add("PyCall")
using PyCall

# Import the Python module
np = pyimport("numpy")
sitk = pyimport("SimpleITK")

# Define the Dice loss function in Julia
function dice_loss(arrGold::Array{Int64,3}, arrAlgo::Array{Int64,3})
    intersection = sum(arrGold .& arrAlgo)
    union = sum(arrGold) + sum(arrAlgo)
    if union == 0
        return 1.0
    end
    return 1.0 - (2.0 * intersection/union)
end

# Test data
arrGold = [
    [1 0 0; 0 1 0; 0 0 1],
    [1 1 0; 0 1 1; 0 0 1],
    [1 0 1; 0 1 0; 1 0 1]
]

arrAlgo = [
    [1 1 0; 1 0 0; 0 1 1],
    [1 0 1; 0 1 0; 1 1 0],
    [0 1 0; 1 0 1; 1 1 0]
]

# Convert to 3D arrays
arrGold = reshape(arrGold, (3,3,3))
arrAlgo = reshape(arrAlgo, (3,3,3))

# Calculate Dice loss using Julia
julia_loss = dice_loss(arrGold, arrAlgo)
println("Julia Dice Loss: ", julia_loss)

# Convert Julia arrays to NumPy arrays for compatibility with Python functions
arrGold_np = PyObject(arrGold)
arrAlgo_np = PyObject(arrAlgo)

# Calculate Dice loss using numpy in Python
dice_loss_py = py"
import numpy as np
def dice_loss(arrGold, arrAlgo):
    intersection = np.sum(arrGold * arrAlgo)
    union = np.sum(arrGold) + np.sum(arrAlgo)
    if union == 0:
        return 1.0
    return 1.0 - (2.0 * intersection/union)

result = dice_loss($arrGold_np, $arrAlgo_np)
"""
println("Python Dice Loss: ", dice_loss_py)

# Calculate Dice coefficient using SimpleITK in Python
arrGold_sitk = sitk.GetImageFromArray(arrGold)
arrAlgo_sitk = sitk.GetImageFromArray(arrAlgo)
dice_filter = sitk.LabelOverlapMeasuresImageFilter()
dice_filter.Execute(arrGold_sitk, arrAlgo_sitk)
sitk_dice = dice_filter.GetDiceCoefficient()
println("SimpleITK Dice Coefficient: ", sitk_dice)

Integration of Python and JULIA scores (function was able to determine the speed of both languages to see which language computes a Dice Loss score first). The winner was Julia, as it beat Python by 50 sec. My research is still ongoing, with the addition of Kernel Abstractions and tests with PANDAS and numpy. More information is available via my GitHub page at the start of this paper.

While Python remains a popular and versatile language, Julia's advantages in performance, ease of use, and suitability for numerical and scientific computing make it a compelling choice for high-performance computing tasks. As more industries recognize the importance of speed and efficiency, particularly in fields like medical image analysis and deep learning, Julia's role is set to expand. By leveraging Julia's strengths, developers and researchers can push the boundaries of what is possible in computational science, driving innovation and achieving previously unattainable breakthroughs.

I plan to contribute to more papers that emphasize the importance of JULIA, compare it to other languages, and explore its potential for web development and future deployment.

A Comparative Analysis of Julia and Python: Advancements in High-Performance Computing was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.

This content originally appeared on Level Up Coding – Medium and was authored by Arjun Mathur