Exploring Python 3.14’s Zstandard Compression



This content originally appeared on DEV Community and was authored by ZEZE1020

As a developer interested in exploring Python’s latest features, I tried Python 3.14, I like the fact that its version name is close to π. It is set to be released in October 2025, and I discovered its new compression.zstd module. This module brings the Zstandard compression algorithm, known for its speed and high compression ratios, into Python’s standard library. I created PyChive, a simple project to compress text files concurrently using this new feature. In this article, I’ll share what I learned while building PyChive, including code snippets and challenges faced.

@itseieio, shared how they compressed 335 GB of chess data using zstd, saving significant space compared to JSON (X post).

What is Zstandard?

Zstandard, developed by Facebook, is a fast compression algorithm that balances high compression ratios with quick performance. According to the Zstandard website, it outperforms older algorithms like gzip and bzip2, offering tunable speed versus compression trade-offs. Python 3.14’s compression.zstd module, introduced via PEP 784, makes this algorithm accessible without external dependencies, supporting file compression, decompression, and advanced features like dictionary training (Python 3.14 Documentation).

PyChive: A Simple Compression Tool

PyChive is a Python script that compresses all .txt files in the current directory into .zst files using Zstandard. It uses ThreadPoolExecutor for concurrent processing, testing Python 3.14’s potential free-threaded mode (PEP 779). The project avoids complex features like template strings to keep it accessible, focusing on compression and basic reporting of file details (names, sizes, and compression ratios).

Setting Up Python 3.14 Beta

Since Python 3.14’s stable release is pending, I used the beta version (3.14.0b2). Here’s how to set it up with pyenv, which makes managing Python versions easy (Real Python: Managing Multiple Python Versions)

# Install pyenv (if not already installed)
curl https://pyenv.run | bash

# Install Python 3.14.0b2
pyenv install 3.14.0b2
pyenv global 3.14.0b2

# Verify version
python --version  # Should output: Python 3.14.0b2

Alternatively, download the beta from python.org. There’s always caution about beta stability, but for testing, it’s a great way to explore new features.

PyChive’s Core Code

Here’s the main implimentation of PyChive, a script that compresses .txt files and prints their details:

import compression.zstd
import os
import shutil
from concurrent.futures import ThreadPoolExecutor
import time

def compress_file(input_path: str, output_path: str) -> dict:
    """Compress a file using Zstandard and return compression stats."""
    with open(input_path, 'rb') as f_in, compression.zstd.open(output_path, 'wb') as f_out:
        shutil.copyfileobj(f_in, f_out)
    original_size = os.path.getsize(input_path)
    compressed_size = os.path.getsize(output_path)

    try:
        compression_ratio = original_size / compressed_size
    except ZeroDivisionError:
        compression_ratio = 0.0

    return {
        'original_filename': input_path,
        'compressed_filename': output_path,
        'original_size': original_size,
        'compressed_size': compressed_size,
        'compression_ratio': compression_ratio
    } 


def main():
    # Get all .txt files in the current directory
    files = [f for f in os.listdir('.') if os.path.isfile(f) and f.endswith('.txt')]

    if not files:
        print("No .txt files found in the current directory.")
        return

    # Compress files concurrently
    start_time = time.time()
    with ThreadPoolExecutor(max_workers=4) as executor:
        futures = [executor.submit(compress_file, file, file + '.zst') for file in files]
        results = [future.result() for future in futures]
    conc_time = time.time() - start_time

    # Print compression time and file details
    print(f"Concurrent compression time: {conc_time} seconds")
    for stats in results:
        print("Compression Report")
        print("------------------")
        print(f"Original file: {stats['original_filename']}")
        print(f"Compressed file: {stats['compressed_filename']}")
        print(f"Original size: {stats['original_size']} bytes")
        print(f"Compressed size: {stats['compressed_size']} bytes")
        print(f"Compression ratio: {stats['compression_ratio']:.2f}")
        print()

if __name__ == '__main__':
    main()

Code Breakdown

  • Compression Function: The compress_file function uses compression.zstd.open to compress a file, streaming data with shutil.copyfileobj for efficiency. It returns a dictionary with file details: original_filename, compressed_filename, original_size, compressed_size, and compression_ratio.
  • Main Function: The main function finds .txt files, compresses them concurrently using ThreadPoolExecutor, and prints stats directly from the stats dictionary.
  • Concurrency: Using max_workers=4, it tests Python 3.14’s free-threaded mode, which may improve performance for I/O-bound tasks like compression.

Running PyChive

To run PyChive:

  1. Ensure Python 3.14.0b2 is installed.
  2. Create .txt files in the script’s directory:
   echo "Sample content for testing" > test1.txt
   echo "Sample content for testing" > test2.txt
  1. Run the script:
   python main.py
  1. Expected output:
   Concurrent compression time: 0.123456 seconds
   Compression Report
   ------------------
   Original file: test1.txt
   Compressed file: test1.txt.zst
   Original size: 26 bytes
   Compressed size: 10 bytes
   Compression ratio: 0.38

   Compression Report
   ------------------
   Original file: test2.txt
   Compressed file: test2.txt.zst
   Original size: 26 bytes
   Compressed size: 10 bytes
   Compression ratio: 0.38

Challenges and Lessons Learned

While writing PyChive, I encountered several challenges, particularly with implementing t-templates, a new feature in Python 3.14 introduced via PEP 750. These hurdles provided valuable learning experiences, deepening my understanding of Python’s latest capabilities.

Understanding T-Templates

T-templates, or template strings, are designed to offer a safer alternative to f-strings for scenarios involving user input or dynamic content generation. Unlike f-strings, which evaluate expressions immediately, t-templates use a Template object to process placeholders, allowing for more control and security.

Initially, I struggled with the syntax and usage of t-templates. In PyChive, I aimed to generate compression reports using placeholders like {original_filename}. However, I mistakenly tried to access variables directly, leading to a NameError:

# Incorrect approach
print(f"Original file: {original_filename}")  # NameError: name 'original_filename' is not defined

This error occurred because original_filename was not a standalone variable but a key in the stats dictionary returned by the compress_file function. I identified that the correct approach involves using a Template object and passing the stats dictionary to render the report:

from string.templatelib import Template

report_template = t"""
Compression Report
------------------
Original file: {original_filename}
Compressed file: {compressed_filename}
Original size: {original_size} bytes
Compressed size: {compressed_size} bytes
Compression ratio: {compression_ratio}
"""

# Render the template with stats dictionary
report = render_report(report_template, stats)
print(report)

For a detailed explanation of t-templates, I relied on Real Python’s guide (Real Python: Template Strings in Python 3.14), which clarified their syntax and use cases.

Key Takeaways

  • T-Templates Differ from F-Strings

  • Syntax Mastery Is Essential

  • Community Resources Shine: Real Python’s content was a lifeline, offering practical examples that bridged the gap between theory and application.

Experimenting with t-templates in PyChive made me appreciate Python 3.14’s new features.

Why PyChive Matters

PyChive is a start to exploring Python 3.14’s capabilities. It shows how to:

  • Use compression.zstd for efficient file compression.
  • Use concurrency for performance gains.
  • Keep code simple for learning and experimentation.

As Python 3.14’s stable release approaches, PyChive can evolve, potentially adding a web interface or advanced features like Zstandard dictionaries. For now, it’s a practical example for you to dive into Python’s latest tools. I’d love to see what you’d come up with!

Conclusion

PyChive taught me the power of Zstandard compression and Python 3.14’s potential for efficient file handling. I encourage you to try PyChive, experiment with Python 3.14, and share your findings. Check out the Python 3.14 documentation and Real Python for more on Python’s new features.

Sources


This content originally appeared on DEV Community and was authored by ZEZE1020