██FR█████ █INTELL███████████
frenchintelligence.org
dataengineering
Dynamic Routing Lightweight ETL with AWS Lambda, DuckDB, and PyIceberg
September 2, 2025
Data Mesh: The Decentralized Revolution That Will Transform Your Data Architecture
September 1, 2025
Check Out 3 Awesome Open Source Tabular Data Wrangling Apps
August 29, 2025
Good work George
August 28, 2025
Good work George
August 28, 2025
Why We Built Confidence Scoring Into Our Date Parser (And Why Every API Should)
August 28, 2025
Data Modeling: From Basics to Advanced Techniques for Business Impact
August 26, 2025
Is Prompt Engineering Just Hype for Now?
August 23, 2025
Lightweight ETL with AWS Lambda, DuckDB, and delta-rs
August 22, 2025
Personal Picks: Data Product News (August 20, 2025)
August 20, 2025
Tableau Sales Dashboard Performance (Updated for 2025)
August 19, 2025
Build a Lightweight Serverless ETL Pipeline to Iceberg Tables with AWS Lambda Athena
August 19, 2025
Building ML Infrastructure in TypeScript – Part 1: The Vision
August 13, 2025
Engineering with SOLID, DRY, KISS, YAGNI and GRASP
August 13, 2025
15 foundational concepts on Data Engineering
August 12, 2025
[Boost]
August 12, 2025
Core Concepts of Data Engineering: A Practical Guide for Modern Data Teams
August 12, 2025
The Case for Apache Airflow and Kafka in Data Engineering
August 11, 2025
Snowflake RBAC 101
August 11, 2025
A Recap of Data Engineering Concepts
August 11, 2025
Docker Persistence: When and How to Keep Your Container Data
August 9, 2025
What Is a Primary Key in SQL? Learn with Examples
August 8, 2025
What Is a Primary Key in SQL? Learn with Examples
August 8, 2025
AI-Powered Data Engineering Pipelines: Smarter, Faster, Scalable
August 8, 2025
Building My First Production-Ready ELT Pipeline: A Student’s Journey with Docker, PostgreSQL, dbt, and Airflow
August 7, 2025
Is your Vector Database Really Fast?
July 22, 2025
SQL Server 2025 – What’s New and How to Visualize the Schema
July 18, 2025
Apache Iceberg Table Optimization #10:
July 17, 2025
Apache Iceberg Table Optimization #9:
July 17, 2025
Apache Iceberg Table Optimization #8: Hidden Pitfalls — Compaction and Partition Evolution in Apache Iceberg
July 17, 2025
Apache Iceberg Table Optimization #7: Using Iceberg Metadata Tables to Determine When Compaction Is Needed
July 17, 2025
Apache Iceberg Table Optimization #5: Avoiding Metadata Bloat with Snapshot Expiration and Rewriting Manifests
July 17, 2025
Apache Iceberg Table Optimization #4: Smarter Data Layout — Sorting and Clustering Iceberg Tables
July 17, 2025
Apache Iceberg Table Optimization #3: Optimizing Compaction for Streaming Workloads in Apache Iceberg
July 17, 2025
Apache Iceberg Table Optimization #2: The Basics of Compaction — Bin Packing Your Data for Efficiency
July 17, 2025
Apache Iceberg Table Optimization #1: The Cost of Neglect — How Apache Iceberg Tables Degrade Without Optimization
July 17, 2025
Data and analytics reimagined with Terraform and DevOps principles
July 16, 2025
Big Data Fundamentals: data pipeline tutorial
July 15, 2025
Big Data Fundamentals: data pipeline tutorial
July 15, 2025
Big Data Fundamentals: data lake
July 10, 2025
Big Data Fundamentals: delta lake example
July 9, 2025
Personal Picks: Data Product News (July 9, 2025)
July 9, 2025
How to Discover or Organize Lakehouse & Apache Iceberg Meetups
July 3, 2025
Big Data Fundamentals: big data tutorial
June 29, 2025
Big Data Fundamentals: big data tutorial
June 28, 2025
The Myth of Sisyphus in Data Engineering
June 27, 2025
How to Document SQL Server Schemas Visually in 2025
June 26, 2025
How to Document SQL Server Schemas Visually in 2025
June 26, 2025
How to Document SQL Server Schemas Visually in 2025
June 26, 2025
How to Document SQL Server Schemas Visually in 2025
June 26, 2025
Data Engineering: The Hero Behind Smart Data Decisions
April 3, 2025
Why Pi-Shaped Teams Matter in This AI Era
March 19, 2025
How fault-tolerance works in Flink
March 16, 2025
Azure For Data Engineering
March 15, 2025
Data Modeling – Entities and Events
October 30, 2024
Análise de dados de tráfego aéreo em tempo real com Spark Structured Streaming e Apache Kafka
October 28, 2024
Data Engineering in 2024: Innovations and Trends Shaping the Future
October 27, 2024
From a Unified Bronze Layer to Multiple Silver Layers: Streamlining Data Transformation in Databricks Unity Catalog
October 20, 2024
*Mastering Informatica Intelligent Cloud Services (IICS) for Cloud Data Integration*
October 18, 2024
Handling Outliers in Python – IQR Method
October 10, 2024
Handling Outliers in Python – IQR Method
October 10, 2024
Go vs Python for File Processing: A Performance and Architecture Perspective
October 9, 2024
End-to-End ETL and Sales Dashboard on WWI dataset in Microsoft Fabric
October 8, 2024
Ultimate Directory of Apache Iceberg Resources
October 5, 2024
Clear Link Between DevSecOps and Data Engineering
September 13, 2024
Capture Browser XHR/Fetch API Response Automatically into JSON Files
September 12, 2024
One Minute: DatAasee
September 10, 2024
Building a User-Friendly, Budget-Friendly Alternative to dbt Cloud
September 8, 2024
Ensuring Data Integrity: Comparing Soda and Great Expectations for Quality Assurance
September 8, 2024
What Apache Iceberg REST Catalog is and isn’t
August 18, 2024
ETL Real Estate Data Engineering with Redfin: From Extraction to Visualization
August 18, 2024
Transforming Data Engineering: A Business Domain Approach with Data Mesh
August 18, 2024
Cogumelos Mágicos: explorando e tratando dados nulos com Mage
August 16, 2024
The Ultimate Guide to Data Analytics: Techniques and Tools.
August 16, 2024
Useful Python Libraries for AI/ML
August 10, 2024
Uploading Files Using Pre-Signed URLs to a Specific Storage Class
August 8, 2024
Data Lakehouse 101: The Who, What and Why of Data Lakehouses
August 5, 2024
A beginner’s guide to data engineering concepts, tools, and responsibilities.
August 5, 2024
A Beginner’s Guide To Data Engineering Concepts, Tools, And Responsibilities.
August 4, 2024
An Approach to Finding Missing Documents between 2 indices
August 3, 2024
# Breaking Into Data Science: A Comprehensive Guide for Aspiring Data Scientists
August 3, 2024
🪄 Debezium: the magic behind data capture & async replication (for free)
July 22, 2024
Ways to load data in DW from External Data Source
July 21, 2024
Evolution of Data Sharding Towards Automation and Flexibility in Apache Doris
July 20, 2024
Apache Doris Job Scheduler for Task Automation
July 18, 2024
Aggregation in GROUP BY vs. Window Functions Using OVER()
July 14, 2024
Understanding RAID Levels: A Comprehensive Guide to RAID 0, 1, 5, 6, 10, and Beyond
July 11, 2024
Os Principais Pontos para uma Documentação Técnica de Negócios com Arquitetura de Sistemas
July 10, 2024
BigQuery Schema Generation Made Easier with PyPI’s bigquery-schema-generator
July 9, 2024
MapReduce Vs Tez
July 7, 2024
Azure Synapse Analytics Security: Data Protection
July 4, 2024
Leveraging PySpark.Pandas for Efficient Data Pipelines
July 4, 2024
HNG STAGE ZERO: ANALYZING RETAIL SALES DATA AT FIRST GLANCE
July 3, 2024
From Messy Data to Super Mario Pipeline: My First Adventure in Data Engineering
June 20, 2024
The Data Professions
June 20, 2024
Database generated events: LiveSync’s database connector vs CDC
June 20, 2024
Working with Parquet files in Java using Carpet
June 19, 2024
Analyzing Svenskalag Data using DBT and DuckDB
June 17, 2024