Introduction to Apache Iceberg using MinIO



This content originally appeared on DEV Community and was authored by Minwook Je

Introduction

Apache Iceberg is one of three popular open table formats (OTF).

(Hudi, Uber) and (Delta Lake, Databricks)

In this post:

  1. Iceberg specification
  2. Docker Compose Hands-on
  3. Metadata

What is OTF?

Turn files into tables
Open Table Format is a specification for organizing a collection of files containing the same information such that they are presented as a single table.

Table
Implying is that we want all these files to be viewable and updateable as if they were a single entity – the table.

We can interact with this collection of files in the same way with a table in a database.

Various parties must implement this specification to produce usable software.

Apache Iceberg specification (3)

To implement the Apache Iceberg specification, we need three things:

  1. Catalog: keep track of all the metadata files
  2. Processing engine: e.g., query engine
  3. Scalable storage: object storage


This content originally appeared on DEV Community and was authored by Minwook Je