This content originally appeared on DEV Community and was authored by Minwook Je
Introduction
Apache Iceberg is one of three popular open table formats (OTF).
(Hudi, Uber) and (Delta Lake, Databricks)
In this post:
- Iceberg specification
- Docker Compose Hands-on
- Metadata
What is OTF?
Turn files into tables
Open Table Format is a specification for organizing a collection of files containing the same information such that they are presented as a single table.
Table
Implying is that we want all these files to be viewable and updateable as if they were a single entity – the table.
We can interact with this collection of files in the same way with a table in a database.
Various parties must implement this specification to produce usable software.
Apache Iceberg specification (3)
To implement the Apache Iceberg specification, we need three things:
-
Catalog
: keep track of all the metadata files - Processing engine: e.g., query engine
- Scalable storage: object storage
This content originally appeared on DEV Community and was authored by Minwook Je