data
Medallion Architecture
A layered data lakehouse design with Bronze (raw), Silver (cleansed), and Gold (aggregated) tiers.
Also known as
Lakehouse Architecture Bronze Silver Gold
Medallion Architecture is a data design pattern used in modern data lakehouses that organises data into three progressive quality tiers:
| Tier | Also called | Contents |
|---|---|---|
| Bronze | Raw / Landing | Ingested data in its original form — no transformations, append-only. |
| Silver | Cleansed / Enriched | Deduplicated, validated, and joined data ready for analysis. |
| Gold | Aggregated / Business | Business-level aggregations, KPIs, and ML feature stores. |
Why it matters
- Auditability — the Bronze layer preserves the raw source of truth for replay and debugging.
- Incremental quality — each hop adds value without discarding history.
- Decoupling — downstream consumers query Gold without coupling to upstream raw formats.
Typical implementation
Orchestration via Apache Airflow DAGs triggers Spark or dbt jobs that transform data from one tier to the next, writing to Delta Lake or Apache Iceberg tables stored in object storage (S3, MinIO, GCS).
See also
Data Lakehouse Apache Spark dbt