A layered data design pattern that progressively refines raw data into clean, business-ready insights — moving through Bronze, Silver, and Gold layers.
Design Principles
Medallion architecture is built on a set of practical engineering principles that reflect real-world data team needs:
- Decoupling — Separating data into layers maintains flexibility. Changes in a source system don’t cascade into your reporting layer.
- Operational reporting & analytics — Application teams often need near-real-time insights from operational data, which can in turn be used to improve the operational systems themselves.
- Historical retention — Teams need to keep data for years — for trend analysis, quality improvement, and the ability to reload or reprocess when pipelines change.
- Integration — Business questions rarely live within a single system. Combining data from multiple sources into integrated perspectives is a core reason the architecture exists.
- Duplication for performance — Different teams have different read patterns and performance requirements. Duplicating data across layers (e.g. a summarised Gold table alongside a detailed Silver one) is intentional, not wasteful.
Why layer data at all?
The principle mirrors classic data warehouse thinking (Inmon, Kimball, Data Vault). Medallion gives those separation-of-concerns ideas a modern, lakehouse-native home. Teams need historical data for auditing and reloading, integrated views for analytics, and performance-optimized reads for reporting — often simultaneously. Layering makes each concern independently manageable.
The Three Layers
🟤 Bronze — Raw, “as-is” The landing zone. Data is ingested as-is, immutable, and stored in original formats (CSV, JSON, Parquet) before being converted to Delta. Enriched with metadata like ingestion timestamps and data source labels. Supports both batch and streaming, and is valuable for debugging and historic reloads.
⚪ Silver — Cleaned & filtered Data quality rules are applied here: fixing dates, removing special characters, correcting formatting errors. Historization is typically handled with SCD2. The structure remains source-oriented but is queryable and used by operational analytics teams. Columns are renamed for readability and lightweight conformations to enterprise standards are applied.
🟡 Gold — Refined business-level Data is harmonized and integrated across sources, typically modelled using the Kimball dimensional style (facts and dimensions). Complex business rules, calculations, and enrichments live here. The output consists of consumer-ready data products with backwards compatibility and data contracts for downstream users.

Practical Example: Unified Ad ROI Analysis
The business question: A growing e-commerce company wants to understand the impact of marketing spend across channels. The project: a unified ROI analysis for multi-channel ad campaigns.
Data sources: Facebook Ads and Google Ads.
Reporting requirements: Cost, impressions, clicks — broken down by ad, campaign, attribution (1-day window), and demographics (age, gender, device, platform).

Bronze: Load everything
Facebook Insights exposes 134 fields plus additional parameter combinations. Google Ads exposes 329 fields. The recommendation is simple: load everything the APIs return.
- Downside: More storage used — but storage is cheap.
- Upside: You avoid the classic “This report is great! Can you just add this one more field?” conversation.
Transformations at this stage are minimal: remove corrupt or unusable records, convert to Delta format, enrich with metadata (data source, ingestion timestamp), and partition by date only if data volume becomes a bottleneck.
Silver: Three options for modelling
This is where the real design decisions happen. Facebook and Google use different field names for the same concepts — cost_micros vs spend, ad_group vs adset, segments.age_range vs age — and some fields simply don’t exist on both platforms.
Option 1 — Two separate source tables Keep a Facebook silver table and a Google silver table, each cleaned and renamed to a common naming convention. Null fields where a platform has no equivalent. Simple to build, preserves source nuance, but requires joining or unioning later.
Option 2 — Single unified ads table Map both sources into one shared table at Silver. Simpler for downstream consumers and avoids repeated union logic. Recommended when platform-specific nuances aren’t needed in reporting. Important: document the mapping decisions clearly — adset and ad_group are not identical concepts.
Option 3 — Data Vault Model entities as Hubs (Hub_Ad, Hub_Campaign, Hub_Ad_Group), relationships as Links (Link_Ad_To_Campaign), and metrics as Satellites (Sat_Ad). More modular and better at tracking changes to relationships over time. Best suited for complex, long-lived enterprise data platforms.
- Pro: Highly flexible, source-agnostic, great for auditability.
- Con: More complex to build and query (more joins).
Gold: Two options for serving
Option A — Dimensional model (Kimball) Build a Fact_Ad_Performance table joined to Dim_Campaign, Dim_Ad_Group, Dim_Ad, and Dim_Date. Clean, well-understood by BI tools, and easy to slice. This is the most common approach.
Option B — One Big Table (OBT) Denormalise everything into a single wide table. Easier for analysts and self-service reporting — no joins needed. Works well when the use case is stable and the audience isn’t doing complex multi-fact analysis.
The linking logic (unioning Facebook and Google silver tables into a single Gold fact) can be done as a separate step — which is recommended — or handled directly in joins and unions within the Gold notebook.

Challenges
Data Governance — Storing everything in Bronze can become a GDPR problem. Access rights need clear policies, especially when silver tables contain PII from multiple source systems.
The Human Factor — Medallion is a naming framework, not a rulebook. Without documented team agreements on what belongs in each layer, the architecture degrades into inconsistency and shadow pipelines.
Key Takeaways
- Load everything to Bronze — storage is cheap, but missing a field later is expensive.
- Silver should clean and conform; Gold should model and integrate — resist merging these concerns.
- Data modeling choices at Silver (flat tables vs. Data Vault) have long-lasting downstream consequences.
- Document your layer contracts. Consistency across teams depends on written, shared agreements.
Medallion architecture looks simple on a diagram but the real design decisions — especially at the Silver layer — can have long-lasting consequences. If you want to think through the right approach for your situation, feel free to book a meeting.
ABOUT THE AUTHORS
Pieter Jansen
Managing Partner Estonia
Discover more content from our blog

Medallion Architecture
A layered data design pattern that progressively refines raw data into clean, business-ready insights — moving through Bronze, Silver, and Gold layers.

Budgets belong in your data platform now
Planning in Fabric IQ is Microsoft’s newest capability — and it signals a genuine shift in how organisations can think about the relationship between analytics and planning.

Sparkle is now part of the Estonian Microsoft Fabric user group
The Estonian Microsoft Fabric User Group is a space where people come together to tackle real-world data engineering and architecture challenges using Microsoft Fabric.

Customer case: Nabuminds migrates to Microsoft Fabric platform
The move from fragmented, tenant-based BI to a unified platform delivered faster insights, stronger governance, and future-ready analytics capabilities.

Connecting the Dots: Neo4j Graph Intelligence Inside Microsoft Fabric
Microsoft Fabric unifies data engineering, analytics, and AI across the Microsoft ecosystem.
But many business problems like fraud detection, customer 360, supply-chain optimization, … depend not on single data points, but on how those points connect.

Microsoft Fabric Ideation Session
The Microsoft Fabric Ideation Session helps organizations move from curiosity to clarity. In just three hours, our experts guide your team through Fabric’s AI and analytics capabilities to identify high-impact use cases, connect business goals with technology, and build a practical roadmap for value-driven transformation.



