Faster Insight, Better Reuse, and Scalable Data Foundations
Healthcare organisations face growing demand for better use of data: improving operational performance, supporting population health management, enabling AI, and accelerating research. Yet many still rely on fragmented pipelines, duplicated transformations, and slow bespoke data requests.
At the same time, the economics of technology have changed. Modern cloud platforms now provide highly durable, scalable storage at costs that make retaining large volumes of raw data practical and economical. This has enabled a shift in data architecture design: rather than transforming data before storage, organisations can preserve source data first and refine it iteratively over time.
The Medallion Architecture, popularised by Databricks, provides a practical model for this approach. Data progresses through three logical layers:
- Bronze – raw source data retained with provenance
- Silver – cleansed, linked, standardised, reusable data assets
- Gold – trusted datasets optimised for operational, analytical, or research use
For healthcare, this model offers substantial advantages. Once common cleansing, linkage, terminology mapping, and quality controls are established in Silver, Gold datasets for specific use cases can be produced rapidly and repeatedly. This shortens time to insight, reduces duplicated engineering effort, and strengthens governance consistency.
Why Data Architecture Must Change
Traditional healthcare data estates often evolved around individual reporting needs, local applications, or one-off integrations. Common consequences include:
- Repeated extraction of the same source data
- Multiple inconsistent versions of metrics
- Long lead times for new data requests
- Limited ability to reuse previous work
- Poor lineage and provenance
- High cost of maintaining bespoke pipelines
Historically, these designs reflected technical constraints. Storage was expensive, compute was fixed, and systems favoured structured schemas.
Modern cloud platforms changed those assumptions:
- Cheap, scalable storage
- Elastic compute on demand
- Native support for structured and semi-structured data
- Streaming ingestion
- Separation of storage and compute resources
This creates the opportunity for a more durable and reusable model.
From ETL to ELT
Traditional warehouses commonly used ETL (Extract, Transform, Load):
- Extract from source systems
- Transform externally
- Load final shaped data
Modern platforms increasingly favour ELT (Extract, Load, Transform):
- Extract source data
- Load raw data quickly into platform storage
- Transform using in-platform compute
ELT is better suited to healthcare because it supports:
- Large data volumes
- Frequent source system changes
- Streaming feeds
- Reprocessing when business rules change
- Multiple downstream outputs from one source ingestion
- Preservation of source fidelity for audit and replay
Instead of deciding once what data should become, organisations can decide many times.
What Is a Medallion Architecture?
The Medallion model organises data into progressive layers of trust and usability.
Bronze Layer – Raw Data
The Bronze layer stores source data substantially as received.
Examples:
- EPR extracts
- Laboratory feeds
- Claims data
- HL7 / FHIR messages
- Device telemetry
- Scheduling events
- Legacy flat files
Key principles:
- Preserve original records
- Maintain timestamps and provenance
- Support replay and reprocessing
- Avoid premature data loss
Silver Layer – Cleaned and Reusable Data
Silver is where data becomes enterprise-grade.
Typical processing:
- Validation and schema checks
- Deduplication
- Standardisation
- Identity linkage / pseudonymisation
- Reference data enrichment
- Coding alignment to SNOMED CT, ICD-10, OPCS-4
- Conformance to models such as OMOP
This layer creates reusable assets rather than one-off outputs.
Gold Layer – Consumption Ready Data
Gold contains products shaped for specific users.
Examples:
- Board performance dashboards
- Waiting list metrics
- Population health segmentation
- Clinical pathway analytics
- Service planning marts
- Research cohorts
- AI feature datasets
Why This Is Powerful in Healthcare
Healthcare repeatedly asks new questions of old data.
Examples:
- Which cohorts are at highest risk this winter?
- What predicts delayed discharge?
- How equitable is service access?
- Which interventions improved outcomes?
- Can a research cohort be assembled safely for oncology?
Without a layered architecture, each request may restart engineering effort.
With Medallion:
- Bronze already holds historical source data
- Silver already contains reusable linked and standardised assets
- Gold can be generated quickly for the new question
This changes delivery speed dramatically.
Population Health and Research Use Cases
Population Health
Gold datasets can support wide-scale analysis across:
- Long-term conditions
- Prevention opportunities
- Health inequalities
- Demand forecasting
- Primary / secondary care utilisation
- Place-based planning
Research and Innovation
Gold datasets can support approved linkable cohorts for:
- Oncology outcomes
- Cardiovascular studies
- Rare disease analysis
- Genomics-enabled studies
- Medicines safety
- Pathway redesign evaluation
Because much of the hard work is already completed in Silver, research mobilisation becomes faster and more repeatable.
Governance by Design
In healthcare, speed without trust fails.
A Medallion Architecture should embed:
- Data minimisation
- Role-based access control
- Pseudonymisation
- Full lineage
- Audit trails
- Approval workflows
- Safe outputs controls
- Policy-as-code enforcement where possible
This allows Gold datasets to be generated quickly without weakening governance.
Strategic Benefits
For executives, the model delivers:
Faster Time to Insight
Weeks become days; days become hours.
Lower Total Cost
Reuse common transformation logic rather than rebuilding pipelines. Pseudonymise on input once.
Better Quality
Shared standards reduce conflicting numbers.
Stronger Governance
Traceability from Gold outputs back to Bronze source. Traceability on all transforms applied.
AI Readiness
Curated Silver and Gold layers support analytics and machine learning. Bronze layers pseudonymised can be used by private LLMs
Future Flexibility
New use cases can be served from existing foundations.
Common Mistakes to Avoid
- Treating Bronze as a dumping ground with no metadata, ownership, history
- Creating too many bespoke Gold datasets with no ownership
- Skipping data quality controls in Silver
- Ignoring governance until the end
- Designing around tools rather than operating model
- Failing to assign product owners to Gold assets
Architecture alone does not solve accountability.
Recommended Healthcare Implementation Approach
Phase 1 – Foundation
Prioritise ingestion, metadata, lineage, identity controls.
Phase 2 – Silver Core Assets
Create reusable patient, encounter, provider, activity, coding, and geography models.
Phase 3 – Gold Priority Products
Deliver highest-value dashboards and research datasets first.
Phase 4 – Scale and Automate
Add self-service, policy automation, data product ownership, and advanced analytics.
Conclusion
The Medallion Architecture reflects a modern reality: storage is affordable, compute is elastic, and healthcare needs faster answers from increasingly complex data.
By retaining raw data, building trusted reusable Silver assets, and rapidly generating Gold products, healthcare organisations can move from fragmented reporting toward an industrialised data capability.
For systems seeking better operational performance, stronger population health insight, and faster research enablement, Medallion Architecture is not just a technical option—it is a strategic operating model for data.