Tag Archives: UK FS SCD2 Bronze

The “Land It Early, Manage It Early” Series

This series sets out a practical doctrine for building data platforms in highly regulated Financial Services environments. Its core premise is simple: firms must be able to reconstruct what was known, when it was known, and why decisions were made — using preserved evidence, not hindsight.

The articles develop an architectural pattern that treats temporal fidelity as foundational. Historical state is captured early at ingestion, governed before consumption, and only then transformed for analytics, reporting, and AI. Slowly Changing Dimension Type 2 (SCD2) is used as enabling infrastructure rather than a modelling afterthought, allowing platforms to scale while remaining regulator-defensible.

Across the series, this approach is applied to real operational concerns: ingestion, identity, multi-source precedence, point-in-time reconstruction, consumption layers, AI integration, cost, security, and operating models. Taken together, the work describes what “land it early, manage it early” looks like in practice for modern regulated data platforms.

Operationalising SCD2 at Scale: Monitoring, Cost Controls, and Governance for a Healthy Bronze Layer

This article explains how to operationalise Slowly Changing Dimension Type 2 (SCD2) at scale in the Bronze layer of a medallion architecture, with a focus on highly regulated Financial Services environments. It outlines three critical pillars: monitoring, cost control, and governance, needed to keep historical data trustworthy, performant, and compliant. By tracking growth patterns, preventing meaningless updates, controlling storage and compute costs, and enforcing clear governance, organisations can ensure their Bronze layer remains a reliable audit-grade historical asset rather than an unmanaged data swamp.

Continue reading

Advanced SCD2 Optimisation Techniques for Mature Data Platforms

Advanced SCD2 optimisation techniques are essential for mature Financial Services data platforms, where historical accuracy, regulatory traceability, and scale demands exceed the limits of basic SCD2 patterns. Attribute-level SCD2 significantly reduces storage and computation by tracking changes per column rather than per row. Hybrid SCD2 pipelines, combining lightweight delta logs with periodic MERGEs into the main Bronze table, minimise write amplification and improve reliability. Hash-based and probabilistic change detection eliminate unnecessary updates and accelerate temporal comparison at scale. Together, these techniques enable high-performance, audit-grade SCD2 in platforms such as Databricks, Snowflake, BigQuery, Iceberg, and Hudi, supporting the long-term data lineage and reconstruction needs of regulated UK Financial Services institutions.

Continue reading

Using SCD2 in the Bronze Layer with a Non-SCD2 Silver Layer: A Modern Data Architecture Pattern for UK Financial Services

UK Financial Services firms increasingly implement SCD2 history in the Bronze layer while providing simplified, non-SCD2 current-state views in the Silver layer. This pattern preserves full historical auditability for FCA/PRA compliance and regulatory forensics, while delivering cleaner, faster, easier-to-use datasets for analytics, BI, and data science. It separates “truth” from “insight,” improves governance, supports Data Mesh models, reduces duplicated logic, and enables deterministic rebuilds across the lakehouse. In regulated UK Financial Services today, it is the only pattern I have seen that satisfies the full, real-world constraint set with no material trade-offs.

Continue reading

WTF Is SCD? A Practical Guide to Slowly Changing Dimensions

Slowly Changing Dimensions (SCDs) are how data systems manage attributes that evolve without constantly rewriting history. They determine whether you keep only the latest value, preserve full historical versions, or maintain a limited snapshot of changes. The classic SCD types (0–3, plus hybrids) define different behaviours… from never updating values, to overwriting them, to keeping every version with timestamps. The real purpose of SCDs is to make an explicit choice about how truth should behave in your analytics: what should remain fixed, what should update, and what historical context matters. Modern data platforms make tracking changes easy, but they don’t make the design decisions for you. SCDs are ultimately the backbone of reliable, temporal, reality-preserving analytics.

Continue reading