Tag Archives: Data Architecture

Using SCD2 in the Bronze Layer with a Non-SCD2 Silver Layer: A Modern Data Architecture Pattern for UK Financial Services

UK Financial Services firms increasingly implement SCD2 history in the Bronze layer while providing simplified, non-SCD2 current-state views in the Silver layer. This pattern preserves full historical auditability for FCA/PRA compliance and regulatory forensics, while delivering cleaner, faster, easier-to-use datasets for analytics, BI, and data science. It separates “truth” from “insight,” improves governance, supports Data Mesh models, reduces duplicated logic, and enables deterministic rebuilds across the lakehouse. In regulated UK Financial Services today, it is the only pattern I have seen that satisfies the full, real-world constraint set with no material trade-offs.

Continue reading

WTF Is SCD? A Practical Guide to Slowly Changing Dimensions

Slowly Changing Dimensions (SCDs) are how data systems manage attributes that evolve without constantly rewriting history. They determine whether you keep only the latest value, preserve full historical versions, or maintain a limited snapshot of changes. The classic SCD types (0–3, plus hybrids) define different behaviours… from never updating values, to overwriting them, to keeping every version with timestamps. The real purpose of SCDs is to make an explicit choice about how truth should behave in your analytics: what should remain fixed, what should update, and what historical context matters. Modern data platforms make tracking changes easy, but they don’t make the design decisions for you. SCDs are ultimately the backbone of reliable, temporal, reality-preserving analytics.

Continue reading

MapReduce: A 20-Year Retrospective on How Jeffrey Dean and Sanjay Ghemawat Revolutionised Data Processing

This article provides a retrospective on the 20th anniversary of Jeffrey Dean and Sanjay Ghemawat’s seminal paper, “MapReduce: Simplified Data Processing on Large Clusters”. It explores the paper’s lasting impact on data processing, its influence on the development of big data technologies like Hadoop, and its broader implications for industries ranging from digital advertising to healthcare. The article also looks ahead to future trends in data processing, including stream processing and AI, emphasising how MapReduce’s principles will continue to shape the future of distributed computing.

Continue reading