Tag Archives: slowly changing dimensions

Managing a Rapidly Growing SCD2 Bronze Layer on Snowflake: Best Practices and Architectural Guidance

Slowly Changing Dimension Type 2 (SCD2) patterns are widely used in Snowflake-based Financial Services platforms to preserve full historical change for regulatory, analytical, and audit purposes. However, Snowflake’s architecture differs fundamentally from file-oriented lakehouse systems, requiring distinct design and operational choices. This article provides practical, production-focused guidance for operating large-scale SCD2 Bronze layers on Snowflake. It explains how to use Streams, Tasks, micro-partition behaviour, batching strategies, and cost-aware configuration to ensure predictable performance, controlled spend, and long-term readiness for analytics and AI workloads in regulated environments.

Continue reading

Managing a Rapidly Growing SCD2 Bronze Layer on Databricks: Best Practices and Practical Guidance ready for AI Workloads

Slowly Changing Dimension Type 2 (SCD2) patterns are increasingly used in the Bronze layer of Databricks-based platforms to meet regulatory, analytical, and historical data requirements in Financial Services. However, SCD2 Bronze tables grow rapidly and can become costly, slow, and operationally fragile if not engineered carefully. This article provides practical, production-tested guidance for managing large-scale SCD2 Bronze layers on Databricks using Delta Lake. It focuses on performance, cost control, metadata health, and long-term readiness for analytics and AI workloads in regulated environments.

Continue reading

Event-Driven CDC to Correct SCD2 Bronze in 2025–2026

Broken history often stays hidden until remediation or skilled-person reviews. Why? Event-driven Change Data Capture fundamentally changes how history behaves in a data platform. When Financial Services organisations move from batch ingestion to streaming CDC, long-standing SCD2 assumptions quietly break — often without immediate symptoms. Late, duplicated, partial, or out-of-order events can silently corrupt Bronze history and undermine regulatory confidence. This article sets out what “correct” SCD2 means in a streaming world, why most implementations fail, and how to design Bronze pipelines that remain temporally accurate, replayable, and defensible under PRA/FCA scrutiny in 2025–2026.

Continue reading

WTF Is SCD? A Practical Guide to Slowly Changing Dimensions

Slowly Changing Dimensions (SCDs) are how data systems manage attributes that evolve without constantly rewriting history. They determine whether you keep only the latest value, preserve full historical versions, or maintain a limited snapshot of changes. The classic SCD types (0–3, plus hybrids) define different behaviours… from never updating values, to overwriting them, to keeping every version with timestamps. The real purpose of SCDs is to make an explicit choice about how truth should behave in your analytics: what should remain fixed, what should update, and what historical context matters. Modern data platforms make tracking changes easy, but they don’t make the design decisions for you. SCDs are ultimately the backbone of reliable, temporal, reality-preserving analytics.

Continue reading