Tag Archives: SCD2

Series Wrap-Up: Reconstructing Time, Truth, and Trust in UK Financial Services Data Platforms

This series explored how UK Financial Services data platforms can preserve temporal truth, reconstruct institutional belief, and withstand regulatory scrutiny at scale. Beginning with foundational concepts such as SCD2 and event modelling, it developed into a comprehensive architectural pattern centred on an audit-grade Bronze layer, non-SCD Silver consumption, and point-in-time defensibility. Along the way, it addressed operational reality, governance, cost, AI integration, and regulatory expectations. This final article brings the work together, offering a structured map of the series and a coherent lens for understanding how modern, regulated data platforms actually succeed. Taken together, this body of work describes what I refer to as a “land it early, manage it early” data platform architecture for regulated industries.

Continue reading →

The 2026 UK Financial Services Lakehouse Reference Architecture

An opinionated but practical blueprint for regulated, temporal, multi-domain data platforms: focused on authority, belief, and point-in-time defensibility. This article lays out a reference architecture for UK FS in 2026: not as a rigid blueprint, but as a description of what “good” now looks like in banks, insurers, payments firms, wealth platforms, and capital markets organisations operating under FCA/PRA supervision.

Continue reading →

Why Bronze-Level Temporal Fidelity Obsoletes Traditional Data Lineage Tools in Regulated Platforms

This article argues that in regulated financial services, true data lineage cannot be retrofitted through catalogues or metadata overlays. Regulators require temporal lineage: proof of what was known, when it was known, and how it changed. By preserving audit-grade temporal truth at the Bronze layer, lineage becomes an inherent property of the data rather than a post-hoc reconstruction. The article explains why traditional lineage tools often create false confidence and why temporal fidelity is the only regulator-defensible foundation for lineage.

Continue reading →

From Build to Run Without Losing Temporal Truth: Operating Model Realities for Regulated Financial Services Data Platforms

This article explores why most regulated data platforms fail operationally rather than technically. It argues that the operating model is the mechanism by which architectural intent survives change, pressure, and organisational churn. Focusing on invariants, authority, correction workflows, and accountability, it shows how platforms must be designed to operate safely under stress, not just in steady state. The piece bridges architecture and real-world execution, ensuring temporal truth and regulatory trust persist long after delivery.

Continue reading →

Cost Is a Control: FinOps and Cost Management in Regulated Financial Services Data Platforms

This article positions cost management as a first-class architectural control rather than a post-hoc optimisation exercise. In regulated environments, cost decisions directly constrain temporal truth, optionality, velocity, and compliance. The article explains why FinOps must prioritise predictability, authority, and value alignment over minimisation, and how poorly designed cost pressure undermines regulatory defensibility. By linking cost to long-term value creation and regulatory outcomes, it provides a principled framework for sustaining compliant, scalable data platforms.

Continue reading →

From Threat Model to Regulator Narrative: Security Architecture for Regulated Financial Services Data Platforms

This article reframes security as an architectural property of regulated financial services data platforms, not a bolt-on set of controls. It argues that true security lies in preserving temporal truth, enforcing authority over data, and enabling defensible reconstruction of decisions under scrutiny. By grounding security in threat models, data semantics, SCD2 foundations, and regulator-facing narratives, the article shows how platforms can prevent silent history rewriting, govern AI safely, and treat auditability as a first-class security requirement.

Continue reading →

Why Transactions Are Events, Not Slowly Changing Dimensions

This article argues that modelling transactions as slowly changing dimensions is a fundamental category error in financial data platforms. Transactions are immutable events that occur once and do not change; what evolves is the organisation’s interpretation of them through enrichment, classification, and belief updates. Applying SCD2 logic to transactions conflates fact with interpretation, corrupts history, and undermines regulatory defensibility. By separating immutable event records from mutable interpretations, platforms become clearer, auditable, and capable of reconstructing past decisions without rewriting reality.

Continue reading →

Why UK Financial Services Data Platforms Must Preserve Temporal Truth for Regulatory Compliance

A Regulatory Perspective (2025–2026). UK Financial Services regulation in 2025–2026 increasingly requires firms to demonstrate not just what is true today, but what was known at the time decisions were made. Across Consumer Duty, s166 reviews, AML/KYC, model risk, and operational resilience, regulators expect deterministic reconstruction of historical belief, supported by traceable evidence. This article explains where that requirement comes from, why traditional current-state platforms fail under scrutiny, and why preserving temporal truth inevitably drives architectures that capture change over time as a foundational control, not a technical preference.

Continue reading →

Common Anti-Patterns in Financial Services Data Platforms

Financial Services data platforms rarely fail because of tools, scale, or performance. They fail because architectural decisions are left implicit, applied inconsistently, or overridden under pressure. This article documents the most common and damaging failure modes observed in large-scale FS data platforms: not as edge cases, but as predictable outcomes of well-intentioned instincts applied at the wrong layer. Each pattern shows how trust erodes quietly over time, often remaining invisible until audit, remediation, or regulatory scrutiny exposes the underlying architectural fault lines.

Continue reading →

Operationalising Time, Consistency, and Freshness in a Financial Services Data Platform

This article translates the temporal doctrine established in Time, Consistency, and Freshness in a Financial Services Data Platform into enforceable architectural mechanisms. It focuses not on tools or technologies, but on the structural controls required to make time, consistency, and freshness unavoidable properties of a Financial Services (FS) data platform. The objective is simple: ensure that temporal correctness does not depend on developer discipline, operational goodwill, or institutional memory, but is instead enforced mechanically by the platform itself.

Continue reading →

Databricks vs Snowflake vs Fabric vs Other Tech with SCD2 Bronze: Choosing the Right Operating Model

Choosing the right platform for implementing SCD2 in the Bronze layer is not a tooling decision but an operating model decision. At scale, SCD2 Bronze forces trade-offs around change capture, merge frequency, physical layout, cost governance, and long-term analytics readiness. Different platforms optimise for different assumptions about who owns those trade-offs. This article compares Databricks, Snowflake, Microsoft Fabric, and alternative technologies through that lens, with practical guidance for Financial Services organisations designing SCD2 Bronze layers that must remain scalable, auditable, and cost-effective over time.

Continue reading →

From Partitioning to Liquid Clustering: Evolving SCD2 Bronze on Databricks at Scale

As SCD2 Bronze layers mature, even well-designed partitioning and ZORDER strategies can struggle under extreme scale, high-cardinality business keys, and evolving access patterns. This article examines why SCD2 Bronze datasets place unique pressure on static data layouts and introduces Databricks Liquid Clustering as a natural next step in their operational evolution. It explains when Liquid Clustering becomes appropriate, how it fits within regulated Financial Services environments, and how it preserves auditability while improving long-term performance and readiness for analytics and AI workloads.

Continue reading →

Probabilistic & Graph-Based Identity in Regulated Financial Services

This article argues that probabilistic and graph-based identity techniques are unavoidable in regulated Financial Services, but only defensible when tightly governed. Deterministic entity resolution remains the foundation, providing anchors, constraints, and auditability. Probabilistic scores and identity graphs introduce likelihood and network reasoning, not truth, and must be time-bound, versioned, and replayable. When anchored to immutable history, SCD2 discipline, and clear guardrails, these techniques enhance fraud and AML insight; without discipline, they create significant regulatory risk.

Continue reading →

Migrating Legacy EDW Slowly-Changing Dimensions to Lakehouse Bronze

From 20-year-old warehouse SCDs to a modern temporal backbone you can trust. This article lays out a practical, regulator-aware playbook for migrating legacy EDW SCD dimensions to a modern SCD2 Bronze layer in a medallion/lakehouse architecture. It covers what you are really migrating (semantics, not just tables), how to treat the EDW as a source system, how to build canonical SCD2 Bronze, how to run both platforms in parallel, and how to prove to auditors and regulators that nothing has been lost or corrupted in the process.

Continue reading →

Enterprise Point-in-Time (PIT) Reconstruction: The Regulatory Playbook

This article sets out the definitive regulatory playbook for enterprise Point-in-Time (PIT) reconstruction in UK Financial Services. It explains why PIT is now a supervisory expectation: driven by PRA/FCA reviews, Consumer Duty, s166 investigations, AML/KYC forensics, and model risk, and makes a clear distinction between “state as known” and “state as now known”. Covering SCD2 foundations, entity resolution, precedence versioning, multi-domain alignment, temporal repair, and reproducible rebuild patterns, it shows how to construct a deterministic, explainable PIT engine that can withstand audit, replay history reliably, and defend regulatory outcomes with confidence.

Continue reading →

Temporal RAG: Retrieving “State as Known on Date X” for LLMs in Financial Services

This article explains why standard Retrieval-Augmented Generation (RAG) silently corrupts history in Financial Services by answering past questions with present-day truth. It introduces Temporal RAG: a regulator-defensible retrieval pattern that conditions every query on an explicit as_of timestamp and retrieves only from Point-in-Time (PIT) slices governed by SCD2 validity, precedence rules, and repair policies. Using concrete implementation patterns and audit reconstruction examples, it shows how to make LLM retrieval reproducible, evidential, and safe for complaints, remediation, AML, and conduct-risk use cases.

Continue reading →

AI Agents for Temporal Repair, Metadata Enrichment, and Regulatory Automation

This article explains where AI agents genuinely add value in regulated Financial Services data platforms — and where they become indefensible. It distinguishes “useful” agents from regulator-defensible ones, showing how agents can safely detect temporal defects, propose repairs, enrich metadata, and draft regulatory artefacts without bypassing governance. Using practical patterns and real failure modes, it defines a strict operating model — detect, propose, approve, apply — that preserves auditability, replayability, and SMCR accountability while delivering operational leverage that survives PRA/FCA scrutiny.

Continue reading →

Integrating AI and LLMs into Regulated Financial Services Data Platforms

How AI fits into Bronze/Silver/Gold without breaking lineage, PIT, or SMCR: This article sets out a regulator-defensible approach to integrating AI and LLMs into UK Financial Services data platforms (structurally accurate for 2025/2026). It argues that AI must operate as a governed consumer and orchestrator of a temporal medallion architecture, not a parallel system. By defining four permitted integration patterns, PIT-aware RAG, controlled Bronze embeddings, anonymised fine-tuning, and agentic orchestration, it shows how to preserve lineage, point-in-time truth, and SMCR accountability while enabling practical AI use under PRA/FCA scrutiny.

Continue reading →

Foundational Architecture Decisions in a Financial Services Data Platform

This article defines a comprehensive architectural doctrine for modern Financial Services data platforms, separating precursor decisions (what must be true for trust and scale) from foundational decisions (how the platform behaves under regulation, time, and organisational pressure). It explains why ingestion maximalism, streaming-first eventual consistency, transactional processing at the edge, domain-first design, and freshness as a business contract are non-negotiable in FS. Through detailed narrative and explicit anti-patterns, it shows how these decisions preserve optionality, enable regulatory defensibility, support diverse communities, and prevent the systemic failure modes that quietly undermine large-scale financial data platforms.

Continue reading →

Time, Consistency, and Freshness in a Financial Services Data Platform

This article explains why time, consistency, and freshness are first-class architectural concerns in modern Financial Services data platforms. It shows how truth in FS is inherently time-qualified, why event time must be distinguished from processing time, and why eventual consistency is a requirement rather than a compromise. By mapping these concepts directly to Bronze, Silver, Gold, and Platinum layers, the article demonstrates how platforms preserve historical truth, deliver reliable current-state views, and enforce freshness as an explicit business contract rather than an accidental outcome.

Continue reading →