Tag Archives: Data Governance

Series Wrap-Up: Reconstructing Time, Truth, and Trust in UK Financial Services Data Platforms

This series explored how UK Financial Services data platforms can preserve temporal truth, reconstruct institutional belief, and withstand regulatory scrutiny at scale. Beginning with foundational concepts such as SCD2 and event modelling, it developed into a comprehensive architectural pattern centred on an audit-grade Bronze layer, non-SCD Silver consumption, and point-in-time defensibility. Along the way, it addressed operational reality, governance, cost, AI integration, and regulatory expectations. This final article brings the work together, offering a structured map of the series and a coherent lens for understanding how modern, regulated data platforms actually succeed. Taken together, this body of work describes what I refer to as a “land it early, manage it early” data platform architecture for regulated industries.

Continue reading →

The 2026 UK Financial Services Lakehouse Reference Architecture

An opinionated but practical blueprint for regulated, temporal, multi-domain data platforms: focused on authority, belief, and point-in-time defensibility. This article lays out a reference architecture for UK FS in 2026: not as a rigid blueprint, but as a description of what “good” now looks like in banks, insurers, payments firms, wealth platforms, and capital markets organisations operating under FCA/PRA supervision.

Continue reading →

Why Bronze-Level Temporal Fidelity Obsoletes Traditional Data Lineage Tools in Regulated Platforms

This article argues that in regulated financial services, true data lineage cannot be retrofitted through catalogues or metadata overlays. Regulators require temporal lineage: proof of what was known, when it was known, and how it changed. By preserving audit-grade temporal truth at the Bronze layer, lineage becomes an inherent property of the data rather than a post-hoc reconstruction. The article explains why traditional lineage tools often create false confidence and why temporal fidelity is the only regulator-defensible foundation for lineage.

Continue reading →

From Build to Run Without Losing Temporal Truth: Operating Model Realities for Regulated Financial Services Data Platforms

This article explores why most regulated data platforms fail operationally rather than technically. It argues that the operating model is the mechanism by which architectural intent survives change, pressure, and organisational churn. Focusing on invariants, authority, correction workflows, and accountability, it shows how platforms must be designed to operate safely under stress, not just in steady state. The piece bridges architecture and real-world execution, ensuring temporal truth and regulatory trust persist long after delivery.

Continue reading →

Cost Is a Control: FinOps and Cost Management in Regulated Financial Services Data Platforms

This article positions cost management as a first-class architectural control rather than a post-hoc optimisation exercise. In regulated environments, cost decisions directly constrain temporal truth, optionality, velocity, and compliance. The article explains why FinOps must prioritise predictability, authority, and value alignment over minimisation, and how poorly designed cost pressure undermines regulatory defensibility. By linking cost to long-term value creation and regulatory outcomes, it provides a principled framework for sustaining compliant, scalable data platforms.

Continue reading →

Collapsing the Medallion: Layers as Patterns, Not Physical Boundaries

The medallion model was never meant to be a physical storage mandate. It is a pattern language for expressing guarantees about evidence, interpretation, and trust. In mature, regulated platforms, those guarantees increasingly live in contracts, lineage, governance, and tests: not in rigid physical layers. Collapsing the medallion does not weaken regulatory substantiation; it strengthens it by decoupling invariants from layout. This article explains why layers were necessary, why they eventually collapse, and what must never be lost when they do.

Continue reading →

From Writes to Reads: Applying CQRS Thinking to Regulated Data Platforms

In regulated financial environments, data duplication is often treated as a failure rather than a necessity. Command Query Responsibility Segregation (CQRS) is an approach to separate concerns such as reads versus writes. This article reframes duplication through CQRS-style thinking, arguing that separating write models (which execute actions) from read models (which explain outcomes) is essential for both safe operation and regulatory defensibility. By making authority explicit and accepting eventual consistency, institutions can act in real time while reconstructing explainable, auditable belief over time. CQRS is presented not as a framework, but as a mental model for survivable data platforms.

Continue reading →

Edge Systems Are a Feature: Why OLTP, CRM, and Low-Latency Stores Must Exist

Modern data platforms often treat operational systems as legacy constraints to be eliminated. This article argues the opposite. Transactional systems, CRM platforms, and low-latency decision stores exist because some decisions must be made synchronously, locally, and with authority. These “edge systems” are not architectural debt but purpose-built domains of control. A mature data platform does not replace them or centralise authority falsely; it integrates with them honestly, preserving their decisions, context, and evolution over time.

Continue reading →

Authority, Truth, and Belief in Financial Services Data Platforms

Financial services data architectures often fail by asking the wrong question: “Which system is the system of record?” This article argues that regulated firms operate with multiple systems of authority, while truth exists outside systems altogether. What data platforms actually manage is institutional belief: what the firm believed at a given time, based on available evidence. By separating authority, truth, and belief, firms can build architectures that preserve history, explain disagreement, and withstand regulatory scrutiny through accountable, reconstructable decision-making.

Continue reading →

Eventual Consistency in Regulated Financial Services Data Platforms

In regulated financial services, eventual consistency is often treated as a technical weakness to be minimised or hidden. This article argues the opposite: eventual consistency is the only honest and defensible consistency model in a multi-system, regulator-supervised institution. Regulators do not require instantaneous agreement: they require explainability, reconstructability, and reasonableness at the time decisions were made. By treating eventual consistency as an explicit architectural and regulatory contract, firms can bound inconsistency, preserve historical belief, and strengthen audit defensibility rather than undermine it.

Continue reading →

Why UK Financial Services Data Platforms Must Preserve Temporal Truth for Regulatory Compliance

A Regulatory Perspective (2025–2026). UK Financial Services regulation in 2025–2026 increasingly requires firms to demonstrate not just what is true today, but what was known at the time decisions were made. Across Consumer Duty, s166 reviews, AML/KYC, model risk, and operational resilience, regulators expect deterministic reconstruction of historical belief, supported by traceable evidence. This article explains where that requirement comes from, why traditional current-state platforms fail under scrutiny, and why preserving temporal truth inevitably drives architectures that capture change over time as a foundational control, not a technical preference.

Continue reading →

Common Anti-Patterns in Financial Services Data Platforms

Financial Services data platforms rarely fail because of tools, scale, or performance. They fail because architectural decisions are left implicit, applied inconsistently, or overridden under pressure. This article documents the most common and damaging failure modes observed in large-scale FS data platforms: not as edge cases, but as predictable outcomes of well-intentioned instincts applied at the wrong layer. Each pattern shows how trust erodes quietly over time, often remaining invisible until audit, remediation, or regulatory scrutiny exposes the underlying architectural fault lines.

Continue reading →

Operationalising Time, Consistency, and Freshness in a Financial Services Data Platform

This article translates the temporal doctrine established in Time, Consistency, and Freshness in a Financial Services Data Platform into enforceable architectural mechanisms. It focuses not on tools or technologies, but on the structural controls required to make time, consistency, and freshness unavoidable properties of a Financial Services (FS) data platform. The objective is simple: ensure that temporal correctness does not depend on developer discipline, operational goodwill, or institutional memory, but is instead enforced mechanically by the platform itself.

Continue reading →

Databricks vs Snowflake vs Fabric vs Other Tech with SCD2 Bronze: Choosing the Right Operating Model

Choosing the right platform for implementing SCD2 in the Bronze layer is not a tooling decision but an operating model decision. At scale, SCD2 Bronze forces trade-offs around change capture, merge frequency, physical layout, cost governance, and long-term analytics readiness. Different platforms optimise for different assumptions about who owns those trade-offs. This article compares Databricks, Snowflake, Microsoft Fabric, and alternative technologies through that lens, with practical guidance for Financial Services organisations designing SCD2 Bronze layers that must remain scalable, auditable, and cost-effective over time.

Continue reading →

From Partitioning to Liquid Clustering: Evolving SCD2 Bronze on Databricks at Scale

As SCD2 Bronze layers mature, even well-designed partitioning and ZORDER strategies can struggle under extreme scale, high-cardinality business keys, and evolving access patterns. This article examines why SCD2 Bronze datasets place unique pressure on static data layouts and introduces Databricks Liquid Clustering as a natural next step in their operational evolution. It explains when Liquid Clustering becomes appropriate, how it fits within regulated Financial Services environments, and how it preserves auditability while improving long-term performance and readiness for analytics and AI workloads.

Continue reading →

From Graph Insight to Action: Decisions, Controls & Remediation in Financial Services Platforms

This article argues that financial services platforms fail not from lack of insight, but from weak architecture between detection and action. Graph analytics and models generate signals, not decisions. Collapsing the two undermines accountability, auditability, and regulatory defensibility. By separating signals, judgements, and decisions; treating decisions as time-qualified data; governing controls as executable policy; and enabling deterministic replay for remediation, platforms can move from reactive analytics to explainable, defensible action. In regulated environments, what matters is not what was known: but what was decided, when, and why.

Continue reading →

Networks, Relationships & Financial Crime Graphs on the Bronze Layer

Financial crime rarely appears in isolated records; it emerges through networks of entities, relationships, and behaviours over time. This article explains why financial crime graphs must be treated as foundational, temporal structures anchored near the Bronze layer of a regulated data platform. It explores how relationships are inferred, versioned, and governed, why “known then” versus “known now” matters, and how poorly designed graphs undermine regulatory defensibility. Done correctly, crime graphs provide explainable, rebuildable network intelligence that stands up to scrutiny years later.

Continue reading →

Probabilistic & Graph-Based Identity in Regulated Financial Services

This article argues that probabilistic and graph-based identity techniques are unavoidable in regulated Financial Services, but only defensible when tightly governed. Deterministic entity resolution remains the foundation, providing anchors, constraints, and auditability. Probabilistic scores and identity graphs introduce likelihood and network reasoning, not truth, and must be time-bound, versioned, and replayable. When anchored to immutable history, SCD2 discipline, and clear guardrails, these techniques enhance fraud and AML insight; without discipline, they create significant regulatory risk.

Continue reading →

WTF is the Fellegi–Sunter Model? A Practical Guide to Record Matching in an Uncertain World

The Fellegi–Sunter model is the foundational probabilistic framework for record linkage… deciding whether two imperfect records refer to the same real-world entity. Rather than enforcing brittle matching rules, it treats linkage as a problem of weighing evidence under uncertainty. By modelling how fields behave for true matches versus non-matches, it produces interpretable scores and explicit decision thresholds. Despite decades of new tooling and machine learning, most modern matching systems still rest on this logic… often without acknowledging it.

Continue reading →

Aligning the Data Platform to Enterprise Data & AI Strategy

This article establishes the data platform as the execution engine of Enterprise Data & AI Strategy in Financial Services. It bridges executive strategy and technical delivery by showing how layered architecture (Bronze, Silver, Gold, Platinum), embedded governance, dual promotion lifecycles (North/South and East/West), and domain-aligned operating models turn strategic pillars, architecture & quality, governance, security & privacy, process & tools, and people & culture, into repeatable, regulator-ready outcomes. The result is a platform that delivers control, velocity, semantic alignment, and safe AI enablement at scale.

Continue reading →