Tag Archives: Entity Resolution

Probabilistic & Graph-Based Identity in Regulated Financial Services

This article argues that probabilistic and graph-based identity techniques are unavoidable in regulated Financial Services, but only defensible when tightly governed. Deterministic entity resolution remains the foundation, providing anchors, constraints, and auditability. Probabilistic scores and identity graphs introduce likelihood and network reasoning, not truth, and must be time-bound, versioned, and replayable. When anchored to immutable history, SCD2 discipline, and clear guardrails, these techniques enhance fraud and AML insight; without discipline, they create significant regulatory risk.

Continue reading

WTF is the Fellegi–Sunter Model? A Practical Guide to Record Matching in an Uncertain World

The Fellegi–Sunter model is the foundational probabilistic framework for record linkage… deciding whether two imperfect records refer to the same real-world entity. Rather than enforcing brittle matching rules, it treats linkage as a problem of weighing evidence under uncertainty. By modelling how fields behave for true matches versus non-matches, it produces interpretable scores and explicit decision thresholds. Despite decades of new tooling and machine learning, most modern matching systems still rest on this logic… often without acknowledging it.

Continue reading

Migrating Legacy EDW Slowly-Changing Dimensions to Lakehouse Bronze

From 20-year-old warehouse SCDs to a modern temporal backbone you can trust. This article lays out a practical, regulator-aware playbook for migrating legacy EDW SCD dimensions to a modern SCD2 Bronze layer in a medallion/lakehouse architecture. It covers what you are really migrating (semantics, not just tables), how to treat the EDW as a source system, how to build canonical SCD2 Bronze, how to run both platforms in parallel, and how to prove to auditors and regulators that nothing has been lost or corrupted in the process.

Continue reading

Enterprise Point-in-Time (PIT) Reconstruction: The Regulatory Playbook

This article sets out the definitive regulatory playbook for enterprise Point-in-Time (PIT) reconstruction in UK Financial Services. It explains why PIT is now a supervisory expectation: driven by PRA/FCA reviews, Consumer Duty, s166 investigations, AML/KYC forensics, and model risk, and makes a clear distinction between “state as known” and “state as now known”. Covering SCD2 foundations, entity resolution, precedence versioning, multi-domain alignment, temporal repair, and reproducible rebuild patterns, it shows how to construct a deterministic, explainable PIT engine that can withstand audit, replay history reliably, and defend regulatory outcomes with confidence.

Continue reading

Entity Resolution & Matching at Scale on the Bronze Layer

Entity resolution has become one of the hardest unsolved problems in modern UK Financial Services data platforms. This article sets out a Bronze-layer–anchored approach to resolving customers, accounts, and parties at scale using SCD2 as the temporal backbone. It explains how deterministic, fuzzy, and probabilistic matching techniques combine with blocking, clustering, and survivorship to produce persistent, auditable entity identities. By treating entity resolution as platform infrastructure rather than an application feature, firms can build defensible Customer 360 views, support point-in-time reconstruction, and meet growing FCA and PRA expectations.

Continue reading