Tag Archives: statistics

WTF is the Fellegi–Sunter Model? A Practical Guide to Record Matching in an Uncertain World

The Fellegi–Sunter model is the foundational probabilistic framework for record linkage… deciding whether two imperfect records refer to the same real-world entity. Rather than enforcing brittle matching rules, it treats linkage as a problem of weighing evidence under uncertainty. By modelling how fields behave for true matches versus non-matches, it produces interpretable scores and explicit decision thresholds. Despite decades of new tooling and machine learning, most modern matching systems still rest on this logic… often without acknowledging it.

Continue reading