Why Deterministic Precedence Is the Line Between “Data Platform” and “Regulatory Liability”. Modern UK Financial Services organisations ingest customer, account, and product data from 5–20 different systems of record, each holding overlapping and often conflicting truth. Delivering a reliable “Customer 360” or “Account 360” requires deterministic, audit-defensible precedence rules, survivorship logic, temporal correction workflows, and regulatory point-in-time (PIT) reconstructions: all operating on an SCD2 Bronze layer. This article explains how mature banks resolve multi-source conflicts, maintain lineage, rebalance history when higher-precedence data arrives late, and produce FCA/PRA-ready temporal truth. It describes the real patterns used in Tier-1 institutions, and the architectural techniques required to make them deterministic, scalable, and regulator-defensible.
Contents
- Contents
- 1. Introduction: The Multi-Source Conflict Problem
- 2. Why No Bank Has a Single System of Truth
- 3. Golden-Source Precedence Frameworks (What Actually Works)
- 4. Implementing Deterministic Precedence in Pipelines
- 5. Survivorship Rules for the Hardest Attributes
- 6. Late-Arriving Higher-Precedence Data
- 6.1 Why This Breaks Naïve SCD2 Implementations
- 6.2 The First Answer: Separate Belief-Time from Event-Time
- 6.3 The Second Answer: Retroactive SCD2 Rewriting, Not Overwriting
- 6.4 The Third Answer: Precedence Rules Must Be Time-Versioned
- 6.5 The Fourth Answer: Dual Views Are Mandatory, Not Optional
- 6.6 Why This Is the Real Test of the Platform
- 7. Regulatory Point-in-Time Reporting (PIT)
- 8. Reconciliation, Lineage and Daily Controls
- 9. Real-World War Stories (Anonymised)
- 10. Conclusion: You Cannot Do Customer 360 Without This
1. Introduction: The Multi-Source Conflict Problem
Modern UK Financial Services organisations ingest customer, account, and product data from dozens of systems, each optimised for a different purpose and operating under different controls. None of these systems is wrong in isolation. Together, they rarely agree.
Core banking, CRM, AML/KYC utilities, risk engines, payment platforms, and digital channels all maintain overlapping representations of the same real-world entities. Each claims partial authority. None provides a complete, uncontested view.
This is not an architectural flaw. It is the natural outcome of scale, regulation, and long system lifecycles.
The challenge is not collecting this data. It is deciding — deterministically and defensibly — which version of the truth the institution stands behind at any point in time, and proving that decision years later under regulatory scrutiny.
How Tier-1 Banks Actually Deliver Customer 360 / Account 360 With Conflicting Sources — and Survive a PRA Audit
Every UK Financial Services institution — retail bank, credit card issuer, insurer, wealth platform, payments processor — eventually faces the same architectural reality:
Customer and account data does not come from one source. It comes from many. And they disagree.
Conflicting updates flow from:
- Core Banking
- CRM
- AML/KYC
- Risk Engines
- Collections
- Digital Channels
- Onboarding / Identity Verification
- Legacy Data Warehouses
- Product Processors
- Payment Platforms
Each claims to hold some part of “the truth”.
None holds all of it.
This leads to the inevitable question:
When five systems say different things about the same customer, how do we decide what is correct — and defensible — at any point in time?
This Part 4 answers that question.
2. Why No Bank Has a Single System of Truth
The idea of a single system of truth collapses quickly in any bank that has grown beyond a handful of products or channels.
Different systems exist to serve different obligations. They evolve independently, update at different cadences, and are governed under different risk models. Over time, divergence is inevitable.
What matters is not eliminating disagreement — that is impossible — but ensuring the institution can explain, consistently and repeatably, how disagreement is resolved. In regulated environments, failing to resolve conflicting truth is not neutrality. It is abdication of responsibility.
The myth of “one system of truth” dies quickly once a bank reaches 20m+ customers or 200m+ accounts.
Why?
2.1 Systems have different purposes
Core Banking holds legal identifiers.
CRM holds preferred contact details.
AML/KYC holds risk scores and residency.
Risk systems hold credit attributes.
Digital systems hold device and behavioural fingerprints.
2.2 Systems update at different cadences
Core Banking is stable.
CRM updates constantly.
KYC re-verifies periodically.
Digital updates in real-time.
2.3 Systems disagree
- CRM says address = “Flat 2”
- Core Banking says address = “Flat 1–2”
- KYC says address = “Unit 2”
Who is correct?
2.4 Regulatory demands require reconstruction of historical state
The PRA, FCA, and Consumer Duty now expect banks to demonstrate:
“What data did you actually hold about Customer X at the time of Decision Y?”
This requires a consistent, deterministic precedence model applied across SCD2 Bronze.
3. Golden-Source Precedence Frameworks (What Actually Works)
After years of regulatory pressure, remediation programmes, and painful re-engineering, the industry has converged on a small number of ways to resolve multi-source conflict.
These are not theoretical models. They are the patterns that survive audits, support reconstruction, and scale across hundreds of attributes and millions of entities.
Choosing between them is not about technical preference. It is about how much nuance, governance, and accountability the institution is prepared to own.
Through painful iteration, the FS industry has converged on four real precedence models.
You only need one — done properly — but you must choose explicitly.
3.1 Strict Hierarchy (“Core Banking Wins”)
This is the simplest and most common in legacy banks.
Example hierarchy:
- Core Banking
- KYC / AML
- CRM
- Digital / Payments
- Everything else
Pros:
- simple
- predictable
- easy to explain to regulators
Cons:
- too crude for many attributes
- CRM may be more accurate for contact details
- KYC may be more accurate for residency
- core may be wrong for inactive customers
Used by: older banks, smaller building societies.
3.2 Attribute-Level Precedence Matrix (the modern standard)
This is the Tier-1 default.
Banks maintain a large matrix:
- ~150–400 attributes
- ~10–20 source systems
- each cell = precedence rank (1–N)
For example:
| Attribute | Core | CRM | KYC | Risk | Digital |
|---|---|---|---|---|---|
| Legal Name | 1 | 3 | 2 | 4 | 5 |
| Address (Postal) | 1 | 2 | 3 | 4 | 5 |
| Address (Residential) | 3 | 2 | 1 | 4 | 5 |
| 4 | 1 | 3 | 2 | 5 | |
| Risk Score | 5 | 3 | 1 | 2 | 4 |
This matrix is maintained as metadata.
Pros:
- precise
- attribute-specific
- regulator-defensible
- deterministic
Cons:
- must be governed
- changes over time
- requires metadata + test coverage
Used by: the top 5 UK retail banks, major insurers.
3.3 Best-Record Dynamic Scoring
Some banks score each candidate record based on:
- completeness
- freshness
- consistency
- data quality
The “best” record wins.
Pros:
- flexible
- adaptive
- handles emerging sources
Cons:
- harder to explain under audit
- must be fully deterministic
Used by: two Tier-1 UK banks, one large insurer.
3.4 Temporal Precedence
For attributes where recency matters:
- the most recent valid update wins
OR - earliest-known wins (e.g., DOB)
Pros:
- extremely simple
- useful for behavioural attributes
Cons:
- insufficient for identity data
- can be manipulated if upstream timestamps are wrong
4. Implementing Deterministic Precedence in Pipelines
Precedence only exists if it is executable.
If resolution logic lives in documentation, code comments, or analyst assumptions, it will drift, fragment, and eventually fail under audit. In regulated platforms, precedence must be expressed as data, applied consistently in pipelines, and replayable across the full historical record.
This is where governance becomes architecture. The mechanics matter, because anything that cannot be rebuilt deterministically cannot be defended.
Regardless of the model selected, the implementation patterns are identical.
4.1 Precedence Metadata Table
This is the heart of the system.
At minimum it contains:
| source_system | attribute_group | attribute_name | precedence_rank | valid_from | valid_to |
This enables:
- backdated rule changes
- future rulebook evolution
- clear lineage for regulators
This metadata table is essential for:
FCA SMCR accountability, Consumer Duty, audit defensibility, and temporal reconstruction.
4.2 Windowing + ROW_NUMBER / QUALIFY Patterns
The technical core of precedence resolution is a windowing pattern:
- partition by business key + attribute
- order by precedence, timestamp, quality
- pick ROW_NUMBER = 1
This works identically in:
- Databricks
- Snowflake
- BigQuery
- Spark SQL
- dbt
Generic SQL / Spark / Databricks Pattern (using subquery + WHERE rn = 1)
SELECT *
FROM (
SELECT *,
ROW_NUMBER() OVER (
PARTITION BY customer_id, attribute_name
ORDER BY precedence_rank ASC,
source_timestamp DESC
) AS rn
FROM candidate_attribute_values
)
WHERE rn = 1;
Equivalent Snowflake / BigQuery Pattern (using QUALIFY)
SELECT *
FROM candidate_attribute_values
QUALIFY ROW_NUMBER() OVER (
PARTITION BY customer_id, attribute_name
ORDER BY precedence_rank ASC, source_timestamp DESC
) = 1;
4.3 Idempotency & Reprocessing
A requirement in UK FS platforms:
If you reprocess 10 years of data, you MUST get the same result.
This requires:
- deterministic precedence rules
- versioned rule metadata
- idempotent processing
- stable ordering keys
- temporal repair logic
Without these, the platform is not regulator-defensible.
5. Survivorship Rules for the Hardest Attributes
Not all attributes carry equal regulatory weight.
Some fields — identity, address, risk indicators — repeatedly determine customer treatment, eligibility, escalation, and remediation outcomes. Errors in these attributes are not cosmetic. They are consequential.
As a result, survivorship for these fields cannot be generic. It must be explicit, attribute-specific, and documented in a way that survives both operational turnover and regulatory challenge.
Some attributes require bespoke rules.
5.1 Name / Legal Name
- CRM wins for preferred name
- Core wins for legal name
- KYC wins for sanctions-sensitive names
5.2 Address
- Core provides postal address
- KYC provides residential address
- CRM provides correspondence address
Each has different regulatory meaning.
5.3 Risk Rating
- AML system wins for first 180 days
- Risk system can override once validated
5.4 Date of Birth
- Core initial value
- If KYC corrects it, no system may override thereafter
- Fully regulator-defensible
5.5 KYC Flags
- OR logic (any system can elevate)
- AND logic (only specific system can clear)
These rules MUST be documented and versioned.
6. Late-Arriving Higher-Precedence Data
Every mature Financial Services platform eventually encounters the same failure mode.
A decision is made using the best information available at the time. Weeks or months later, more authoritative data arrives — often from a source that should always have taken precedence.
At that moment, the institution is forced to confront a hard truth: being correct now is not the same as having been correct then. Regulators care about both.
How a platform handles this scenario determines whether it can support Consumer Duty look-backs, s166 reviews, and historical accountability — or whether it collapses under retroactive scrutiny.
This is the nightmare scenario every UK FS platform eventually faces.
Late-arriving higher-precedence data is not an edge case.
It is the inevitable consequence of operating regulated processes across asynchronous systems, third-party utilities, and long-running investigations.
The nightmare is simple to describe:
A decision is made.
A customer is treated a certain way.
Weeks or months later, better information arrives — from a source that should always have won.
At that point, the institution must answer two questions simultaneously:
- What did we believe at the time the decision was made?
- What should we now believe about that same point in time?
Most platforms can answer one.
Very few can answer both — and regulators expect both.
6.1 Why This Breaks Naïve SCD2 Implementations
In immature implementations, late-arriving data triggers one of three failure modes:
- the “current” record is overwritten
- history is patched manually
- the correction is applied only forward
All three are indefensible.
They destroy one or more of:
- historical belief
- auditability
- deterministic replay
- explainability
Under Consumer Duty and s166 scrutiny, this is where the platform is judged — not on steady-state operation, but on how it handles being wrong.
6.2 The First Answer: Separate Belief-Time from Event-Time
The foundational answer is conceptual.
A mature platform distinguishes between:
- event time – when something actually occurred
- belief time – when the institution became aware of it
Late-arriving higher-precedence data does not invalidate past belief.
It revises understanding of the past.
Both must be preserved.
This is why:
- Bronze must be append-only
- SCD2 rows must be additive
- corrections must never erase prior states
The platform must be able to say:
“This is what we believed then, and this is what we now know about then.”
6.3 The Second Answer: Retroactive SCD2 Rewriting, Not Overwriting
Handling late-arriving precedence correctly requires retroactive history insertion, not mutation.
Practically, this means:
- inserting a new SCD2 record with backdated effective dates
- correctly closing adjacent records
- preserving prior rows untouched
- allowing the historical timeline to branch and be re-evaluated
This is not optional.
If the platform cannot reconstitute the historical timeline deterministically after a late correction, it cannot support Consumer Duty remediation or defend point-in-time decisions.
6.4 The Third Answer: Precedence Rules Must Be Time-Versioned
Late-arriving data often coincides with rule evolution.
What was authoritative in 2021 may not be authoritative in 2023.
Therefore:
- precedence rules must themselves be time-bound
- historical resolution must respect the rules as they existed at the time
- rebuilds must apply the correct rule version to the correct period
Without time-versioned precedence, historical truth becomes retroactively distorted — a subtle but critical regulatory failure.
6.5 The Fourth Answer: Dual Views Are Mandatory, Not Optional
When late-arriving higher-precedence data exists, the platform must explicitly support:
- State as known on date X
- State as now known for date X
Trying to collapse these into a single view is what creates regulatory confusion.
The correct response is not to choose one, but to make the distinction explicit and queryable.
This is why:
- Silver must be rebuildable
- Gold must declare its temporal assumptions
- reporting views must state which truth they represent
6.6 Why This Is the Real Test of the Platform
Most data platforms work when the world is orderly.
Financial Services platforms must work when:
- investigations are ongoing
- facts are revised
- assumptions are invalidated
- regulators look backward
Late-arriving higher-precedence data is not a failure of upstream systems.
It is the moment the platform proves whether it understands time.
Platforms that handle this well:
- survive s166 reviews
- scale Consumer Duty remediation
- avoid forensic rebuilds under pressure
Platforms that do not:
- appear correct until the worst possible moment
- then collapse publicly and expensively
This is why this scenario is not just “a nightmare”.
It is the proving ground of the entire architecture.
7. Regulatory Point-in-Time Reporting (PIT)
Regulators do not ask for “the data”. They ask for the data as it was understood at a specific moment, under specific assumptions, using the rules in force at the time.
This distinction — between what was known then and what is known now about the past — is subtle, but fundamental. It cannot be patched in reporting layers or reconciled manually during an investigation.
If a platform cannot reconstruct both views from the same historical foundation, it is not regulator-ready.
Regulators require banks to answer two different questions:
7.1 “State as known on date X”
What the bank actually believed at the time.
7.2 “State as now known on date X”
What the bank would have believed if errors were corrected earlier.
Both views must be reconstructable.
This requires:
- SCD2 Bronze
- rule metadata versioning
- precedence rule history
- full lineage
- a PIT view per reporting requirement
These capabilities are mandatory for:
- Consumer Duty look-backs
- AML investigations
- PRA regulatory submissions
- complaints and remediation
- mis-selling reviews
- credit decision audits
8. Reconciliation, Lineage and Daily Controls
Deterministic logic is meaningless if it cannot be evidenced.
In practice, regulatory confidence comes not from design documents, but from daily proof that the platform is behaving as expected. Precedence decisions must be observable, traceable, and routinely validated, not rediscovered during an audit.
Reconciliation and lineage are not compliance overhead. They are the mechanisms by which the institution demonstrates ongoing control over its data estate.
Modern FS platforms perform:
8.1 Daily Golden-Source Reconciliation
Checking:
- mismatches between final entity values and incoming sources
- upstream data drift
- data quality regressions
- expected-vs-actual precedence resolution
- completeness of ingestion
8.2 Lineage
Lineage must show:
- final attribute value
- which source it came from
- which rule selected it
- which timestamp was used
- which pipeline produced it
Platforms use:
- Databricks Unity Catalog
- OpenLineage
- Snowflake Object Tagging
- dbt metadata lineage
This lineage is frequently requested in PRA audits.
9. Real-World War Stories (Anonymised)
Precedence failures rarely surface during normal operations. They surface under pressure — during regulatory reviews, large-scale remediation, or cost-driven re-engineering.
The following cases are anonymised, but representative. Each illustrates a different way that unresolved or implicit precedence turns from a technical inconvenience into a regulatory and financial event.
9.1 Case 1 — CRM Overrode a PEP (Politically Exposed Person) Flag
A CRM system cleared a PEP flag set by AML.
Precedence was wrong.
Bank was fined.
Fix: AML now always overrides; CRM cannot clear risk flags.
9.2 Case 2 — £18m Storage Savings From Killing “Union Everything”
A card processor stored all versions from all feeds — no precedence.
Hybrid SCD2 + precedence matrix reduced versions by 80%.
9.3 Case 3 — 12 Years of Customer History Rebuilt in 9 Days
A UK insurer had to reprocess a decade of history for Consumer Duty.
Only succeeded because precedence rules were deterministic and metadata-driven.
10. Conclusion: You Cannot Do Customer 360 Without This
SCD2 preserves history, but it does not resolve truth.
Without deterministic, time-aware precedence, Customer 360 is not a coherent view — it is a collection of conflicting assertions. Under regulatory scrutiny, that distinction matters.
Golden-source resolution, survivorship rules, temporal correction, and point-in-time reconstruction are not optional refinements. They are the difference between a platform that stores data and one that can defend institutional decisions years after they were made.
This is how Tier-1 UK Financial Services institutions make SCD2 work under scrutiny — and without it, no amount of optimisation elsewhere will save the platform.
Golden-source resolution, precedence rules, temporal PIT reconstruction, and lineage are the hardest and most essential components of a modern FS data platform.
Without them:
- SCD2 becomes untrustworthy
- regulatory defensibility collapses
- Consumer Duty look-backs fail
- Customer 360 degenerates into chaos
With them:
- the Bronze layer becomes an enterprise temporal truth
- Silver and Gold remain consistent
- lineage is defensible
- regulatory audits are survivable
- Customer 360 and Account 360 become real
This is how Tier-1 banks actually do it.
And without it, no amount of SCD2 optimisation can save a platform.