Turning catalogues, glossaries, and lineage tools into something regulators will actually trust. Most Financial Services organisations invest heavily in data catalogues, glossaries, and lineage tools: yet still struggle to satisfy regulators when asked to explain where data comes from, what it means, and how decisions were made over time. This article focuses on governance and metadata that actually survive Prudential Regulation Authority’s (PRA) and Financial Conduct Authority (FCA) scrutiny. It sets out a practical model for connecting semantics, lineage, rules, and accountability, so regulated firms can move beyond decorative governance and confidently explain historical outcomes with evidence, not anecdotes.
Contents
- Contents
- 1. Introduction
- 2. The Gap Between Tooling and Trust
- 3. What Regulators Actually Care About
- 4. The Four Pillars of Regulator-Defensible Governance
- 5. Designing a Governance & Metadata Model That Ties It All Together
- 6. Implementing This on Modern Lakehouse Platforms
- 7. Change Management, Evidence, and Auditability
- 8. Common Failure Modes (and How to Avoid Them)
- 9. Example End-to-End: A Risk Metric Under PRA Scrutiny
- 10. Summary: A Practical Definition of “Good” Governance in 2026
1. Introduction
Governance and metadata sit at the boundary between data engineering and regulatory accountability. In Financial Services, they are expected to provide confidence that reported figures, customer decisions, and historical outcomes are not only correct, but explainable long after the fact. As platforms modernise and logic becomes increasingly temporal and rule-driven, traditional catalogue-centric approaches struggle to keep pace. This introduction sets the context for why governance must evolve from static documentation into an operational system that can withstand detailed regulatory scrutiny.
Most Financial Services organisations now “have” a data catalogue, a business glossary, lineage tooling, and a stack of governance processes. Yet when the PRA or FCA ask hard questions:
- “Show me where this metric comes from.”
- “Show me what you knew about this customer on 31 March 2021, and why.”
- “Show me how your golden-source and precedence rules are enforced.”
— many platforms still fall back to slideware, tribal knowledge, and hastily-written SQL.
This article is about governance and metadata that actually work under regulatory scrutiny. Not just a list of tools, but a practical pattern for connecting four things that matter in UK FS in 2025–2026:
- What the data means (glossary and conceptual model)
- Where the data comes from (lineage and provenance)
- How the data is decided (precedence, business rules, SCD2/PIT logic)
- Who is accountable (ownership, controls, and change management)
The lens is simple: could you sit in a PRA review, walk through your governance story, and feel confident it will stand up when someone asks to drill down?
2. The Gap Between Tooling and Trust
Over the past decade, Financial Services firms have invested heavily in governance tooling. Catalogues are populated, glossaries exist, and lineage diagrams look impressive. Yet regulatory trust has not increased at the same pace. This section explores why the presence of governance artefacts does not automatically translate into confidence during regulatory scrutiny, and why many programmes fail precisely at the point where questions become specific and uncomfortable.
In 2025/2026, most banks, insurers, asset managers, and payment firms can say:
- “We’ve got Unity Catalog / Purview / Collibra / Atlan / Alation / <insert tool>.”
- “We’ve got a business glossary.”
- “We’ve got lineage.”
- “We’ve got owners.”
But when examined:
- glossary terms don’t match actual fields;
- lineage is technically impressive but semantically lightweight;
- precedence rules and SCD2 logic live in notebooks and ETL code, not in metadata;
- owners can’t see how changes propagate into regulatory reports;
- PIT logic is implicit, undocumented, or different in each team.
The result is governance that looks fine in board packs but crumbles when a skilled regulator, internal audit, or model risk function asks detailed questions.
The good news: the gap is fixable.
The less comfortable news: it requires connecting governance to engineering — especially to SCD2, precedence, entity resolution, and PIT.
3. What Regulators Actually Care About
Governance frameworks often focus on internal maturity models, but regulators approach data very differently. They are not interested in how comprehensive a catalogue looks; they are interested in whether decisions can be explained, justified, and reproduced. This section reframes governance around the core questions regulators consistently ask, cutting through tooling and terminology to focus on what must be demonstrable in practice.
Tools and frameworks come and go. The underlying regulatory questions have stayed remarkably consistent:
- Meaning
- What does this metric, attribute, or flag mean?
- Who defined it? When? For what purpose?
- Lineage
- Where did this data come from originally?
- Which systems, feeds, and transformations are involved?
- Can I see the journey from source to report?
- Control
- Who owns this data/metric?
- What processes prevent incorrect changes?
- How do you detect and respond to quality issues?
- Temporal Fidelity
- Can you reconstruct what you believed on a specific date?
- Can you explain historical changes and corrections?
- Precedence & Entity Decisions
- When sources conflict, how do you decide which one “wins”?
- How are those rules implemented, governed, and changed?
- Evidence
- Can you demonstrate all of the above with artefacts, not anecdotes?
If your governance stack can answer these, your tools and diagrams matter much less.
If it can’t, more tooling won’t help.
4. The Four Pillars of Regulator-Defensible Governance
To withstand scrutiny, governance must be structured around a small number of interlocking capabilities rather than a collection of disconnected processes. This section introduces four foundational pillars that consistently appear in governance models that hold up during PRA and FCA reviews, providing a practical lens for assessing whether governance is merely documented or genuinely operational.
A governance model that works in practice, not just in frameworks, tends to rest on four pillars.
4.1 Conceptual & Business Semantics
You need a conceptual model and business glossary that:
- define key entities (Customer, Party, Account, Product, Contract, Transaction, Ledger Entry);
- define relationships (owns, party_to, beneficial_owner_of, linked_to, guaranteed_by);
- define measures (Exposure, EAD, LGD, PD, Impairment, Profitability, Risk Rating);
- define flags and attributes (PEP, Sanctioned, KYC risk level, Vulnerable customer, Conduct segment).
Crucially:
- these definitions must be linked to physical data structures, not just words in Confluence;
- changes to definitions must be governed and versioned.
4.2 Technical Lineage & Provenance
You need a technical view of how data moves:
- from source systems (core banking, CRM, KYC, payments, GL)
- via Raw/Base/Bronze/Silver/Gold/Platinum
- into reports, dashboards, models, APIs.
That lineage must be:
- accurate enough to debug issues;
- granular enough to see how a particular column is derived;
- integrated with your CI/CD and pipeline tooling;
- queryable — not just a visual graph.
But technical lineage on its own is not enough. It must be connected to semantics, rules, and owners.
4.3 Rules, Precedence & Temporal Logic
This is the most neglected pillar.
Regulator-defensible governance must capture:
- how SCD2 history is constructed (change detection, effective dating, compaction);
- how late-arriving and out-of-order events are handled;
- how entity resolution works (matching, survivorship, cluster logic);
- how golden-source precedence applies (which system wins for which attribute, over which periods);
- how PIT is reconstructed (“state as known” vs “state as now known”).
These rules must be represented as metadata:
- precedence tables;
- rules engines;
- configuration stored alongside code;
- documented algorithms.
If these live only in notebooks and ETL scripts, governance will fail under pressure.
4.4 Ownership, Controls & Change Management
Each important entity, domain, and metric must have:
- a clear business owner;
- a clear technical owner;
- defined change processes (design, review, approval, deployment);
- controls and checks (data quality, threshold alerts, reconciliation);
- a record of who changed what, when, and why.
Regulators don’t expect perfection. They do expect that people are accountable, that changes are managed, and that issues are visible and acted on.
5. Designing a Governance & Metadata Model That Ties It All Together
Having individual governance components is not enough; they must form a coherent system. This section focuses on how semantics, lineage, rules, and ownership can be represented as a single connected model, allowing teams to move fluidly from business definition to physical data and back again when challenged.
Instead of thinking “we have a catalogue, a glossary, and some lineage”, it’s more useful to think in terms of a single, connected graph of:
- concepts (Customer, Product, Exposure)
- business terms (Retail customer, SME customer, Vulnerable customer)
- physical objects (tables, columns, views, jobs, models)
- rules (precedence, SCD2 logic, matching logic, PIT algorithms)
- owners and responsibilities
5.1 The Minimum Graph You Need
For each business-critical attribute or metric, you should be able to answer:
- What is it?
– Business term, definition, data type, domain. - Where is it physically stored?
– Columns in Bronze/Silver/Gold; views; models. - How is it computed or derived?
– Transformations, SCD2 rules, precedence rules, aggregations. - Which source systems does it come from?
– Core banking, CRM, KYC, AML, GL, etc. - Who owns it and can change it?
– Business owner, data steward, technical owner. - How has it changed over time?
– Definition history, rule changes, structural changes.
If you cannot walk that path, you have a gap.
5.2 Making Rules First-Class Metadata
Two particularly important classes of rules must be elevated to first-class objects in governance:
- Precedence rules
- Represented as tables or configs:
attribute_group,source_system,precedence_rank,effective_from,effective_to.
- Linked to:
- glossary terms (“Customer Address”, “Risk Rating”);
- physical transformations;
- pipeline code.
- Represented as tables or configs:
- Temporal rules
- How SCD2 is applied (e.g., attribute-level vs row-level, compaction windows).
- How PIT is computed (e.g., “as known” vs “as now known” definitions).
- How late-arriving data is inserted or corrected.
These should not be ad-hoc code fragments. They should be documented, versioned, and visible in the governance environment.
6. Implementing This on Modern Lakehouse Platforms
While governance principles are platform-agnostic, their implementation is shaped by the capabilities and constraints of modern data platforms. This section illustrates how the governance model can be realised on common lakehouse architectures, showing how metadata, rules, and lineage can be anchored in real systems without turning governance into a parallel, disconnected process.
The specifics depend on technology, but a few patterns generalise well.
6.1 Databricks-Centric Example
- Unity Catalog as the technical catalogue and governance backbone.
- Table and column comments used to link to business terms and glossary URIs.
- View definitions in Silver/Gold tied to:
- SCD2 functions in Bronze
- precedence lookup tables
- entity resolution tables.
- Delta table properties used to tag regulatory-critical tables (e.g.
regulatory_critical = true). - Integration with Collibra/Atlan/Alation for:
- richer glossary
- business process mapping
- stewardship workflows.
Key point: Unity Catalog holds what exists and how it moves; external tools hold what it means and who owns it — but both are tied together via shared IDs and tags.
6.2 Snowflake-Centric Example
- Snowflake provides:
- objects (tables, views, tasks, streams)
- column-level tagging and classification
- masking policies and row access policies
- External catalogue (Atlan/Collibra/etc.) provides:
- glossary and conceptual model
- process mapping
- stewardship workflows
- dbt (or similar) enforces:
- model definitions
- tests and documentation
- links between SQL and business terms.
Key point: Snowflake Object Tags and dbt metadata are wired back into the enterprise catalogue, so each tagged column can be traced back to a business term, a rule set, and an owner.
6.3 Common Non-Negotiables
Regardless of platform, a few practices are non-negotiable if you want governance that survives a PRA review:
- SCD2 Bronze tables registered and discoverable in the catalogue, with clear descriptions.
- Precedence and ER tables treated as core data assets, not “hidden config”.
- PIT views documented as such — not just random views called
*_snapshot. - Regulatory-critical metrics and tables tagged and subject to stricter controls.
- Lineage capturing both code-level flows (jobs, SQL) and business-level flows (domain, process).
7. Change Management, Evidence, and Auditability
Governance is ultimately judged over time, not at a single point. Regulators care deeply about how definitions, rules, and controls evolve, and whether those changes are understood and controlled. This section examines how governance must handle change, preserve evidence, and support auditability long after original design decisions were made.
Governance is not just structure; it’s also behaviour over time.
7.1 Change Management for Rules
Changes to:
- SCD2 logic (e.g., compaction window)
- precedence rules (e.g., KYC now outranks CRM for address)
- entity resolution logic (new match scores or models)
- PIT definitions
must be:
- proposed;
- impact-assessed (which reports/metrics are affected?);
- reviewed and approved;
- tested in a temporal way;
- deployed via controlled pipelines;
- and recorded for future evidence.
The record should include:
- who approved;
- when;
- rationale;
- links to tests and reconciliations.
7.2 Evidence Packs
For each major domain (Customer, Account, Product, etc.) it is worth assembling evidence packs that can be produced quickly during a review.
Typical contents:
- conceptual model diagrams
- glossary extracts for key terms
- lineage snapshots Raw → Bronze → Silver → Gold → report
- rule definitions for precedence, SCD2, and PIT
- change logs for rules and definitions
- sample PIT reconstructions for specific dates
- test results for key temporal scenarios (late data, backfill, restatement)
When done well, these packs transform a regulatory review from adversarial to collaborative.
8. Common Failure Modes (and How to Avoid Them)
Even well-intentioned governance programmes tend to fail in predictable ways. This section distils the most common patterns that undermine regulator confidence, explaining why they occur and how they can be avoided by treating governance as an operational control system rather than a documentation exercise.
8.1 The “Tool-Only” Governance Programme
Buying an expensive tool but treating it as an inventory, not a connected control system.
Fix:
Start from concrete regulatory questions and work backwards. Ensure that for a small set of Tier 1 metrics, the catalogue can answer who/what/where/how/when, end-to-end.
8.2 Glossary/Schema Disconnection
Glossary defines “Retail Customer”, but nobody knows which columns implement it.
Fix:
Enforce a discipline where new tables/views must link columns to glossary terms, and these links are reviewed in data design forums.
8.3 Rules Buried in SQL/Notebooks
Precedence, survivorship, SCD2, and PIT logic exist only as code.
Fix:
Elevate rules to configuration tables / metadata and reference them in code. Document them and put them under governance.
8.4 Unversioned Governance
Definitions, rules, and ownership change, but there is no temporal record.
Fix:
Version control for:
- glossary definitions;
- rule sets;
- ownership assignments.
Link versions to change requests, approvals, and test results.
8.5 Governance that Ignores Time
Catalogues show “current state” only — no view of how structures and rules have changed over time.
Fix:
Treat schema evolution, rule evolution, and definitional evolution as temporal data. Version them, and store them in your platform like any other domain.
9. Example End-to-End: A Risk Metric Under PRA Scrutiny
Abstract principles become meaningful when applied to a concrete regulatory scenario. This section walks through how a single, high-impact risk metric can be explained end-to-end under PRA scrutiny, demonstrating how semantics, lineage, rules, ownership, and evidence come together in practice.
To make this concrete, imagine a PRA review where you’re asked:
“Show us how you compute ‘Retail Secured Exposure at Default’ and demonstrate that it is reliable over time.”
A credible response might look like this:
- Meaning
- Business glossary entry: “Retail Secured Exposure at Default (EAD)” with definition, regulatory references, and owner.
- Conceptual model
- Diagram showing how EAD relates to Loan, Collateral, Customer, Product, and Account.
- Lineage
- Technical lineage graph showing:
- source systems (core banking, collateral system)
- Raw/Base tables
- Bronze SCD2 tables (Customer, Account, Product, Collateral)
- Silver and Gold views
- final EAD dataset and regulatory report.
- Technical lineage graph showing:
- Rules
- Precedence rules for key attributes (e.g., which system wins for collateral valuation).
- SCD2 rules for relevant dimensions (Customer, Account, Collateral).
- PIT logic for “as known at calculation date”.
- Ownership
- Business owner: Head of Credit Risk.
- Technical owner: Risk Data Platform lead.
- Data steward: identified by name.
- Evidence
- Test results showing PIT reconstruction for historic dates.
- Reconciliation between legacy EDW EAD and lakehouse EAD for a sample period.
- Change logs showing rule updates and their impact.
When you can walk through that confidently, you’re not just compliant — you’re operating a genuinely well-governed data platform.
10. Summary: A Practical Definition of “Good” Governance in 2026
Governance maturity is often discussed in abstract terms, but regulators judge outcomes, not intentions. This final section consolidates the article’s arguments into a pragmatic definition of what “good” governance looks like in UK Financial Services today — measured by explainability, evidence, and confidence under questioning.
In 2026, in UK Financial Services, good governance and metadata management is not defined by:
- the presence of a particular tool;
- adherence to a fashionable framework;
- or the number of lineage graphs in a slide deck.
It is defined by whether you can:
- clearly explain what your data and metrics mean;
- show where they come from and how they are transformed;
- demonstrate which rules decide their values, especially when sources conflict;
- reconstruct what you believed, about whom, on any date;
- and provide evidence that changes to definitions, rules, and data are controlled, tested, and understood.
A simple test:
If a PRA or FCA reviewer picks one critical metric or attribute and says,
“Walk me from the business definition all the way back to the source system row, including all rules and changes along the way,”
can you do it calmly, with facts, in under 60 minutes?
If yes, your governance is not just decorative — it’s operational and defensible.
If not, the good news is that you now know what to build.