Skip to content

Horkan

a blog by Wayne Horkan

  • Home
  • About
  • Neurodiversity
  • Garden Centre
  • Horkan Genealogy

Governance & Metadata That Actually Survive a PRA Review

Leave a Reply

Turning catalogues, glossaries, and lineage tools into something regulators will actually trust. Most Financial Services organisations invest heavily in data catalogues, glossaries, and lineage tools: yet still struggle to satisfy regulators when asked to explain where data comes from, what it means, and how decisions were made over time. This article focuses on governance and metadata that actually survive Prudential Regulation Authority’s (PRA) and Financial Conduct Authority (FCA) scrutiny. It sets out a practical model for connecting semantics, lineage, rules, and accountability, so regulated firms can move beyond decorative governance and confidently explain historical outcomes with evidence, not anecdotes.

Contents

Table of Contents
  • Contents
  • 1. Introduction
  • 2. The Gap Between Tooling and Trust
  • 3. What Regulators Actually Care About
  • 4. The Four Pillars of Regulator-Defensible Governance
    • 4.1 Conceptual & Business Semantics
    • 4.2 Technical Lineage & Provenance
    • 4.3 Rules, Precedence & Temporal Logic
    • 4.4 Ownership, Controls & Change Management
  • 5. Designing a Governance & Metadata Model That Ties It All Together
    • 5.1 The Minimum Graph You Need
    • 5.2 Making Rules First-Class Metadata
  • 6. Implementing This on Modern Lakehouse Platforms
    • 6.1 Databricks-Centric Example
    • 6.2 Snowflake-Centric Example
    • 6.3 Common Non-Negotiables
  • 7. Change Management, Evidence, and Auditability
    • 7.1 Change Management for Rules
    • 7.2 Evidence Packs
  • 8. Common Failure Modes (and How to Avoid Them)
    • 8.1 The “Tool-Only” Governance Programme
    • 8.2 Glossary/Schema Disconnection
    • 8.3 Rules Buried in SQL/Notebooks
    • 8.4 Unversioned Governance
    • 8.5 Governance that Ignores Time
  • 9. Example End-to-End: A Risk Metric Under PRA Scrutiny
  • 10. Summary: A Practical Definition of “Good” Governance in 2026

1. Introduction

Governance and metadata sit at the boundary between data engineering and regulatory accountability. In Financial Services, they are expected to provide confidence that reported figures, customer decisions, and historical outcomes are not only correct, but explainable long after the fact. As platforms modernise and logic becomes increasingly temporal and rule-driven, traditional catalogue-centric approaches struggle to keep pace. This introduction sets the context for why governance must evolve from static documentation into an operational system that can withstand detailed regulatory scrutiny.

Most Financial Services organisations now “have” a data catalogue, a business glossary, lineage tooling, and a stack of governance processes. Yet when the PRA or FCA ask hard questions:

  • “Show me where this metric comes from.”
  • “Show me what you knew about this customer on 31 March 2021, and why.”
  • “Show me how your golden-source and precedence rules are enforced.”

— many platforms still fall back to slideware, tribal knowledge, and hastily-written SQL.

This article is about governance and metadata that actually work under regulatory scrutiny. Not just a list of tools, but a practical pattern for connecting four things that matter in UK FS in 2025–2026:

  1. What the data means (glossary and conceptual model)
  2. Where the data comes from (lineage and provenance)
  3. How the data is decided (precedence, business rules, SCD2/PIT logic)
  4. Who is accountable (ownership, controls, and change management)

The lens is simple: could you sit in a PRA review, walk through your governance story, and feel confident it will stand up when someone asks to drill down?

2. The Gap Between Tooling and Trust

Over the past decade, Financial Services firms have invested heavily in governance tooling. Catalogues are populated, glossaries exist, and lineage diagrams look impressive. Yet regulatory trust has not increased at the same pace. This section explores why the presence of governance artefacts does not automatically translate into confidence during regulatory scrutiny, and why many programmes fail precisely at the point where questions become specific and uncomfortable.

In 2025/2026, most banks, insurers, asset managers, and payment firms can say:

  • “We’ve got Unity Catalog / Purview / Collibra / Atlan / Alation / <insert tool>.”
  • “We’ve got a business glossary.”
  • “We’ve got lineage.”
  • “We’ve got owners.”

But when examined:

  • glossary terms don’t match actual fields;
  • lineage is technically impressive but semantically lightweight;
  • precedence rules and SCD2 logic live in notebooks and ETL code, not in metadata;
  • owners can’t see how changes propagate into regulatory reports;
  • PIT logic is implicit, undocumented, or different in each team.

The result is governance that looks fine in board packs but crumbles when a skilled regulator, internal audit, or model risk function asks detailed questions.

The good news: the gap is fixable.
The less comfortable news: it requires connecting governance to engineering — especially to SCD2, precedence, entity resolution, and PIT.

3. What Regulators Actually Care About

Governance frameworks often focus on internal maturity models, but regulators approach data very differently. They are not interested in how comprehensive a catalogue looks; they are interested in whether decisions can be explained, justified, and reproduced. This section reframes governance around the core questions regulators consistently ask, cutting through tooling and terminology to focus on what must be demonstrable in practice.

Tools and frameworks come and go. The underlying regulatory questions have stayed remarkably consistent:

  1. Meaning
    • What does this metric, attribute, or flag mean?
    • Who defined it? When? For what purpose?
  2. Lineage
    • Where did this data come from originally?
    • Which systems, feeds, and transformations are involved?
    • Can I see the journey from source to report?
  3. Control
    • Who owns this data/metric?
    • What processes prevent incorrect changes?
    • How do you detect and respond to quality issues?
  4. Temporal Fidelity
    • Can you reconstruct what you believed on a specific date?
    • Can you explain historical changes and corrections?
  5. Precedence & Entity Decisions
    • When sources conflict, how do you decide which one “wins”?
    • How are those rules implemented, governed, and changed?
  6. Evidence
    • Can you demonstrate all of the above with artefacts, not anecdotes?

If your governance stack can answer these, your tools and diagrams matter much less.
If it can’t, more tooling won’t help.

4. The Four Pillars of Regulator-Defensible Governance

To withstand scrutiny, governance must be structured around a small number of interlocking capabilities rather than a collection of disconnected processes. This section introduces four foundational pillars that consistently appear in governance models that hold up during PRA and FCA reviews, providing a practical lens for assessing whether governance is merely documented or genuinely operational.

A governance model that works in practice, not just in frameworks, tends to rest on four pillars.

4.1 Conceptual & Business Semantics

You need a conceptual model and business glossary that:

  • define key entities (Customer, Party, Account, Product, Contract, Transaction, Ledger Entry);
  • define relationships (owns, party_to, beneficial_owner_of, linked_to, guaranteed_by);
  • define measures (Exposure, EAD, LGD, PD, Impairment, Profitability, Risk Rating);
  • define flags and attributes (PEP, Sanctioned, KYC risk level, Vulnerable customer, Conduct segment).

Crucially:

  • these definitions must be linked to physical data structures, not just words in Confluence;
  • changes to definitions must be governed and versioned.

4.2 Technical Lineage & Provenance

You need a technical view of how data moves:

  • from source systems (core banking, CRM, KYC, payments, GL)
  • via Raw/Base/Bronze/Silver/Gold/Platinum
  • into reports, dashboards, models, APIs.

That lineage must be:

  • accurate enough to debug issues;
  • granular enough to see how a particular column is derived;
  • integrated with your CI/CD and pipeline tooling;
  • queryable — not just a visual graph.

But technical lineage on its own is not enough. It must be connected to semantics, rules, and owners.

4.3 Rules, Precedence & Temporal Logic

This is the most neglected pillar.

Regulator-defensible governance must capture:

  • how SCD2 history is constructed (change detection, effective dating, compaction);
  • how late-arriving and out-of-order events are handled;
  • how entity resolution works (matching, survivorship, cluster logic);
  • how golden-source precedence applies (which system wins for which attribute, over which periods);
  • how PIT is reconstructed (“state as known” vs “state as now known”).

These rules must be represented as metadata:

  • precedence tables;
  • rules engines;
  • configuration stored alongside code;
  • documented algorithms.

If these live only in notebooks and ETL scripts, governance will fail under pressure.

4.4 Ownership, Controls & Change Management

Each important entity, domain, and metric must have:

  • a clear business owner;
  • a clear technical owner;
  • defined change processes (design, review, approval, deployment);
  • controls and checks (data quality, threshold alerts, reconciliation);
  • a record of who changed what, when, and why.

Regulators don’t expect perfection. They do expect that people are accountable, that changes are managed, and that issues are visible and acted on.

5. Designing a Governance & Metadata Model That Ties It All Together

Having individual governance components is not enough; they must form a coherent system. This section focuses on how semantics, lineage, rules, and ownership can be represented as a single connected model, allowing teams to move fluidly from business definition to physical data and back again when challenged.

Instead of thinking “we have a catalogue, a glossary, and some lineage”, it’s more useful to think in terms of a single, connected graph of:

  • concepts (Customer, Product, Exposure)
  • business terms (Retail customer, SME customer, Vulnerable customer)
  • physical objects (tables, columns, views, jobs, models)
  • rules (precedence, SCD2 logic, matching logic, PIT algorithms)
  • owners and responsibilities

5.1 The Minimum Graph You Need

For each business-critical attribute or metric, you should be able to answer:

  • What is it?
    – Business term, definition, data type, domain.
  • Where is it physically stored?
    – Columns in Bronze/Silver/Gold; views; models.
  • How is it computed or derived?
    – Transformations, SCD2 rules, precedence rules, aggregations.
  • Which source systems does it come from?
    – Core banking, CRM, KYC, AML, GL, etc.
  • Who owns it and can change it?
    – Business owner, data steward, technical owner.
  • How has it changed over time?
    – Definition history, rule changes, structural changes.

If you cannot walk that path, you have a gap.

5.2 Making Rules First-Class Metadata

Two particularly important classes of rules must be elevated to first-class objects in governance:

  1. Precedence rules
    • Represented as tables or configs:
      • attribute_group, source_system, precedence_rank, effective_from, effective_to.
    • Linked to:
      • glossary terms (“Customer Address”, “Risk Rating”);
      • physical transformations;
      • pipeline code.
  2. Temporal rules
    • How SCD2 is applied (e.g., attribute-level vs row-level, compaction windows).
    • How PIT is computed (e.g., “as known” vs “as now known” definitions).
    • How late-arriving data is inserted or corrected.

These should not be ad-hoc code fragments. They should be documented, versioned, and visible in the governance environment.

6. Implementing This on Modern Lakehouse Platforms

While governance principles are platform-agnostic, their implementation is shaped by the capabilities and constraints of modern data platforms. This section illustrates how the governance model can be realised on common lakehouse architectures, showing how metadata, rules, and lineage can be anchored in real systems without turning governance into a parallel, disconnected process.

The specifics depend on technology, but a few patterns generalise well.

6.1 Databricks-Centric Example

  • Unity Catalog as the technical catalogue and governance backbone.
  • Table and column comments used to link to business terms and glossary URIs.
  • View definitions in Silver/Gold tied to:
    • SCD2 functions in Bronze
    • precedence lookup tables
    • entity resolution tables.
  • Delta table properties used to tag regulatory-critical tables (e.g. regulatory_critical = true).
  • Integration with Collibra/Atlan/Alation for:
    • richer glossary
    • business process mapping
    • stewardship workflows.

Key point: Unity Catalog holds what exists and how it moves; external tools hold what it means and who owns it — but both are tied together via shared IDs and tags.

6.2 Snowflake-Centric Example

  • Snowflake provides:
    • objects (tables, views, tasks, streams)
    • column-level tagging and classification
    • masking policies and row access policies
  • External catalogue (Atlan/Collibra/etc.) provides:
    • glossary and conceptual model
    • process mapping
    • stewardship workflows
  • dbt (or similar) enforces:
    • model definitions
    • tests and documentation
    • links between SQL and business terms.

Key point: Snowflake Object Tags and dbt metadata are wired back into the enterprise catalogue, so each tagged column can be traced back to a business term, a rule set, and an owner.

6.3 Common Non-Negotiables

Regardless of platform, a few practices are non-negotiable if you want governance that survives a PRA review:

  • SCD2 Bronze tables registered and discoverable in the catalogue, with clear descriptions.
  • Precedence and ER tables treated as core data assets, not “hidden config”.
  • PIT views documented as such — not just random views called *_snapshot.
  • Regulatory-critical metrics and tables tagged and subject to stricter controls.
  • Lineage capturing both code-level flows (jobs, SQL) and business-level flows (domain, process).

7. Change Management, Evidence, and Auditability

Governance is ultimately judged over time, not at a single point. Regulators care deeply about how definitions, rules, and controls evolve, and whether those changes are understood and controlled. This section examines how governance must handle change, preserve evidence, and support auditability long after original design decisions were made.

Governance is not just structure; it’s also behaviour over time.

7.1 Change Management for Rules

Changes to:

  • SCD2 logic (e.g., compaction window)
  • precedence rules (e.g., KYC now outranks CRM for address)
  • entity resolution logic (new match scores or models)
  • PIT definitions

must be:

  • proposed;
  • impact-assessed (which reports/metrics are affected?);
  • reviewed and approved;
  • tested in a temporal way;
  • deployed via controlled pipelines;
  • and recorded for future evidence.

The record should include:

  • who approved;
  • when;
  • rationale;
  • links to tests and reconciliations.

7.2 Evidence Packs

For each major domain (Customer, Account, Product, etc.) it is worth assembling evidence packs that can be produced quickly during a review.

Typical contents:

  • conceptual model diagrams
  • glossary extracts for key terms
  • lineage snapshots Raw → Bronze → Silver → Gold → report
  • rule definitions for precedence, SCD2, and PIT
  • change logs for rules and definitions
  • sample PIT reconstructions for specific dates
  • test results for key temporal scenarios (late data, backfill, restatement)

When done well, these packs transform a regulatory review from adversarial to collaborative.

8. Common Failure Modes (and How to Avoid Them)

Even well-intentioned governance programmes tend to fail in predictable ways. This section distils the most common patterns that undermine regulator confidence, explaining why they occur and how they can be avoided by treating governance as an operational control system rather than a documentation exercise.

8.1 The “Tool-Only” Governance Programme

Buying an expensive tool but treating it as an inventory, not a connected control system.

Fix:
Start from concrete regulatory questions and work backwards. Ensure that for a small set of Tier 1 metrics, the catalogue can answer who/what/where/how/when, end-to-end.

8.2 Glossary/Schema Disconnection

Glossary defines “Retail Customer”, but nobody knows which columns implement it.

Fix:
Enforce a discipline where new tables/views must link columns to glossary terms, and these links are reviewed in data design forums.

8.3 Rules Buried in SQL/Notebooks

Precedence, survivorship, SCD2, and PIT logic exist only as code.

Fix:
Elevate rules to configuration tables / metadata and reference them in code. Document them and put them under governance.

8.4 Unversioned Governance

Definitions, rules, and ownership change, but there is no temporal record.

Fix:
Version control for:

  • glossary definitions;
  • rule sets;
  • ownership assignments.
    Link versions to change requests, approvals, and test results.

8.5 Governance that Ignores Time

Catalogues show “current state” only — no view of how structures and rules have changed over time.

Fix:
Treat schema evolution, rule evolution, and definitional evolution as temporal data. Version them, and store them in your platform like any other domain.

9. Example End-to-End: A Risk Metric Under PRA Scrutiny

Abstract principles become meaningful when applied to a concrete regulatory scenario. This section walks through how a single, high-impact risk metric can be explained end-to-end under PRA scrutiny, demonstrating how semantics, lineage, rules, ownership, and evidence come together in practice.

To make this concrete, imagine a PRA review where you’re asked:

“Show us how you compute ‘Retail Secured Exposure at Default’ and demonstrate that it is reliable over time.”

A credible response might look like this:

  1. Meaning
    • Business glossary entry: “Retail Secured Exposure at Default (EAD)” with definition, regulatory references, and owner.
  2. Conceptual model
    • Diagram showing how EAD relates to Loan, Collateral, Customer, Product, and Account.
  3. Lineage
    • Technical lineage graph showing:
      • source systems (core banking, collateral system)
      • Raw/Base tables
      • Bronze SCD2 tables (Customer, Account, Product, Collateral)
      • Silver and Gold views
      • final EAD dataset and regulatory report.
  4. Rules
    • Precedence rules for key attributes (e.g., which system wins for collateral valuation).
    • SCD2 rules for relevant dimensions (Customer, Account, Collateral).
    • PIT logic for “as known at calculation date”.
  5. Ownership
    • Business owner: Head of Credit Risk.
    • Technical owner: Risk Data Platform lead.
    • Data steward: identified by name.
  6. Evidence
    • Test results showing PIT reconstruction for historic dates.
    • Reconciliation between legacy EDW EAD and lakehouse EAD for a sample period.
    • Change logs showing rule updates and their impact.

When you can walk through that confidently, you’re not just compliant — you’re operating a genuinely well-governed data platform.

10. Summary: A Practical Definition of “Good” Governance in 2026

Governance maturity is often discussed in abstract terms, but regulators judge outcomes, not intentions. This final section consolidates the article’s arguments into a pragmatic definition of what “good” governance looks like in UK Financial Services today — measured by explainability, evidence, and confidence under questioning.

In 2026, in UK Financial Services, good governance and metadata management is not defined by:

  • the presence of a particular tool;
  • adherence to a fashionable framework;
  • or the number of lineage graphs in a slide deck.

It is defined by whether you can:

  • clearly explain what your data and metrics mean;
  • show where they come from and how they are transformed;
  • demonstrate which rules decide their values, especially when sources conflict;
  • reconstruct what you believed, about whom, on any date;
  • and provide evidence that changes to definitions, rules, and data are controlled, tested, and understood.

A simple test:

If a PRA or FCA reviewer picks one critical metric or attribute and says,
“Walk me from the business definition all the way back to the source system row, including all rules and changes along the way,”
can you do it calmly, with facts, in under 60 minutes?

If yes, your governance is not just decorative — it’s operational and defensible.
If not, the good news is that you now know what to build.

This entry was posted in article and tagged business glossary, change management, data auditability, Data Governance, Data Lineage, data ownership, Data Platform Architecture, data provenance, Financial Services Data, lakehouse governance, Metadata Management, point in time reporting, PRA FCA regulation, Regulatory Compliance, regulatory evidence, SCD2, Temporal Data on December 20, 2025 by Wayne Horkan.

Post navigation

← Cyber deception at UK scale: what the NCSC trials tell us — and what they still don’t

Posts on Page

  • Governance & Metadata That Actually Survive a PRA Review



Recent Posts

  • Governance & Metadata That Actually Survive a PRA Review
  • Cyber deception at UK scale: what the NCSC trials tell us — and what they still don’t
  • Production-Grade Testing for SCD2 & Temporal Pipelines
  • Event-Driven CDC to Correct SCD2 Bronze in 2025–2026
  • How to Safely Increase System Volume on Windows

On This Day

  • The Evolution of Drug Narratives: “The Man with the Golden Arm”, “Drugstore Cowboy”, and “Requiem for a Dream”
    2024
  • More Cyber Bollocks: Cutting Through the Hype, Fear, and Nonsense in Cybersecurity
    2024

Recent Comments

  1. Wayne H on My Years at Sun Microsystems: From Dream Job to Oracle Redundancy
  2. Scott Fairbanks on My Years at Sun Microsystems: From Dream Job to Oracle Redundancy
  3. Wayne H on Curiosity, Cats, and Rabies: A Thought from Tenerife
  4. Lim on Curiosity, Cats, and Rabies: A Thought from Tenerife
  5. Wayne H on West Midlands Cyber Hub Diaries: Day One (Or Perhaps Day Sixty)

Archives

  • December 2025 (17)
  • November 2025 (1)
  • October 2025 (15)
  • September 2025 (21)
  • August 2025 (8)
  • July 2025 (20)
  • June 2025 (28)
  • May 2025 (36)
  • April 2025 (30)
  • March 2025 (20)
  • February 2025 (22)
  • January 2025 (36)
  • December 2024 (39)
  • November 2024 (30)
  • October 2024 (20)
  • September 2024 (5)
  • August 2024 (24)
  • July 2024 (55)
  • June 2024 (10)
  • May 2024 (2)
  • April 2024 (1)
  • March 2024 (4)
  • February 2024 (12)
  • January 2024 (5)
  • December 2023 (7)
  • November 2023 (9)
  • October 2023 (57)
  • September 2023 (132)
  • August 2023 (27)
  • July 2023 (9)
  • June 2023 (10)
  • May 2023 (4)
  • November 2022 (1)
  • May 2022 (1)
  • December 2021 (6)
  • April 2017 (1)
  • September 2015 (2)
  • September 2009 (7)
  • August 2009 (3)
  • July 2009 (16)
  • June 2009 (3)
  • May 2009 (13)
  • April 2009 (6)
  • March 2009 (17)
  • February 2009 (31)
  • January 2009 (37)
  • December 2008 (9)
  • November 2008 (17)
  • October 2008 (7)
  • September 2008 (14)
  • August 2008 (19)
  • July 2008 (1)
  • June 2008 (1)
  • May 2008 (5)
  • April 2008 (5)
  • March 2008 (19)
  • February 2008 (9)
  • January 2008 (10)
  • December 2007 (1)
  • November 2007 (5)
  • October 2007 (4)
  • September 2007 (5)
  • August 2007 (7)
  • July 2007 (3)
  • June 2007 (20)
  • May 2007 (12)
  • April 2007 (19)

Categories

  • article (351)
  • blog (93)
  • blog-post (42)
  • chess (24)
  • facial-recognition (2)
  • film (6)
  • history-of-cheltenham (11)
  • home (34)
  • irish-unification (1)
  • language (1)
  • life (19)
  • link (88)
  • management-consulting (6)
  • music (7)
  • n4s-learning (37)
  • neurodiversity (37)
  • open-source (2)
  • personality-types (21)
  • programming (2)
  • quotation (1)
  • rip (1)
  • search (8)
  • site-design (1)
  • site-update (4)
  • social-media (5)
  • social-technology (2)
  • socio-technical (2)
  • socio-technicalogical (0)
  • talks (2)
  • tech (98)
  • trading (9)
  • uncategorized (159)
  • virtual-book (1)
  • west-midlands-cyber-hub (1)
  • wordpress (4)
  • work (23)

Social

  • Wayne Horkan
  • Wayne Horkan

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org

Tags

AI and cybersecurity AI Safety bollocks CIISec critical infrastructure cyber cyber academia cyber clusters cyber commercialisation cyber communities cyber education Cyber Governance Cyber Hubs cyber inclusion cyber infrastructure Cyber Innovation cyber investment cyber leadership cyber policy cyber procurement cyber regulation Cyber Resilience cyber risk Cyber Runway Cyber Runway Scale cyber scaleups cybersecurity Cyber Skills cyber standards Cyber Startups defence cyber Digital Resilience DSIT national cyber strategy NCSC ncsc-for-startups neurodiversity public sector cyber Secure by Design sun-microsystems-blog UKCSC UK cyber strategy UK cyber workforce UK government cyber UK tech ecosystem women in cybersecurity

Breadcrumb

Home » Governance & Metadata That Actually Survive a PRA Review

Popular

Categories

  • article
  • blog
  • blog-post
  • chess
  • facial-recognition
  • film
  • history-of-cheltenham
  • home
  • irish-unification
  • language
  • life
  • link
  • management-consulting
  • music
  • n4s-learning
  • neurodiversity
  • open-source
  • personality-types
  • programming
  • quotation
  • rip
  • search
  • site-design
  • site-update
  • social-media
  • social-technology
  • socio-technical
  • talks
  • tech
  • trading
  • uncategorized
  • virtual-book
  • west-midlands-cyber-hub
  • wordpress
  • work

Links

Proudly powered by WordPress