The missing middle in GenAI: data quality, MDM, and governance that actually scale

hero image

Everyone agrees GenAI can change how work gets done. However, what does that really mean for people on the ground—finance business partners, marketers, supply chain managers? If you’re improving Order to Cash, Purchase to Pay, or Record to Report, you probably want GenAI to automate the dull work and surface issues early. But here’s the uncomfortable truth: without solid data foundations, GenAI rarely gets beyond a clever proof of concept. 

Michael presents a simple model showing the interaction between key business use cases, key business processes and core data components necessary to enable GenAI to drive business value 

Problem statement 

Organizations often jump into GenAI with exciting use cases while leaving Data Quality, Master Data Management (MDM), and Data Governance for “later.” Typically we see Commercial teams cleanse customer records, whilst Supply Chain focus on standardizing the definition of product, specifically finished goods here, bill of materials there as well as other derivatives of products including parts, components, materials and services. Meanwhile, Data Quality initiatives and MDM programs are run on an ad hoc basis, in silos, outside a single and unified Data Governance framework. The result? Different naming conventions, conflicting definitions, variable degrees of granularity, and ultimately data models that generate inconsistencies in reporting  across different business functions and units. 

Considering that LLMs will aim to provide the most probable answers as opposed to the right answer, lack of alignment between Data Governance, Data Quality and MDM typically leads to the following implications when deploying GenAI: 

  • Magnified biases: over representation and more complete customer segments or product lines would naturally create propagation bias in LLMs. 
  • Amplified hallucinations: Conflicting, and duplicated, records cause LLMs to synthesize incompatible facts, producing ‘confidently wrong’ answers. 
  • Spurious correlations: lack of standardized taxonomies make it easier for models to find irrelevant patterns and also miss patterns to the disadvantage of customer segments, product categories, service lines etc. 
Data foundations

Author’s perspective and expertise

Michael Norejko

Michael Norejko, Data Engineering Lead, Cloud &Digital, PwC Poland

Michael Norejko brings 15 years of experience building data and analytics capabilities with a focus on aligning Data Quality, Master Data Management, and Data Governance initiatives as part of large digital transformation programmes. Successfully deploying LLMs is as much dependent on the compute as it is on the availability of data and a consistent ontology that defines the business. 

Observations and learnings from recent projects

GenAI only scales when Data Quality, and Master Data Management, fall within a single Data Governance framework, as opposed to running as three separate initiatives. If master records are inconsistent and quality is managed in silos, GenAI will just amplify the confusion and faster. 

Data foundations

To mitigate instances of bias, hallucinations and spurious correlations in LLMs, ensure that master and meta data is continuously harmonized across multiple sources, standardized against a set standard and enriched where there are instances of missing attributes. The standards need to be provided as part of a single Data Governance framework  shared across key business functions, domains and processes  priority business. Consider the following point of interactions:

Data foundations

 

  • Data Governance to Data Quality: data standards and policies inform data quality rules, thresholds, metrics and actions thereby providing continuous evaluation and driving improvement in data quality.  
  • Master Data Management to Data Quality: Golder records are created to from duplicate master records thereby improving the overall data quality and reducing errors in business reporting 
  • Data Governance to Master Data Management: data standards and policies inform master data management workflows to ensure accurate matching  & merging and thereby reducing false positive and false negative matches 

As an example, lets look at this from the perspective of a Commercial Manager responsible for overseeing Order-to-Cash process. Sales representative enters “Acme Ltd.”, Finance analyst sees “ACME” and Customer service agent uses “ACME Holdings UK.” Without MDM matching & merging logic and supporting Data Quality rules, GenAI may allocate discounts as per an agreed contracts against three different entities which in reality belong to one and the same customer. With standardized Customer names, IDs, address attributes, and harmonized master records from different sources, the same LLM can reconcile invoices to purchase orders and identify missing discounts against the same Customer. 

To support this practical example, research shows that Deploying GenAI with robust Data Governance and supporting Master Data Management practices accelerates time-to-value up to 40% (Gartner, 2025). Furthermore, organizations that invest in data governance and master data management  see up to 3x the return on their data investments versus those without governance. 

A word of caution 

  • One-off cleansing exercises do not last: If you “fix” data without standards, deviations will return and with GenAI with greater variability. 
  • Do not confuse activity with impact: A higher match rate is meaningless if duplicate products, services and vendors still slip through and missing discount capture does not improve. 
  • Avoid governance theatre: Policies that live in slide decks will not help. Apply and embed rules in pipelines, scripts and integration end-points so that the application is automatic. 
  • Watch out for the hidden costs: Inference, storage, and reprocessing can balloon if upstream data keeps changing. Stabilize master data first to avoid paying twice. 
  • Siloed wins can create enterprise losses: Local optimizations without shared standards often break cross-functional processes. 

Concluding point 

If you want GenAI to move beyond experiments, do not treat Data Quality, Master Data Management and Data Governance  as afterthought. Furthermore, do not run these as separate initiative and instead move towards a synergy focused approach where all three initiatives are consolidated into a single program. 

Supporting perspectives 

For complementary viewpoints on turning GenAI into measurable results: 

  • Applying GenAI to Strategic Initiatives like Revenue Optimization by Adam Rogalewicz explores how to improve pricing, offers, and trade terms with strong data foundations. Read Adam’s perspective.
  • Automating and accelerating software development with AI agents by Wiktor Witkowski shows how engineering teams can compress design, testing, and delivery cycles without sacrificing quality. Read Wiktor’s perspective.

Digital Foundations Hub - for Cloud, Data & AI

Discover our video series

Contact us

Mariusz Chudy

Partner, PwC Poland

+ 48 502 996 481

Email

Paweł Kaczmarek

Director, PwC Poland

+48 509 287 983

Email

Marek Chlebicki

Partner, PwC Poland

+48 519 507 667

Email

Jakub Borowiec

Partner, Analytics & AI Leader, Warsaw, PwC Poland

+48 502 184 506

Email

Michael Norejko

Senior Manager, PwC Poland

+48 519 504 686

Email

Mariusz Strzelecki

Senior Manager, PwC Poland

+48 519 505 634

Email