A.I. Adoption Is Surging. Data Governance Is Not Keeping Up.

A few years after the launch of consumer A.I., companies are racing to build governance structures, appointing chief A.I. officers, drafting policies and formalizing oversight processes. The goal is to ensure that A.I. adoption delivers measurable value while minimizing operational, legal and reputational risks. But in the scramble to establish new frameworks, organizations are overlooking something fundamental: the state of their own data.

Most companies have been collecting transactional, operational and customer data for decades. In the A.I. era, how that existing data is managed will determine whether A.I. systems deliver meaningful returns or amplify existing weaknesses.

The data foundations of A.I. governance

Public debate around A.I.’s data use has largely centered on model developers scraping the open internet, including social media, books and journalism, to train generative A.I. systems. These practices have provoked backlash on privacy and copyright grounds, exposing unresolved questions about what constitutes fair use in the digital age.

Less attention has been paid to how enterprises themselves are using data. Harvard’s 2025 A.I. Index Report found that 88 percent of organizations are adopting A.I. in some capacity. These companies feed internal data into A.I. models to streamline operations and generate insights. But much of that data was not collected with A.I. deployment in mind. From a governance perspective, enterprise data is frequently incomplete, inconsistently labeled, poorly documented or insufficiently protective of personal and sensitive data.

This creates a structural gap. While companies invest in A.I. governance, many neglect the data foundations on which those systems depend. Based on our experience advising organizations on responsible A.I. and data management programs, the conclusion is clear: A.I. governance begins with data governance.

The risks of poorly governed data

When A.I. systems are built on weak data foundations, risk is inevitable. Start with reliability. A.I. systems fed incomplete or non-representative data will produce flawed outputs. Starbucks’ deployment of an A.I.-powered inventory tool illustrates the point: designed to automate stock counts and refills, the system was given inaccurate data. The result was inventory waste and product shortages, culminating in reduced sales. Instead of driving efficiency, the system introduced new costs.

Bias presents a second, more complex risk. A.I. models trained on datasets favoring certain groups will produce biased outputs. A 2025 Nature study of large language models trained on emergency department data found the tools were more likely to recommend invasive medical treatments to Black, LBGTQ+ and unhoused patients than other groups, replicating biases embedded in training data. Similar concerns are emerging across hiring, lending, insurance and law enforcement applications, where biased data can directly influence access to jobs, credit and public service. For businesses, adopting A.I. tools that produce biased outputs carries legal, financial and reputational consequences that are difficult, and expensive, to reverse.

Poor data governance also erodes transparency and accountability. Where training data, validation processes and model performance are poorly documented, organizations accumulate “documentation debt.” This debt limits their ability to explain how decisions are made, with knock-on effects for regulatory compliance, incident investigations and audits.

The risks extend further still. Repurposing data without a clear lawful basis can breach data protection laws. Weak data provenance controls increase the likelihood of inadvertently using protected intellectual property. Biased or incomplete datasets can create downstream human rights impacts, especially when automated systems influence employment, healthcare, financial access or housing.

These risks are not isolated compliance failures. Rather, they are the structural consequences of treating data governance as secondary to A.I. deployment.

Making your data A.I.-ready

Unlike model development, which generally depends on external vendors, data governance sits firmly within an organization’s control. Companies looking to extract value from A.I. should start there.

The first step is building a comprehensive data inventory. Organizations need a clear record of what data they hold, where it originates and the legal basis for its use. This includes identifying what additional assessments—including privacy, legal or risk-related—are needed before the data can be repurposed for A.I. A well-executed inventory not only supports compliance but enables faster, more confident deployment of A.I. systems by reducing uncertainty around data quality and risk exposure.

Second, organizations should establish a data classification policy. Data assets should be categorized according to their sensitivity, value and regulatory obligations. The aim is to protect the confidentiality, integrity and availability of data used in A.I. systems while ensuring it meets both legal requirements and operational standards. Developing such a policy requires answering several deceptively simple but often overlooked questions: What data do we hold? How sensitive is it? What rules govern its use?

Third, roles and responsibilities must be clearly defined. Effective data governance depends on accountability. Data owners should be responsible for accuracy and classification, data custodians for secure storage and handling and data users for appropriate application. Establishing these roles enables organizations to create safeguards when transitioning their data to A.I. systems.

Existing standards and legislation offer practical guidance. The EU A.I. Act sets baseline requirements for data quality and governance in A.I. systems. International standards like the ISO 42001 establish data-related guidelines for A.I. applications, while ISO 27001 and ISO 38500 set broad data governance requirements. Even in less regulated markets, these frameworks offer a practical starting point for building internal governance maturity.

Data readiness is A.I. readiness

Business leaders should not just be asking whether their organizations are ready for A.I. They should also consider whether their data is. A.I. systems cannot compensate for weak data foundations. Without coherent, well-governed data, organizations risk investing in tools that amplify inefficiencies, introduce new liabilities and fail to deliver returns.

Policymakers, experts and enterprises are still debating where the responsibility for A.I. governance should lie. But on the question of internal data quality, there is no ambiguity: accountability belongs to the organization. Businesses that treat data governance as a prerequisite are the ones best positioned to turn A.I. investment into competitive advantage, and to defend their decisions when scrutiny arrives.

Amelia Williams is a Senior Research Impact Officer at Trilateral Research with expertise in scientific communication at the intersection of emerging technologies, environmental issues, ethics, and policy. At Trilateral, she supports the development and implementation of research projects alongside policy, media and industry engagement.