Why generative AI is just the latest technology getting in the way of smart data management
The problem persists, but don't blame large language models for it
Unsurprisingly given how fast it spread, Generative AI has already reached its peak of inflated expectations according to Gartner’s Hype Cycle for Emerging Technologies.
From profit-making private enterprises to public agencies and third-sector entities, leaders seem fixated on the same question: How might we adopt Gen AI to gain productivity, reduce costs, increase time to market, improve customer experience, get ahead of the competition?
Hopefully soon we’ll get past the next phases of the hype cycle to arrive at Gen AI’s plateau of productivity. But that won’t solve a problem that for decades has plagued all kinds of organizations.
I’m talking about the well-known issue of accumulating big piles of data without achieving insights that lead to better choices and lower risks.
For a long time, studies have shown that companies that adopt a data-driven attitude toward decision making are on average more productive and profitable that their competitors. But too many organizations eager to follow that path fall into the trap of assuming they just need collect and store high volumes of data to get there, when this couldn’t be further from the truth. So they stop their data strategy initiatives prematurely, long before they have achieved the necessary agility to fullfill even a fraction of their emerging needs of data reporting and analytics.
Because I’m a consultant, my sample is naturally biased toward entities struggling to escape this “data rich, insight poor” state. But you can easily check if your organization is any different. Think of a question whose answer would drive smarter decisions and more accurate investments but isn’t routinely tracked in a dashboard.
Now, think about how fast you can get that answer:
If you are an executive at a manufacturing company about to make a consequential decision that requires understanding lead times, can you effortlessly get a list of the top factors causing shipping delays for each product?
If you are responsible for ensuring a safe working environment across multiple construction sites, can you quickly report how many workers got injured last year and where those injuries occurred?
If you manage marketing campaigns, can you pull up a current list of active customers for which an upsell opportunity exists?
Consider yourself very lucky if reliable answers for questions like that are at your fingertips. Give your organization extra points if different teams searching for the same information within their own systems would arrive at the same results.
In my decades working with numerous entities in the private, public, and third sectors, I can’t think of one case where leaders weren’t constantly having to make critical decisions based on partial and often conflicting information. If a senior executive requires a complete answer and has the luxury of waiting for it to be produced, managers will reach out to the data team. Then they’ll impatiently wait for analysts to piece together information from various disparate systems with data stored in incompatible formats, often requiring “reconstructive surgery” to get to a state to be finally queried for answers.
With all attention now turned to Gen AI, the root causes of this problem continue to be left untouched:
Ambiguous and mutable data definitions. Definitions of what constitutes the “truth” are missing or nonstandardized, causing stakeholders to squander time and resources trying to navigate disparate interpretations of the data.
Vague or inconsistently applied data rules. Rules for aggregating, integrating, and transforming data are unclear, conflicting, or simply not followed, making it difficult or impossible to replicate transformations and leverage information across the organization.
Needless duplication of work. Complex data analyses such as predictive modeling that is relevant across the organization must be redone by different groups due to lack of mechanisms to share the information through internal channels.
We all know the solution to this recurring problem. It requires investments in:
processes to optimize the extraction, standardization, storage, transformation, enrichment, modeling, and visualization of data, and
solutions to support the seamless information flow between departments and teams.
But with all eyes now on Gen AI, and other emerging technologies with “transformation potential” already vying for attention, why would a CIO or CTO feel motivated to allocate resources toward such mundane data management approaches?
And this is why I expect those unglamourous but vital to high performance data management initiatives to remain neglected. And ironically, as the hype subsides, the organizational impact of generative AI will be largely dependent of a robust data management strategy.
I’ll talk about why that’s the case in my next article.