Up until recently, most of the AI headlines have revolved around competitive jockeying to develop and improve LLMs and other AI models, along with NVIDIA’s rise to fame as the AI processor of choice. As the AI industry matures two years after the launch of ChatGPT, we are now seeing the tide turn to focus more on the end goal: putting AI into use – or inferencing. Most enterprise IT organizations must rearchitect their data management to ensure proper data governance of AI inferencing.
Let’s take a closer look at why.
The Business Value of AI Inferencing
AI inferencing is not all the same. While some tasks, like writing social media copy, can be accomplished simply by opening your favorite generative AI application and typing in a prompt, the greater value for enterprises is leveraging internal data to customize outputs to business and customer needs. You can see the business value of AI with corporate data.
For example, analyzing chat records for customer sentiment and satisfaction trends or inputting clinical notes from hospital patient encounters to summarize cases and unearth disease treatment and prevention trends.
Why AI Inferencing Requires a New Data Architecture
AI inferencing is shaping up to be a massive market with massive IT infrastructure consequences. “Several sources, including companies in the technology sector such as Amazon or NVIDIA, estimate that inference can exceed the cost of training in pervasive systems and that inference accounts for up to 90% of the machine learning costs for deployed AI systems,” according to researchers published in Science Direct.
With so much time, cost, and energy being invested in AI inferencing, IT leaders will want to take a close look at their supporting technology stack—from data platforms to data center infrastructure, whether on-premises or in the cloud – and laptops and phones, which will be optimized for AI processing. They’ll need to potentially upgrade their systems for efficiency, performance, and cost optimization to take them into the future.
Breaking Down the New Requirements
Although AI Inferencing is the next step after model development, the requirements for AI inferencing are sufficiently different that it requires a new data paradigm – especially for unstructured data:
- Manage routine use of corporate data: First, let’s look at the high-level process behind AI inferencing, as explained by IBM Research: “During inference, an AI model goes to work on real-time data, comparing the user’s query with information processed during training and stored in its weights, or parameters. The response the model comes back with depends on the task, whether identifying spam, converting speech to text, or distilling a long document into key takeaways. The goal of AI inference is to calculate and output an actionable result.” In other words, AI inferencing and its use require organizations to augment what the model has been trained on with relevant corporate data.
- Must be easy for any employee, not just data scientists: Whereas model development is a highly specialized AI market for data scientists, AI inferencing targets any user and, therefore, must be easy to use and deploy. How can you make it easy for employees to find and feed the correct corporate data to the proper AI process? How can you ensure that the data workflow is captured for audits? AI model training is like building a power plant; very few have the specialized expertise and resources to build AI models. But AI inferencing should be as easy as consuming electricity. Anyone should be able to turn the switch on or off for their purposes.
- Need to search and feed unstructured data for AI: Today, with AI and ML, it’s all about the unstructured data—the millions or billions of files such as documents, messages, images, video, and machine data spread across directories and shares in the enterprise. Because of its large size and distribution across many data silos in the enterprise, moving it to a system is not viable from a cost perspective as well as the time this would take. Users rely on many AI tools rather than a giant data warehouse.
- Moving data first no longer works: Since unstructured data can easily be in petabytes and is expensive to move, the traditional extract transform and load (ETL) paradigm of moving data first into a data lake or data lakehouse and then operating on it no longer works. IT teams need a way to systematically access and manage unstructured data without moving it to one place. They also need automated tools to prepare their data – from filtering out the deadwood, adding context and structure to it via metadata enrichment, and creating secure workflows to find and move the right data sets to the right tools. A unified global index provides a sensible starting point to achieve data visibility and fast search across hybrid data estates.
- Automated data governance: In addition, AI data governance is becoming a paramount capability for data management. To avoid sensitive data leakage into AI tools, false outcomes, hallucinations, and lawsuits from copyright infringement, organizations will need ways to monitor and track data movement into AI tools. They need automated methods to locate sensitive data such as PII and move it to locations where it cannot be accessed by employees for AI. By automating the workflows for users to access AI, you can ensure that automation tracks and manages data governance.
- A new data management model for AI: The emerging “database” layer for unstructured data is virtual and does not actually store all the data. We’ll see it evolve to encompass the above capabilities and more. It will need to work across all storage technologies and environments – from the data center to the edge to the cloud and operate efficiently under the heavyweight of petabytes of file and object data.
Critical Questions to Ask Before Adoption
Before investing in any new technology for AI, stakeholders must come together to determine the top goals and restrictions they envision to manage risk. Here are some questions to consider:
- Which departments and/or use cases will deliver quick wins for learning opportunities with the lowest risk?
- What are the top long-term goals and expected outcomes from using AI, especially for using corporate data with AI?
- How do they want users to be able to interact with AI systems? Will there be specific AI tools authorized for use, or can users decide?
- What internal policies will be required for AI use, and how will they be enforced? For instance, an organization may want to dictate which AI tools can be used at work and/or the types of data that can be fed into systems. They may wish to restrict, for instance, meeting notes from executive board meetings, R&D sessions, or calls with top customers.
- What industry regulations must be followed when using corporate data with AI?
- How will outcomes or derivate works from AI be assessed for accuracy and legitimacy?
- How can derivate works be leveraged internally or externally?
- What AI governance and security framework will be needed, and what gaps in the security technology infrastructure will need to be filled?
- What systems can run in the cloud, and which should stay on-premises?
- How will users search, find, and feed corporate data to AI?
- How is the outcome of AI persisted?
When used correctly, AI Inferencing can help organizations gain significant competitive advantage. However, it can also create data risks, as corporate data needs to be exposed. AI inferencing needs a new data architecture because the traditional model of loading data into a data warehouse or data lake and acting on it in one place no longer works. AI Inferencing requires efficient indexing, search, curation, and mobilization of unstructured data to AI with automated auditing and tracking.
Subscribe to The CTO Club's newsletter for more data insights and best practices.