PoCs in generative AI have proliferated, mainly in large companies. But few initiatives have been industrialized due to their complexity.
Hanan Ouazan, Partner and Generative AI lead at Artefact, takes stock of the situation and looks at ways to improve in this article by Christophe Auffray, journalist at LeMagIT.
“The adoption of generative artificial intelligence is far from uniform among French companies. Size remains a determining factor. But most major groups have taken action (strategy, PoC, etc.); only a few are still lagging behind today.”Hanan Ouazan, Generative AI Lead at Artefact
In contrast, VSEs and SMEs are more hesitant. “Not all organizations are on the same wavelength,” Hanan points out. “Large companies are the first to take up the issue. But that doesn’t mean the ‘big boys’ have successfully tamed the technology.”
The complexity of industrialization
“A year ago, we’d already anticipated the main difficulty of generative AI projects, namely their industrialization. […] A few years ago, a PoC in [classical] artificial intelligence took three to four months. For GenAI, two to three weeks may suffice. Industrialization, on the other hand, is a completely different story. Companies are realizing the limitations of PoC initiatives.”
This is compounded by the problem of adoption, especially for ”embedded AI” solutions such as Copilots. Usage is not a given. What’s more, the ROI of these tools has yet to be determined.
In-house solutions, i.e., “made as opposed to “bought”, are not immune to adoption issues either.
Despite these obstacles, some sectors are proving to be dynamic when it comes to generative AI, including retail and luxury goods. Even if, for the latter, progress has been less than initially expected, according to Artefact’s GenAI specialist. Banking and healthcare are said to be lagging further behind, but this is largely due to cloud and compliance issues.
RAG, Search 2.0, Automation, Creativity: Four families and four maturities
Two-speed progress can also be observed at the sector level. According to Hanan, industry is ahead in the race for adoption.
But even among the most mature players, industrialization and scaling (in terms of adoption and technology) remain obstacles. Use cases focus on Retrieval-Augmented Generation (RAGs) for document retrieval in “frequently neglected databases.”
Content indexing is aimed at both internal (for productivity) and external (through conversational interfaces) applications. Hanan cites the example of a manufacturer involved in a project to make its product catalog accessible to customers online – via a chatbot.
In retail, generative AI is being used to rethink search. Search 2.0, based on GenAI, would transform the search experience through expressing a need.
Without naming the retailer, Artefact is collaborating on the design of a solution based on intention search. “Tomorrow, search will be part of an experience logic,” predicts the generative AI Lead.
A third category of use cases is automation. This includes analyzing messages and transcripts from distributor call centers to identify irritants or problems.
“We collect all the calls, transcribe them, and then analyze them to create a dashboard. This view allows us to see, for example, that 3,000 calls are about a particular product that customers think is smaller than the image on the product sheet. You can go down to that level of detail.”
Previously, this sentiment analysis was insufficiently effective. Now it can even detect irony.
Automation also covers applications such as trend analysis on social networks or real-time data capture for back-office banking and insurance.
The fourth family is creativity. “This area is less advanced than I expected, but it’s starting to take off,” notes Hanan. Artefact works with clients to automate direct marketing, such as personalizing (“contextualizing”) SMS messages to generate subscriptions based on criteria such as the recipient’s device.
Image generation for advertising is also emerging. However, these uses are hampered by the legal uncertainty surrounding copyright and training data for image-generating AIs. The legal guarantees promised by certain publishers, including Adobe, could remove barriers in this area.
The applications of generative AI are now well formalized. However, the impact in terms of actual transformation needs to be qualified, as few projects are in production and even fewer are accessible to end users.
“There is an abundance of secure GPT-type tools. But it’s not the most difficult thing to develop. What’s more, since the applications are internal, the risks are limited.”
Cost, user experience, quality and change are key for industrialization
For Hanan Ouazan, the cause of this 2024 assessment is clear: the complexity of industrialization. There are a number of reasons for this, not the least of which is cost: user fees can escalate quickly.
However, not all companies have anticipated the ROI of their use cases. LLM consumption results in expenses that far exceed revenues, whether anticipated or not. And while the price per use of these solutions has dropped significantly over the past year, they make applications more complex and difficult to maintain over time.
This cost issue can be addressed with “optimization logic” to reduce the cost of a chatbot conversation, for example. A chat is billed based on the size of the question and answer. And each new question adds to the history of the conversation, increasing the total cost of each query.
“For one industrial customer, we monitor every chatbot interaction. This allows us to measure costs and causes. For example, it could be that the chatbot is having difficulty identifying a product, leading to a multiplication of exchanges. Monitoring is essential to take corrective action.”
User experience is also a critical parameter. To ensure this, latency is important, but it comes at a cost. On Azure, it may be necessary to subscribe to Provisioned Throughput Units (PTUs), “dedicated and expensive managed resources.”
No successful adoption without taking people into account
Hanan Ouazan also cites quality as a challenge for industrialization. Measuring and monitoring quality means “putting the right evaluation building blocks in place […] On certain sensitive user paths, we deploy an AI whose function is to validate the response of another AI. Real-time evaluation processes ensure that the quality of the generative AI’s responses is maintained,” he explains. “We also integrate feedback loops that leverage user responses to improve results.”
Industrialization eventually encounters the human obstacle. Success at this stage requires process changes. For example, moving from a traditional chatbot to a version based on generative AI changes the customer service profession.
“Such change needs to be supported, especially in the process of validating chatbot responses. Part of the conversation can be entrusted to an AI for validation. And when humans intervene, it’s better to assist them to reduce processing time. And when a human annotates, the AI needs to learn to start a virtuous cycle.”
Hanan also calls for supporting user adoption and keeping user experience in mind. “Adoption should not be painful. This is the approach Artefact implemented in a project to improve data quality in a product database for a manufacturer. Individuals should not have to adapt to generative AI. It’s up to the AI to adapt to them, especially to the way they work.”
For example, Artefact rethought its product datasheet generation solution, which was originally designed to integrate directly into the PIM-a tool little appreciated by the employees in the quoted project.
“Our method for generative AI projects is to select a business and identify its daily tasks to distinguish those for which AI will replace the human or modify the tool. The challenge is to automate what can be automated and what is difficult for the job, and to increase what can be increased in human working conditions. And this cannot be done without interacting with the target user. Otherwise, adoption will be nil.”
User training: Adjusting the cursor
What about prompt training? For Hanan, it’s all about the user. Artefact trains employees in the CRAFT method (Context, Role, Action, Format, Target Audience), which defines the correct use of AI. For mature users, it is possible to train them in the operation of models and the tools that integrate them. However, some populations are still accustomed to the use of keywords.
The company is educating employees about the possibility of making queries in natural language. It has also introduced reformulation. When keywords (puk code) are entered, the tool rephrases: “Are you looking for your puk code?”
“There is a slider and it needs to be adjusted according to the population for which the tool is intended. But I think as the capabilities of generative AI solutions improve, eventually there will be no need to prompt,” clarifies Hanan.
Two future projects in focus
In 2024, there are two main areas of focus in the field of generative AI. The first concerns adoption, identifying changes in occupations, careers, and skills.
The second is scaling up. This is being piloted at two levels: governance (prioritizing needs, ROI and steering initiatives) and the GenAI platform.
Hanan observes an increase in the number of RAGs, a phenomenon driven by the many PoCs being carried out. As a result, companies will need to rationalize the building blocks used in their experiments.
“Tomorrow we’ll have to think of RAG as a data product. It has a place as a product in the enterprise, just as LLM has a place as a product.
“Many people thought GenAI was a magical thing that would free them from the constraints of the past. In reality, GenAI brings even more constraints,” cautions Artefact’s expert.
This evolution is part of a continuum. After DevOps, and then MLOps to deal specifically with AI, LLMOps is now emerging. “The safeguards defined with MLOps are still relevant. But we need to add to them to take into account costs, hallucinations, and the generative dimension of models,” concludes Hanan.