Scott Zoldi, Chief Analytics Officer, FICO.
getty
I can’t think of any technology in my professional lifetime that’s been as hyped—or as dissed—as generative artificial intelligence (GenAI).
As a data scientist who strongly believes in deploying all AI responsibly, I’ve been thinking hard about how to replicate the key benefits of Responsible AI (AI that is ethical, explainable, robust and auditable) to benefit its generative cousin. In fact, I have developed a research roadmap to achieve Responsible GenAI that is successfully progressing at my company, FICO, and is intended to be implemented at organizations in a wide range of industries.
The reality is companies need to have more control over the machine to trust and derive business value from GenAI. That’s what Responsible AI is all about. Control allows us to treat GenAI like a tool, not a magic box. The good news is that we can focus on achieving Responsible GenAI right now.
Introducing Focused Language Models
But first, a bit of history and context. Almost two years ago, the term "large language models" (LLMs) exploded into the popular lexicon.
These humanistic chatbots, which can produce what appear to be highly intelligent responses to user prompts, also have a well-known capacity to “hallucinate.” Because users have no control over the data used to train LLMs, or the data’s provenance, the potential for bias can be significant and magnified by the non-deterministic nature of GenAI technology. These limitations have made many organizations cautious of using LLMs and other GenAI technologies.
Small language models (SLMs) have cropped up as an industry response. Smaller and less complex than LLMs, they’re designed to efficiently perform specific language tasks and are built with fewer parameters and less training data. Like LLMs, SLMs are available from multiple providers and come with many of the same challenges as LLMs, although often at reduced risk.
My approach to achieving Responsible GenAI concentrates LLM applications further into a "focused language model" (FLM), a new concept in that SLM development is focused around a very narrow domain or task. A fine level of specificity ensures the appropriate data is chosen; later, you can painstakingly tune the model (“task tuning”) to further ensure its correctness.
The FLM approach is distinctly different from commercially available LLMs and SLMs, which offer no control of the data used to build the model and are not suitable for task tuning, as these factors are crucial for preventing hallucinations and harm.
A Focused New Approach
The concept of a domain-specific GenAI language model isn’t new, but the focus of an FLM is.
Bloomberg, for example, trained a finance-specific LLM, but this model is 50 billion parameters. In contrast, the FLM would define the vocabulary of the problem domain, ensuring that data is aggressively filtered. An FLM would only include, for instance, credit risk, financial inclusion or payment card fraud or any problem domain where the utmost accuracy, reduced hallucinations and detailed data design are all required for trusted use.
For many organizations, a go/no-go decision to invest in building an FLM is based on whether they could otherwise trust commercially available GenAI technology to solve the problem at hand. If full transparency and control of the data used to build the model is required for it to operate responsibly—that is, without hallucinations and other accuracy issues—then a FLM model can be the right choice.
To recap, a FLM enables GenAI to be used responsibly because:
• It affords transparency into how a core domain-focused language model is built.
• On top of industry domain-focused language models, users can create task-specific focused language models with tight vocabulary and training contexts for the task at hand.
• Most importantly, the resulting FLM is accompanied by a trust score with every response, giving users a yardstick by which to measure its accuracy and, therefore, reliability.
How To Build Responsible GenAI
Let’s break down the three components of a purpose-built FLM:
Training Data
Organizations can safely train domain-specific FLMs using two types of trusted data: 1. open-source training data sets available from an emerging class of providers and 2. their internal data.
Data scientists should carefully vet and inspect data sets to ensure they are free from bias and contain correct information. (There are numerous examples of what can happen when data isn’t properly vetted.)
Training The FLM From Scratch
Building a FLM can be a straightforward exercise. This starts with a base language model that learns language and vocabulary bounded to certain domains (like finance and banking). The process of pre-training FLMs and then aligning them with enterprise-specific supervised tasks (task-specific FLMs) ensures full transparency and customization.
Additionally, developing a highly efficient, distributed storage and software stack, combined with robust experimentation tracking (such as with blockchain technology), can enable continuous optimization and faster iterations.
Trust Score
The FLM should be used in conjunction with a secondary analytic model that provides a trust score from 1 to 999, reflecting the probability that the key contexts (such as product documentation) that the task-specific FLM was trained on are used to provide the answer.
In this way, users can use the score to decide if they will trust the FLM’s answer as being supported.
Why Context Is Everything
An FLM approach to achieving Responsible GenAI puts organizations back in control from square one. The data scientists building the model can create the organization’s own domain FLM, curtail contexts for task-specific FLMs and define reduced and relevant vocabularies, which then form the basis of the trust score.
Most importantly, these controls allow the data scientists building the FLM to have confidence in which context will be associated with each prompt, helping to ensure that the responses are human-like, understandable and validated through context-matching.
Finally, the trust model scores the answers based on their quality compared to a pre-determined context corpus.
A Framework For Innovation
When it comes to deploying GenAI in business use cases, innovation and responsible use don’t need to be at odds. FLMs provide a framework for Responsible GenAI to be implemented, audited and enforced—the springboard to trust.
Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?
{Categories} _Category: Takes{/Categories}
{URL}https://www.forbes.com/councils/forbestechcouncil/2024/10/17/responsible-genai-earning-trust-may-be-easier-than-you-think/{/URL}
{Author}Scott Zoldi, Forbes Councils Member{/Author}
{Image}https://imageio.forbes.com/specials-images/imageserve/66b0db520ab1dd064e4ddc66/0x0.jpg?format=jpg&height=600&width=1200&fit=bounds{/Image}
{Keywords}Innovation,/innovation,Innovation,/innovation,technology,standard{/Keywords}
{Source}POV{/Source}
{Thumb}{/Thumb}