Agencies and brands, driven by strategic business decisions to adopt generative artificial intelligence, are increasingly using small-language models for more task-driven solutions.
“As we work with clients, we plan to use [SLMs] because the data set [to train] is smaller, and its tasks are defined to a particular brand’s needs,” said Michael Olaye, senior vice president and managing director of strategy and innovation at R/GA, which began testing SLMs in early January.
Interest in SLMs bubbled up last November when Microsoft announced the launch of its own SLM, Phi-2. In its latest earnings call, Microsoft revealed that its customers including Anker, Ashley, AT&T, EY and Thomson Reuters are exploring Phi for their AI applications.
The rise of SMLs indicates a shift from costly and resource-intensive large language models toward more efficient and adaptable alternatives, making it easier for agencies and brands to accomplish task-driven initiatives.
“The key takeaway for advertisers in 2024 is to be aware of [SLMs] as a developing gen AI area,” said Cristina Lawrence, executive vp of consumer and content experience at Razorfish. “If discoveries are made that reveal valuable use cases, they could enhance efficiency and reduce cost.”
Here’s what you need to know about SLMs.
What are SLMs?SMLs are slimmed-down versions of LLMs that are easier to train on narrower data sets, reduce inappropriate responses and deliver more relevant outputs, all at lower cost.
“An LLM is trained on an expansive, broad set of publicly available data covering massive amounts of information,” said Lawrence. “But specializing an AI model in brand knowledge, or instructional data sets, can make the models more focused and deliver a more targeted user experience. It can also be costly to train an LLM with the processing power required, but when you tighten the scope of data, it becomes more accessible for companies to experiment with.”
For instance, a dedicated SLM could be used to generate dynamic creative assets in real-time, focusing solely on this specific function. This contrasts with multimodal LLMs like Microsoft’s Copilot, which are trained to perform multiple tasks such as writing code or generating text-to-image.
There are a handful of SLMs in the market, including Microsoft’s Phi 2 and Orca 2 (which uses Meta’s open-sourced Llama 2), Google’s T5-Small and BERT, and GPT-Neo, a scaled-down version of OpenAI’s GPT.
These models can exist locally, as well, like on a mobile device, which is driving much of the interest around SLMs today, said Lawrence.
And while training LLMs can take months, sometimes years, according to Olaye, you can train an SLM in one week.
What are its use cases?AT&T began using SLMs late last year for simpler tasks that require less complex reasoning, such as subdocument summarization and classification within portions of its question-and-answer Ask AT&T chat applications for internal documents, said Mark Austin, the company’s vp of data science.
“While there is a cost savings, the main focus was for speed, which is important if you’re using it to build metadata, for example, across hundreds of thousands of documents,” said Austin.
While R/GA’s brands have yet to explore SLMs for consumer-facing campaigns, limited by copyright and privacy concerns, some brands are using this tool to streamline internal processes.
For example, one brand, via an SML-powered chatbot trained on a small set of that brand’s assets, streamlined its legal process to support the rest of the business and third parties, according to Olaye, who wouldn’t share the brand specifics.
“[Brands’] legal and business affairs team take a lot of calls from people asking, ‘Can I use this asset?’ ‘Is this the right copy?’” he said. “We went into a process of automating that. Now, the bot can bypass a lot of the questions that you normally pick up the phone to talk to legal about.”
What are the limitations?The technology is still in its infancy. While SLMs mitigate hallucinations to some degree, they may still occur, albeit less frequently than with LLMs, said Olaye.
While narrower data enhances the specificity of SLMs, they are limited in their breadth of information, which hinders the execution of complex tasks compared to multimodal LLMs.
“There’s a lot of unknown about SLMs and where exactly they fit,” said Lawrence.
SLMs are open-source, which raises concerns regarding data privacy and security and could hinder widespread adoption.
“Responsible AI use means understanding the risks and how to safely navigate them, and that includes only sharing information that is safe to share,” said Lawrence. “Just because a model is customized to train on specific data doesn’t mean it shouldn’t go through the same protections, so the same approach to responsible use should apply regardless of the model size.”
{Categories} _Category: Inspiration,*ALL*{/Categories}
{URL}https://www.adweek.com/media/introducing-small-language-models-the-ad-industrys-latest-gen-ai-fix/{/URL}
{Author}Trishla Ostwal{/Author}
{Image}https://static-prod.adweek.com/wp-content/uploads/2024/02/small-language-models-agencies-2024-600×315.jpg{/Image}
{Keywords}AI News,Emerging Technologies{/Keywords}
{Source}All{/Source}
{Thumb}https://static-prod.adweek.com/wp-content/uploads/2024/02/small-language-models-agencies-2024-640×360.jpg{/Thumb}