Agents Are The Future Of AI. Where Are The Startup Opportunities?

AI agents, popularized in science fiction works like the 2013 film "Her", are fast becoming a … [+] reality.
Source: HerIf you are wondering what the next great chapter in artificial intelligence will be, here is your answer.

“This seems like as good of a time as any to talk about how we view the future,” wrote OpenAI leaders Sam Altman and Greg Brockman recently. “Users will increasingly interact with systems – composed of many multimodal models plus tools – which can take actions on their behalf, rather than talking to a single model.”

This is as clear a description as any of the concept of “agents,” which has taken the field of artificial intelligence by storm over the past year.

Agents are AI systems that can act autonomously in pursuit of open-ended, loosely defined goals. This can involve making long-term plans, using “tools” (say, an internet browser), and dynamically trying new approaches in response to new information.

A concrete example will help illustrate the concept. An example of an AI agent would be a system that automatically books your airfare for an upcoming trip, with no input required from you. In order to do this effectively, the agent would need to review your email or calendar to know when and where you are traveling; remember your travel preferences (aisle or window, red-eye or daytime flight); research and select the best flight for you; retrieve your personal and payment information; and use the airline’s booking system (e.g., via web browser or API) to buy your tickets.

AI agents are the source of tremendous hype today, which can make it hard to separate signal from noise in this space. But it is important not to lose sight of the big picture here: agentic capabilities will define the next great wave of progress in AI.

In the words of Andrew Ng: “AI agent workflows will drive massive AI progress this year—perhaps even more than the next generation of foundation models. This is an important trend, and I urge everyone who works in AI to pay attention to it.”

Or, as Andrej Karpathy put it: “It’s very obvious that AGI will take the form of some kind of AI agent.”

This article will (1) explore the technological underpinnings of AI agents and then (2) canvass some of the most exciting young AI agent startups today. If you think today’s AI systems are powerful—buckle up for what’s coming.

Agents 101
Where did the concept of AI agents come from?

Unlike other breakthroughs in artificial intelligence like, say, the transformer or direct preference optimization (DPO), the idea of an AI agent cannot be traced back to one foundational paper or one particular research group. It is too general and expansive of a concept for that.

Rather, over the past two years, AI practitioners have made a series of interrelated advances that have built upon one another to enable progressively more sophisticated autonomous behavior from AI systems.

The overarching theme of these advances has been to build structures and flows around the core intelligence of large language models (LLMs) that unlock the ability for AI to act autonomously.

A brief word on terminology before we proceed. “Agentic” is often used in AI circles as the adjective form of “agent.” We agree with Andrew Ng’s well-articulated point on this topic: the word “agentic” serves a helpful purpose by allowing for more nuance and flexibility when discussing this fast-moving technology. Rather than needing to classify a given AI system as either an agent or not an agent, it can be instructive to think of AI systems as having agent-like characteristics—being agentic—to varying degrees. This helps avoid semantic hairsplitting over whether a given AI system “counts” as an agent.

One seminal work that helped lay the groundwork for agents was the 2022 Google Brain paper that introduced the concept of “chain-of-thought prompting.” This paper showed that LLMs have the ability to break complex problems down into smaller intermediate steps and then to work through each step in succession to solve the overall problem.

Chain-of-thought prompting was not originally developed in pursuit of AI agents; the paper does not contemplate AI models interacting with the external world in any way. But chain-of-thought techniques significantly enhance LLMs’ multi-step reasoning and planning abilities, which lie at the heart of agentic behavior.

Perhaps the first research effort that explicitly aimed to combine LLMs’ ability to reason with their ability to act was ReAct in 2022, also from Google Brain. While the ReAct system broke important conceptual ground, its functionality was limited.

One essential ingredient for a capable agent is the ability to make use of external applications: browsing the internet, sending an email, making an online purchase, calling an Uber, building a website, updating a database, submitting a pull request, or any of the infinite number of other possible digital actions. In the realm of AI agents, this general capability is often referred to as “tool use.”

A landmark research effort on agentic tool use was Toolformer, published by Meta researchers in 2023. The Toolformer team fine-tuned a large language model to learn how and when to make API calls in order to leverage outside applications like a calculator, a calendar and a language translation program.

More recent efforts, including Gorilla and Chain-of-Abstraction, have built on Toolformer’s API-based approach to enable more sophisticated and flexible forms of tool use.

Instead of a small number of hand-selected tools, the Gorilla approach makes it possible for AI agents to choose from a dynamic landscape of thousands or millions of different APIs. Chain-of-Abstraction, meanwhile, enables agents to create multi-step plans to use different tools in combination, including factoring in how the output from one tool might inform the input of another. Such big-picture planning about tool use unlocks more powerful and versatile agentic behavior.

One final component of agentic systems, which has emerged more recently and which shows tremendous promise, is the concept of multi-agent architectures.

The basic insight behind multi-agent architectures is that—as with humans—while a single AI agent acting alone can be useful, many AI agents working in concert can be far more powerful.

A popular open-source example of a multi-agent system is ChatDev, in which a group of AI agents work together to build software programs. Agents in the ChatDev system assume roles including CEO, CTO, software programmer, software reviewer and test engineer. Each agent focuses on its specific responsibilities (e.g., the CTO architects the overall system, the programmers translate it into code, the reviewer examines the code for bugs) while collaborating with one another in order to achieve the overall goal of building a software application.
A visual depiction of the ChatDev agent team at work.
"ChatDev: Communicative Agents for Software Development" (arXiv:2307.07924)Intuitively, since all the agents are ultimately powered by the same source of intelligence (an LLM), it may seem unnecessary to create a multi-agent system and divide up roles in this way. In practice, however, multi-agent systems perform better than single-agent systems, especially in more complex settings. Why is this?

A big part of the answer is specialization and modularization. When an individual agent is prompted to focus on a specific subtask, it does a better job on that subtask than if one monolithic agent is prompted to complete the entire project. From the human developers’ perspective, too, a multi-agent framework is conceptually useful in that it decomposes a complex system into discrete modules that can be independently improved and evaluated.

The first widely used open-source framework for multi-agent orchestration was AutoGen. Others, including MetaGPT and Langchain’s LangGraph, have followed.

Multi-agent systems remain a nascent and fast-evolving technology area, with best practices still being formulated. What hierarchical relationships work best for groups of agents working together? How can agents best share information and learn from one another? When and how should new agents be generated on-the-fly in response to changing circumstances? What is the best way to manage computational needs as the number of agents in a system massively scales? Answers to these questions and more are being hashed out by AI builders in real-time.

Where Are the Startup Opportunities in Agents Today?
Tomorrow’s leading AI applications will be agentic at their core. This will be one of the defining themes of artificial intelligence in the years ahead. This leaves the question: what are the most compelling opportunities for startups to pursue in this area today?

One common mental model in the world of early-stage technology is to categorize startups as either infrastructure companies or application companies. In a nutshell, infrastructure companies build the underlying tools and platforms that serve as enablers on top of which application companies build products for end customers.

Conventional wisdom has it that, in any new technology wave, opportunities at the infrastructure layer tend to precede opportunities at the application layer. It makes intuitive sense, after all, that the right infrastructure needs to be in place first in order to support the development of robust, mature, scalable applications. Venture capitalists have long been fond of the “picks and shovels” thesis. (“When everyone is looking for gold, it’s a good time to be in the pick and shovel business,” as Mark Twain famously put it.)

And there is certainly a lot of startup activity happening at the infrastructure layer for agents today. Startups have recently emerged to build tools for agents in areas like orchestration, memory, authentication and hosting.

Yet utilization of all this tooling remains very low, despite the fact that the number of agentic applications has surged in recent months.

In our view, it remains unclear how much space there is to build massive businesses that sit between the foundation model providers on one hand and agentic applications on the other.
The state of agent infrastructure startups today?
Source: XEspecially at this early stage of the technology’s life cycle, before product architectures have become standardized and interoperable, most of today’s agent-based products are powered by internally built tooling that is coupled tightly with the application. And as the underlying foundation models continue to advance, they will be able to handle more and more of the “heavy lifting” that agentic infrastructure would otherwise be designed to solve for. (Don’t be surprised if GPT-5 is natively agentic in its architecture and capabilities.)

For all of these reasons, we believe the biggest and most attractive market opportunities for agent startups are at the application layer. This is where the action is today.

Application-Layer Agent Startups
We will walk through a few specific application areas in which we see tremendous opportunity for agent startups today. But first, what general observations can we make about agent startups at the application layer and what makes them successful?

A few brief thoughts.

To begin with, fully horizontal, general-purpose agents do not work reliably. The technology is simply not there yet. In order to build an agentic product that can be deployed in production with customers today, it is essential to limit its degrees of freedom by customizing it for a specific end market or vertical.

End markets that are particularly conducive to being “agentized” (to invent a word) are those that involve structured, repeatable activities. Software engineering, sales development representatives (SDRs) and regulatory compliance are all examples of such functions. Though they involve very different activities, each of these functions consists of routine workflows with consistent patterns that can be learned and audited.

A second characteristic that makes an application area particularly attractive for the deployment of AI agents is the existence of what one might call a “natural human in the loop.”

Agent technology is not yet totally reliable. Edge cases abound. Some degree of human oversight can help make these systems “ready for primetime.” Yet it would be unscalable and uneconomical for an agent startup to employ people to manually check its system’s outputs.

Conveniently, some workflows already include a human who is in a position to review and approve an agent’s actions without much added friction.

Customer support is a good example. In any customer support interaction, there is always a human involved who can review and sign off on any major action: the customer herself. Depending on the system’s design, a human customer support manager can also function as an additional “human in the loop” for an AI agent. These humans’ input can help course-correct the agent and ensure a productive outcome.

It is worth making one final general point about why AI agents represent such a massive market opportunity.

Organizations spend far more on people than they do on software: on average, companies devote about 70% of their budgets to employees, compared to well under 10% for software products.

Agentic applications are such a revolutionary concept because they are not just another software product to enhance worker productivity; rather, they are workers themselves. For certain roles, they can do everything that an employee can do. This means that they will be able to command pricing more in line with an employee’s salary than with a software tool. This unlocks far greater pools of spend than were accessible to earlier generations of technology startups, translating to massive addressable markets.

And indeed, some of today’s leading agent startups are already having success tapping into customers’ hiring budgets as opposed to their IT budgets.

Without further ado, let us walk through a few specific application areas in which agentic AI startups are poised to create enormous value.

Customer Support
Customer support is an unglamorous but essential function for any business. It is also an enormous market: the global market size for contact centers (a useful proxy) was an estimated $332 billion in 2023, projected to grow to over $500 billion by 2030.

In many ways, customer support represents an archetypal end market for AI agents. It is a standardized, formulaic activity in which most types of customer requests (say, help with a forgotten password) occur over and over. And as noted above, it includes a “natural human in the loop”—the customer herself and/or a customer support manager—who can provide oversight and signoff before any high-stakes action is finalized.

For these reasons, customer support is one of the first areas in which agents are already in production and creating real value for enterprises today.

Fintech unicorn Klarna is a case in point. Earlier this year, Klarna announced that it had deployed an AI assistant powered by OpenAI to automate its customer service engagements. According to the company, this AI assistant has been able to handle two-thirds of all customer service requests (2.3 million conversations in its first month alone), automating the work of 700 full-time human reps and driving an estimated $40 million in added profit for the company this year.

A number of young startups has emerged to build AI customer support agents.

The most high-profile and well-capitalized of these startups is Sierra AI, which has raised over $100 million to date from blue-chip venture capital firms Benchmark and Sequoia. What sets Sierra apart? Its world-class founding team. Sierra CEO/cofounder Bret Taylor—former Salesforce co-CEO, former Facebook CTO, former board chairman at Twitter and current board chairman at OpenAI—is one of the most admired technology executives in the world.

Sierra’s AI customer support agents can respond in real-time to customer queries; retrieve all necessary customer information by integrating with internal systems and calling the appropriate APIs; and take action when needed to satisfy a customer request (say, updating a customer’s address or canceling an international data plan).

Sierra plans to price its agents based on work completed rather than the more conventional software subscription model. As discussed above, this notion of charging for work rather than for software represents an important business model paradigm shift made possible by agents.

“We think outcome-based pricing is the future of software. I think with AI we finally have technology that isn’t just making us more productive but actually doing the job. It’s actually finishing the job,” said Taylor.

Two other promising startups building agentic solutions for customer support are Decagon and Maven AGI, both of which recently announced Series A rounds.

Maven claims that its agents can autonomously handle 93% of all customer questions while reducing resolution times by 60%.

Decagon, meanwhile, boasts an impressive list of early customers that includes Eventbrite, Rippling and Substack.

“Technology differentiation is an interesting question in this category,” said Decagon CEO/cofounder Jesse Zhang. “Everyone is using the same underlying AI models, whether it’s OpenAI’s models or open-source models like Llama. So the differentiator is in the infrastructure, the orchestration that you build around those models. Companies building agents today are basically building graphs, where each node in the graph is an API call or an LLM call or so on. We have our own views on the best way to architect that graph.”

Regulatory Compliance
Companies spend many tens of billions of dollars each year to ensure that their decisions and activities are in compliance with all applicable regulations.

Regulatory requirements touch all facets of a company’s operations: what it communicates externally, how it crafts its internal company policies, how it executes business transactions, what data privacy measures it implements, what reporting and disclosures it carries out, how it handles its tax obligations, and so forth.

Compliance workflows are particularly well-suited to be handed off to AI agents, for a few reasons.

First, compliance work is highly structured, pattern-based and repeatable.

In addition, it is typical for compliance teams to consist of front-line analysts—responsible for flagging potential regulatory violations and suggesting remedies—together with managers who oversee and make final decisions on compliance actions. This presents an opportunity to slot in an AI agent while maintaining a “natural human in the loop”: the agent can substitute for the front-line analyst while the higher-level manager continues to provide human review before any high-stakes decision is finalized.

One prominent startup building AI agents for regulatory compliance is New York-based Norm Ai, which has raised nearly $40 million in recent months in two successive rounds led by Coatue.

Norm’s agentic system can review a company’s operations on an ongoing basis, identify when a certain activity is not in compliance with a certain regulation, and suggest remedial actions to ensure compliance.

Among the laws and regulations that Norm’s agents understand and support compliance for today are the Clean Air Act (213,796 words), the Affordable Care Act (371,810 words) and the Americans with Disabilities Act (22,481 words). Given the length and complexity of these laws, the ability to automatically analyze and apply them is compelling.

Another promising early-stage player in this category is Greenlite AI. In contrast to Norm, which seeks to build agents for the full range of compliance activities, Greenlite is initially focused specifically on Anti-Money Laundering and Know Your Customer (AML/KYC) operations. Greenlite’s agents can, for instance, automatically carry out routine investigations on companies by reviewing documents and searching the internet.

“Leading banks and fintech companies already trust our agents to automate AML workflows in production settings,” said Greenlite CEO/cofounder Will Lawrence. “The status quo is often to rely on offshore contract workers to complete these tasks. So using Greenlite means swapping out an outsourced worker sitting in a different country with our AI. And our AI brings tremendous advantages—in terms of cost, speed, accuracy and transparency.”

Data Science
One of the largest and most compelling application areas for agents is software development. There is enormous buzz around this use case today (for good reason), with companies like Cognition AI—recently valued at $2 billion less than six months after its founding—leading the way. Much has been written already about the opportunity for agents in software engineering.

A thematically analogous opportunity for agents that gets much less attention is data science.

Like software engineering, data science entails complex and highly-paid yet structured and repeatable activities that agentic systems are well-suited to tackle.

Data science (or “predictive machine learning”) use cases can be found everywhere in enterprises today: for instance, personalization, demand forecasting, recommendation systems, dynamic pricing and fraud detection.

One exciting startup building agents for data science is Delphina. Founded by two long-time data science leaders from Uber, Delphina’s agents automate the full data science lifecycle: framing the problem, selecting and transforming data, carrying out feature engineering, training the model, and monitoring and improving the model after deployment.

As Delphina cofounders Jeremy Hermann and Duncan Gilchrist describe it: “Delphina’s agents can be thought of as junior data scientists. They take care of the time-consuming and routine elements of data science workflows, the way an entry-level data scientist might, freeing up human data scientists to spend more time on big-picture reflection and ideation.”

Personal Assistants
Let us end with perhaps the most obvious and clear-cut of all use cases for an AI agent: a personal assistant.

The concept of an AI personal assistant has featured in science fiction books and movies going back decades (think J.A.R.V.I.S. from Iron Man or Samantha from Her). Perhaps because it is so obvious—unoriginal, even—this use case has actually attracted less hype and activity from today’s agent-focused founders and investors than many of the other categories mentioned in this article.

Previous generations of startups have tried and failed to build software that could automate the work of an executive assistant or a personal helper. These products have always proven too brittle for the infinite variability of situations, communications and requests that occur in daily life.

The advent of large language models—and the agentic systems built around them—may finally bring the vision of a competent AI personal assistant within reach.

Compared to use cases like, say, customer support or compliance, building an AI agent that serves as a general-purpose personal assistant is a more unconstrained and open-ended undertaking. A key challenge for startups pursuing this vision, therefore, will be to find ways to put enough structure and boundaries around the problem space that their agents function reliably, while at the same time not limiting their flexibility so much that users get little value out of them.

One promising startup building an agent-powered personal assistant is Mindy.

Mindy describes itself as “everybody’s personal Chief of Staff.” Users can ask Mindy to, for instance, schedule a lunch and invite attendees; shop for a given item online; or carry out market research on a certain industry or company.

Mindy’s cofounders hail from the “PayPal Mafia,” which helps explain why Roelof Botha from Sequoia and Peter Thiel from Founders Fund—two of the PayPal Mafia’s leading members—led the company’s $6 million seed round earlier this year.

The Mindy agent lives in email, and users communicate with it the same way that they would communicate with a human assistant or colleague.

The Mindy team explained the logic behind this key design choice: “Email is the original Internet technology and is still the most ubiquitous tool used to communicate in the business world. Allowing users to cc Mindy to schedule a meeting or forward Mindy a document for summarization delivers the value of generative AI without having to leave their day-to-day workflow or having to learn how to ‘prompt.’ Over 4 billion people around the world have an email account.”

The asynchronous nature of email enables Mindy to carry out deeper research and analysis before responding to a user, rather than needing to produce an immediate response the way a chatbot like ChatGPT does. It also, conveniently, makes it easier to incorporate some degree of human review before Mindy responds.

The Mindy agent is available today for anyone to try out for free.

Another interesting startup in this category is Ario.

Ario is built specifically for consumers rather than for enterprise users. Ario helps with tasks like, for instance, managing your family’s calendar, coordinating your Amazon returns and building personalized itineraries for vacations.

In order to understand you, Ario starts by ingesting all of your data from all of the consumer applications you regularly use, from Instagram to Google Calendar to DoorDash to Fitbit. (The company emphasizes its commitment to data privacy and security.) It can then use all this context to proactively help you manage your life: for example, reminding you that your daughter’s birthday is coming up and suggesting personalized party ideas based on her current interests.

If personal assistant agents like Mindy and Ario actually work—and they need not work perfectly, just well enough to be useful—there is little doubt that they will be wildly successful products.

The big question is whether it is possible, with clever engineering, to harness today’s large language models to enable useful agentic behavior over such a wide-ranging and unconstrained set of topics and tasks. We will soon find out.

Looking Forward
These four categories are illustrative examples of promising application areas for agent startups today. But this is far from an exhaustive list.

From software engineering to revenue operations, from healthcare patient management to sales development representatives, from product analytics to data engineering, many other categories are similarly ripe to be transformed by AI agents.

And these are just the functions that agents are well-positioned to tackle today. As the underlying AI continues to improve at breathtaking speed, the set of human activities that can be handed off to agents will rapidly grow. How long will it be before an agentic system can fully automate the work of a lawyer? An investigative journalist? A policymaker? A venture capitalist? An AI researcher?

Agents are not just another overhyped AI buzzword. They are the inevitable future form factor for artificial intelligence systems. Before you know it, you will be interacting with many different agents on a daily basis.

Things are only going to get weirder and more magical from here.
Note: The author is a partner at Radical Ventures, which is an investor in Delphina.

{Categories} _Category: Takes{/Categories}
{URL}https://www.forbes.com/sites/robtoews/2024/07/09/agents-are-the-future-of-ai-where-are-the-startup-opportunities/{/URL}
{Author}Rob Toews, Contributor{/Author}
{Image}https://imageio.forbes.com/specials-images/imageserve/668c9e902efc4effa9ed4074/0x0.jpg?format=jpg&height=600&width=1200&fit=bounds{/Image}
{Keywords}AI,/ai,Innovation,/innovation,AI,/ai,Innovation,AI,standard{/Keywords}
{Source}POV{/Source}
{Thumb}{/Thumb}

Exit mobile version