OpenAI’s new “Strawberry” AI model is winning praise from industry observers who praise its reasoning capabilities but note limitations.
The company unveiled its latest AI model, dubbed “OpenAI o1” and nicknamed “Strawberry,” on Thursday (Sept. 12). The o1 model family, available in o1-preview and o1-mini versions, aims to advance artificial intelligence (AI) problem-solving and reasoning.
Scott Dylan, founder of NexaTech Ventures, a venture capital firm focused on AI, called the new model “an exciting leap forward in AI development.” He told PYMNTS that “the model’s ability to handle complex problems in fields like science, coding, and mathematics by spending more time thinking before responding sets it apart.”
Benchmarks and Early Performance
According to OpenAI, o1-preview ranked in the 89th percentile on competitive programming questions from Codeforces. In mathematics, it scored 83% on an International Mathematics Olympiad qualifying exam, compared to GPT-4o’s 13%.
Some early users reported mixed experiences. They said o1 doesn’t consistently outperform GPT-4o across all metrics. Others criticized slower response times, which OpenAI attributes to more complex processing.
OpenAI Product Manager Joanne Jang addressed concerns on social media. “There’s a lot of o1 hype on my feed, so I’m worried that it might be setting the wrong expectations,” she wrote on X. Jang described o1 as “the first reasoning model that shines in really hard tasks” but cautioned it isn’t a “miracle model that does everything better than previous models.”
One area of interest is whether the model is a step toward artificial general intelligence (AGI), which refers to highly autonomous systems that outperform humans at most economically valuable work. Unlike narrow AI systems designed for specific tasks, AGI would possess human-like general intelligence and adaptability across various domains.
“While it’s not quite AGI, it’s a strong step in that direction,” Dylan said.
Explainable AI and Reasoning
Steve Wilson, CPO at the AI security company Exabeam, told PYMNTS he was impressed by o1’s ability to explain its reasoning. “The biggest takeaway from OpenAI’s o1 is its ability to explain its reasoning. The new o1 model uses step-by-step reasoning rather than relying solely on ‘next token’ logic,” he said.
Wilson provided an example: “I posed a riddle to o1, asking it ‘What has 18 legs and catches flies?’ It responded: a baseball team. A baseball team has nine players on the field (totaling 18 legs), and they catch ‘flies’ —which are fly balls hit by the opposing team.”
He noted a new feature that shows users how o1 arrives at its conclusions. “This feels like a huge step forward! The concept of explainability has always been a huge topic and a major challenge for applications based on machine learning,” Wilson added.
Dylan sees significant potential in specific sectors: “Industries such as healthcare, legal tech and scientific research will see the greatest benefits.” He elaborated, “In healthcare, the model can help interpret complex genomics or protein data with far greater accuracy; in legal tech, its ability to analyze nuanced legal language could lead to more thorough contract reviews.”
The slower processing may challenge industries like customer service or real-time data analysis, where speed is essential, Dylan noted. “For tasks requiring precision, like medical diagnostics or complex legal cases, this model could be a game-changer,” he said.
Future Implications
Wilson underscored the significance of o1’s explainability feature. Explainability in AI refers to the ability of a system to provide clear, understandable reasons for its outputs or decisions. This feature lets users see how the AI model arrives at its conclusions, making the decision-making process more transparent.
“What’s exciting about my initial testing isn’t so much that it’s going to ‘score better on benchmarks’ but that it offers a level of ‘explainability’ that has never been present in production AI/LLM models,” he said.
Looking ahead, Wilson predicted, “When you start to combine these reasoning models with multi-modal vision models and voice interaction, we’re in for a radical shift in the next 12 months.”
OpenAI credits o1’s advancements to a novel reinforcement learning approach. This method teaches the model to spend more time analyzing problems before responding, similar to human reasoning processes.
Researchers and developers are now testing o1 to determine its capabilities and limitations. The release has reignited discussions about AI reasoning technologies’ current state and future.
“The o1 model isn’t just an upgrade; it’s a shift toward more careful, calculated reasoning in AI, which will likely reshape how we solve real-world problems,” Dylan said.
The post OpenAI’s ‘Strawberry’ Model Sparks Fresh Discussions on AI Capabilities appeared first on PYMNTS.com.
{Categories} _Category: Platforms{/Categories}
{URL}https://www.pymnts.com/artificial-intelligence-2/2024/openais-strawberry-model-sparks-fresh-discussions-on-ai-capabilities/{/URL}
{Author}PYMNTS{/Author}
{Image}{/Image}
{Keywords}artificial intelligence,AI,Exabeam,News,NexaTech Ventures,OpenAI,PYMNTS News,Strawberry,Technology{/Keywords}
{Source}Platforms{/Source}
{Thumb}{/Thumb}