Cerebras Launches CePO, Enabling Realtime Reasoning Capabilities for Llama AI ModelsAuthor of the article:
Published Dec 10, 2024 • 3 minute read
You can save this article by registering for free here. Or sign-in if you have an account.
CePO significantly enhances Llama 3.3. 70B’s reasoning capabilities, enabling it to outperform the larger Llama 3.1 405B model across key benchmarks including MATH, MMLU-Pro (Math), GPQA and CRUX. Business WireArticle contentCerebras Planning and Optimization (CePO) enables Llama 3.3 70B to outperform flagship Llama 3.1 405B model and leading closed source models
Article content
SUNNYVALE, Calif. & VANCOUVER, British Columbia — Today at NeurIPS 2024, Cerebras Systems, the pioneer in accelerating generative AI, announced CePO (Cerebras Planning and Optimization), a powerful framework that dramatically enhances the reasoning capabilities of Meta’s Llama family of models. Through sophisticated test-time computation techniques, CePO enables Llama 3.3-70B to outperform Llama 3.1 405B across challenging benchmarks while maintaining interactive speeds of 100 tokens per second – a first among test-time reasoning models.
THIS CONTENT IS RESERVED FOR SUBSCRIBERS ONLY
Subscribe now to read the latest news in your city and across Canada.
Exclusive articles from Barbara Shecter, Joe O’Connor, Gabriel Friedman, and others.Daily content from Financial Times, the world’s leading global business publication.Unlimited online access to read articles from Financial Post, National Post and 15 news sites across Canada with one account.National Post ePaper, an electronic replica of the print edition to view on any device, share and comment on.Daily puzzles, including the New York Times Crossword.SUBSCRIBE TO UNLOCK MORE ARTICLES
Subscribe now to read the latest news in your city and across Canada.
Exclusive articles from Barbara Shecter, Joe O’Connor, Gabriel Friedman and others.Daily content from Financial Times, the world’s leading global business publication.Unlimited online access to read articles from Financial Post, National Post and 15 news sites across Canada with one account.National Post ePaper, an electronic replica of the print edition to view on any device, share and comment on.Daily puzzles, including the New York Times Crossword.REGISTER / SIGN IN TO UNLOCK MORE ARTICLES
Create an account or sign in to continue with your reading experience.
Access articles from across Canada with one account.Share your thoughts and join the conversation in the comments.Enjoy additional articles per month.Get email updates from your favourite authors.THIS ARTICLE IS FREE TO READ REGISTER TO UNLOCK.
Create an account or sign in to continue with your reading experience.
Access articles from across Canada with one accountShare your thoughts and join the conversation in the commentsEnjoy additional articles per monthGet email updates from your favourite authorsSign In or Create an Accountor
Article content
The framework represents a significant breakthrough in making advanced reasoning capabilities accessible to the open-source AI community. While models like OpenAI o1 and Alibaba QwQ have demonstrated the power of additional computation at inference time, CePO brings these capabilities to Llama – the world’s most popular open-source LLM family.
“CePO represents a significant advancement in LLM reasoning capabilities,” said Ganesh Venkatesh, Head of Applied ML at Cerebras Systems. “By combining step-by-step reasoning with comparative analysis and structured outputs, we’ve enabled Llama 3.3-70B to surpass the performance of Llama 3.1-405B across multiple challenging benchmarks. Our results on MMLU-Pro (Math), GPQA, and CRUX demonstrate that sophisticated reasoning techniques can dramatically enhance model performance without requiring larger parameter counts.”
CePO’s effectiveness is demonstrated through its performance on challenging reasoning tasks that typically trip up even the most advanced AI models. In direct comparisons with GPT-4 Turbo and Claude 3.5 Sonnet, Llama 3.3-70B with CePO achieved comparable performance in CRUZ, LiveCodeBench, and GPQA benchmarks, while significantly outperforming in MATH evaluations. The framework has also shown remarkable success in classic reasoning challenges like the Strawberry Test and modified Russian Roulette problem, demonstrating true reasoning capabilities rather than mere pattern matching.
The CePO reasoning framework achieves these improvements through an innovative four-stage pipeline:
Step-by-step planning for complex problem decompositionMultiple execution paths to ensure solution robustnessCross-execution analysis to identify and correct inconsistenciesStructured confidence scoring within a Best-of-N framework
CePO uses a combination of reasoning techniques, generates multiple plans and checks its own work, consuming 10-20x more output tokens compared to one-shot approaches. However, thanks to Cerebras’ hardware optimization, it still achieves speeds of 100 tokens per second – comparable to best-in-class chat applications like GPT-4 Turbo and Claude 3.5 Sonnet.
“CePO’s ability to enhance reasoning capabilities while maintaining interactive speeds opens new possibilities for AI applications,” said Andrew Feldman, CEO and co-founder of Cerebras Systems. “By bringing these capabilities to the Llama family of models, we’re democratizing access to sophisticated reasoning techniques previously limited to closed commercial systems. This advancement will enable developers to build more sophisticated AI applications that require complex, multi-step reasoning in real-time scenarios.”
Article content
To accelerate innovation in AI reasoning capabilities, Cerebras will open source the CePO framework, enabling researchers and developers worldwide to build upon and enhance these breakthrough techniques. The company’s roadmap includes developing advanced prompting frameworks that leverage comparative reasoning, creating synthetic datasets optimized for inference-time computation, and building enhanced verification mechanisms for complex reasoning chains. For more information about CePO, read our technical blog post at cerebras.ai/blog/cepo.
About Cerebras Systems
Cerebras Systems is a team of pioneering computer architects, computer scientists, deep learning researchers, and engineers of all types. We have come together to accelerate generative AI by building from the ground up a new class of AI supercomputer. Our flagship product, the CS-3 system, is powered by the world’s largest and fastest commercially available AI processor, our Wafer-Scale Engine-3. CS-3s are quickly and easily clustered together to make the largest AI supercomputers in the world, and make placing models on the supercomputers dead simple by avoiding the complexity of distributed computing. Cerebras Inference delivers breakthrough inference speeds, empowering customers to create cutting-edge AI applications. Leading corporations, research institutions, and governments use Cerebras solutions for the development of pathbreaking proprietary models, and to train open-source models with millions of downloads. Cerebras solutions are available through the Cerebras Cloud and on premise. For further information, visit cerebras.ai or follow us on LinkedIn or X.
View source version on businesswire.com: https://www.businesswire.com/news/home/20241210664485/en/
Contacts
Media Contact
PR@zmcommunications.com
#distro
Share this article in your social network
{Categories} _Category: Platforms{/Categories}
{URL}https://financialpost.com/pmn/business-wire-news-releases-pmn/cerebras-launches-cepo-enabling-realtime-reasoning-capabilities-for-llama-ai-models{/URL}
{Author}Business Wire{/Author}
{Image}https://smartcdn.gprod.postmedia.digital/financialpost/wp-content/uploads/2024/12/bw20241210664485_cepo_graph.jpeg{/Image}
{Keywords}Business Wire News Releases,PMN Press Releases,BW{/Keywords}
{Source}Platforms{/Source}
{Thumb}{/Thumb}