Nvidia’s new microservices APIs promise to speed up AI development

By zapier
10 months Ago

Nvidia Corp. said today it’s adding a new, microservices-based layer to its popular Nvidia AI Enterprise platform, giving generative artificial intelligence model developers, platform providers and others the ability to run custom AI models in any location.

The new microservices were announced at the Nvidia GTC 2024 conference in San Jose alongside a major update to the Nvidia Edify platform, which is a multimodal architecture for visual generative AI workloads. Nvidia Edify gains 3D asset generation capabilities, alongside more controls over generative AI image generation, the company said.

Generative AI microservices to simplify model deployment
Nvidia said the Nvidia NIM microservices, built atop its Nvidia CUDA platform, enable the optimized inference of popular generative AI models from both itself and its partner ecosystem. Besides NIM, it also announced the launch of new, Nvidia CUDA-X microservices-based accelerated software development kits, libraries and tools for tasks such as retrieval-augmented generation, guardrails, data processing, high-performance computing and more.

The company explained that NIM microservices are prebuilt containers for reducing the deployment time of inference software such as Triton Inference Server and TensorRT-LLM from weeks to a matter of minutes. The microservices come with industry-standard application programming interfaces for AI domains such as language and drug discovery, and make it simpler for developers to build AI applications that can leverage their own data, on any platform, including cloud servers, on-premises systems and even workstations and laptops.

The NIM microservices cater to Nvidia’s own catalog of models, as well as those from partners such as AI21 Labs Inc., Adepts Inc., Cohere Inc., Getty Images Holdings Inc. and Shutterstock Inc., plus open-source models from the likes of Meta Platforms Inc., Hugging Face Inc., Stability AI Ltd. and Google LLC.

The company said customers can access NIM microservices via the Nvidia AI Enterprise platform, as well as Microsoft Azure AI, Google Cloud Vertex AI, Google Kubernetes Engine and Amazon SageMaker, and integrate with AI frameworks, including LangChain, LlamaIndex and Deepset.

Early access customers include ServiceNow Inc., which is using NIM to develop and deploy a series of more cost-effective, domain-specific generative AI copilots. Others include companies such as Adobe Inc., CrowdStrike Holdings Inc., Getty Images, SAP SE and Shutterstock.

Nvidia co-founder and Chief Executive Jensen Huang said the NIM microservices can be thought of as the building blocks enterprises need to become AI companies. “Established enterprise platforms are sitting on a goldmine of data that can be

transformed into generative AI copilots,” he explained.

CUDA-X Microservices for RAG, Data Processing, Guardrails, HPC
As for the CUDA-X microservices, they provide the building blocks for essential AI development tasks such as data preparation, customization and training development. They include Nvidia Riva, which is a microservice for customized speech and translation AI, Nvidia cuOptCUDA-X Microservices for RAG, data processing, guardrails, HPC for routing optimization and Nvidia Earth-2 for high resolution climate and weather forecasting.

Others include NeMo Retriever microservices, which make it simple to link AI applications with proprietary business data so they can generate more accurate and contextually relevant responses. These RAG capabilities enable organization to feed more data to their copilots, chatbots and generative AI productivity tools, the company said.

The Nvidia CUDA-X microservices are also available Nvidia AI Enterprise 5.0, and will be supported on public cloud infrastructure platforms, on-premises server systems, including Nvidia-certified systems from the likes of Dell Technologies Inc., Hewlett Packard Enterprise Co. and Lenovo Group Ltd. They’re also compatible with infrastructure software platforms such as VMware Inc.’s Private AI Foundation with Nvidia and Red Hat OpenShift. AI and machine learning ecosystem partners, including Anyscale Inc., Dataiku Inc. and Weights & Biases Inc., are also adding support for CUDA-X microservices.

Nvidia Edify enhances visual generative AI development
Nvidia said its Edify platform for visual generative AI models is being enhanced with various new APIs that enable superior control for image and scene generation. The Edify AI Models can be accessed as an API through Nvidia NIM or via Nvidia Picasso, which is an AI development foundry built on the Nvidia DGX Cloud platform.

One early customer is the livestreaming platform BeLive Studios Ltd., which has used Nvidia Picasso and Edify to create real-time generative AI that automates the creation of visual scenes.

Shutterstock is another early adopter, partnering with HP Inc. to demonstrate how Edify 3D can enhance customized 3D printing with various AI-generated designs. This, the company said, will enable designers to quickly iterate on new prototypes to aid in product design. In addition, Shutterstock has also created an Edify-powered tool to light 3D scenes using 360-degree HDRi environments generated from text and image prompts.

Meanwhile, Getty Images has announced new Edify-powered APIs to enhance its image generation AI tools. They include an API for inpainting, which enables users to add, remove or replace objects in an image, plus outpainting, which can be used to extend the creative canvas. Getty Images is also adding Edify-based APIs that provide more control over generative AI image output and the ability to fine-tune the Edify foundation models to a company’s brand and visual style.

Other APIs offered by Getty deliver sketch, depth and segmentation features, allowing users to provide a sketch as a prompt, follow the composition of reference images with a depth map and segment parts of an image to add, remove or retouch any character or object.

The new Edify APIs are designed to give users much greater control and flexibility over the output of image-based generative AI tools, making them much more viable for creative design processes.

Images: Nvidia
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU

{Categories} _Category: Platforms,*ALL*{/Categories}
{URL}https://siliconangle.com/2024/03/18/nvidias-new-microservices-apis-promise-speed-ai-development/{/URL}
{Author}Mike Wheatley{/Author}
{Image}https://d15shllkswkct0.cloudfront.net/wp-content/blogs.dir/1/files/2024/03/NIM-Microservices-Image.png{/Image}
{Keywords}AI,NEWS,The-Latest,Top Story 3,AI deployment,AI model deployment,CUDA-X microservices,Democratized Generative AI,generative AI control,generative AI deployment,guardrails,image generation,microservices,Nim microservices,Nvidia,Nvidia Edify,Nvidia microservices,visual generative AI{/Keywords}
{Source}SiliconANGLE{/Source}
{Thumb}https://d15shllkswkct0.cloudfront.net/wp-content/blogs.dir/1/files/2024/03/NIM-Microservices-Image.png{/Thumb}

Categories: Platforms, Uncategorized

Related Content

Windsurf Cascade Uses AI to Help Beginners Code Games and Applications

DHL’s Greg Hewitt Talks Global Logistics and Industry Transformation

5 Future of Work Predictions and Trends to Watch in 2025

The Past, Present and Not-Too-Distant Future of Standards: A Conversation with George Ivie

In 2024, Warning Calls About Catastrophic AI Risks Were Drowned Out By The Tech Industry's Promotion Of A Practical And Prosperous Vision Of Generative AI (Maxwell Zeff/TechCrunch)