NVIDIA unveils NeMo Retriever for accurate AI application responses
NVIDIA has unveiled NeMo Retriever, a cutting-edge generative AI microservice. This new development is designed to allow enterprises to link custom large language models with their data in order to produce incredibly accurate responses for their AI applications.
The NVIDIA NeMo Retriever is a new addition to the NVIDIA NeMo framework family, which is highly regarded for building, customising, and deploying generative AI models. The speciality of NeMo Retriever is its enterprise-grade retrieval-augmented generation (RAG) capabilities, deployed as a semantic-retrieval microservice. This advanced feature allows generative AI applications to deliver more accurate responses through NVIDIA-optimised algorithms.
Thanks to NeMo Retriever, AI developers can connect their applications to business data stored across various platforms, from clouds to data centres. The microservice also enhances NVIDIA's RAG capabilities with AI foundries and makes it part of the NVIDIA AI Enterprise software platform, available in the AWS Marketplace.
Reputed firms, including Cadence, Dropbox, SAP and ServiceNow, have become the early adopters of this technology. They are collaborating with NVIDIA to incorporate production-ready RAG capabilities into their custom generative AI offerings.
"Generative AI applications with RAG capabilities are the next killer app of the enterprise," said Jensen Huang, the founder and CEO of NVIDIA. NeMo Retriever enables developers to create customised generative AI chatbots, copilots and summarisation tools that access business data and provide valuable AI intelligence, thereby transforming productivity across enterprises.
Among these world-class firms is Cadence, a leader in electronic systems design. The company serves a variety of sectors ranging from hyperscale computing and 5G communications to automotive, mobile, aerospace, consumer and healthcare markets. The company has been leveraging NVIDIA's NeMo Retriever to build RAG features for generative AI applications in industrial electronics design.
Anirudh Devgan, the President and CEO of Cadence explained how generative AI offers innovative ways to address customer needs. "Our researchers are working with NVIDIA to use NeMo Retriever to further boost the accuracy and relevance of generative AI applications. This will help to reveal issues early in the process and assist our customers in bringing high-quality products to market faster," he stated.
Contrary to open-source RAG toolkits, NeMo Retriever supports production-ready generative AI with commercially viable models, API stability, security patches, and enterprise support. The secret to its high-accuracy results lies in the NVIDIA-optimised algorithms powering the embedding models of Retriever, which capture relationships between words and allow LLMs to process and analyse textual data.
Enterprises using NeMo Retriever can connect their LLMs to various data sources and knowledge bases. This facilitates user interaction with data and ensures they receive up-to-date, accurate answers through simple, conversational prompts. As a result, businesses can allow users secure access to diverse data modalities, including text, PDFs, images and videos.
NVIDIA believes that the use of NeMo Retriever results in more accurate outcomes with less training enhances the speed of bringing products to market, and supports energy efficiency in generative AI application development.
Enterprises can deploy NeMo Retriever-powered applications on NVIDIA-accelerated computing virtually in any data centre or cloud. The NVIDIA AI Enterprise facilitates accelerated, high-performance inference with NVIDIA NeMo, among other NVIDIA AI software.
Developers can sign up for early access to NVIDIA NeMo Retriever. As the world continues its rapid move towards digitalisation, this new offering by NVIDIA is expected to enhance the precision and relevance of generative AI applications in a variety of sectors.