NVIDIA announced at the 2024 GTC conference the launch of a modern software platform aimed at simplifying the deployment of pre-trained artificial intelligence models for use in production environments known as NIM.
NIM leverages software developed by NVIDIA for inference and pattern development operations, making it easier to use by integrating a specific model with an optimized inference engine, and then packaging them together into a unified suite, making it accessible as a specialized and flexible service.
Developers typically need several weeks or months to develop similar software packages if the company has an in-house specialized artificial intelligence team.
Through the NIM initiative, NVIDIA aims to establish an environment of ready-made solutions for artificial intelligence applications that rely on its technologies as a foundational layer. These integrated software modules serve as the core programming for enterprises eager to enhance their progress in the field of artificial intelligence quickly.
Currently, NIM supports NVIDIA’s own models such as A121 and Adept, in addition to supporting models from companies like Cohere, Getty Images, Shutterstock, as well as open-source models from companies like Google, Meta, and Microsoft, along with models from MISTRAL, Hugging Face, and Stability AI.
NVIDIA collaborates with Google, Microsoft, and Amazon to make these small NIM services available through SageMaker, Kubernetes Engine, and Azure AI by integrating them into frameworks like Deepset, LangChain, and LlamaIndex.
Manuvir Das, Executive Director of Enterprise Computing at NVIDIA, stated: “We believe that NVIDIA’s GPUs represent the ideal environment for executing inferences on these computational models. We also consider NIM as the best application suite offered to developers to use as a basis for their work, enabling them to focus on business applications.”
NVIDIA utilizes its Triton Inference Server, TensorRT, and TensorRT-LLM for inference in its engine. It provides a range of integrated services accessible through the NIM network, including the Riva speech and translation model development service, the cuOpt optimization service, as well as the Earth-2 weather and climate simulation service.
As time progresses, the company plans to add new features, including providing a Large Language Model Retrieval Data RAG LLM model runtime unit as a microservice through NIM, promising to facilitate the creation of AI-based conversational robots capable of fetching specifically modified data.
Among the current users of NIM, we can mention companies like Box, Cloudera, Cohesity, Datastax, Dropbox, and NetApp.