NVIDIA Corporation and the French company Mistral AI announced the Mistral NeMo 12B large language model (LLM), specially designed to solve various enterprise-level tasks – chatbots, data summarization, working with program code, etc.
Mistral NeMo 12B has 12 billion parameters and uses a context window of 128 thousand tokens. The inference uses the FP8 data format, which is said to reduce memory requirements and speed up deployment without any reduction in response accuracy.
Image Source: Pixabay.com
When training the model, the Megatron-LM library, which is part of the NVIDIA NeMo platform, was used. In this case, 3072 NVIDIA H100 accelerators based on DGX Cloud were used. It is claimed that Mistral NeMo 12B copes well with multi-pass dialogues, mathematical problems, programming, etc. The model has “common sense” and “world knowledge”. Overall, it reports accurate and reliable performance across a wide range of applications.
The model is released under the Apache 2.0 license and is offered as a NIM container. The implementation of LLM, according to the creators, takes a matter of minutes, not days. To run the model, one NVIDIA L40S accelerator, GeForce RTX 4090 or RTX 4500 is enough. Among the key advantages of deployment via NIM are high efficiency, low computational cost, security and privacy.
Earlier this week, Oppo unveiled its new flagship Find N5 foldable smartphone, which is currently…
Hackers broke into Singapore-based crypto exchange Bybit this week, stealing more than $1.4 billion worth…
Microsoft has officially confirmed changes to the Windows 11 Start menu regarding the All apps…
There is an opinion among experts that the new topological quantum processor Microsoft Majorana 1…
Some Chrome users have noticed that the uBlock Origin extension no longer works. The developers…
The directness of the current US President Donald Trump sometimes creates inconvenience for his partners,…