Artificial Intelligence, Machine Learning, Neural Networks Technology and IT market. news

The Chinese presented the open AI model DeepSeek V3 – it is faster than GPT-4o and its training was much cheaper

Dec 27, 2024

The Chinese company DeepSeek introduced a powerful open artificial intelligence model DeepSeek V3 – the license allows it to be freely downloaded, modified and used in most projects, including commercial ones.

Image source: and machines / unsplash.com

DeepSeek V3 handles a variety of text processing tasks, including writing articles, emails, translation, and code generation. The model is superior to most open and closed analogues, as shown by the results of testing carried out by the developer. Thus, in programming-related tasks it turned out to be stronger than Meta✴ Llama 3.1 405B, OpenAI GPT-4o and Alibaba Qwen 2.5 72B; DeepSeek V3 also performed better than its competitors in the Aider Polyglot test, which tests, among other things, its ability to generate code for existing projects.

The model was trained on a data set of 14.8 trillion projects; When deployed on the Hugging Face platform, DeepSeek V3 showed a size of 685 billion parameters – about 1.6 times more than Llama 3.1 405B, which, as one might guess, has 405 billion parameters. Typically, the number of parameters, that is, internal variables that models use to predict responses and make decisions, correlates with the model’s skill: the more parameters, the more capable it is. But running such AI systems requires more computing resources.

DeepSeek V3 was trained in two months in a data center on Nvidia H800 accelerators – their deliveries to China are now prohibited by American sanctions. The cost of training the model, the developer claims, was $5.5 million, which is significantly lower than OpenAI’s expenses for the same purposes. At the same time, DeepSeek V3 is politically verified – it refuses to answer questions that official Beijing considers sensitive.

In November, the same developer presented the DeepSeek-R1 model, an analogue of the “reasoning” OpenAI o1. One of DeepSeek’s investors is Chinese hedge fund High-Flyer Capital Management, which makes decisions using AI. It has several of its own clusters for training models. One of the latest, according to some information, contains 10,000 Nvidia A100 accelerators, and its cost was 1 billion yuan ($138 million). High-Flyer aims to help DeepSeek develop “superintelligent” AI that will outperform humans.

On the cutting edge of science Technology and IT market. news

NASA endowed the swimmer of the satellites with a collective “mind” and experienced technology in space

Feb 5, 2025 admin

Games Technology and IT market. news

The first major patch for Warhammer 40,000: Space Marine 2 in 2025 will not be long in coming – update 6.0 received the release date

Feb 5, 2025 admin

Mobile phones, smartphones, cellular communications, communicators, PDAs Technology and IT market. news

Huawei helped out $ 118 billion last year – much more Xiaomi, but less than Apple in one quarter

Feb 5, 2025 admin

The Chinese presented the open AI model DeepSeek V3 – it is faster than GPT-4o and its training was much cheaper

Related Post

NASA endowed the swimmer of the satellites with a collective “mind” and experienced technology in space

The first major patch for Warhammer 40,000: Space Marine 2 in 2025 will not be long in coming – update 6.0 received the release date

Huawei helped out $ 118 billion last year – much more Xiaomi, but less than Apple in one quarter

Leave a Reply Cancel reply

You missed

NASA endowed the swimmer of the satellites with a collective “mind” and experienced technology in space

The first major patch for Warhammer 40,000: Space Marine 2 in 2025 will not be long in coming – update 6.0 received the release date

Huawei helped out $ 118 billion last year – much more Xiaomi, but less than Apple in one quarter

Nissan and Honda changed their minds to merge in the third largest automaker in the world