With the growing popularity of artificial intelligence, high power consumption of AI models is becoming an increasingly pressing problem. Despite the fact that tech giants such as Nvidia, Microsoft and OpenAI have not yet spoken loudly about this problem, clearly downplaying its significance, specialists from BitEnergy AI have developed a technology that can significantly reduce energy consumption without significant losses in the quality and speed of AI work.

Image source: Copilot

According to the study, the new method can reduce energy use by up to 95%. The team calls their discovery Linear-Complexity Multiplication, or L-Mul for short. According to TechSpot, this computing process is based on adding integers and requires significantly less energy and operations compared to floating-point multiplication, which is widely used in AI-related tasks.

Today, floating point numbers are actively used in AI to process very large or very small numbers. They resemble notation in binary form, allowing algorithms to accurately perform complex calculations. However, such accuracy requires extremely large resources and already raises certain concerns, since some AI models require huge amounts of electricity. For example, ChatGPT requires as much electricity as 18,000 US households consume—564 TWh daily. Analysts at the Cambridge Center for Alternative Finance estimate that by 2027, the AI ​​industry could consume between 85 and 134 TWh annually.

The L-Mul algorithm solves this problem by replacing complex floating-point multiplication operations with simpler integer additions. During testing, AI models maintained accuracy, while energy consumption for tensor operations was reduced by 95%, and for scalar operations by 80%.

L-Mul also improves performance. The algorithm was found to outperform current 8-bit precision computing standards, providing higher precision with fewer bit-level operations. In tests covering a variety of AI tasks, including natural language processing and computer vision, the performance hit was only 0.07%, which experts considered a minor loss compared to the huge energy savings.

That being said, transformer-based models such as GPT may benefit the most from L-Mul as the algorithm is easily integrated into all key components of these systems. And tests on popular AI models such as Llama and Mistral even showed improvements in accuracy in some tasks.

The bad news is that L-Mul requires specialized hardware and current AI accelerators are not optimized to use this method. The good news is that work is already underway to create such hardware and application programming interfaces (APIs).

One possible obstacle could be resistance from large chip makers like Nvidia, which could slow the adoption of the new technology. Since, for example, Nvidia is a leader in the production of equipment for artificial intelligence and it is unlikely that it will so easily give way to more energy-efficient solutions.

Leave a Reply

Your email address will not be published. Required fields are marked *