Enthusiasts ran the modern AI model Llama on an ancient PC with Pentium II and Windows 98

Experts from EXO Labs were able to run a fairly powerful large language model (LLM) Llama on a 26-year-old computer running the Windows 98 operating system. The researchers clearly showed how an old PC equipped with an Intel Pentium II processor with an operating frequency of 350 MHz and 128 MB of RAM, after which the neural network is launched and further interacts with it.

Image source: GitHub

To run LLM, EXO Labs specialists used their own output interface for the Llama98.c algorithm, which was created based on the Llama2.c engine, written in the C programming language by former OpenAI and Tesla employee Andrej Karpathy. After loading the algorithm, he was asked to create a story about Sleepy Joe. Surprisingly, the AI ​​model actually works even on such an ancient PC, and the story is written at a good speed.

The mysterious organization EXO Labs, formed by researchers and engineers from Oxford University, emerged from the shadows in September this year. She reportedly advocates for the openness and accessibility of artificial intelligence-based technologies. Representatives of the organization believe that advanced AI technologies should not be in the hands of a handful of corporations, as is the case now. Going forward, they hope to “build an open infrastructure for training advanced AI models, allowing anyone to run them anywhere.” Demonstrating the ability to run LLM on an ancient PC, in their opinion, proves that AI algorithms can run on almost any device.

In their blog, enthusiasts said that to implement the task, they purchased an old PC with Windows 98 on eBay. Then, by connecting the device to the network using an Ethernet connector, they were able to transfer the necessary data to the device’s memory via FTP. Probably, compiling modern code for Windows 98 turned out to be a more difficult task, which was solved by the work of Andrei Karpathy published on GitHub. Ultimately, we were able to achieve a text generation speed of 35.9 tokens per second using a 260K LLM with the Llama architecture, which is quite good considering the modest computing capabilities of the device.

admin

Share
Published by
admin

Recent Posts

Samsung confirms it is developing a tri-fold smartphone

Samsung had some surprises at this year's Samsung Unpacked event. In addition to the main…

50 minutes ago

Not China: Taiwan blames ‘natural wear and tear’ for latest undersea internet cable outages

Taiwan's Ministry of Digital Technology has revealed the reason for the outage of two undersea…

2 hours ago

For Formula E electric car racing, they created ultra-fast charging – 10% in 34 seconds

Dependency on the charging infrastructure is a problem not only for road electric vehicles, but…

3 hours ago

Rutube users can now download videos on Android smartphones

The ability to save videos from Rutube to a local drive has appeared for users…

4 hours ago

Copper mixed with diamonds will provide better cooling for the hottest chips

The growing energy consumption of data processing centers (DPCs) exacerbates the problem of heat removal…

4 hours ago