Large language models of artificial intelligence require significant resources not only during training, but also during launch — significant amounts of RAM and powerful graphics processors are needed. An alternative was proposed by the creators of Exo — a free program for distributed launch of AI on several devices. Almost like torrents, only for launching AI.

Image source: github.com/exo-explore/exo

The application allows combining the computing resources of several computers, smartphones and even single-board computers, including Raspberry Pi, to run models that none of the user’s existing systems could handle on their own. The devices’ resources are combined via a peer-to-peer network.

Exo dynamically distributes the load created by a large language model across devices available on the network, placing its layers based on the available RAM and computing power. LLaMA, Mistral, LlaVA, Qwen, and DeepSeek are supported. The program installs on devices running Linux, macOS, Android, or iOS — there is no version for Windows yet. Exo requires a minimum version of Python 3.12.0 and, in the case of Linux machines with Nvidia graphics, a number of other components.

An AI model that requires 16GB of RAM can be run on two laptops with 8GB each; the powerful DeepSeek R1, which requires 1.3TB of RAM, could theoretically be run on a cluster of 170 Raspberry Pi 5s with 8GB. Network speed and latency can be a drag on the model, and Exo’s developers warn that low-end devices can slow down the AI, but with each device added to the network, overall performance increases. There are also security risks that inevitably arise when running workloads on multiple machines. Even with these caveats, Exo looks like a promising alternative to the cloud.

Leave a Reply

Your email address will not be published. Required fields are marked *