Exo, a new application, has been launched to facilitate the distributed running of AI models across multiple devices. Similar to torrents but tailored for AI inferencing, Exo allows users to pool resources from computers, smartphones, and single-board computers.
Compatibility and Platform Support
The Exo application supports devices running Linux, macOS, Android, and iOS. Although there is currently no Windows version, the setup requires Python 3.12.0. For systems with Nvidia graphics under Linux, additional components are necessary.
- Exo can dynamically distribute models based on the device's memory and computational power.
- Supported models include LLaMA, Mistral, LlaVA, Qwen, and DeepSeek.
- Intensive models, such as DeepSeek R1 requiring 1.3 TB of RAM, can potentially operate on clusters of Raspberry Pi devices.
Functional Advantages and Challenges
By allowing AI model layers to be distributed, Exo makes it possible to run models requiring substantial resources on less powerful devices collectively. For instance, a model needing 16 GB of RAM can run on two 8 GB laptops.
However, network speed and device latency may impact performance, while weaker devices can slow down processing times. Exo also notes the potential security risks when sharing workloads across devices. Nevertheless, it presents a promising alternative to cloud resources for distributed machine learning tasks.