Stability AI, the developer of the popular Stable Diffusion neural network, has introduced a music AI model called Stable Audio Open Small, which generates stereo audio and can work on smartphones without an internet connection. The model was created in collaboration with chipmaker Arm, whose processors are used in most mobile devices, and is capable of quickly generating high-quality audio even on devices with limited computing resources.
Image source: AI
Unlike competitors like Suno and Udio, which require cloud processing, Stable Audio Open Small runs locally. However, as TechCrunch notes, the model was trained only on data from free audio libraries Free Music Archive and Freesound, which reduces the risk of copyright infringement and distinguishes it from some other AI services that use protected content.
The model contains 341 million parameters and is optimized for Arm processors. It is designed to quickly create short audio samples and sound effects, such as drums or instrumental parts. According to Stability AI, on a smartphone, the AI can generate 11 seconds of audio in less than eight seconds.
At the same time, Stable Audio Open Small has some limitations. For example, it only understands text queries in English, and it can’t create realistic vocals or complex musical compositions. In addition, the company admits that because the model was trained on Western-oriented data, it copes better with styles inherent to Western music.
Another complication is the terms of use. The AI model is available for free for researchers, hobbyists, and small businesses, but if the company’s annual revenue exceeds $1 million, it will require purchasing a commercial license. While this is a good deal for indie developers, it can be a bit of a hurdle for larger projects.
As a reminder, Stability AI, best known for its deep learning model Stable Diffusion that generates images from text descriptions, has been trying to restore its reputation in recent months after financial problems under former CEO Emad Mostaque. The company has raised funds, appointed a new CEO, and added director James Cameron to its board. In parallel, it continues to release new generative models, including new tools for creating images.