The Open Source Initiative (OSI), which has been defining open software standards for decades, has introduced a definition for the concept of “open AI.” Now, for an AI model to be considered truly open, OSI requires access to the data used to train it, the full source code, and all the parameters and weights that determine its behavior. These new conditions could have a significant impact on the tech industry, as AI models such as Meta✴’s Llama do not meet these standards.

Image source: BrianPenny / Pixabay

Not surprisingly, Meta✴ takes a different view, believing that the OSI approach does not take into account the features of modern AI systems. Company spokeswoman Faith Eischen emphasized that Meta✴, while supporting many OSI initiatives, does not agree with the proposed definition because, in her words, “there is no single standard for open AI.” She also added that Meta✴ will continue to work with OSI and other organizations to ensure “responsible expansion of access to AI” regardless of formal criteria. At the same time, Meta✴ emphasizes that its Llama model is limited in commercial use in applications with an audience of more than 700 million users, which contradicts OSI standards, which imply complete freedom of use and modification.

The OSI principles, which define open source software standards, have been recognized and actively used by the developer community for 25 years. Thanks to these principles, developers can freely use the work of others without fear of legal claims. The new OSI definition for AI models suggests a similar application of openness principles, but for tech giants such as Meta✴ this could pose a serious challenge. Recently, the non-profit organization Linux Foundation also entered the discussion, offering its interpretation of “open AI,” which underscores the growing importance of this topic for the entire IT industry.

OSI Executive Director Stefano Maffulli noted that the development of the new definition of “open AI” took two years and included consultations with experts in the field of machine learning (ML) and natural language processing (NLP), philosophers, representatives of Creative Commons and other specialists . This process allowed OSI to create a definition that could become the basis for combating so-called “open washing,” where companies claim to be open but actually limit how their products can be used and modified.

Meta✴ explains its reluctance to disclose AI training data due to security concerns, but critics point to other motives, including minimizing legal risks and maintaining a competitive advantage. Many AI models are likely trained on copyrighted material. Thus, in the spring, The New York Times reported that Meta✴ recognized the presence of such content in its training data, since filtering it is almost impossible. While Meta✴ and other companies, including OpenAI and Perplexity, face lawsuits for possible copyright infringement, the Stable Diffusion AI model remains one of the few examples of open access to AI training data.

Maffulli sees Meta✴’s actions as parallels with Microsoft’s stance in the 1990s, when it viewed open source software as a threat to its business. Meta✴, according to Maffulli, emphasizes the volume of its investment in the Llama model, suggesting that such resource-intensive developments are within the capabilities of few. Meta✴’s use of proprietary training data, according to Maffulli, has become the “secret sauce” that allows corporations to maintain a competitive advantage and protect their intellectual property.

Leave a Reply

Your email address will not be published. Required fields are marked *