Apple said its OpenELM artificial intelligence model does not underpin any of the AI or machine learning features in the company’s commercial products, including Apple Intelligence. The company drew attention to this because it became known that data of dubious origin was used when training OpenELM.
It was previously revealed that Apple and other technology giants used subtitles from YouTube videos to train their AI models, including materials from the platform’s largest video bloggers. This data was included in the public Pile array, which is published by the non-profit organization EleutherAI – it included subtitles downloaded from YouTube, that is, in fact, transcripts of video recordings, which is a direct violation of the rules of the platform.
Apple said OpenELM is the company’s contribution to the research community—work that advances the creation of open, large language models. OpenELM, the company told 9to5Mac, was created solely for research purposes and not to provide any functions of the Apple Intelligence system. The model is published as open source and is available to anyone, including on the Apple Machine Learning Research website section.
Since OpenELM is not part of the Apple Intelligence system, the allegedly illegally obtained YouTube subtitles also have nothing to do with the commercial system – Apple has previously emphasized that Apple Intelligence was trained “on licensed data, including data selected to improve certain functions, and also publicly available data collected by our web crawler.” Apple has no plans to develop new versions of OpenELM.