Just two months after the release of the large language model Llama 3.1, Meta✴ introduced its updated version Llama 3.2 – the first open artificial intelligence system that can process both images and text.
Image source: Gerd Altmann / pixabay.com
Meta✴ Llama 3.2 allows developers to create advanced AI applications: augmented reality platforms with real-time video recognition; visual search engines with content-based image sorting; as well as document analysis systems with the preparation of summaries of long text fragments. It will be quite simple for developers to launch a new model, they say in Meta✴ – they will need to add support for multimodality, “be able to show Llama images and make it communicate.”
OpenAI and Google launched their own multimodal AI models last year, so Meta✴ now finds itself in a catching-up position. Imaging support is important as Meta✴ continues to expand AI capabilities across devices, including Ray-Ban Meta✴ glasses. The Llama 3.2 package includes two models that support working with images (with 11 and 90 billion parameters) and two lightweight text models (with 1 and 3 billion parameters). The smaller ones are designed to run on chips from Qualcomm, MediaTek and other Arm processors – Meta✴ clearly expects them to be used on mobile devices. That said, Llama 3.1, released in July, is still a strong offering – one version has 405 billion parameters, and should outperform newer ones in text generation.
One of my first memories (or perhaps the very first one – is it possible…
A few days before the official presentation, details about the new flagship Sony Xperia 1…
Bloomberg journalist Jason Schreier reported on the domino effect triggered by the recent delay of…
Nintendo has updated its user agreement, formalizing the right to remotely disable Switch consoles if…
Gigabyte has unveiled the X870 Aorus Stealth and B850 Aorus Stealth motherboards for Ryzen 7000,…
Alienware, a subsidiary of Dell known for its futuristic gaming laptops, has released new high-performance…