Meta showed AI for the metaverse and created an alternative to traditional large language models

Meta✴ reported on the results of the latest research in the field of artificial intelligence within the framework of the FAIR (Fundamental AI Research) projects. The company’s specialists have developed an AI model that is responsible for believable movements of virtual characters; a model that operates not with tokens—language units—but with concepts; and much more.

Image Source: Google DeepMind / unsplash.com

The Meta✴ Motivo model controls the movements of virtual humanoid characters when performing complex tasks. It was trained with reinforcement on an unlabeled array with data on the movements of the human body – this system can be used as an auxiliary system in designing the movements and body positions of characters. “Meta Motivo is capable of performing a wide range of full-body control tasks, including motion tracking and target posture, without any additional training or planning,” the company said.

An important achievement was the creation of a large conceptual model (Large Concept Model or LCM) – an alternative to traditional large language models. Meta✴ researchers have noticed that today’s advanced AI systems operate at the level of tokens—language units that typically represent a fragment of a word—but do not demonstrate explicit hierarchical reasoning. In LCM, the reasoning mechanism is separated from the linguistic representation – in a similar way, a person first forms a sequence of concepts, and then puts it into verbal form. Thus, when conducting a series of presentations on one topic, the speaker already has a formed series of concepts, but the wording in the speech may change from one event to another.

When generating a response to a query, LCM predicts a sequence not of tokens, but of concepts represented in full sentences in a multimodal and multilingual space. As the context on the input increases, the LCM architecture, according to the developers, appears to be more efficient at the computational level. In practice, this work will help improve the performance of language models with any modality, that is, data format, or when outputting responses in any language.

Image source: Meta✴

The Meta✴ Dynamic Byte Latent Transformer mechanism also offers an alternative to language tokens, but not by expanding them into concepts, but, on the contrary, by forming a hierarchical model at the byte level. This, according to the developers, increases efficiency when working with long sequences when training and running models. The Meta✴ Explore Theory-of-Mind companion tool is designed to instill social intelligence skills in AI models as they are trained, to evaluate the models’ performance on these tasks, and to fine-tune already trained AI systems. Meta✴ Explore Theory-of-Mind is not limited to a given range of interactions, but generates its own scenarios.

Meta✴ Memory Layers at Scale technology aims to optimize the actual memory mechanisms of large language models. As the number of parameters in models increases, working with actual memory requires more and more resources, and the new mechanism is aimed at saving them. The Meta✴ Image Diversity Modeling project, which is being implemented with the involvement of third-party experts, aims to increase the priority of AI-generated images that more accurately correspond to real-world objects; it also helps improve developer safety and responsibility when creating images using AI.

Meta✴ CLIP 1.2 model is a new version of the system designed to establish a connection between text and visual data. It is also used to train other AI models. The Meta✴ Video Seal tool is designed to create watermarks on AI-generated videos – this marking is invisible when viewing the video with the naked eye, but can be detected to determine the origin of the video. The watermark is preserved through editing, including blurring, and encoding using various compression algorithms. Finally, Meta✴ recalled the Flow Matching paradigm, which can be used to generate images, video, sound, and even three-dimensional structures, including protein molecules – this solution helps to use information about movement between different parts of the image and acts as an alternative to the diffusion mechanism.

admin

Share
Published by
admin

Recent Posts

Baldur’s Gate 3 Gets Its Last Major Update as Larian Says Goodbye to the Triumphant RPG

As promised, on April 15, the fantasy role-playing game Baldur’s Gate 3 from the Belgian…

23 minutes ago

Gigabyte Unveils GeForce RTX 5060 Ti and RTX 5060 in Aorus Elite, Gaming, Eagle, Aero, and Windforce Editions

Gigabyte has unveiled its versions of the GeForce RTX 5060 Ti graphics cards with 16…

23 minutes ago

AMD Unveils Amuse 3.0, AI Image Generation App for Ryzen and Radeon

AMD has unveiled Amuse 3.0, a software tool for AI image generation. The platform was…

24 minutes ago

Experts Warn of ‘Hallucinatory Hijack’ Scheme Attacking PC Developers Using AI

Cybersecurity researchers have warned of a new method by which hackers could abuse the “hallucinations”…

24 minutes ago