Meta showed AI for the metaverse and created an alternative to traditional large language models

Meta reported on the results of the latest research in the field of artificial intelligence within the framework of the FAIR (Fundamental AI Research) projects. The company’s specialists have developed an AI model that is responsible for believable movements of virtual characters; a model that operates not with tokens—language units—but with concepts; and much more.

Image Source: Google DeepMind / unsplash.com

The Meta Motivo model controls the movements of virtual humanoid characters when performing complex tasks. It was trained with reinforcement on an unlabeled array with data on the movements of the human body – this system can be used as an auxiliary system in designing the movements and body positions of characters. “Meta Motivo is capable of performing a wide range of full-body control tasks, including motion tracking and target posture, without any additional training or planning,” the company said.

An important achievement was the creation of a large conceptual model (Large Concept Model or LCM) – an alternative to traditional large language models. Meta researchers have noticed that today’s advanced AI systems operate at the level of tokens—language units that typically represent a fragment of a word—but do not demonstrate explicit hierarchical reasoning. In LCM, the reasoning mechanism is separated from the linguistic representation – in a similar way, a person first forms a sequence of concepts, and then puts it into verbal form. Thus, when conducting a series of presentations on one topic, the speaker already has a formed series of concepts, but the wording in the speech may change from one event to another.

When generating a response to a query, LCM predicts a sequence not of tokens, but of concepts represented in full sentences in a multimodal and multilingual space. As the context on the input increases, the LCM architecture, according to the developers, appears to be more efficient at the computational level. In practice, this work will help improve the performance of language models with any modality, that is, data format, or when outputting responses in any language.

Image source: Meta

The Meta Dynamic Byte Latent Transformer mechanism also offers an alternative to language tokens, but not by expanding them into concepts, but, on the contrary, by forming a hierarchical model at the byte level. This, according to the developers, increases efficiency when working with long sequences when training and running models. The Meta Explore Theory-of-Mind companion tool is designed to instill social intelligence skills in AI models as they are trained, to evaluate the models’ performance on these tasks, and to fine-tune already trained AI systems. Meta Explore Theory-of-Mind is not limited to a given range of interactions, but generates its own scenarios.

Meta Memory Layers at Scale technology aims to optimize the actual memory mechanisms of large language models. As the number of parameters in models increases, working with actual memory requires more and more resources, and the new mechanism is aimed at saving them. The Meta Image Diversity Modeling project, which is being implemented with the involvement of third-party experts, aims to increase the priority of AI-generated images that more accurately correspond to real-world objects; it also helps improve developer safety and responsibility when creating images using AI.

Meta CLIP 1.2 model is a new version of the system designed to establish a connection between text and visual data. It is also used to train other AI models. The Meta Video Seal tool is designed to create watermarks on AI-generated videos – this marking is invisible when viewing the video with the naked eye, but can be detected to determine the origin of the video. The watermark is preserved through editing, including blurring, and encoding using various compression algorithms. Finally, Meta recalled the Flow Matching paradigm, which can be used to generate images, video, sound, and even three-dimensional structures, including protein molecules – this solution helps to use information about movement between different parts of the image and acts as an alternative to the diffusion mechanism.

Meta showed AI for the metaverse and created an alternative to traditional large language models

Related Post

After his arrest, Durov explained to French police how to correctly send requests to Telegram

After his arrest, Durov explained to French police how to correctly send requests to Telegram

Ubisoft accidentally leaks Star Wars Outlaws’ A Pirate’s Fortune story expansion — new trailer and release date

Leave a Reply Cancel reply

You missed

After his arrest, Durov explained to French police how to correctly send requests to Telegram

After his arrest, Durov explained to French police how to correctly send requests to Telegram

Ubisoft accidentally leaks Star Wars Outlaws’ A Pirate’s Fortune story expansion — new trailer and release date

Ubisoft accidentally leaks Star Wars Outlaws’ A Pirate’s Fortune story expansion — new trailer and release date