Meta showed AI for the metaverse and created an alternative to traditional large language models

Meta✴ reported on the results of the latest research in the field of artificial intelligence within the framework of the FAIR (Fundamental AI Research) projects. The company’s specialists have developed an AI model that is responsible for believable movements of virtual characters; a model that operates not with tokens—language units—but with concepts; and much more.

Image Source: Google DeepMind / unsplash.com

The Meta✴ Motivo model controls the movements of virtual humanoid characters when performing complex tasks. It was trained with reinforcement on an unlabeled array with data on the movements of the human body – this system can be used as an auxiliary system in designing the movements and body positions of characters. “Meta Motivo is capable of performing a wide range of full-body control tasks, including motion tracking and target posture, without any additional training or planning,” the company said.

An important achievement was the creation of a large conceptual model (Large Concept Model or LCM) – an alternative to traditional large language models. Meta✴ researchers have noticed that today’s advanced AI systems operate at the level of tokens—language units that typically represent a fragment of a word—but do not demonstrate explicit hierarchical reasoning. In LCM, the reasoning mechanism is separated from the linguistic representation – in a similar way, a person first forms a sequence of concepts, and then puts it into verbal form. Thus, when conducting a series of presentations on one topic, the speaker already has a formed series of concepts, but the wording in the speech may change from one event to another.

When generating a response to a query, LCM predicts a sequence not of tokens, but of concepts represented in full sentences in a multimodal and multilingual space. As the context on the input increases, the LCM architecture, according to the developers, appears to be more efficient at the computational level. In practice, this work will help improve the performance of language models with any modality, that is, data format, or when outputting responses in any language.

Image source: Meta✴

The Meta✴ Dynamic Byte Latent Transformer mechanism also offers an alternative to language tokens, but not by expanding them into concepts, but, on the contrary, by forming a hierarchical model at the byte level. This, according to the developers, increases efficiency when working with long sequences when training and running models. The Meta✴ Explore Theory-of-Mind companion tool is designed to instill social intelligence skills in AI models as they are trained, to evaluate the models’ performance on these tasks, and to fine-tune already trained AI systems. Meta✴ Explore Theory-of-Mind is not limited to a given range of interactions, but generates its own scenarios.

Meta✴ Memory Layers at Scale technology aims to optimize the actual memory mechanisms of large language models. As the number of parameters in models increases, working with actual memory requires more and more resources, and the new mechanism is aimed at saving them. The Meta✴ Image Diversity Modeling project, which is being implemented with the involvement of third-party experts, aims to increase the priority of AI-generated images that more accurately correspond to real-world objects; it also helps improve developer safety and responsibility when creating images using AI.

Meta✴ CLIP 1.2 model is a new version of the system designed to establish a connection between text and visual data. It is also used to train other AI models. The Meta✴ Video Seal tool is designed to create watermarks on AI-generated videos – this marking is invisible when viewing the video with the naked eye, but can be detected to determine the origin of the video. The watermark is preserved through editing, including blurring, and encoding using various compression algorithms. Finally, Meta✴ recalled the Flow Matching paradigm, which can be used to generate images, video, sound, and even three-dimensional structures, including protein molecules – this solution helps to use information about movement between different parts of the image and acts as an alternative to the diffusion mechanism.

admin

Share
Published by
admin

Recent Posts

FTC’s ‘One-Click Unsubscribe’ Rule Delayed Again, But Not for Long

The US Federal Trade Commission (FTC) has delayed a rule that would require companies to…

10 hours ago

KIBORG: left – crown, right – augmented. Review

Played on PC Developers from Sobaka Studio have built a reputation for themselves as authors…

12 hours ago

A simple and reliable speedometer for satellites has been invented in the USA

Satellites move in orbit at speeds of thousands of kilometers per hour, and without precise…

15 hours ago

Schoolchildren are starting to stick metal objects into Chromebook ports en masse for TikTok likes

A dangerous new TikTok challenge has gone viral in which American schoolchildren are deliberately damaging…

15 hours ago

Apple is developing powerful Baltra processors for AI servers, as well as consumer M5, M6 and M7

Apple is developing processors for data centers that will serve requests from Apple Intelligence artificial…

16 hours ago

Despelote — goo-o-o-o-o-o-o-o-o-o-ol! Review

One of my first memories (or perhaps the very first one – is it possible…

1 day ago