Meta showed AI for the metaverse and created an alternative to traditional large language models

Meta✴ reported on the results of the latest research in the field of artificial intelligence within the framework of the FAIR (Fundamental AI Research) projects. The company’s specialists have developed an AI model that is responsible for believable movements of virtual characters; a model that operates not with tokens—language units—but with concepts; and much more.

Image Source: Google DeepMind / unsplash.com

The Meta✴ Motivo model controls the movements of virtual humanoid characters when performing complex tasks. It was trained with reinforcement on an unlabeled array with data on the movements of the human body – this system can be used as an auxiliary system in designing the movements and body positions of characters. “Meta Motivo is capable of performing a wide range of full-body control tasks, including motion tracking and target posture, without any additional training or planning,” the company said.

An important achievement was the creation of a large conceptual model (Large Concept Model or LCM) – an alternative to traditional large language models. Meta✴ researchers have noticed that today’s advanced AI systems operate at the level of tokens—language units that typically represent a fragment of a word—but do not demonstrate explicit hierarchical reasoning. In LCM, the reasoning mechanism is separated from the linguistic representation – in a similar way, a person first forms a sequence of concepts, and then puts it into verbal form. Thus, when conducting a series of presentations on one topic, the speaker already has a formed series of concepts, but the wording in the speech may change from one event to another.

When generating a response to a query, LCM predicts a sequence not of tokens, but of concepts represented in full sentences in a multimodal and multilingual space. As the context on the input increases, the LCM architecture, according to the developers, appears to be more efficient at the computational level. In practice, this work will help improve the performance of language models with any modality, that is, data format, or when outputting responses in any language.

Image source: Meta✴

The Meta✴ Dynamic Byte Latent Transformer mechanism also offers an alternative to language tokens, but not by expanding them into concepts, but, on the contrary, by forming a hierarchical model at the byte level. This, according to the developers, increases efficiency when working with long sequences when training and running models. The Meta✴ Explore Theory-of-Mind companion tool is designed to instill social intelligence skills in AI models as they are trained, to evaluate the models’ performance on these tasks, and to fine-tune already trained AI systems. Meta✴ Explore Theory-of-Mind is not limited to a given range of interactions, but generates its own scenarios.

Meta✴ Memory Layers at Scale technology aims to optimize the actual memory mechanisms of large language models. As the number of parameters in models increases, working with actual memory requires more and more resources, and the new mechanism is aimed at saving them. The Meta✴ Image Diversity Modeling project, which is being implemented with the involvement of third-party experts, aims to increase the priority of AI-generated images that more accurately correspond to real-world objects; it also helps improve developer safety and responsibility when creating images using AI.

Meta✴ CLIP 1.2 model is a new version of the system designed to establish a connection between text and visual data. It is also used to train other AI models. The Meta✴ Video Seal tool is designed to create watermarks on AI-generated videos – this marking is invisible when viewing the video with the naked eye, but can be detected to determine the origin of the video. The watermark is preserved through editing, including blurring, and encoding using various compression algorithms. Finally, Meta✴ recalled the Flow Matching paradigm, which can be used to generate images, video, sound, and even three-dimensional structures, including protein molecules – this solution helps to use information about movement between different parts of the image and acts as an alternative to the diffusion mechanism.

admin

Share
Published by
admin

Recent Posts

Supply volumes of graphics solutions for PCs grew by 6% at the end of 2024

IDC statistics say that last year, PC shipments increased by 1% to 262.7 million units,…

24 minutes ago

Nvidia Reveals More Blackwell Architecture Details for GeForce RTX 50 Series Graphics Cards

At CES 2025, Nvidia revealed its new Blackwell GPU architecture, which will be the basis…

1 hour ago

Review and test of PCCooler RT500 Digital cooler: just add a fan

In the lineup of almost four dozen processor coolers from PCCooler, the new RT500 Digital…

5 hours ago

Not just a graphics upgrade: a leak confirmed new details of the remake of Assassin’s Creed IV: Black Flag

The MP1st portal shared details of an unconfirmed remake of the open-world pirate action game…

6 hours ago

Microsoft launched free Copilot Chat for business with paid AI agents

Microsoft announced the launch of Copilot, a free AI service for business, now called Microsoft…

6 hours ago

Assassin’s Creed Valhalla and Origins finally made friends with Windows 11 24H2, but Odyssey is still broken

Ubisoft continues to deal with the consequences of the November Windows 11 24H2 update. Two…

6 hours ago