Waymo and Gemini will teach robotaxis to cope with difficult traffic situations

Waymo, a subsidiary of Alphabet, has introduced a new approach to training its self-driving vehicles using the Gemini model, a large multimodal language model (MLLM) from Google. The model will improve the navigation of autonomous cars and allow them to better cope with complex road situations.

Image source: waymo.com

In a new research paper, Waymo defined its development as an “end-to-end multimodal model for autonomous driving” (EMMA), which processes sensor data and helps robotaxis make decisions about the direction of travel while avoiding obstacles. According to The Verge, Waymo has long emphasized its strategic advantage due to access to scientific research in the field of artificial intelligence (AI) Google DeepMind, formerly owned by the British company DeepMind Technologies.

The new EMMA system represents a fundamentally different approach to training autonomous vehicles. Instead of traditional modular systems that separate functions into perception, route planning and other tasks, EMMA offers a unified approach that will allow data to be processed holistically, help avoid errors that occur when transferring data between modules, and improve adaptation to new, unfamiliar road conditions in real time.

One of the key benefits of using MLLM models, in particular Gemini, is their ability to generalize knowledge gleaned from vast amounts of data obtained from the Internet. This allows the models to better adapt to unusual situations on the road, such as the unexpected appearance of animals or repair work. Additionally, models trained on Gemini are capable of “chain of reasoning.” It is a technique that helps break down complex problems into sequential, logical steps, improving decision making.

Despite its successes, Waymo acknowledges that EMMA has its limitations. For example, the model does not yet support processing 3D data from sensors such as lidar or radar due to high computational complexity. Additionally, EMMA can only process a limited number of image frames at a time. It is emphasized that further research will be required to overcome all these limitations before the full implementation of the model in real conditions.

Waymo also recognizes the risks associated with using MLLM to drive autonomous vehicles. Models like Gemini can make mistakes or “hallucinate” in simple tasks, which of course is unacceptable on the road. However, it is hoped that further research and improvements in the architecture of AI models for autonomous driving will overcome these problems.

admin

Share
Published by
admin

Recent Posts

Scientists have found a way to ensure fast charging and long service life of lithium-sulfur batteries

Two independent research groups have reported an advance in the development of lithium-sulfur batteries that…

2 hours ago

The US government considers GlobalFoundries a good candidate to save Intel

Until now, it was believed that large suppliers of semiconductor products such as Qualcomm and…

3 hours ago

Microsoft and Ubisoft have solved the problem of Assassin’s Creed compatibility with Windows 11 24H2

Microsoft has lifted restrictions on updating Windows 11 to version 24H2 for computers running Assassin's…

3 hours ago

Windows 11 will become smarter: Microsoft is testing AI file search

Microsoft is testing a new artificial intelligence (AI)-powered search feature in the latest build for…

4 hours ago

Merger instead of sale: Perplexity AI wants to save TikTok in the US

Perplexity AI proposed on Saturday, a day before TikTok was blocked in the United States,…

4 hours ago