Waymo and Gemini will teach robotaxis to cope with difficult traffic situations

Waymo, a subsidiary of Alphabet, has introduced a new approach to training its self-driving vehicles using the Gemini model, a large multimodal language model (MLLM) from Google. The model will improve the navigation of autonomous cars and allow them to better cope with complex road situations.

Image source: waymo.com

In a new research paper, Waymo defined its development as an “end-to-end multimodal model for autonomous driving” (EMMA), which processes sensor data and helps robotaxis make decisions about the direction of travel while avoiding obstacles. According to The Verge, Waymo has long emphasized its strategic advantage due to access to scientific research in the field of artificial intelligence (AI) Google DeepMind, formerly owned by the British company DeepMind Technologies.

The new EMMA system represents a fundamentally different approach to training autonomous vehicles. Instead of traditional modular systems that separate functions into perception, route planning and other tasks, EMMA offers a unified approach that will allow data to be processed holistically, help avoid errors that occur when transferring data between modules, and improve adaptation to new, unfamiliar road conditions in real time.

One of the key benefits of using MLLM models, in particular Gemini, is their ability to generalize knowledge gleaned from vast amounts of data obtained from the Internet. This allows the models to better adapt to unusual situations on the road, such as the unexpected appearance of animals or repair work. Additionally, models trained on Gemini are capable of “chain of reasoning.” It is a technique that helps break down complex problems into sequential, logical steps, improving decision making.

Despite its successes, Waymo acknowledges that EMMA has its limitations. For example, the model does not yet support processing 3D data from sensors such as lidar or radar due to high computational complexity. Additionally, EMMA can only process a limited number of image frames at a time. It is emphasized that further research will be required to overcome all these limitations before the full implementation of the model in real conditions.

Waymo also recognizes the risks associated with using MLLM to drive autonomous vehicles. Models like Gemini can make mistakes or “hallucinate” in simple tasks, which of course is unacceptable on the road. However, it is hoped that further research and improvements in the architecture of AI models for autonomous driving will overcome these problems.

admin

Share
Published by
admin

Recent Posts

Multimode wireless keyboard RAPOO E9350L: a compact and convenient tool for work

The keyboard is, of course, a very important component of the computer, because the comfort…

7 minutes ago

The Chinese trained an analogue of GPT-4 with only 2000 chips and 33 times cheaper than OpenAI

Image source: Copilot 01.ai's achievement is especially noteworthy given the limited access Chinese companies have…

16 minutes ago

Epic Games has made classic shooters Unreal and Unreal Tournament free

American studio Epic Games has given users free access to two classic games from 25…

16 minutes ago

Women’s versions of the HUAWEI WATCH GT 5 Pro smartwatch: white ceramics and grace

It’s worth saying right away that functionally the 46mm and 42mm versions of the HUAWEI…

16 minutes ago

Startup Akash Systems introduced GPU cooling technology using diamonds

US startup Akash Systems announced the development of diamond cooling technology for GPUs, which can…

1 hour ago

Intel was forced to cut up to 20% of developers in Israel

According to Globes, the current staff cuts at Intel in Israel were not only one…

2 hours ago