Google presented the latest AI model for generating videos from text descriptions, Veo 3, at the I/O 2025 conference. It creates not only a picture, but also an audio accompaniment. Unlike its analogues, the algorithm understands the content of frames and creates audio without additional hints. And to protect against deepfakes, all videos will be marked with an invisible watermark.

Image source: Google

The algorithm can create sound effects, background noises, and even dialogues, synchronizing them with the image. According to Demis Hassabis, head of Google DeepMind, users can specify descriptions of characters, environments, and even specify how lines should sound. The company does not disclose what data Veo 3 was trained on, but, most likely, as TechCrunch writes, YouTube materials were used, since Google, which owns this platform, has previously confirmed that its content “can” be used to train models.

The generative video market is already crowded, with Runway, OpenAI, Alibaba, and dozens of startups releasing similar models. But Google has gone further, introducing full-fledged audio. DeepMind previously developed video-to-audio technology, which likely formed the basis for the new system, which analyzes video pixels and automatically selects appropriate audio. To combat the spread of misinformation and deepfakes, all Veo 3 videos are marked with an invisible embedded watermark, SynthID.

At the same time, many artists and animators are expressing concern about what is happening. According to a study commissioned by the Hollywood Animation Guild, by 2026, about 100,000 jobs in the film, television, and animation industries in the United States could be lost due to AI.

Experts say Veo 3 could be a serious contender in the crowded generative video market — provided Google delivers on its audio quality promises. It’s already available in the Gemini app for subscribers of the company’s $249-per-month AI Ultra plan.

Leave a Reply

Your email address will not be published. Required fields are marked *