ChatGPT has gained vision – the bot has learned to understand video broadcasts from a smartphone camera and screen

OpenAI announced that the ChatGPT chatbot, powered by generative intelligence, has the ability to process a video stream and talk to users about what it is “observing” using a smartphone or computer camera, or what it sees on the device’s screen. The new feature is available in Advanced Voice Mode.

Image source: OpenAI

The company announced that ChatGPT will receive a computer vision function that allows it to “see” using the user’s smartphone camera or through screen broadcasting. Owners of paid ChatGPT Plus, Team and Pro subscriptions now have access to ChatGPT Enhanced Voice with Video Recognition. The company says ChatGPT Enterprise and Edu subscribers won’t get the feature until January, and that there is no timetable for its launch in the EU, Switzerland, Iceland, Norway and Liechtenstein.

In a recent demo on CNN’s 60 Minutes, OpenAI President Greg Brockman tested an advanced voice mode with visual recognition with TV host Anderson Cooper on the chatbot’s anatomical skills. When Cooper drew body parts on the board, ChatGPT “understood” what he was drawing. At the same time, ChatGPT made an error in a geometry task in this mode, which indicates its tendency to hallucinate.

Since its announcement in May, the company has delayed the launch of an enhanced voice mode with visual recognition several times. In April, OpenAI promised that the mode would be available to users “within a few weeks” but admitted months later that it would take longer than planned. And when Enhanced Voice launched for some users in September, it didn’t have computer vision functionality.

Google and Meta✴ are also working on similar capabilities for their chatbots. This week, Google made its real-time video analysis AI feature Project Astra available to a group of “trusted testers” on the Android platform.

admin

Share
Published by
admin

Recent Posts

Nvidia to Release Simplified Blackwell Accelerator for China Costing $6,500-8,000

Nvidia is developing a stripped-down and cheaper AI accelerator based on the Blackwell architecture specifically…

21 hours ago

German court rules that websites must have a button to reject all cookies at once

The Hanover Administrative Court has issued a ruling that tightens the protection of digital privacy…

21 hours ago

Nuclear restructuring has begun in the US — Trump wants to step up construction of nuclear power plants

The day before, Donald Trump signed a series of executive orders that will lead to…

21 hours ago

To Fix Problems at His Companies, Elon Musk Decides to Sleep on the Job

After widespread user complaints about unstable operation of X on Friday and Saturday, Elon Musk…

21 hours ago

The Midnight Walk – A Spark That Will Catch a Flame. Review

PlayStation 5 played MoonHood Studio was founded in 2023, but its employees have been working…

21 hours ago

“What do you see: craters or bulges?” – Japanese probe Resilience photographs the south pole of the Moon

The Japanese private probe Resilience has taken a high-quality photo of the Moon's south pole…

2 days ago