Data from thousands of now-private GitHub repositories is still available in Copilot, researchers find

Data that was publicly available online, even momentarily, can remain in the possession of generative AI online chatbots like Microsoft Copilot for a long time after access to it has been removed, according to research from Israeli cybersecurity company Lasso, which specializes in emerging generative AI threats.

Image Source: Windows/unsplash.com

The issue affects thousands of once-public GitHub repositories from a number of major companies, including Microsoft, that have since been closed, Lasso told TechCrunch.

According to Lasso co-founder Ofir Dror, the company discovered that content from its own GitHub repository was appearing in Copilot because it was indexed and cached by Microsoft’s Bing search engine. The repository was briefly open by mistake and is now private. Attempting to access it on GitHub results in a “Page not found” message.

«”On Copilot, oddly enough, we found one of our own private repositories,” Dror said. “If I were browsing the web, I wouldn’t see this data. But anyone who asks Copilot the right question can get it.”

In response, Lasso conducted an investigation that pulled a list of repositories that were publicly accessible at some point in 2024 and identified those that have since been removed or made private. Using Bing’s caching engine, the company found that more than 20,000 private GitHub repositories from over 16,000 organizations are still accessible through Copilot. This includes Amazon Web Services, Google, IBM, PayPal, Tencent, and Microsoft.

Dror said Lasso contacted all companies that were “seriously affected” by the data breach and advised them to rotate or revoke any compromised keys.

Lasso notified Microsoft of its findings in November 2024, but the software giant told it it considered the issue to be of “low severity,” saying the caching behavior was “acceptable.” Microsoft said it would no longer include Bing cache links in search results as of December 2024.

However, Lasso claims that even though the caching feature was disabled, Copilot still had access to the data, even though it was not reflected in web search results.

admin

Share
Published by
admin

Recent Posts

ChatGPT’s Main Competitor Learns to Do Deep Research and Dig into Gmail

Anthropic, the developer of the AI ​​assistant Claude, has unveiled a new tool called Research…

2 hours ago

ChatGPT’s Main Competitor Learns to Do Deep Research and Dig into Gmail

Anthropic, the developer of the AI ​​assistant Claude, has unveiled a new tool called Research…

2 hours ago

Vastarmor Unveils Factory Overclocked Radeon RX 9070 XT Alloy Without Backlighting

Vastarmor, AMD's partner in China, has unveiled the Radeon RX 9070 XT graphics card in…

2 hours ago

Vastarmor Unveils Factory Overclocked Radeon RX 9070 XT Alloy Without Backlighting

Vastarmor, AMD's partner in China, has unveiled the Radeon RX 9070 XT graphics card in…

2 hours ago

No Way Out: Atomic Releases ‘Split’-Style Keyboard — No Escape Key

Atomic Keyboards has unveiled a limited edition MDR Dasher keyboard inspired by the cult TV…

2 hours ago