Anthropic has announced the launch of an expanded vulnerability hunting program, offering third-party cybersecurity experts up to $15,000 in rewards for identifying critical issues in its artificial intelligence systems.

Image source: Copilot

The initiative aims to find “universal evasion techniques,” that is, hacking techniques that can consistently bypass AI security measures in high-risk areas such as chemical, biological, radiological and nuclear threats, as well as in the cyber domain. According to VentureBeat, Anthropic will invite ethical hackers to test its system before its public launch, to immediately prevent potential exploits that could lead to abuse of its AI systems.

Interestingly, this approach differs from the strategies of other major players in the field of AI. For example, OpenAI and Google have bounty programs, but they focus more on traditional software vulnerabilities rather than AI industry-specific exploits. Additionally, Meta✴ has recently come under fire for its relatively veiled stance on AI safety research. On the contrary, Anthropic’s clear focus on openness sets a new standard for transparency on this issue.

However, the effectiveness of vulnerability scanning programs in addressing the full range of AI security problems remains controversial. Experts note that a more comprehensive approach may be required, including extensive testing, improved interpretability and perhaps new governance structures needed to ensure AI systems globally align with human values.

The program starts as an invitation-only initiative (closed testing) in partnership with the renowned HackerOne platform, but in the future Anthropic plans to expand the program by making it open and creating a separate independent model for industry collaboration on AI security.

Leave a Reply

Your email address will not be published. Required fields are marked *