Artificial intelligence company Anthropic is collaborating with the US Department of Energy to conduct unique tests on its Claude 3 Sonnet model. The purpose of the experiment is to test the ability of AI to “not share” potentially dangerous information related to nuclear energy, in particular, the creation of weapons.
As Axios has learned, experts from the National Nuclear Security Administration (NNSA) at the US Department of Energy have been checking the Claude 3 Sonnet model since April of this year to make sure that it cannot be used to create atomic weapons. During “red requests,” experts manipulate the system, trying to “break” it.
Anthropic says such tests, taking place in a top secret environment, are the first of their kind and could pave the way for similar relationships with other government agencies. “While American industry is leading the way in developing cutting-edge AI models, the federal government is gaining unique expertise needed to evaluate AI systems for specific national security risks,” said Marina Favaro, head of national security policy at Anthropic.
NNSA representatives also emphasized the importance of working in this direction. Wendin Smith, NNSA deputy administrator, said AI is “a key technology that requires continued attention in the context of national security.”
Anthropic plans to continue working with the government to develop stronger security measures for its systems. The pilot program, which also tests the newer Claude 3.5 Sonnet, will last until February 2024. The company promises to share test results with scientific laboratories and other organizations.