Categories: Network newsTechnology and IT market. news

OpenAI will improve the safety of its AI models using a “hierarchy of instructions”

OpenAI has developed a new technique called Instruction Hierarchy to improve the security of its large language models (LLMs). This method, first used in the new GPT-4o Mini, aims to prevent unwanted AI behavior caused by unscrupulous users manipulating certain commands.

Image source: Copilot

OpenAI API platform lead Olivier Godement explained that the “hierarchy of instructions” will prevent dangerous injections of prompts using hidden hints that users use to bypass the limitations and initial settings of the model, and block “ignore all previous instructions” attacks.

The new method, according to The Verge, gives priority to the developer’s original instructions, making the model less susceptible to end-user attempts to force it to perform unwanted actions. In the event of a conflict between system instructions and user commands, the model will give highest priority to system instructions, refusing to perform injections.

OpenAI researchers believe that other, more sophisticated protections will be developed in the future, especially for agent-based use cases in which AI agents are created by developers for their own applications. Given that OpenAI faces ongoing security challenges, the new method applied to the GPT-4o Mini has significant implications for its subsequent approach to AI model development.

admin

Next Lack of contract progress forced Samsung to postpone construction of a plant in South Korea »

Previous « AMD said its Ryzen AI 300 processors are faster than the Apple M3 Pro

TikTok stopped working in the US prematurely

Short video service TikTok has stopped working in the United States. This happened after months…

4 minutes ago

Chinese developers of robots and self-driving electric vehicles believe they are ahead of American competitors in a number of areas

US sanctions against China are aimed at curbing the technological development of the latter country,…

4 hours ago

OpenAI will improve the safety of its AI models using a “hierarchy of instructions”

Recent Posts

TikTok stopped working in the US prematurely

Chinese developers of robots and self-driving electric vehicles believe they are ahead of American competitors in a number of areas

Scientists have found a way to ensure fast charging and long service life of lithium-sulfur batteries

The US government considers GlobalFoundries a good candidate to save Intel

Microsoft and Ubisoft have solved the problem of Assassin’s Creed compatibility with Windows 11 24H2

Windows 11 will become smarter: Microsoft is testing AI file search