Categories: Artificial Intelligence, Machine Learning, Neural NetworksTechnology and IT market. news

AI has surpassed scientists in generating innovative research ideas

Large language models (LLMs) have proven capable of generating new research ideas at the expert level. Moreover, according to a new study, these ideas turned out to be more original and interesting than those proposed by experts. This calls into question the uniqueness of human intelligence in the field of scientific innovation and opens new horizons for the development of AI in the scientific community.

Image source: NickyPe/Pixabay

Advances in large language models have sparked a wave of enthusiasm among researchers. It turns out that AI models such as OpenAI’s ChatGPT and Anthropic’s Claude are capable of independently generating and confirming new scientific hypotheses. It was believed that the creation of new knowledge and making scientific discoveries are the exclusive prerogative of humans, in contrast to the mechanical combination of AI knowledge from training data. However, having already supplanted humans in the areas of artistic expression, music, and programming, AI has now taken aim at science, showing the ability to generate research ideas that are, on average, newer than those proposed by scientists.

To test this hypothesis, research was conducted in the field of natural language processing (NLP). NLP is a field of AI that deals with communication between humans and AI in a language that both parties understand. It covers not only basic syntax, but also the nuances of language, understanding of context, and, more recently, even verbal tone and emotional nuances of speech. The study involved 100 NLP experts (PhDs and doctors from 36 different institutes), who entered into a kind of scientific competition with “idea agents” based on LLM. The goal was to find out whose research ideas would be more original, interesting and feasible.

To ensure the integrity of the experiment, 49 experts formulated ideas on 7 specific topics in the field of NLP, while an AI specially trained by the researchers generated ideas on the same topics. To motivate brainstormers to produce quality ideas, $300 was paid for each concept the experts proposed, and each of the top five human ideas received an additional $1,000. Once the project was completed, LLM was used to standardize the writing styles of each paper while maintaining the original content to even the odds and make the research as unbiased as possible.

Comparing the novelty of scientific ideas proposed by humans and AI under three conditions: human ideas, AI ideas, and human-edited AI ideas. Image Source: @ChengleiSi/X

All submitted papers were then reviewed by 79 external experts who blindly assessed all research ideas. The panel of experts submitted 298 reviews, giving each idea two to four independent reviews. The results were amazing. AI-generated ideas received statistically significantly higher ratings for novelty and excitement compared to human ideas. However, AI ideas were slightly lower in feasibility and slightly higher in effectiveness than human ideas, although these differences were not statistically significant.

The study also revealed some shortcomings in AI performance, such as a lack of diversity of ideas. Even with clear instructions not to repeat themselves, the AI quickly forgot about it. Additionally, the AI was unable to consistently test and evaluate ideas and received low scores for agreeing with human judgments. It is important to note that the study also revealed certain limitations in the methodology. In particular, assessing the “originality” of an idea, even by a group of experts, remains subjective, so it is planned to conduct a more comprehensive study in which ideas generated by both AI and humans will be fully formalized into projects, which will allow for a more in-depth study of their impact in real life. scenarios. However, the first results of the study are certainly impressive.

Compare the ratings of scientific ideas proposed by humans and AI according to five key criteria: novelty, exciting, feasibility, effectiveness and overall assessment. Image Source: @ChengleiSi/X

Today, when AI models, although becoming incredibly powerful tools, they still suffer from their unreliability and tendency to “hallucinate,” which in the context of a scientific approach that requires absolute accuracy and reliability of information becomes critical. By some estimates, at least 10% of scientific papers are now co-authored by AI. On the other hand, do not underestimate the potential of AI to accelerate progress in some areas of human activity. A striking example of this is DeepMind’s GNoME system, which in a few months has achieved the equivalent of about 800 years of research in materials science, generating the structure of about 380,000 new inorganic crystals, capable of revolutionizing a variety of fields.

AI is now the fastest growing technology humanity has ever seen, and so it is reasonable to expect that many of its shortcomings will be corrected within the next couple of years. Many AI researchers believe that humanity is approaching the birth of general superintelligence—the point at which general-purpose AI will surpass human expertise in virtually every field. The ability of AI to generate more original and exciting ideas than scientists can lead to a rethinking of the process of scientific discovery and the role of humans in it.

admin

Next T-Mobile tested sending emergency messages to smartphones via Starlink satellites »

Previous « Major automakers are massively postponing the full transition to electric vehicles

Video: metroidvania trailer Ender Magnolia: Bloom in the Mist on the occasion of its release from early access, where it collected 98% positive reviews

Publisher Binary Haze Interactive, together with developers from Live Wire and Adglobe studios, have released…

34 minutes ago

In the United States, the developers of Genshin Impact will be required to pay a $20 million fine and close donations to the game for children under 16 years of age.

Chinese HoYoverse, the developer of Genshin Impact, has agreed to pay a fine of $20…

7 hours ago

Chinese developers of robots and self-driving electric vehicles believe they are ahead of American competitors in a number of areas

US sanctions against China are aimed at curbing the technological development of the latter country,…

12 hours ago

AI has surpassed scientists in generating innovative research ideas

Recent Posts

Video: metroidvania trailer Ender Magnolia: Bloom in the Mist on the occasion of its release from early access, where it collected 98% positive reviews

In the United States, the developers of Genshin Impact will be required to pay a $20 million fine and close donations to the game for children under 16 years of age.

Photos of Radeon RX 9070 video cards from Asus TUF Gaming and Prime have been published

Apple, along with TikTok, removed a dozen other ByteDance apps from the App Store

TikTok stopped working in the US prematurely

Chinese developers of robots and self-driving electric vehicles believe they are ahead of American competitors in a number of areas