Improving Generative AI by Mimicking the Scientific Method

A recent article in Nature discussed the results of a ChatGPT-based AI system designed to mimic the scientific process of analyzing data and generating a scientific data report from scratch. The results were both impressive and concerning and suggest at least two ways of improving generative AI tools:

Designing the AI system to incorporate evaluation of novelty and validity of the results given existing knowledge.
Providing transparency of how the AI system worked to allow testing and repetition of the reported results.

Prompting AI to evaluate generated results for novelty to ensure that the AI system continues beyond the production of the first trivial solution or idea

The generative AI system discussed in Nature produced a well-written manuscript based on a solid data set analysis. The problem was that the manuscript reported “eating more fruit and vegetables and exercising is linked to a lower risk of diabetes” as a finding that “addresses a gap in the literature” even though this finding is very well-known.

The reported generative AI system was designed to mimic the scientific process by allowing it to explore data, determine the study goal, analyze the data given the study goal, and prepare a report. The system even included reviewing the initial text to mimic the scientific peer review process. However, the review part of the AI system’s results needed further evaluation of the results given existing knowledge to determine novelty and validity, which is an integral part of scientific peer review.

Accordingly, the generative AI system may be further improved by prompting it to compare the initial results with existing literature, conduct further data exploration, and generate new hypotheses if the initial results are already known or not valid. It will be interesting to see if AI systems can help accelerate this iterative and time-consuming problem-solving process familiar to scientists and lawyers alike.

Transparency of the AI system is essential for providing testable and repeatable results

A scientific hypothesis should be testable and results should be repeatable. Accordingly, the researchers behind this study highlighted that the detailed process and code used by the AI system to explore data, develop study goals, analyze, interpret, and report the results is included with the generated paper so that anyone can understand and repeat the AI-based process that yielded the reported paper.

Designing improved AI systems that mimic the scientific process to accelerate problem solving

The ability of AI to quickly explore large data sets and identify meaningful patterns suggests that AI systems may be beneficial for generating new ideas for solving problems that would be difficult for humans alone to identify. However, this process needs to incorporate evaluation of the creativity and validity of the AI-generated results given existing knowledge to ensure that the AI system continues beyond the production of the first trivial solution or idea.

A pair of scientists has produced a research paper in less than an hour with the help of ChatGPT — a tool driven by artificial intelligence (AI) that can understand and generate human-like text. The article was fluent, insightful, and presented in the expected structure for a scientific paper, but researchers say that there are many hurdles to overcome before the tool can be truly helpful.
View referenced article