In a new experiment comparing different types of collaboration, researchers found that pairs of humans working together produced more original ideas than individuals collaborating with artificial intelligence or using internet search tools. The findings suggest that human interaction still holds a creative edge—especially when it comes to generating novel ideas—despite the growing capabilities of generative AI like ChatGPT.
Generative artificial intelligence has made headlines for its apparent creative capabilities, from composing music to brainstorming business ideas. These systems, such as ChatGPT, can generate content based on patterns in massive datasets. As they become increasingly integrated into everyday tasks, many researchers have begun to ask whether AI can actually enhance human creativity—or even surpass it.
To investigate this question, a team of researchers led by Min Tang at the University Institute of Schaffhausen compared creative performance across several types of collaboration. Their goal was to determine how working with AI stacks up against other sources of external input—like working with another human or using the internet for inspiration. The study was published in The Journal of Creative Behavior.
The researchers recruited 202 university students in Germany, mostly studying business-related fields, and assigned them to one of four conditions: human–human dyads, human–internet (using Google), or human–ChatGPT collaborations, with two types of instructions for the AI group. Each participant or pair completed four creative tasks, including two alternate uses tests (e.g., finding unusual uses for pants or a fork), a consequences task (e.g., imagining a world without food), and a creative problem-solving activity.
Before and after the tasks, participants answered surveys about their creative confidence and perceptions of the collaboration. The researchers also evaluated participants’ creative output using both trained human judges and an automated scoring system based on a large language model.
When it came to generating divergent ideas—the kinds of ideas that branch out and explore many possibilities—human–human pairs consistently performed best. Across all three divergent thinking tasks, their responses were rated as more original and clever by human judges than those produced by participants who used ChatGPT or Google.
The most striking difference came in the “fork” task, where human pairs significantly outshone the other groups. The researchers found no meaningful difference in performance between those who collaborated with ChatGPT and those who used internet search tools.
Interestingly, the human–human pairs were also the only group to show an increase in creative confidence after completing the tasks. Participants in these pairs reported feeling more capable and creative at the end of the session, suggesting that working with another person not only inspired better ideas, but also helped people feel better about their own creativity. Those who worked with ChatGPT or Google did not experience a similar boost.
The study also highlighted differences in how participants perceived their collaborators. Those in the human–human condition saw their partners as equally contributing to the task. But people who used Google tended to view themselves as the main driver of the ideas, while those who used ChatGPT saw the AI as doing most of the creative heavy lifting. While ChatGPT was seen as more helpful than Google, participants often attributed the success of the collaboration to the AI rather than to their own input.
One of the more surprising findings came from the automated scoring system, which rated the ChatGPT-assisted ideas as more creative than those from human–human teams. This result was the opposite of what human judges concluded. After further analysis, the researchers discovered that the AI scoring system was heavily influenced by the length of the responses.
Since ChatGPT-generated responses tended to be longer and more elaborate, the automated system may have mistaken verbosity for creativity. Once the researchers accounted for this factor, the advantage for ChatGPT disappeared.
This discrepancy between human and AI evaluations points to what the researchers call “elaboration bias”—a tendency for automated scoring systems to overvalue longer, more detailed responses, even if they are not especially novel. The findings raise questions about whether current AI tools can reliably assess creativity, especially in languages or contexts they were not extensively trained on.
The researchers caution that their study only looked at a specific kind of creativity—divergent thinking—where originality and unusualness are key. They did not find any significant differences between the groups on the problem-solving task, which involved selecting the most serious consequence from the earlier task and coming up with a creative but useful solution. It’s possible that AI tools may still be helpful in tasks that require refining or converging on an idea, rather than generating a wide range of new ones.
There are also limitations in how much the study can tell us about real-world creative collaborations. Participants used ChatGPT and Google in a lab setting, with constraints on how they could interact with the tools. The researchers did not analyze the actual back-and-forth between people and AI, which could reveal more about how ideas are accepted, rejected, or transformed during the creative process. In future studies, recording these interactions might help explain why AI partnerships seem less effective at boosting creativity—and why people sometimes feel less ownership over ideas generated with the help of a machine.
While generative AI may still play a role in helping people think outside the box, the new study suggests it hasn’t replaced the unique spark that can come from two people bouncing ideas off each other. Collaboration between humans continues to generate not only more original ideas, but also more confidence in one’s own creative abilities. As the researchers put it, “creativity is a unique human endowment that is not easily replicated by AI.”
The study, “‘Who’ Is the Best Creative Thinking Partner? An Experimental Investigation of Human–Human, Human–Internet, and Human–AI Co-Creation,” was authored by Min Tang, Sebastian Hofreiter, Christian H. Werner, Aleksandra Zielińska, and Maciej Karwowski.