Artificial intelligence has been making headlines for its capabilities in various fields, but can it make us laugh? A recent study published in the journal PLOS ONE suggests that AI might have the upper hand in humor as well. The research shows that AI-generated humor was rated as equally funny or funnier than human-created jokes, even when pitted against professional satirists from The Onion.
Creating humor is notoriously difficult. To be perceived as funny, jokes need to strike a balance between being surprising and benign. Most people develop their sense of humor through exposure and practice, picking up on patterns that make jokes work. Researchers wanted to see if large language models (LLMs), a type of artificial intelligence designed to understand, generate, and manipulate human language, could replicate this human skill.
LLMs are built using vast amounts of textual data and complex algorithms to create models capable of predicting and generating text. These models learn by processing and analyzing extensive datasets, which enables them to recognize patterns, understand context, and produce coherent text responses to prompts.
The study aimed to explore whether LLMs could generate humor that resonates with people. This question is particularly relevant given the entertainment industry’s ongoing debate about the use of AI in creative fields. The study’s lead researcher, Drew Gorenz of the University of Southern California, noted that recent strikes by Hollywood writers and actors highlight the fear that AI could threaten jobs and creativity in the entertainment industry.
The researchers conducted two main studies to compare the humor production abilities of AI and humans. They used OpenAI’s ChatGPT 3.5 for the AI-generated content. The first study focused on comparing ChatGPT’s humor with that of laypeople, while the second compared ChatGPT’s humor with professional satirists from The Onion.
In the first study, 105 participants from Amazon Mechanical Turk, an online workforce platform, were asked to complete three humor tasks. These tasks involved creating humorous phrases for given acronyms, answering fill-in-the-blank prompts humorously, and crafting roast jokes in response to hypothetical scenarios. Participants were explicitly told to use their own imagination and not to copy jokes from other sources.
ChatGPT 3.5 was given the same tasks, producing 20 responses for each prompt. These AI-generated jokes were then mixed with human-created jokes and evaluated by a separate group of 200 participants, who rated their funniness on a seven-point scale.
The AI’s jokes were consistently rated higher in funniness across three different tasks: creating humorous acronyms, completing fill-in-the-blank statements humorously, and crafting roast jokes. Overall, ChatGPT’s jokes outperformed the majority of human-generated jokes, with the AI excelling particularly in the roast joke task.
Specifically, ChatGPT outperformed 73% of the human participants in the acronyms task, 63% of the human participants in the fill-in-the-blank task, and 87% of human participants in the roast joke task.
In the second study, the researchers compared AI-generated satirical headlines to those from The Onion. They used a convenience sample of 217 students from the University of Southern California. Each student rated the funniness of a mix of headlines generated by ChatGPT and The Onion, without knowing the source of each headline.
The results showed no significant difference in the average funniness ratings between the AI-generated headlines and those from The Onion. Among the top four highest-rated headlines, two were generated by ChatGPT and two by The Onion. Notably, the highest-rated headline was an AI-generated one: “Local Man Discovers New Emotion, Still Can’t Describe It Properly.” This suggests that ChatGPT can produce satirical content that is on par with professional writers.
These findings indicate that AI, specifically ChatGPT 3.5, has a surprising proficiency in humor production. Despite lacking emotions and personal experiences, the AI was able to analyze patterns and create jokes that resonated well with people.
“Since ChatGPT can’t feel emotions itself but it tells novel jokes better than the average human, these studies provide evidence that you don’t need to feel the emotions of appreciating a good joke to tell a really good one yourself,” Gorenz said.
The researchers also explored whether demographic factors influenced humor ratings. It was found that age, sex, and political orientation did not significantly affect participants’ preferences for AI-generated versus human-generated jokes. This suggests that the AI’s humor appeal was broad and not limited to specific demographic groups.
While the study’s findings are intriguing, they come with several caveats. For example, the humor tasks were text-based and did not involve delivery, which is a critical component of humor. AI-generated jokes might not perform as well in formats that require timing and presentation, such as stand-up comedy or sketch shows.
“That ChatGPT can produce written humor at a quality that exceeds laypeople’s abilities and equals some professional comedy writers has important implications for comedy fans and workers in the entertainment industry,” the researchers wrote. “For professional comedy writers, our results suggest that LLMs can pose a serious employment threat. The implications are more positive for people who merely want to reap the benefits of elevating their everyday communications with a dose of humor. They can turn to LLMs for help.”
The study, “How funny is ChatGPT? A comparison of human- and A.I.-produced jokes,” was authored by Drew Gorenz and Norbert Schwarz.