Introduction to AI-Generated Content
A recent study published in the peer-reviewed journal PNAS has made a fascinating discovery about large language models (LLMs). These models, which are used in many AI systems, tend to prefer content written by other LLMs over content written by humans. This preference could have significant implications for the way we discover and interact with online content.
About the Study
The study was conducted by a team of researchers led by Walter Laurito and Jan Kulveit. They compared human-written and AI-written versions of the same items across three categories: marketplace product descriptions, scientific paper abstracts, and movie plot summaries. The researchers used popular LLMs, including GPT-3.5 and GPT-4, to act as selectors in pairwise prompts that forced a single pick.
Key Findings
The results of the study show a consistent tendency for LLMs to prefer LLM-presented options. This means that when given a choice between two similar pieces of content, one written by a human and the other written by an LLM, the LLM is more likely to choose the AI-written content. The study found that:
- For marketplace product descriptions, LLMs chose the AI-written version 89% of the time, compared to 36% for human raters.
- For scientific paper abstracts, LLMs chose the AI-written version 78% of the time, compared to 61% for human raters.
- For movie summaries, LLMs chose the AI-written version 70% of the time, compared to 58% for human raters.
Why This Matters
The preference of LLMs for AI-written content could have significant implications for the way we interact with online content. If marketplaces, chat assistants, or search experiences use LLMs to score or summarize listings, AI-assisted copy may be more likely to be selected in those systems. This could create a "gate tax," where businesses feel compelled to pay for AI writing tools to avoid being down-selected by AI evaluators.
Limits and Questions
While the study’s findings are intriguing, there are some limitations to consider. The human baseline in the study is small, and the pairwise choices don’t measure sales impact. The findings may also vary by prompt design, model version, domain, and text length. The mechanism behind the preference is still unclear, and the authors call for follow-up work on stylometry and mitigation techniques.
Looking Ahead
As AI-mediated ranking continues to expand in commerce and content discovery, it’s reasonable to consider AI assistance where it directly affects visibility. However, this should be treated as an experimentation lane rather than a blanket rule. Human writers should still be kept in the loop for tone and claims, and customer outcomes should be used to validate the effectiveness of AI-assisted content.
Conclusion
The study’s findings suggest that LLMs have a preference for AI-written content, which could have significant implications for the way we interact with online content. As AI continues to play a larger role in content discovery and ranking, it’s essential to consider the potential benefits and drawbacks of AI-assisted content. By keeping human writers in the loop and validating with customer outcomes, we can ensure that AI-assisted content is used effectively and responsibly.