The underlying operating system of AI writing stories is actually completely different from that of humans. Often, even with frequent changes in prompts, it's difficult to reverse the trend. On May 28th, a professor from the Wharton School shared on a social platform a groundbreaking research result jointly published by the Department of Computer Science at the University of Maryland and Google DeepMind. The study shows that by using a fully automated analysis pipeline called StoryScope, the accuracy of detecting AI-written texts can reach as high as 93.2%, based only on narrative features such as plot, subject, and temporal structure.

In this large-scale "literary dissection" experiment, the research team collected tens of thousands of writing prompts and compared over 60,000 stories generated by human authors and five large language models: Claude, DeepSeek, Gemini, GPT, and Kimi. The results show that AI is trapped in a narrow default narrative template when it comes to making up stories, exposing extremely obvious "narrative fingerprints."

image.png

Specifically, there are five fundamental logical differences in AI writing. First, AI tends to be overly didactic, directly stating the theme of the story in nearly 80% of cases, and even forcing philosophical discussions into dialogues, far exceeding the proportion of human authors. Second, compared to human authors' skills in nonlinear flashbacks or digressions, AI just goes straight ahead, with nearly 80% of its works lacking subplots, and their endings tend to follow the "positive and upright" pattern of the protagonist's sudden realization.

More interestingly, AI is almost addicted to "physical descriptions." Lacking real emotional experiences, AI cannot directly perceive sadness or fear, so it can only pile up physiological sensations and environmental metaphors in a textbook way, leading to descriptions that are often discordant and overly exaggerated. Additionally, AI lacks a "reader awareness" and rarely breaks the fourth wall to interact with readers.

In terms of specific models, each major model has its own unique writing flaws. The study points out that Claude usually has uneventful and very flat plot progression; GPT tends to frequently use dream sequences for turning points in stories; and Gemini tends to adopt an external perspective to coldly describe characters, like reading character profiles.

The research team has now made the entire project's code and narrative text library publicly available. This not only provides the literary world with a "mirror to reveal true nature," but also sounds a warning for many writers who rely on AI for creative assistance: although large models may imitate the style and handwriting of any great writer, they can never experience real life on behalf of humans, and it is the latter that is the ultimate source of good stories.