< Explain other AI papers

COIG-Writer: A High-Quality Dataset for Chinese Creative Writing with Thought Processes

Yunwen Li, Shuangshuang Ying, Xingwei Qu, Xin Li, Sheng Jin, Minghao Liu, Zhoufutu Wen, Tianyu Zheng, Xeron Du, Qiguang Chen, Jiajun Shi, Wangchunshu Zhou, Jiazhan Feng, Wanjun Zhong, Libo Qin, Stephen Huang, Wanxiang Che, Chenghua Lin, Eli Zhang

2025-10-17

COIG-Writer: A High-Quality Dataset for Chinese Creative Writing with Thought Processes

Summary

This paper investigates why large language models struggle with creative writing, especially in languages other than English, and introduces a new dataset designed to help them improve.

What's the problem?

Large language models aren't very good at creative writing, and this problem is much worse for languages like Chinese because there's less training data available. Existing datasets only show the final product (the story), but don't explain *how* a writer thought through the process of creating it. This lack of understanding of the creative process hinders the models' ability to learn to write creatively.

What's the solution?

The researchers created a new dataset called COIG-Writer specifically for Chinese creative writing. This dataset is different because it includes not just the finished story, but also the original prompt that inspired it *and* a detailed breakdown of the reasoning and decisions a writer made while crafting the story. They then tested different models using this dataset, finding that a combination of learning from the writing process itself (process supervision) and general language data works best – about one creative example for every twelve general examples. They also discovered that creative writing skills don't easily transfer between languages and that using a lot of different words doesn't necessarily mean the writing is better.

Why it matters?

This research shows that truly creative writing isn't just about knowing a lot of words; it's about having a solid logical structure and understanding how to build a narrative. It highlights the need for datasets that capture the *thinking* behind writing, not just the final result, and that different languages require tailored approaches to improve creative AI. It's like how you need both math skills and language skills to really understand something – one doesn't replace the other.

Abstract

Large language models exhibit systematic deficiencies in creative writing, particularly in non-English contexts where training data is scarce and lacks process-level supervision. We present COIG-Writer, a novel Chinese creative writing dataset that captures both diverse outputs and their underlying thought processes through systematic reverse-engineering of high-quality texts. Unlike existing datasets that provide only input-output pairs, COIG-Writer comprises 1,665 meticulously curated triplets spanning 51 genres, each containing: (1) a reverse-engineered prompt, (2) detailed creative reasoning documenting decision-making processes, and (3) the final text. Through comprehensive experiments, we identify a two-component model of creative writing: narrative logic (provided by process supervision) and linguistic expression (maintained by general-purpose data). Our findings reveal three critical insights: (1) Process supervision is highly effective but requires stabilization with general data. A ratio of at least one creative sample to twelve general samples is needed to achieve optimal performance; below this threshold, the win rate progressively degrades (from 62.75% down to 35.78%)., (2) creative capabilities are culturally-bound with no cross-lingual transfer (89.26pp gap between Chinese and English performance), and (3) lexical diversity inversely correlates with creative quality (TTR paradox), suggesting high diversity signals compensatory behavior for logical deficiencies. These findings establish that creative excellence emerges from the interaction between logical scaffolding and linguistic grounding, analogous to how mathematical reasoning enhances but cannot replace linguistic competence in foundation models.