PoeTone: A Framework for Constrained Generation of Structured Chinese Songci with LLMs

Authors

  • Zhan Qu Technische Universität Dresden ScaDS.AI
  • Shuzhou Yuan Technische Universität Dresden ScaDS.AI
  • Michael Färber Technische Universität Dresden ScaDS.AI

DOI:

https://doi.org/10.1609/aaai.v40i39.40555

Abstract

This paper presents a systematic investigation into the constrained generation capabilities of large language models (LLMs) in producing Songci, a classical Chinese poetry form characterized by strict structural, tonal, and rhyme constraints defined by Cipai templates. We first develop a comprehensive, multi-faceted evaluation framework that includes: (i) a formal conformity score, (ii) automated quality assessment using LLMs, (iii) human evaluation, and (iv) classification-based probing tasks. Using this framework, we evaluate the generative performance of 18 LLMs, including 3 proprietary models and 15 open-source models across 4 families, under five prompting strategies: zero-shot, one-shot, completion-based, instruction-based, and chain-of-thought. Finally, we propose a Generate-Critic architecture in which the evaluation framework functions as an automated critic. Leveraging the critic’s feedback as a scoring function for best-of-N selection, we fine-tune 3 lightweight open-source LLMs via supervised fine-tuning (SFT), resulting in improvements of up to 5.88% in formal conformity. Our findings offer new insights into the generative strengths and limitations of LLMs in producing culturally significant and formally constrained literary texts.

Downloads

Published

2026-03-14

How to Cite

Qu, Z., Yuan, S., & Färber, M. (2026). PoeTone: A Framework for Constrained Generation of Structured Chinese Songci with LLMs. Proceedings of the AAAI Conference on Artificial Intelligence, 40(39), 32764–32772. https://doi.org/10.1609/aaai.v40i39.40555

Issue

Section

AAAI Technical Track on Natural Language Processing IV