IntrEx: A Dataset for Modeling Engagement in Educational Conversations

Xingwei Tan, Mahathi Parvatham, Chiara Gambi, Gabriele Pergola

2025-09-15

IntrEx: A Dataset for Modeling Engagement in Educational Conversations

Summary

This research focuses on understanding what makes conversations between teachers and students learning a new language actually *interesting* and keeps students engaged, something that's surprisingly hard to pinpoint.

What's the problem?

While we know what makes written learning materials engaging, we don't really understand *why* some conversations are more captivating than others in a language learning context. It's not enough to just look at individual responses; engagement builds up over time during a conversation, and previous studies haven't really captured that dynamic.

What's the solution?

The researchers created a new dataset called IntrEx, which contains conversations between teachers and students learning a second language. What makes IntrEx special is that it’s annotated with ratings of how interesting the conversations are, not just for single turns but for the whole flow of the dialogue. They had over 100 language learners rate these conversations, and then used these ratings to train smaller AI models to predict what humans find interesting. Surprisingly, these smaller, specialized models performed better than much larger, more general AI models like GPT-4o. They also analyzed specific language features like how concrete the language is, how easy it is to understand, and how teachers respond to students.

Why it matters?

This work is important because it provides a way to actually measure and understand engagement in language learning conversations. By building a dataset and training AI models to recognize interesting interactions, we can potentially develop tools and strategies to help teachers create more engaging lessons and improve student motivation, ultimately leading to better language acquisition.

Abstract

Engagement and motivation are crucial for second-language acquisition, yet maintaining learner interest in educational conversations remains a challenge. While prior research has explored what makes educational texts interesting, still little is known about the linguistic features that drive engagement in conversations. To address this gap, we introduce IntrEx, the first large dataset annotated for interestingness and expected interestingness in teacher-student interactions. Built upon the Teacher-Student Chatroom Corpus (TSCC), IntrEx extends prior work by incorporating sequence-level annotations, allowing for the study of engagement beyond isolated turns to capture how interest evolves over extended dialogues. We employ a rigorous annotation process with over 100 second-language learners, using a comparison-based rating approach inspired by reinforcement learning from human feedback (RLHF) to improve agreement. We investigate whether large language models (LLMs) can predict human interestingness judgments. We find that LLMs (7B/8B parameters) fine-tuned on interestingness ratings outperform larger proprietary models like GPT-4o, demonstrating the potential for specialised datasets to model engagement in educational settings. Finally, we analyze how linguistic and cognitive factors, such as concreteness, comprehensibility (readability), and uptake, influence engagement in educational dialogues.

View Paper