ParaStudent evaluates LLM-based code generation to mimic real student progress, capturing error patterns, incremental improvements, and stylistic variations through fine-tuning and multi-dimensional evaluation.

This paper talks about ParaStudent, a system that trains large language models to generate and evaluate computer code in a way that mimics how real students write and improve their code, including making mistakes and fixing them.

ParaStudent: Generating and Evaluating Realistic Student Code by Teaching LLMs to Struggle

Summary

What's the problem?

What's the solution?

Why it matters?

Abstract