< Explain other AI papers

Personalizable Long-Context Symbolic Music Infilling with MIDI-RWKV

Christian Zhou-Zheng, Philippe Pasquier

2025-06-17

Personalizable Long-Context Symbolic Music Infilling with MIDI-RWKV

Summary

This paper talks about MIDI-RWKV, a new AI model designed to help musicians create music by filling in missing parts of songs using symbolic musical data. The model is based on a special RWKV-7 architecture that is efficient and can be run on smaller devices, like edge devices. It allows musicians to personalize the model with very little training data, making it easier for them to work creatively by going back and forth with the AI in the music-making process.

What's the problem?

The problem is that most music generation systems either produce complete new pieces or continue existing ones without allowing users to easily edit or regenerate parts of a song. These systems also struggle to work with long and complex musical contexts, making it hard for musicians to use AI tools interactively and personalize the music generated by the model.

What's the solution?

The solution is MIDI-RWKV, which uses a linear RWKV-7 model that can understand long pieces of music and selectively fill in sections that the musician wants to change. It also includes a new way to personalize the AI’s initial state with very little data from the user, so the model can adapt quickly to the musician’s style. MIDI-RWKV works efficiently on edge devices, making the creative process smoother and more accessible.

Why it matters?

This matters because it makes computer-assisted music composition more practical and creative by letting musicians interact with AI on long compositions, customize the AI to their style, and do this all on devices with limited power, like personal laptops or tablets. This can help more artists use AI tools to enhance their music creation without needing massive computing resources.

Abstract

MIDI-RWKV, a novel RWKV-7 based model, enables efficient and coherent musical infilling on edge devices with personalizable initial states, enhancing the computer-assisted composition process.