JEN-1 DreamStyler: Customized Musical Concept Learning via Pivotal Parameters Tuning

Boyu Chen, Peike Li, Yao Yao, Alex Wang

2024-06-19

JEN-1 DreamStyler: Customized Musical Concept Learning via Pivotal Parameters Tuning

Summary

This paper introduces the JEN-1 DreamStyler, a new method for creating customized music based on short reference pieces. It allows users to generate music that reflects specific musical concepts by fine-tuning a pre-trained model.

What's the problem?

While large models for generating music from text have improved a lot, they often struggle to create music that meets specific user needs. When users want music that captures a particular style or concept, the standard text prompts may not be enough. Additionally, directly adjusting all the settings in these models can lead to overfitting, where the model becomes too tailored to the reference music and loses its ability to create diverse compositions.

What's the solution?

To solve these issues, the authors developed a method called Pivotal Parameters Tuning. This approach focuses on fine-tuning only the most important settings (or parameters) of the music generation model while keeping the rest unchanged. This helps the model learn new musical concepts without losing its original capabilities. The paper also addresses potential conflicts that arise when trying to incorporate multiple musical concepts at once. They introduce a strategy to help the model distinguish between different concepts so it can generate music that includes one or several styles effectively. They also created a new dataset and evaluation method specifically for this task.

Why it matters?

This research is important because it empowers users to create personalized music that aligns with their tastes and preferences. By improving how models can learn and generate music based on specific concepts, the JEN-1 DreamStyler could enhance various applications in music production, education, and entertainment. This advancement opens up new possibilities for musicians and creators to explore unique musical styles and ideas.

Abstract

Large models for text-to-music generation have achieved significant progress, facilitating the creation of high-quality and varied musical compositions from provided text prompts. However, input text prompts may not precisely capture user requirements, particularly when the objective is to generate music that embodies a specific concept derived from a designated reference collection. In this paper, we propose a novel method for customized text-to-music generation, which can capture the concept from a two-minute reference music and generate a new piece of music conforming to the concept. We achieve this by fine-tuning a pretrained text-to-music model using the reference music. However, directly fine-tuning all parameters leads to overfitting issues. To address this problem, we propose a Pivotal Parameters Tuning method that enables the model to assimilate the new concept while preserving its original generative capabilities. Additionally, we identify a potential concept conflict when introducing multiple concepts into the pretrained model. We present a concept enhancement strategy to distinguish multiple concepts, enabling the fine-tuned model to generate music incorporating either individual or multiple concepts simultaneously. Since we are the first to work on the customized music generation task, we also introduce a new dataset and evaluation protocol for the new task. Our proposed Jen1-DreamStyler outperforms several baselines in both qualitative and quantitative evaluations. Demos will be available at https://www.jenmusic.ai/research#DreamStyler.

View Paper