Improving Inference-Time Optimisation for Vocal Effects Style Transfer with a Gaussian Prior

Chin-Yun Yu, Marco A. Martínez-Ramírez, Junghyun Koo, Wei-Hsiang Liao, Yuki Mitsufuji, György Fazekas

2025-05-19

Improving Inference-Time Optimisation for Vocal Effects Style Transfer
with a Gaussian Prior

Summary

This paper talks about a new way to make computer programs better at copying the sound style of one singer or audio track onto another, especially when it comes to adding vocal effects.

What's the problem?

The problem is that when trying to transfer vocal effects from one recording to another, the computer sometimes doesn't get the settings right, so the new audio doesn't match the style of the original as closely as it should.

What's the solution?

To solve this, the researchers used information from a large collection of vocal effect settings, known as a Gaussian prior, to guide the computer in picking better effect settings during the transfer process. This helps the program make the new audio sound more like the reference style.

Why it matters?

This matters because it can help musicians, producers, and even hobbyists create more realistic and high-quality vocal mixes, making it easier to experiment with different sounds and styles in music production.

Abstract

Incorporating Gaussian prior knowledge derived from a vocal preset dataset enhances audio effects transfer by improving parameter accuracy and matching reference styles more effectively than existing methods.

View Paper