A Multimodal Symphony: Integrating Taste and Sound through Generative AI

Matteo Spanio, Massimiliano Zampini, Antonio Rodà, Franco Pierucci

2025-03-05

A Multimodal Symphony: Integrating Taste and Sound through Generative AI

Summary

This paper talks about a new AI system that can create music based on descriptions of tastes, like turning the flavor of chocolate into a melody

What's the problem?

Scientists have found links between how we taste things and how we hear sounds, but it's been hard to use this knowledge to actually create music that matches specific flavors

What's the solution?

The researchers took an AI model that usually makes music and taught it to understand taste descriptions. They then had it create music based on detailed descriptions of different tastes. They tested this by having 111 people listen to the music and rate how well it matched the taste descriptions

Why it matters?

This matters because it opens up new ways for AI to connect our senses. It could lead to unique experiences in restaurants, new forms of art, or even help people with sensory impairments experience tastes through music. It's a step towards AI that understands and can work with multiple human senses at once

Abstract

In recent decades, neuroscientific and psychological research has traced direct relationships between taste and auditory perceptions. This article explores multimodal generative models capable of converting taste information into music, building on this foundational research. We provide a brief review of the state of the art in this field, highlighting key findings and methodologies. We present an experiment in which a fine-tuned version of a generative music model (MusicGEN) is used to generate music based on detailed taste descriptions provided for each musical piece. The results are promising: according the participants' (n=111) evaluation, the fine-tuned model produces music that more coherently reflects the input taste descriptions compared to the non-fine-tuned model. This study represents a significant step towards understanding and developing embodied interactions between AI, sound, and taste, opening new possibilities in the field of generative AI. We release our dataset, code and pre-trained model at: https://osf.io/xs5jy/.

View Paper