AI-Invented Tonal Languages: Preventing a Machine Lingua Franca Beyond Human Understanding

David Noever

2025-03-04

AI-Invented Tonal Languages: Preventing a Machine Lingua Franca Beyond
Human Understanding

Summary

This paper talks about a new way for AI language models to communicate with each other using a special tonal language that humans can't fully understand. It's inspired by how twins sometimes develop their own secret language and by tonal languages like Mandarin.

What's the problem?

There are worries that AI systems might create their own private languages in the next few years, which could be a problem if humans can't understand or control what the AIs are saying to each other.

What's the solution?

The researchers created a system that turns letters and symbols into musical tones, with some tones so high that humans can't hear them. They made a computer program to show how this works and tested how fast information could be shared this way.

Why it matters?

This matters because it helps us understand how AI might communicate in ways we can't easily understand. By creating a working example, the researchers are giving us a head start on figuring out how to detect and manage these potential AI languages. This could be crucial for keeping AI systems under human control as they become more advanced.

Abstract

This paper investigates the potential for large language models (LLMs) to develop private tonal languages for machine-to-machine (M2M) communication. Inspired by cryptophasia in human twins (affecting up to 50% of twin births) and natural tonal languages like Mandarin and Vietnamese, we implement a precise character-to-frequency mapping system that encodes the full ASCII character set (32-126) using musical semitones. Each character is assigned a unique frequency, creating a logarithmic progression beginning with space (220 Hz) and ending with tilde (50,175.42 Hz). This spans approximately 7.9 octaves, with higher characters deliberately mapped to ultrasonic frequencies beyond human perception (>20 kHz). Our implemented software prototype demonstrates this encoding through visualization, auditory playback, and ABC musical notation, allowing for analysis of information density and transmission speed. Testing reveals that tonal encoding can achieve information rates exceeding human speech while operating partially outside human perceptual boundaries. This work responds directly to concerns about AI systems catastrophically developing private languages within the next five years, providing a concrete prototype software example of how such communication might function and the technical foundation required for its emergence, detection, and governance.

View Paper