New research applying artificial intelligence to the study of sperm whale vocalizations has identified what appears to be a combinatorial structure in their communication, with elements that function roughly like human vowels. The work, highlighted by researcher Ethan Mollick, represents a continued push to apply machine learning techniques to decode non-human communication systems.
What the Research Found
The core finding, based on analysis of the extensive "Dominica Sperm Whale Project" dataset, is that sperm whale codas—the patterned clicks they use to communicate—contain identifiable, reusable components. Researchers describe these components as functioning analogously to vowels in human language: discrete units that can be combined in different sequences to alter meaning. This combinatorial property is a foundational feature of human language and a significant indicator of a complex communication system.
The AI Methodology
While specific architectural details from the latest work are not public, the field typically employs self-supervised deep learning models, such as transformers or convolutional neural networks, trained on vast audio datasets. These models learn to identify patterns, clusters, and structures within the acoustic data without human-labeled categories. The goal is to discover the underlying "phonetic" and syntactic rules of whale codas by finding predictable patterns in sequences of clicks, inter-click intervals, and rhythms.
This approach builds on prior work like Project CETI (Cetacean Translation Initiative), which uses natural language processing techniques to map sperm whale communication. The discovery of vowel-like elements suggests the codas are not monolithic signals but are built from smaller, recombinable parts.
Context and Implications
Decoding animal communication, particularly in highly social and intelligent species like sperm whales, is a long-standing scientific challenge. Evidence of combinatoriality—using a finite set of elements to create a large set of meaningful expressions—would place whale communication closer to human language in complexity than previously confirmed. This research does not claim to have translated whale "language" but has identified a crucial structural feature that must exist for a translatable language to be possible.
Successful decoding could transform fields like ethology and conservation, providing deeper insight into whale society, culture, and decision-making. It also serves as a stress test for AI's ability to find structure in complex, non-human data where ground truth is unknown.
gentic.news Analysis
This update fits squarely within the accelerating trend of applying large-scale AI models to fundamental scientific questions. As we covered in our analysis of Google DeepMind's AlphaFold 3, the pattern is clear: self-supervised learning on massive, unlabeled datasets is becoming a primary tool for discovery in domains from protein folding to animal communication. The whale research leverages the same core paradigm—using AI to detect patterns invisible to human analysts.
The work is almost certainly linked to the ongoing Project CETI, a multidisciplinary initiative we reported on in 2024, which aims to apply advanced machine translation models to sperm whale codas. CETI's team, involving AI researchers from MIT and Harvard, has been collecting one of the largest bioacoustic datasets in the world. This new finding of combinatorial "vowels" likely represents a mid-stream breakthrough from that or a similar consortium, validating their data-driven approach. It suggests the roadmap—record, process with AI, search for linguistic primitives—is yielding results.
For AI practitioners, this is a notable example of the field expanding beyond text, images, and code into entirely novel modalities. The techniques being refined here—unsupervised discovery of semantic units in sequential data—could have downstream applications in other areas, such as analyzing network traffic logs, financial time series, or any complex system where the "language" is unknown. The key takeaway is methodological: when you lack labels, train a model to find the grammar itself.
Frequently Asked Questions
What are sperm whale codas?
Sperm whale codas are patterned series of clicks used for communication. They are distinct from the longer, singular clicks used for echolocation. Different coda patterns (e.g., "1+1+3," "5R") have been observed in different social contexts, suggesting they carry specific meanings.
Has AI translated whale language?
No. The research indicates a significant step forward by identifying a structural building block (combinatorial, vowel-like elements) within whale codas. Translation—assigning human-interpretable meanings to specific coda sequences—remains a distant and much more complex goal.
Why is combinatoriality important?
Combinatoriality, or duality of patterning, is a core design feature of human language. It allows a small set of meaningless sounds (phonemes) to be combined into a vast set of meaningful words. Finding evidence of this in another species suggests their communication system may have a similar capacity for open-ended expression, rather than being a fixed set of holistic signals.
What AI models are used for this research?
While not specified in this brief update, related projects like Project CETI typically use deep learning architectures suited for sequence data, such as transformers (the backbone of large language models) or convolutional neural networks, trained in a self-supervised manner on audio spectrograms to discover patterns and clusters.









