Guest Blog Authored by Sho Araiba, Ph.D., BCBA-D

Sho Araiba, Ph.D., BCBA-D, is a behavior analyst and a filmmaker. He earned his doctorate in psychology (behavior analysis) from the City University of New York. His research interests encompass a wide range of philosophical, basic, and applied topics within behavior analysis. With 15+ years of experience as an ABA therapist, he has served neurodiverse people both in the U.S. and Japan. He also teaches at Leeward Community College, the University of Hawaii. He is also known as Dr. Sho, the creator and host of the Dr. Sho Show on YouTube: https://www.youtube.com/c/DrShoShow.
Correspondence concerning this article should be addressed to Sho Araiba. Email: info@araibabehavioranalysis.net.
New Discoveries about Animal Language in Ethology
Skinner (1986) once assumed that speaker behavior is uniquely human; however, as Charles Catania often says, Skinner would be happy to change his view given enough evidence. And now, after many years of work by ethologists and behavior analysts, we have ample evidence to convince Skinner (and ourselves) that animals use complex verbal communication systems that qualify as language. In this blog, we will venture into a wonderful world of animal verbal behavior and its evolutionary insight into human language. I will then connect this discovery to behavior analysis (both experimental and applied). This blog is based on my recent article, A Search for Language in Birds in the Lab and the Wild, published in the Journal of the Experimental Analysis of Behavior (Araiba, 2025).
Across the animal kingdom, ethologists have been busy discovering many complex communication systems. Honeybees perform the famous waggle-dance, in which forager bees communicate to nestmates the direction and distance of food sources via precise motion patterns (e.g., Ai et al., 2019). Sperm whales use “codas” —click sequences that vary in rhythm and tempo— that may function somewhat like a phonetic alphabet, with recognizable “clan-dialects,” to identify individuals and to form their clans (e.g., Sharma et al., 2024). Meerkats use different alarm calls toward predator type, urgency, and context (e.g., Manser, 2001). Ring-tailed lemurs have distinct alarm calls, distinct vocal repertoires, and use scent marking and postures for communication, in ways that are highly adapted to social structure, predator detection, and territorial behavior (e.g., Bolt et al., 2015). Elephants communicate via low-frequency rumbling, which can travel long distances, combined with visual displays and tactile signals (e.g., Eleuteri et al., 2024). These are just a few examples. What we see in the animal kingdom is that many species have multiple signals or calls for different functions (alarm, recruitment, mating, territory, social hierarchy, etc.), and that these signals or calls can be conceived as “words.” Animals also have “listener responses” that are reliably differentiated by call type. Thus, there are incredible variations in animal verbal behavior, which was once thought impossible in non-human animals.

Honeybee: Bob Peterson from North Palm Beach, Florida, Planet Earth!, CC BY-SA 2.0 <https://creativecommons.org/licenses/by-sa/2.0>, via Wikimedia Commons
Sperm Whale: Gabriel Barathieu, CC BY-SA 2.0 <https://creativecommons.org/licenses/by-sa/2.0>, via Wikimedia Commons
Meerkat: Charles J. Sharp, CC BY-SA 4.0 <https://creativecommons.org/licenses/by-sa/4.0>, via Wikimedia Commons
Ring-tail Lemurs: Musicaline, CC BY-SA 4.0 <https://creativecommons.org/licenses/by-sa/4.0>, via Wikimedia Commons
Yet, it is rare in non-human species to find evidence that two or more signals/calls are combined in a sequence (to make a sentence), that such a sequence has a pattern (grammar), and that the order in which the signals are sequenced changes the meaning of the sentence (composition). To qualify as “true language,” as linguists call it, we need evidence of syntax (grammar, sentence) that has compositionality (a principle that the order of words determines the meaning of the sentence). This, once believed, was a uniquely human capability.
Chimpanzees’ Two-Word Sentence
Recent research shows that chimpanzees use structured two-call combinations that resemble early forms of compositional syntax. Leroux et al. (2023) found that chimps encountering snakes combined an “alarm-huus” call (a word to express surprise) and a “waa-barks” call (a word to recruit other members for a hunt or aggression). Two different words are put together to express something different: “Come help me! There is a snake!” Although research with chimps is promising, we need more definitive evidence of compositional syntax.
Japanese Tits’ Compositional Syntax
Suzuki and his team (2016, 2017) studied Japanese tits, a small songbird species known for its sophisticated vocal communication. Japanese tits use different calls to change their flock members’ listener behavior. For example, they use alarm calls to warn of the presence of different predators such as crows, snakes, and martens. The flock members ‘ listener behavior toward each alarm call is also different. To a call that indicates a crow, they scan the sky. To a call that indicates a snake, they search the ground. They also use various social calls for food, mating, and parenting. For example, a recruitment call is used to gather flock members to the speaker.

Photo by Laitche, CC BY-SA 4.0 <https://commons.wikimedia.org/w/index.php?curid=53756352>, via Wikimedia Commons
What is even more fascinating is that, just like chimps, these birds produce not only single calls, but also put together two calls (Suzuki et al., 2016, 2017). Specifically, they put together an alarm call and a recruitment call into a sequence: the alarm-recruitment call. Upon the alarm-recruitment call, the tits scan the sky (for a crow) and gather together to prepare for mobbing (anti-predator aggression). Is this a compositional syntax? Suzuki and his team (2016, 2017) conducted a quasi-experiment to investigate whether this two-call sequence is actually a sentence or just a random two calls. They recorded Japanese tits’ alarm call and recruitment calls and artificially combined them in two orders: a) the alarm-recruitment call sequence and b) the recruitment-alarm call sequence. They then went into a forest where Japanese tits lived and played these recorded calls. When Japanese tits heard the alarm-recruitment sequence, they scanned the sky for predators and then approached their companions. But when the experimenters played the recruitment-alarm sequence, the birds did not respond. This asymmetry in listener behavior led Suzuki and colleagues to argue that Japanese tits have compositional syntax. That is, the meaning of the call depends on the order of its components. They “understand” the alarm-recruitment call but not the recruitment-alarm call (in English, a “red car” makes sense but not a “car red”). Therefore, there is a grammatical rule in Japanese tits’ two-call combinations. This was one of the first demonstrations of compositional syntax in non-human animals.
When Japanese Tits Meet Behavior Analysis
Some behavior analysts would read the results of Japanese tits’ listener behavior and recall a line of behavior analytic research by Urcuioli and his colleagues. They have devoted years to understanding how non-human animals behave toward higher-order stimulus relations such as stimulus equivalence (e.g., Lionello-DeNolf & Urcuioli, 2002). In a typical experiment, pigeons learn to peck a key in response to specific visual stimuli, such as colors or shapes. When pigeons are trained to match one stimulus to another (say, a red key as a sample stimulus and a triangle key as a comparison) in a match-to-sample procedure, researchers then test for symmetry—that is, whether the pigeon will match a triangle as a sample to a red key as a comparison without direct training. This seemingly easy task for human eyes turned out to be a very difficult task not only for Urcuioli’s pigeons, but also for many non-human animals such as rats, monkeys, chimpanzees, and baboons (Bruce et al., 2022; D’Amato et al., 1985; Dugdale et al., 2000; Medam et al., 2016). They all failed the symmetry test.
Urcuioli saw this “failure” of symmetry performance not as an error but rather as a characteristic responding pattern in non-human animals. In his theory of pigeons’ equivalence-class formation, Urcuioli (2008) postulated that for pigeons, a red key as a sample stimulus and a red key as a comparison stimulus are two completely different stimuli because they don’t share some stimulus properties such as position, temporal order, and other aspects. Because these two red keys are two different stimuli, pigeons that were trained to respond to the red key sample – the triangle comparison sequence would not respond to the triangle sample – the red key comparison sequence in testing. This triangle sample – the red key comparison sequence is a completely new set of stimuli that pigeons have never seen before. Based on this insight, Urcuioli and his colleagues (Urcuioli, 2008, 2011; Urcuioli & Swisher, 2015; see also Frank & Wasserman, 2005) painstakingly controlled all the properties of a functional stimulus in their experiments so that even when these stimuli appear as sample and comparison, they are treated “the same.” Eventually, they demonstrated symmetry and other relations (identity and reflexivity) in pigeons in what was the first successful demonstration of stimulus equivalence in non-human animals (see Araiba, 2025, for details).
Evidence of Syntax or Failure of Symmetry —or Both?
Let’s go back to Japanese tits. Tits responded to the alarm–recruitment sequence, but not to the recruitment–alarm combination. Suzuki et al. (2016, 2017) saw these differential responses as evidence of compositional syntax. But in the light of Urcuioli’s theory, this can be interpreted as a failure of symmetry performance. Maybe the recruitment call in the alarm-recruitment sequence is a completely different stimulus than that in the recruitment-alarm sequence because these two recruitment calls occupy different temporal positions (one is the first and the other is the last in the sequence). If so, what to make of the Japanese tit’s differential responding to those two call sequences?
One way to see it is that both the failure of symmetry performance and compositional syntax are the different sides of the same coin. That is, the fact that it is very difficult for birds to perceive two recruitment calls in different positions as the same might have given rise to a bird’s unique “grammar.” That is, Japanese tits might be communicating verbally based on this “limitation” in sequencing calls.
Parallels exist with human linguistic cognition. Cognitive linguists like Deutscher (2005) argue that human syntax also arises from cognitive constraints. For example, the proximity principle states that humans can only make sense of a sentence when adjectives are closer to the noun that is described. A sentence “an old wooden chair broke” can hardly be “a chair broke old wooden” because humans cannot perceive adjectives such as “old” and “wooden” as a part of “chair” when they are placed far from each other. Likewise, the Japanese tits’ call sequence might reflect a species-specific pattern of responding to sequential stimuli.
A Possibility of Language Evolution Across Species
With fascinating new discoveries in animal communication, we can now view language as something that evolved gradually across species rather than appearing suddenly in humans. Suzuki and his colleagues (2019) proposed a model describing this evolution: in Stage 1, animals use single signals (“A” or “B”); in Stage 2, some species begin combining signals into short sequences (AB or BA), though order doesn’t yet matter. In Stage 3, AB might fuse into a new sound (as in bird songs) or differ in meaning from BA, marking the emergence of compositional syntax, as seen in the Japanese tit’s alarm–recruitment calls. Finally, in Stage 4, human language appears. But Urcuioli’s theory adds an intriguing twist by emphasizing the listener’s role in this evolution. Suzuki’s framework focuses on how speakers combine calls, but grammar may depend just as much on how listeners respond. If listeners fail to react meaningfully to certain combinations, those “grammatical rules” would go extinct, while effective ones persist and shape communication over generations. In this way, grammar could have co-evolved through speaker–listener dynamics —a collaborative process that not only shaped human language but continues to echo across the animal kingdom.
A Word of Caution
Of course, we should be careful when comparing pigeons and Japanese tits. These species are very different, and even animals within the same group don’t always follow the same behavioral rules. Urcuioli’s pigeons were trained in strict lab settings, while Suzuki’s tits were studied in the wild. The tits show a range of complex behaviors in response to calls—like scanning or approaching—whereas pigeons mostly respond by pecking a key. Their vocal abilities also differ: pigeons don’t naturally combine sounds, but songbirds like the tits do. So, the comparisons I’m making in this blog are quite speculative. Still, these studies have opened an exciting new window for behavior analysts to explore animal communication. There’s so much more to discover about how animals use and understand signals—it’s an incredible time to study animal language!
Implications for Human Verbal Behavior Research
Non-human animals’ use of two-call combinations can also inspire human verbal behavior research. Palmer (2007, 2023) argues that grammar and syntax are “structural phenomena” and urges behavior analysts to investigate them. The simplest structural unit is the two-word combination. Even with just a combination of two words, the word function changes drastically (e.g., “blue light” vs. “light blue”). Despite decades of verbal behavior research, this structural stimulus control (compositionally) of verbal behavior has rarely been studied directly. In Applied Behavior Analysis, practitioners using the VB-MAPP (Sundberg, 2008) are quite familiar with programs such as teaching two-word combinations in mands or tacts, but we virtually have no experimental investigations on how to establish such a two-word combination repertoire in children. Experimenters can explore how reinforcement histories shape structural stimulus control—the behavioral foundation of grammar.
Conclusion
By now, I hope you (and Skinner) are convinced that many non-human animals engage in complex verbal behavior that we can call language. Both ethologists and behavior analysts can contribute to the progress in this endeavor together. Whether in the laboratory or the forest canopy, both lines of research illuminate how animals respond to complex stimulus sequences and how such behaviors might underlie the emergence of language itself. These studies remind us that communication—human or otherwise—arises not from words alone but from the behavioral contingencies that shape how we respond to them. Next time you go outside, listen to the birds — they may be chatting with one another.
References
Ai, H., Okada, R., Sakura, M., Wachtler, T., & Ikeno, H. (2019). Neuroethology of the waggle dance: how followers interact with the waggle dancer and detect spatial information. Insects, 10(10), 336. https://doi.org/10.3390/insects10100336
Araiba, S. (2025). A Search for Language in Birds in the Lab and the Wild. Journal of the Experimental Analysis of Behavior. Advance online publication. https://doi.org/10.1002/jeab.70063
Bolt, L. M., Sauther, M. L., Cuozzo, F. P., & Youssouf Jacky, I. A. (2015). Antipredator vocalization usage in the male ring-tailed lemur (Lemur catta). Folia Primatologica, 86(1-2), 124-133. https://doi.org/10.1159/000369064
Bruce, K., Dyer, K., Phasukkan, T., & Galizio, M. (2022). Two directions in a search for symmetry in rats. The Psychological Record, 72(3), 465‒480. https://doi.org/10.1007/s40732-021-00490-x
D’Amato, M. R., Salmon, D. P., Loukas, E., & Tomie, A. (1985). Symmetry and transitivity of conditional relations in monkeys (Cebus apella) and pigeons (Columba livia). Journal of the Experimental Analysis of Behavior, 44(1), 35–47. https://doi.org/10.1901/jeab.1985.44-35
Dugdale, N., & Lowe, C. F. (2000). Testing for symmetry in the conditional discriminations of language-trained chimpanzees. Journal of the Experimental Analysis of Behavior, 73(1), 5–22. https://doi.org/10.1901/jeab.2000.73-5
Eleuteri, V., Bates, L., Rendle-Worthington, J., Hobaiter, C., & Stoeger, A. (2024). Multimodal communication and audience directedness in the greeting behaviour of semi-captive African savannah elephants. Communications Biology, 7(1), 472. https://doi.org/10.1038/s42003-024-06133-5
Frank, A., & Wasserman, E. (2005). Associative symmetry in the pigeon after successive matching-to-sample training. Journal of the Experimental Analysis of Behavior, 84, 147–165. https://doi.org/10.1901/jeab.2005.115-04
Deutscher, G. (2005). The Unfolding of Language: An evolutionary tour of mankind’s greatest invention. Macmillan.
Leroux, M., Schel, A. M., Wilke, C., Chandia, B., Zuberbühler, K., Slocombe, K. E., & Townsend, S. W. (2023). Call combinations and compositional processing in wild chimpanzees. Nature Communications, 14(1), 2225. https://doi.org/10.1038/s41467-023-37816-y
Lionello‐DeNolf, K. M., & Urcuioli, P. J. (2002). Stimulus control topographies and tests of symmetry in pigeons. Journal of the Experimental Analysis of Behavior, 78(3), 467–495. https://doi.org/10.1901/jeab.2002.78-467
Manser, M. B. (2001). The acoustic structure of suricates’ alarm calls varies with predator type and the level of response urgency. Proceedings of the Royal Society of London. Series B: Biological Sciences, 268(1483), 2315-2324. https://doi.org/10.1098/rspb.2001.1773
Medam, T., Marzouki, Y., Montant, M., & Fagot, J. (2016). Categorization does not promote symmetry in Guinea baboons (Papio papio). Animal Cognition, 19(5), 987–998. https://doi.org/10.1007/s10071-016-1003-4
Palmer, D. C. (2007). Verbal behavior: What is the function of structure? European Journal of Behavior Analysis, 8(2), 161–175. https://doi.org/10.1080/15021149.2007.11434280
Palmer, D. C. (2023). Toward a behavioral interpretation of English grammar. Perspectives on Behavior Science, 46(3), 521–538. https://doi.org/10.1007/s40614-023-00368-z
Sharma, P., Gero, S., Payne, R., Gruber, D. F., Rus, D., Torralba, A., & Andreas, J. (2024). Contextual and combinatorial structure in sperm whale vocalisations. Nature Communications, 15(1), 3617. https://doi.org/10.1038/s41467-024-47221-8
Sundberg, M. L. (2008). VB-MAPP Verbal Behavior Milestones Assessment and Placement Program: a language and social skills assessment program for children with autism or other developmental disabilities: guide. Mark Sundberg.
Suzuki, T. N., Griesser, M., & Wheatcroft, D. (2019). Syntactic rules in avian vocal sequences as a window into the evolution of compositionality. Animal Behaviour, 151, 267–274. https://doi.org/10.1016/j.anbehav.2019.01.009
Suzuki, T. N., Wheatcroft, D., & Griesser, M. (2016). Experimental evidence for compositional syntax in bird calls. Nature Communications, 7(1), 10986. https://doi.org/10.1038/ncomms10986
Suzuki, T. N., Wheatcroft, D., & Griesser, M. (2017). Wild birds use an ordering rule to decode novel call sequences. Current Biology, 27(15), 2331–2336. https://doi.org/10.1016/j.cub.2017.06.031
Urcuioli, P. J. (2008). Associative symmetry, antisymmetry, and a theory of pigeons’ equivalence-class formation. Journal of the Experimental Analysis of Behavior, 90(3), 257–282. https://doi.org/10.1901/jeab.2008.90-257
Urcuioli, P. J. (2011). Emergent identity matching after successive matching training, I: Reflexivity or generalized identity?. Journal of the Experimental Analysis of Behavior, 96(3), 329–341. https://doi.org/10.1901/jeab.2011.96-329
Urcuioli, P. J., & Swisher, M. J. (2015). Transitive and anti-transitive emergent relations in pigeons: Support for a theory of stimulus-class formation. Behavioural Processes, 112, 49–60. https://doi.org/10.1016/j.beproc.2014.07.006
