Functional Convergence in Bat and Toothed Whale Biosonars

P. T. Madsen, A. Surlykke

Abstract

Echolocating bats and toothed whales hunt and navigate by emission of sound pulses and analysis of returning echoes to form a self-generated auditory scene. Here, we demonstrate a striking functional convergence in the way these two groups of mammals independently evolved the capability to sense with sound in air and water.

Echolocation: Perception Through Active Sensing

Echolocating bats and toothed whales emit sound pulses and listen for returning echoes to form an actively generated auditory scene for navigation and foraging. The independent evolution of echolocation has, along with the capabilities to fly or perform long breath-hold dives, allowed bats and toothed whales to exploit dark foraging niches with little competition from predators that rely on vision. The evolution of biosonars allowed a successful speciation to ∼1,100 species of bats and some 80 species of toothed whales living in a large range of different habitats and accounting for ∼25% of all known extant mammalian species.

Echolocation is an active sense where the sensing animal must produce the energy that eventually carries information (30) about the surrounding environment back to its auditory system via the delay, the amplitude, and the spectral and temporal properties of the returning echo (FIGURE 1A). This implies that information is primarily gathered when the echolocating animal phonates and that the properties and emission rates of the generated sounds will dominate the information flow available for processing and perceptual organization of the animal's Umwelt (26). Using echolocation to find, select, and capture prey in darkness involves close coordination of vocal and motor outputs with sensory inputs via a vocal-motor feedback loop that informs and times body movements and the properties and rates of the emitted sonar pulses. Such adjustments are part of an acoustic gaze control, wherein the hearing (e.g., Refs. 31, 38) and the call rates, levels, frequencies (3, 12), and beam widths (14, 28) of the emitted sound pulses are all manipulated to focus dynamically on targets of interest (27, 42, 45).

FIGURE 1.

The sonar equation

A: schematic outline of the sonar equation. The echolocating animal derives the target range from the delay between the outgoing pulse and the returning echo. The echolocating animal emits a sound pulse with a source level (SL) that suffers from a transmission loss (TL) to the target. Depending on the target strength (TS) of the prey, a proportion of the received sound level will be reflected back toward the echolocating animal as an echo that again suffers from a TL to form the received echo level (EL). The spectral and temporal properties of the echo will thus be affected by the TL and the properties and geometry of the ensonified target. B: sound radiation from a Daubenton's bat (Myotis daubentonii) producing a sound pulse directionality index (DI) of 16 dB (half-power beam width of 29°) along with the waveform (SL of 138 dB re 20 μPa, pp), spectrogram, and energy accumulation of the emitted call (42). C: sound radiation from a bottlenose dolphin (Tursiops truncatus) with a DI of 26 dB (half-power beam width of 9°) along with the waveform (SL of 226 dB re 1 μPa, pp), spectrogram, and energy accumulation of the emitted click (44).

These requirements put an evolutionary premium not only on an acute auditory system but also on the properties and the control of the emitted sounds to allow the echolocating animal to find, track, and intercept small prey moving in a dark, three-dimensional space. For example, echolocating bats, flying at 3–5 m/s, have <1 s between detecting and capturing a prey item, and echolocating dolphins may successfully echolocate for fish buried 30 cm into the seabed. These remarkable sonar performances surpass those of man-made sonars at short ranges (3), making biosonar systems intriguing from both biomimetic and basic science points of view. Accordingly, since the discovery of echolocation in bats in the late 1930s (10) and in toothed whales in the late 1950s (19, 32), a large range of studies have been undertaken to understand the operation of biosonar systems in both air and water.

These two media have very different physical properties with significant implications for the production, propagation, and reflection of sound, and the means and speeds with which echolocating predators and their prey can move and maneuver. The completely independent evolution of biosonar systems of very different-sized mammals in two such different media would at first glance likely be predicted to entail very different properties, frequency ranges, and sampling rates. However, in this review, we use recent field data to show that bats and toothed whales have evolved remarkably similar acoustic means to locate and catch prey with ultrasound, offering a fascinating area of research in comparative sensory physiology.

As with any research on complex systems, most biosonar studies have taken a reductionist approach by studying at the performance and function of selected aspects of the echolocation process. This dedicated effort has led to a firm understanding of how echolocating animals produce, receive, and process ultrasound, but laboratory studies cannot provide the full picture of how free-ranging animals use echolocation in the habitats for which their sonar systems evolved. A natural environment will often comprise an extremely complex soundscape compared with the controlled, simple surroundings of a laboratory where specific tasks, such as ranging, detection, and discrimination in noise and clutter, are often studied one at a time using a stationed animal. Such experimental conditions are in stark contrast to the situation faced by moving, echolocating animals in the wild, where the properties and behavior of live echoic targets vary in time and space with their returning echoes buried in clutter, background noise, or even sounds from other echolocators. Furthermore, psychophysical laboratory studies are designed so that the experimental animal is encouraged via food rewards to optimize its pay-off matrix by focusing exclusively on solving the defined task at hand while ignoring other factors, which it would normally have to negotiate in the wild. For example: Do bats flying at 3 m/s in dense clutter toward a prey that they have <500 ms to detect, track, and capture employ the same discrimination and ranging resolution as stationed bats with much more time on a platform in an anechoic laboratory chamber? Will a breath-holding toothed whale at a depth of 1,000 m dedicate the same amount of clicks to a moving elusive target in the deep-scattering layer as when stationed at 1-m depth in a fixed target detection experiment?

Such questions about ecological validity should, in our view, concern all comparative physiologists and drive formation of hypotheses and technological developments to form a strong and informed synergy between laboratory and field studies. That need was recognized early on by Don Griffin, the discoverer of echolocation, who championed the combination of laboratory and field studies to understand how bats use biosonars to capture prey (10). In this review, we seek to address how recent developments of tagging and recording technology have enabled increasingly detailed studies of echolocation in the field to reveal a dynamic use of acoustic gaze in the active process of sensing with ultrasound. We collate recent laboratory and field studies with earlier data to show a remarkable functional convergence in the way biosonars are operated in bats and toothed whales despite very different evolutionary starting points in air and water.

What Does It Take to Echolocate?

Echolocating animals all go through the three phases of echo-guided search, approach, and capture as defined by Griffin (10). These tasks involve different challenges: first, in the search phase to detect and classify enough potential prey items; next, after detection, tracking of moving prey, and acute timing of vocal-motor feedbacks in the course of close-up encounters during capture attempts. To solve these tasks, echolocating animals use high-powered, ultrasonic sound pulses to create a sufficiently large sensory volume to forage efficiently. Despite being as different mammals as imaginable in size and general morphology and living in the two very different media of air and water, echolocating bats and toothed whales use surprisingly similar call frequencies to search for, approach, and capture prey (FIGURE 1, B AND C). This striking evolutionary convergence is based in part on fundamental shared spectral and temporal features of the mammalian auditory system but also on the often opposing physical constraints involved in production, propagation, and reflection of sound in air and water that set the evolutionary stage for efficient biosonar operation.

Call Frequencies and Directionality

A functional biosonar system for locating and discriminating small prey items requires high frequencies to provide directional sound beams (FIGURE 1, B AND C). Directionality provides increased source levels for the same power and reduced clutter levels. High frequencies are also needed to provide geometric backscatter and form a spectral basis for target discrimination. Bats are several orders of magnitude smaller than toothed whales, and the difference in size of their prey is equally large. Accordingly, larger echolocating animals targeting large prey can use lower frequencies to achieve the same directionality, and relative spectral resolution and geometric backscatter off their prey compared with small echolocating animals that must use higher frequencies to detect smaller prey. Such scaling seems indeed to be supported at least in general for both bats and toothed whales, where the biggest echolocators in both media operate at ∼15 kHz and the smallest beyond 130 kHz, suggesting that directionality is likely a major driving force for call frequencies in both bats [directionality indexes (DIs) of ∼10–16 dB] and toothed whales (DIs of ∼24–32 dB) (15, 20). The substantial size differences between bats and toothed whales and their respective prey predicts that bats should echolocate at much higher frequencies than toothed whales to achieve the same directionality and relation between object size and wavelength for backscatter. However, such an effect is in part offset by the almost five times slower sound speeds, and hence almost five times shorter wavelengths, in air compared with water (FIGURE 2, A AND B). Nevertheless, bats should operate their sonar at frequencies well more than twice as high as toothed whales to achieve the same directionalities because their transmitting apertures are much more than 10 times smaller than those of toothed whales, but the severe atmospheric absorption of ultrasound in air compared with water (FIGURE 2, A AND B) strongly reduces the functional value of such high frequencies (FIGURE 2C). As a consequence, bats and toothed whales produce echolocation signals in a surprisingly similar frequency range from 10 to 200 kHz. The consequence is that the acoustic fields of view in bats are two to six times broader than those of echolocating toothed whales (FIGURE 1, B AND C).

FIGURE 2.

Target size that generates geometric backscatter and absorption

Minimum target size that generates geometric backscatter (blue, left y-axis) and absorption in dB/m (green, right y-axis) as a function of frequency in air (A) and water (B). Absorption is calculated at 25°C and assuming 60% humidity and 1-m depth, respectively. C: the combined transmission loss from spherical spreading and absorption for four different frequencies as a function of range in meters away from reference distances of 0.1 m (bats) and 1 m (toothed whales). Notice how the high absorption at high frequencies in air gives rise to very high transmission loss (TL), which in turn precludes effective biosonar operation.

The Sonar Equation: Detection Ranges in Air and Water

The estimated echo level (EL) returning to an echolocating animal can be estimated with the active sonar equation (FIGURE 1A) that in a simple form on a dB scale includes the source level of the emitted sound pulse (SL), the transmission loss (TL) back and forth to the target, and the target strength (TL) that is a measure of backscattering from the target: EL = SL + TS − 2TL.

Detection of a returning echo will happen when the EL on a statistical basis is higher than the detection threshold in the auditory system of the echolocating animal that in turn may either be defined by ambient noise, clutter, or the hearing threshold of the animal. Echolocating animals in both air and water must produce high SLs to forage successfully with sound. For that reason, bats and toothed whales produce among the most powerful biologically generated sounds. Recent years of field recordings show that bat echolocation calls can reach SLs of up to 140 dB re 20 μPa (pp) @ 0.1 m in air, being 20–40 dB higher than in the laboratory (41). Toothed whales may generate SLs up to 225 dB re 1 μPa (pp) @ 1 m in water (44) and even up to 240 dB re 1 μPa (pp) in the case of the sperm whale (29). Such high source levels should, however, not be compared directly across the water-air interface for several reasons. For starters, the two source levels use different reference values and different reference distances. Second, the mammalian ear is an energy detector that integrates sound intensity over a certain integration window in two media of very different impedances (3, 39). Because the impedance in air is much lower than in water, it is very difficult to make high sound pressure levels in air compared with water, and bats likely operate up against the limit at which sound pressure can be made effectively in air (41). They compensate for this by emitting pulses that may be 30–1,000 times longer than toothed whale echolocation clicks and hence carry much more energy for a given pressure (FIGURE 1, B AND C). When the different pulse durations and impedances of the two media are taken into account, a 2-ms bat call in air with a SL of 138 dB re 20 μPa (pp) @ 10 cm will have an energy flux density of ∼5 × 10−5 J/m2 and a 50-μs-long toothed whale click with a source level of 226 dB re 1 μPa (pp) @ 1 m will have an energy flux density of 0.04 J/m2 (FIGURE 1, B AND C). Thus echolocating bats emit about three orders of magnitude less energy per unit of area on-axis per sonar signal than toothed whales (FIGURE 1, B AND C) but ∼1–2 orders of magnitude more of total acoustic energy per unit of body mass (3–100 g for bats and 40–60,000 kg for toothed whales).

However, prey detection ranges not only depend on the SL and detection thresholds but also on the TS of the prey and the TL back and forth. Due to the very large impedance differences between an insect cuticle and air compared with the much lower impedance differences between water and tissues of fish and squid, the 0.5- to 5-cm prey typical of bats (6) have about the same TS as the 10- to 50-cm aquatic prey typical of toothed whales (5), despite their size being an order of magnitude smaller. The TL is given by the sum of the geometric spreading loss and absorption. The latter is approximately two orders of magnitude larger in air compared with water at echolocation frequencies (FIGURE 2, A AND B). Furthermore, the absorption increases steeply with frequency, and bats that hunt in uncluttered conditions and call at high frequencies seemingly compensate in part for that effect by emitting louder calls than larger aerial hunting bats that use lower frequencies (41).

Absorption in water at ultrasonic frequencies is thus much smaller than in air (FIGURE 2, A AND B), which in combination with much higher source levels provides whales with target detection ranges that are one to two orders of magnitude larger than for bats; toothed whales can, under noise-limited conditions, echolocate their prey at ranges between 20 and 500 m compared with 2–10 m of bats (FIGURE 2C). So, echolocating animals face a trade-off with frequencies (9, 11): They must use ultrasonic pulses to generate directional sound pulses that provide high resolution and geometric backscatter off their prey targets, but on the other hand they need to keep the transmission loss down to a level that allows them to detect prey at long enough ranges to find enough food to meet their energetic requirements (FIGURE 2). Larger species can generate high directionalities and geometric backscatter from their larger prey at lower frequencies, which in turn render them lower transmission losses and hence longer detection ranges (FIGURE 2C).

Reactive vs. Deliberate Sensorimotor Operations

Bats move forward at speeds between 3 and 8 m/s (7), by which they typically cover about a prey detection range per second. Toothed whales, on the other hand, normally move at speeds of ∼2 m/s (35) and hence only cover fractions of a typical detection range per second, leaving much more time to gather echo information from the ensonified target compared with bats (23). Consequently, bats go through their approach and capture phases on time scales that are one to two orders of magnitude faster than toothed whales (FIGURE 3). This means that toothed whales can employ a deliberate mode of sensorimotor operation (37) in which the sensory volume is large compared with the stopping volume (23). This is very different from bats that operate in a reactive mode where they normally have <1 s between detection and interception of prey (18). These very different ratios between maximum detection ranges and speeds of forward motion and the resulting differences in sensorimotor operation may explain why there is little evidence for prey selection by bats (6) and considerable evidence for prey selection by toothed whales in the wild (22), even though bats in the laboratory in fact can discriminate targets based on infinitesimal spectral echo differences when given the time (36). It also follows that bats must employ a much faster vocal-motor feedback loop in their sonar system to guide motor patterns in split-second interception attempts. They do, on the other hand, have the advantage compared with toothed whales that their capture mechanism involves parts of the tail or wing membrane that make up an area considerably bigger than their mouth, whereas toothed whales must maneuver precisely around their larger prey to engulf it with a relatively much smaller mouth area. Nevertheless, bat echolocation must provide higher absolute prey location capabilities in much shorter time than is the case for toothed whales (FIGURE 3).

FIGURE 3.

Call intervals

Both toothed whales (A) and bats (B) decrease click or call intervals (ICI) as well as source level (emitted energy color coded) when closing in on a prey item. A: three species of toothed whales show the same pattern of SL reductions and increased click repetition rates in the buzz, where resolution is traded for output. Note the scaling where the sperm whale (Physeter) buzzes at ICIs comparable to the ICIs in the search phase of a porpoise (Phocoena) with a body mass that is three orders of magnitude smaller. B: a similar pattern is seen in three species of bats but on a time scale that is an order of magnitude smaller, demonstrating the reactive mode of sensorimotor operation for bats. Note the larger changes in ICIs compared with the toothed whales.

Acoustic Gaze Adjustments

Echolocating animals actively update their auditory scene through discrete acoustic sampling at a rate given by the pulse interval of their sonar emissions. Bats and toothed whales both wait for the echoes to return before emitting the next sonar pulse (3), so if they emit pulses too fast they may emit a new biosonar pulse before echoes generated from the previous emission have arrived, which will lead to range ambiguity problems. On the other hand, if they use too long pulse intervals with respect to the speed at which they or their prey move, they may miss potential prey targets or fail to provide enough feedback to avoid obstacles or track targets in time and space. The longer the detection range of the biosonar system and the slower the sound speed, the longer the pulse intervals must be to avoid range ambiguity. Hence, because of the roughly five times slower sound speed in air compared with water, bats must wait about five times longer for the echo to return compared with a whale ensonifying a target at the same range. On the other hand, the detection range of toothed whale sonars are one to two orders of magnitude longer than those of bats, so overall the pulse rates of bats and whales are comparable, in the order of 2–25 Hz during the search phase (FIGURE 3).

In both taxa, the interpulse intervals and the output levels are by most species reduced in both the approach and buzz phases to accommodate the faster returns of echoes and higher echo levels (FIGURE 3) (4, 13). Such acoustic gaze adjustments are most dramatic in the buzz phase, where the high pulse rates provide fast updates on the location of a targeted prey, whereas the low SLs reduce the complexity of the auditory scene (17, 45). Although such buzz behavior has been known for aerial hunting bats since Griffin's early studies (10), it has largely been overlooked for toothed whales through more than 40 years of biosonar studies with a biomimetic focus (3). Field and laboratory studies with free-swimming toothed whales catching prey have recently shown that all species studied under such circumstances switch to a buzz when they are just under their own body length away from a prey item (1, 2, 22, 24, 25, 43). Thus both bats and toothed whales use low-level, fast-repetition-rate buzzes for fine-scale tracking of their prey for capture, trading output intensity for update rate of their actively generated auditory scene (FIGURE 3). The central auditory processing mechanisms at these high sampling rates are not understood at this time (34), but the ubiquitous nature of high-rate buzzes in both air and water, when bats and whales home in on moveable prey, is not only striking and interesting but highly suggestive of a key function for sonar perception in those last moments of prey capture (FIGURE 3). Interestingly, the buzz pulse rates are scaled to the size of the animal so that the buzz rate of a sperm whale is comparable to the clicking rate during search for a small porpoise (FIGURE 3). This scaling is at play even though porpoises and sperm whales swim at comparable speeds when closing in on prey, meaning that sperm whales get about an order of magnitude fewer updates on prey location per distance covered during buzzing than porpoises. It therefore seems that sampling rates during buzzing are related to absolute predator and prey maneuverability rather than to closing speeds during prey captures.

Although both toothed whales and bats change acoustic gaze when they approach their prey items, the degree to which they do it differs (FIGURE 3). Both groups reduce the energy output per sonar pulse during buzzing, but whereas bats do it by reducing both the peak pressure and the duration (33), toothed whales generally do not seem to change their click durations but instead reduce their peak pressures dramatically (FIGURE 3). This is a part of a broader conclusion that bats have a much more plastic sound production system, where bandwidth, duration, sweep rate, and peak power can be adjusted to produce a range of different biosonar signals within a single bat species to handle the dual needs of energy for detection during search and provide range resolution during buzzing. This plasticity is based on superfast muscle control of the vocal cords (8), allowing for fast changes in vocal outputs. Toothed whales produce much shorter clicks normally by using the right pair of phonic lips in their nasal complex (21) with a mass and configuration that offers less plasticity in terms of duration and frequency of their high-pressure, ultra-short broadband clicks (FIGURE 1, B AND C).

In conclusion, echolocation in bats and toothed whales involves surprisingly similar call frequencies and acoustic behavior and offers as such an example of striking functional convergence where two very distantly related groups of mammals independently evolved the capability to hunt and navigate in the dark using ultrasound in much the same way. The last 60 years of research have formed a solid understanding of how these animals use echolocation to detect, discriminate, and track targets in noise and clutter. More recently, the advent of microcontroller technology (e.g., Ref. 16) has enabled studies of echolocating animals in the wild, revealing a much more dynamic use of acoustic gaze to sense with ultrasound. We argue that to understand the evolution and function of sensory systems, they must also be studied in the wild, and we hope that future technological advances will enable increasingly detailed, long-term field studies on a broader range of species to provide a strong synergy between laboratory and field studies.

Footnotes

  • No conflicts of interest, financial or otherwise, are declared by the author(s).

References

  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
  30. 30.
  31. 31.
  32. 32.
  33. 33.
  34. 34.
  35. 35.
  36. 36.
  37. 37.
  38. 38.
  39. 39.
  40. 40.
  41. 41.
  42. 42.
  43. 43.
  44. 44.
  45. 45.
View Abstract