by Michael Towsey, Anthony Truskinger, and Paul Roe

Preface

This page is an adaptation of a presentation produced by Michael Towsey that was given to various external research groups in February and March 2016. It is a good summary of our current research.

You can contact Michael or our research group by going to the contact us page.

Contents

Introduction

Slide 1.

Slide 1.

This presentation describes work being done in the Eco-acoustics Laboratory at the Queensland University of Technology (QUT). An important part of our research is the visualisation, navigation, and analysis of long-duration recordings of the environment. Recordings of the environment help ecologists to monitor species diversity, endangered species, and the effects of climate change. It is fortunate that three major groups of vocal animals, birds, frogs, and insects, are also good indicators of environmental health. Acoustic recordings can therefore help in long term studies of environmental change, whether due to negative factors such as pollution, habitat loss, climate change, or due to positive factors such as conservation and restoration projects.

Many of our recordings have been obtained at the Samford Ecological Research Facility (SERF) using SM2 recording boxes manufactured by Wild Life Acoustics. (See image in Slide 1)

Spectrograms

Slide 2.

Slide 2.

A short snippet of the Lewins Rail on the edges of Brisbane City

Ecologists have long worked with spectrograms, two dimensional representations of sound, with time as the x-axis and frequency (Hertz or kilohertz) as the y-axis. Sound amplitude is coded by the grey-scale intensity. The typical spectrogram will be a few seconds long, or as long as required to demonstrate the animal call of interest. We have used acoustic recordings to monitor the cryptic Lewins Rail on the edges of Brisbane City. The sounds at 8 kHz and above in the spectrogram are due to crickets. There are at least four other birds also calling. However, if we want to prepare a spectrogram of a 24-hour recording at similar scale to that shown in Slide 2, it would require a computer screen 1.2 kilometres wide!

An additional problem with acoustic recordings is that technological advances now make it possible to record days, months, or even years of audio–far in excess of what can ever be listened to. Clearly some kind of audio reduction is required. Visualisation of sound is a promising approach because, of all the human senses, the visual sense has the greatest capacity to synthesise and integrate large amounts of information.

Long-duration spectrograms

Slide 3.

Slide 3.

A good way to reduce the information in acoustic recordings of the environment is by using acoustic indices. An acoustic index can be understood as a statistic, that is, a single value to summarise some aspect of the distribution of acoustic energy in a recording. Just as weather reports give average temperature to summarise the weather over an entire day, so we can also use acoustic indices to summarise acoustic properties in a segment of recording.

There are now a lot of studies [see for example the papers of Almo Farina (Italy), Jerome Sueur (France), Stuart Gage (USA) and their students who collectively developed this field of research] which shows that certain indices such as the Acoustic Complexity Index (ACI)1 can detect the presence of biological sounds but remain unresponsive to various kinds of monotonous machine noise. This is useful when one wishes to characterise sound in a semi-urban environment. ACI measures the amount of relative change in sound amplitude from one time instant to the next.

It is possible therefore to prepare a spectrogram of a long duration recording using just the ACI index. We illustrate the method using a four-hour recording (4-8pm) obtained 10th October 2013. (See Slide 3). We break the four-hour recording into 240 one-minute segments. We prepare a standard spectrogram for each one-minute of recording and then calculate the ACI value for each of its 256 frequency bins. Each minute of recording therefore yields a single ACI spectrum containing 256 values. We then join the 240 spectra to make a spectrogram. As illustrated in Slide 3, the ACI spectrogram reveals a lot of acoustic structure. For example, the two tracks after sunset (right side of the spectrogram) are due to the chirps of two different species of cricket–note how the tracks decrease in frequency as temperature declines with the onset of night. Other dominant features in the spectrogram are due to birds.

Different indices – different views

Slide 4.

Slide 4.

In like manner, we can make additional spectrograms derived from other acoustic indices. In fact, one can experiment with many kinds of index in order to reveal the acoustic structure of interest. In this image, you see long-duration spectrograms prepared from three different acoustic indices. The H(t) index refers to temporal entropy. It measures the degree of (temporal) concentration of acoustic energy in a frequency bin. With continuous background noise, the acoustic energy is highly dispersed, yielding a low value for H(t). If the only sound over the minute is a single dog bark, the sound is concentrated and the value of H(t) will be high.

CVR is short for cover. It simply measures the fraction of spectrogram cells in a given frequency bin where the acoustic energy exceeds a threshold, typically 3 decibels (dB). These are comparatively simple indices, but the important point to note is that each index reveals different components or events in the acoustic sound-space. For example, a 30-minute cicada chorus is only seen in the CVR spectrogram starting at 6pm at 900 Hz. A strong event is shown at 4:30pm (1.8 kHz) in the H(t) spectrogram which is not so obvious in the other spectrograms.

We have now introduced two new terms, sound-space and acoustic event. Sound-space is the abstract notion of a space with five-dimensions, one of time, one of frequency and three of physical space. It is the “space” in which acoustic events occur. An acoustic event is typically a single uninterrupted sound that can be attributed to the same source. The source might be single animal or a chorus of many animals (such as cicadas). A spectrogram reveals the distribution of acoustic energy in the time-frequency dimensions for a fixed point in space.

False-colour spectrograms

Slide 5.

Slide 5.

The next step in the production of long duration spectrograms was inspired by false-colour satellite imagery in which pictures of the earth’s surface are rendered in three colours by assigning the red, green and blue (RGB) colours to sensors that respond to different parts of the electromagnetic spectrum. In this case we assign RGB to the three different spectrograms and thereby produce the long-duration, false-colour spectrogram seen on the right-hand side of this slide2. Note how the complexity of acoustic structure in the sound-space is clearly revealed.

The utility of false-colour spectrograms

Slide 6.

Slide 6.

Here we see just how much extra acoustic structure can be seen in a false-colour spectrogram. We took a full 24-hour recording (starting at midnight and finishing the following midnight – this is the same recording from which the 4 hours of the previous slides were extracted) and processed them using a standard audio software tool (Audacity) and our false-colour technique. Ordinarily you would not ask Audacity to open a 24-hour recording but it can be done (with a lot of patience!) on a high performance computer. Audacity achieves the 3000-fold reduction by averaging over long segments of recording. However this smooths out the acoustic structure and makes most events invisible.

The basic structure of the forest soundscape at SERF is easily interpreted. (Soundscape is a new word – it refers to the collectivity of acoustic events that occur in a sound-space. It is analogous to landscape in three dimensional space.) The morning chorus is clearly visible at 4:45am. It tends towards white in colour because all indices have high values. The onset of evening is indicated by the onset of acoustic tracks due to insects. But clearly there are many other events.

A colour chart is added at the top of the image to help with interpretation. For example, purple events have high values for ACI and CVR. Cyan events have high values for H(t) and CVR. This will depend on the characteristic distribution of acoustic energy in the animal calls that contribute to the event. With some experience you begin to recognise the different sounds of interest.

Different indices – different views

Slide 7.

Slide 7.

Different combinations of indices give different views of the soundscape. Here we see two false-colour spectrograms of the same recording. Note that more structure is visible in the bottom spectrogram than in the top, because two of the three indices used to construct the top spectrogram are correlated. So here we have an important “trick” in making this technique work. Each of the three indices used to construct a false-colour spectrogram should be independent of each other–or at least they should not be positively correlated. In the top image the two indices AVG and CVR are highly correlated and therefore the third index (CVR) is not adding any new information to the other two. The EVN index in the second spectrogram is a measure of the acoustic events per second in each frequency bin.

Biophony

Slide 8.

Slide 8.

In this slide, we identify some of the birds that produced the acoustic events in the two spectrograms. While the top spectrogram is not as informative, it does show up the cicada chorus at 6pm. Note that much of the structure in these images comes about because a bird (or birds) tends to sit in the same tree for a few minutes and is likely to repeat the same song/call. Even though a bird may only call once or twice in a minute, one or other acoustic index will “detect” the call and calls over consecutive minutes will leave a clearly visible trace in the spectrogram. Note that these false-colour spectrograms were not designed to identify bird species. The fact that this is possible is “icing-on-the-cake”.

You can see an interactive version of this slide here.

Geophony

Slide 9.

Slide 9.

The typical soundscape will be filled with sounds from many different source, not just from animals. Soundscape ecologists broadly categorise three or four sound sources which they label, biophony, geophony, anthropophony and sometimes a fourth is added, technophony (to distinguish musical sounds and speech from machine noise). In this slide we see the traces left in a false-colour spectrogram by various kinds of geophony. In the top-left image you will note that bird song (in green) occurs in between gusts of wind (in blue). An image like this can direct an ecologist to those parts of the recording in which birds are singing, thereby saving a lot of time.

The sound of rain as it appears in a false-colour spectrogram can vary a lot, obviously depending on the volume of rain but also depending on local characteristics, such as leaf size, leaf litter and even the acoustic response of the recording box to the percussive effects of rain drops

Anthropophony

Slide 10.

Slide 10.

This recording has a Koala call in it.

These images show two sources of anthropophony/technophony due to planes and a helicopter. Plane noise is typically in the low frequency band and has a pyramid shape in the spectrogram. The pyramid shape is due to the fact that low frequency sounds travel further than high frequency sounds. As a plane approaches the microphone, the low frequency sounds are heard first and the high frequency sounds are heard only when the plane is closer. The reverse happens as the plane flies away. The speed and distance of the plane can be determined from the shape of the pyramid. Note the dual components of helicopter noise, a low frequency component due to the engine and a high frequency component due to the whipping of the rotors.

Adelbert Ranges, Papua New Guinea

Slide 11. Audio courtesy of Eddie Game and the Nature Conservancy

Slide 11. Audio courtesy of Eddie Game and the Nature Conservancy

This false-colour spectrogram was obtained from the Adelbert Ranges, Papua New Guinea, by The Nature Conservancy. They are a global conservation organisation who are attempting to preserve some of the natural forests of PNG. See more info on the Adelbert Mountains project here and here. The local terrain for this recording is mountainous jungle. Note how the entire sound-space is filled with acoustic activity. Most of the activity is due to insects. Birds are, for the most part, restricted to the lower part of spectrum. The morning chorus is not as well-defined as in the previous recording from Australian eucalypt woodland. It is difficult to imagine how you could cram another vocal species into this sound-space! This brings us to the notion of the sound-space as a finite ecological resource. Somehow all the insects, birds and frogs at this location have to partition their acoustic activity in a way that allows them to complete their life-cycles, i.e. so that the mating pairs of each species can find each other in a cacophony of noise.

Sturt National Park

Slide 12. Audio courtesy of Dave Watson.

Slide 12. Audio courtesy of Dave Watson.

By way of complete contrast to the previous spectrogram, here is a 24-hour spectrogram of a semi-desert environment in the Sturt National Park, about 1200 km inland from Brisbane, Australia. Note the highly attenuated morning chorus around 7am, the almost complete absence of sound at night when the winter temperatures are likely to be below freezing, and the gusting winds in the middle of the day (insufficient baffles on the microphones!). There are occasional bird calls at night – see 2040h. This shows once again the utility of the H(t) index in picking out brief bursts of sound in long periods of silence.

For more information on the ongoing Sturt National Park acoustics project contact Prof. David Watson from Charles Sturt University (@D0CT0R_Dave).

Cross-site comparison (varying latitudes)

Slide 13.

Slide 13.

This slide compares three 24-hour, false-colour spectrograms of three soundscapes from different latitudes. All these recordings were obtained in the first week of July (winter) 2015. The top PNG recording is dominated by insects. The middle Brisbane recording is dominated by birds and the bottom desert recording is dominated by wind.

Recordings courtesy of:

  • (top) Eddie Game, The Nature Conservancy
  • (middle) Yvonne Phillips, QUT Ecoacoustics Research Group
  • (bottom) David Watson, Charles Sturt University

Cross-site comparison (varying altitudes)

Slide 14. Audio courtesy of Eddie Game and the Nature Conservancy.

Slide 14. Audio courtesy of Eddie Game and the Nature Conservancy.

This slide compares spectrograms from four Adelbert Range locations that are quite close but differ in altitude. Several differences are apparent, particularly in the 0-2 kHz band. At 280 metres there is a lot of background noise (red colour) in this band, whereas at 900m there is more bird activity. Spectrograms such as these can help ecologists to spot differences between recording sites and to frame interesting hypotheses.

Fresh water recordings

Slide 15. Audio courtesy of courtesy of Simon Linke and Toby Gifford.

Slide 15. Audio courtesy of courtesy of Simon Linke and Toby Gifford.

Not many people know that fish make a lot of sound as do underwater insects and crustaceans. This slide shows the false-colour spectrogram of a 24-hour recording taken with a hydrophone in a pond in northern Queensland. Note the change in sound during the day compared to the night. All the acoustic activity in this recording is due to insects and is dominant in the 1-2 kHz band. You can listen to 21 seconds of underwater recording here - all the sound is due to insects.

This recording (courtesy of Simon Linke) shows acoustic activity in a pond in northern Queensland.

The audio shown here was provided by Dr. Simon Linke and Dr. Toby Gifford, of Griffith University, Brisbane, Australia. If you’d like more information on their project please contact them.

Marine recordings

Slide 16. Audio courtesy of courtesy of Aaron Rice, Cornell Lab of Ornithology.

Slide 16. Audio courtesy of courtesy of Aaron Rice, Cornell Lab of Ornithology.

This spectrogram is derived from a 24-hour hydrophone recording taken 15 km off the coast of Georgia, USA. The water is only 15 m deep and the hydrophone was positioned 2m off the ocean floor. Note in this case, that the horizontal gridlines are 200 Hz apart with 1 kHz full scale. The red events in the bottom spectrogram are due to passing ships. Sound can travel long distances underwater and the passage of a ship can be acoustically evident for two or more hours. Note the tendency to “pyramid shape events” because low frequencies travel further than high. In the second ship from the left, you can observe interesting interference effects due to a phenomenon called Lloyd’s Mirror. Sound arriving at the microphone comes directly from the ship but also comes indirectly after reflection at the ocean surface. Interference effects result because the two sound paths have slightly different lengths. Each ship has a different acoustic signature that depends on its depth in the water, propulsion mechanism and noises from inside the ship due to generators and winches.

The dominant bio-acoustic activity in this recording is due to fish clicks at around 100 Hz. However, in many marine recordings of the deep oceans many of the sounds have yet to be identified. Earthquakes and sonar activity (due to military and to oil exploration) also contribute to the marine soundscape. Marine noise pollution is now recognised as a major problem and a contributing factor to whale and dolphin strandings.

The audio provided is courtesy of Aaron Rice, Cornell Lab of Ornithology, Cornell University, NYS, USA.

44 days of ocean recordings

Slide 17. Audio courtesy of courtesy of Aaron Rice, Cornell Lab of Ornithology.

Slide 17. Audio courtesy of courtesy of Aaron Rice, Cornell Lab of Ornithology.

So far the longest false-colour spectrogram we have viewed is 24-hours long and this takes up the full width of the typical computer screen. In order to view recordings several months long a different approach must be adopted. This image (Slide 17) represents 44 days of continuous marine recording from the same site as the previous slide. The 24-hour spectrograms have been reduced to height of 32 pixels but retaining the full 24-hour width. Concatenating the daily spectrograms nicely reveals long-term seasonal acoustic patterns.

The noise from passing ships is clearly apparent. The direction, speed and proximity of each ship can be determined by the shape and extension of its pyramid shape. The second dominating component of this soundscape is the cyan-blue acoustic events at night from day 33 onwards. These are due to the chorusing of the black drum fish. Because of their low frequency, the calls of the black drum fish carry a great distance, even 15 kilometres towards the coast.

A short recording demonstrating dominant acoustic activity in a marine recording off the Georgia Coast, USA. Note the low-frequency noises of the black drum fish (Pogonias cromis). Extract courtesy of Cornell Lab of Ornithology

The third dominating component of this soundscape occurs during days 9-13. The green lines are due to unidentified “knocking” sounds, something hitting or biting the hydrophone. There were no hurricanes or other meteorological events before or during days 9-13 that might explain these acoustic events. Fishing net strikes are a major problem for marine mammals in the area and may be a possible explanation. In order to get a better understanding of these events, we can add other annotations to the spectrograms as in the next slide.

The audio provided is courtesy of Aaron Rice, Cornell Lab of Ornithology, Cornell University, NYS, USA.

44 days of ocean recordings–with sunrise, sunset, and tide markings

Slide 18. Audio courtesy of courtesy of Aaron Rice, Cornell Lab of Ornithology.

Slide 18. Audio courtesy of courtesy of Aaron Rice, Cornell Lab of Ornithology.

This is the same 44 days of false-colour spectrogram as shown in the previous slide but with sunrise, sunset, high- and low-tide times superimposed. Day length is increasing as the recording proceeds through spring. While black drum fish chorusing is dominant at night and likely to be triggered by increasing day-length, the chorusing does not appear to be constrained by daylight. However it is apparent that the knocking sounds in days 9-13 are at a maximum between high and low tides when coastal currents are at a maximum. In particular, they are at a maximum in the period between low tide and high tide. Clearly the “knocking” events are associated with something drifting in the ocean currents. The asymmetrical nature of the events (around low tide) may be due to the additive effect of the clock-wise circulation of ocean currents up the east coast of the USA. And there the mystery must remain.

It is worth noting that these hundreds of hours of marine recording were made in order to detect the presence of the North Atlantic Right Whale (NARW), the most threatened of the whale species. The hydrophone was located in the middle of the NARW calving grounds towards the end of the calving season. The usual approach to identifying NARW calls is to write computer code specifically designed to recognise its three kinds of call. Due to the highly specific purpose of automated recognisers, they are designed not to pick up any other acoustic events. The false-colour spectrograms reveal a wealth of additional information about the marine soundscape. They complement the information obtained by call recognisers because they reveal the acoustic environment in which the NARW live.

Four month diel plot

Slide 19. Audio courtesy of courtesy of Yvonne Phillips, QUT Ecoacoustics Research Group.

Slide 19. Audio courtesy of courtesy of Yvonne Phillips, QUT Ecoacoustics Research Group.

Even with the 32 pixel high spectrograms in the previous slide, there is a limit to how many days can be concatenated to produce a long-duration image. Further compression can be obtained by reducing the spectrogram height to a single pixel. Actually, we are no longer dealing with spectrograms but with summary acoustic indices. Unlike a spectrogram index, a summary index is a single value representing the distribution of acoustic energy in an entire minute of recording. In this image (Slide 19), you see a representation of four months of recording. The left-hand side is midnight and the right-hand side the following midnight. These plots are known as diel plots. The time of civil-dawn and civil-dusk is superimposed and it is obvious that unlike the previous marine soundscape, the terrestrial soundscape is highly constrained by daylight. The emerging red patches (bottom right) are due to the onset of insect chorusing as spring temperatures increase. The long horizontal lines are due to extended periods of rain. The red patches within the blue lines are due to extremely heavy rain bursts. The morning chorus is clearly aligned with civil-dawn (around 30 minutes before actual sunrise) and the bright-green specks that are most apparent around sunrise and sunset are due to kookaburra choruses.

Audio provided courtesy of Yvonne Phillips, QUT Ecoacoustics Research Group. The audio was recorded in Gympie National Park, Queensland, Australia.

Four month diel plot–alternate indices

Slide 20. Audio courtesy of courtesy of Yvonne Phillips, QUT Ecoacoustics Research Group.

Slide 20. Audio courtesy of courtesy of Yvonne Phillips, QUT Ecoacoustics Research Group.

Here is another diel plot of exactly the same recording but this time using different summary indices. In this case the RGB channels represent low, middle and high frequency band activity respectively. The red patch at 5pm on the last days of September are caused by thunderstorms on consecutive days. Note also a change in the dominant frequency of the morning chorus during mid-September. Just as with false-colour spectrograms, diel plots derived from different indices provide different views of the soundscape.

Audio provided courtesy of Yvonne Phillips, QUT Ecoacoustics Research Group. The audio was recorded from Gympie, Queensland, Australia.

The classification of soundscapes

Slide 21. Pictures courtesy of courtesy of Yvonne Phillips, QUT Ecoacoustics Research Group.

Slide 21. Pictures courtesy of courtesy of Yvonne Phillips, QUT Ecoacoustics Research Group.

Apart from visualisation, acoustic indices can be used to reveal other ecological insights. For example, we can classify different locations according to their soundscapes. Consider the two locations shown in this slide (Slide 21). One has higher rainfall and therefore supports more dense vegetation cover. However the bird species are quite similar at the two sites. What are the consequences for the soundscape? How will the soundscapes differ despite similar bird species?

Clustering soundscapes

Slide 22. [Sankupellay M., Towsey, M., Truskinger, A., & Roe, P. (2015)](#fn:SAN).

Slide 22. Sankupellay M., Towsey, M., Truskinger, A., & Roe, P. (2015).

In a study reported by Sankupellay et al.3, four sites were selected at location A and two sites at location B. Two consecutive days of recording were made at each site giving 12 days of recording in total.

Summary acoustic indices were calculated at one-minute resolution over all 12 days (12 × 1440 = 17,280 one-minute recording segments). The vectors of ten indices were normalised and then clustered using a 10×10 node self-organising map (SOM). The 100 nodes were further clustered to yield 27 clusters, each representing a distinct “acoustic regime”.

The contents of the 27 clusters were identified by selecting the false-colour spectrum of each minute (see top image of SLIDE 22). Cluster Y contained very quiet night-time recording segments, while cluster V included the morning chorus and other segments with much bird activity.

A 24-hour cluster occupancy histogram was prepared for each of the 12 days (see middle image in slide) and these cluster occupancy histograms were in turn hierarchically clustered. The resulting dendrogram (bottom right image in the slide) clearly shows that the soundscapes of consecutive days at the same site are more similar than those at different sites. The dendrogram also separates the two locations. To sum up, the use of acoustic indices enables the calculation of acoustic signatures that characterise the soundscapes at different locations.

Two acoustic disciplines: Bio-acoustics and Eco-acoustics

Slide 23.

Slide 23.

Until recently, biological interest in acoustics was restricted to the vocalising mechanisms and behaviours of individual animals or species. This science is known as bioacoustics. Technological limitations did not permit long recordings. However in the last few years the technological impediments to long recordings have pretty much disappeared. The effect has been to open up an entirely new science, which is variously called eco-acoustics or soundscape ecology. Soundscape ecology studies the interactions between soundscapes and the underlying ecosystem processes.

There is a huge (orders of magnitude) difference in the scale of bioacoustics versus eco-acoustics. Bioacoustics studies acoustic phenomena that have only few seconds duration, whereas eco-acoustic patterns may be days, months, or even years in duration. Bioacoustics investigates the behaviour of individuals or single species, whereas eco-acoustics studies, as the name implies, the sounds made by ensembles of thousands of interacting species.

The question arises as to how can we bridge the divide between bioacoustics and eco-acoustics?

In the remaining slides we look at four ways in which work in our lab attempts to bridge this divide.

Bridging the divide between bioacoustics and eco-acoustics

1. Using soundscape indices to help study species diversity

Slide 24.

Slide 24.

Typically acoustic indices are calculated at one-minute resolution, too course a time resolution to be of much interest to bio-acousticians who are interested in animal calls that may last only a few seconds. However acoustic indices can pick up the occurrence of different sounds even if they cannot identify what they are. The spectrogram in Slide 24 is dominated by wind (blue vertical lines) but when the wind drops, birds begin to sing (yellow-green lines). So one use of acoustic indices (calculated at one-minute resolution) is to classify one-minute audio segments according to their general acoustic content. It is relatively easy to construct a five-class machine learning problem where the task is to identify minutes containing bird calls versus silence, wind, rain and insect sounds. The classifier can be used as a filter to remove recording segments that do not contain bird sounds, as shown in the next slide.

Slide 25. [Zhang, L., Towsey, M., Zhang, J. & Roe, P. (2015)](#fn:ZHA)

Slide 25. Zhang, L., Towsey, M., Zhang, J. & Roe, P. (2015)

This image (Slide 25) shows a histogram of all the 1440 one-minute segments in one day. The minutes are binned according to how many different bird species are calling in the minute segment. The blue bars indicate the distribution of bird call densities before use of a classifier to filter out non-bird segments. The zero-bird-species bin is by far the largest. The red bars indicate distribution after filtering. The classifier has removed most of the one-minute segments that do not contain bird calls. This means that the ecologist is required to listen to less audio when trying to determine species diversity.

This work in this slide is the research product of Liang Zhang, QUT Ecoacoustics Research Group4.

2. Using acoustic indices to monitor sperm whales

Slide 26. Figure courtesy of Hervé Glotin

Slide 26. Figure courtesy of Hervé Glotin

Slide 26 shows the Mediterranean Sea and the south coast of France. The city of Toulon is on the left. A few kilometres off the coast, the sea level suddenly drops from 200m to 2000m. The submarine cliff face is carved out by canyons in which sperm whales like to hunt. A hydrophone rig (known as BOMBYX) is placed at the top of one of these canyons. Sperm whales hunt for their prey by emitting sonar clicks and their presence can be detected by identifying clicks at a characteristic periodicity of about one second. You can listen to a short recording of sperm whale clicks here. The first four seconds of the recording are dominated by sonar clicks. When they cease, you can more easily hear the sperm whale clicks.

This recording (courtesy of Hervé Glotin) demonstrates sonar beeps and the clicks produced by a sperm whale (Physeter macrocephalus).

This is typically a problem where code would be written to automate the recognition of sperm whale clicks. Another approach is to calculate generic acoustic indices but this time calculated at 0.1 second resolution rather than one-minute resolution.

The figure is provided by Prof. Hervé Glotin. More information on the BOMBYX project can be found here.

Slide 27.

Slide 27.

The generic index which proves to be most useful to detect sperm whale clicks is the temporal entropy index. Slide 27 (top image) shows a spectrogram derived from the H(t) index calculated at 0.1s resolution. The “green” events in the spectrogram are those picked up by H(t). Note that the recording also contains many other “click” sounds, in particular, dominant sonar clicks used to monitor shipping. The sonar clicks must be removed prior to analysis and their harmonics can be easily identified using an averaged spectrum (as shown in top-right image).

The sperm whale clicks are then revealed (centre-left image) and the dominant period between clicks can be shown to be 0.9 seconds (lower-right image). The important idea being demonstrated here is that generic acoustic indices can be used to solve a very specific problem. this makes generic acoustic indices extremely useful.

3. Using acoustic indices for animal welfare

Slide 28. Courtesy of Katherine Herborn, Alan McElligott, and Lucy Asher

Slide 28. Courtesy of Katherine Herborn, Alan McElligott, and Lucy Asher

The obvious advantages of acoustic approaches to monitor animal behaviour are that recordings are cheap and easy to obtain. They also provide a permanent and objective record. Our lab is working with the University of Newcastle (UK) and Queen Mary University London using acoustics to monitor poultry health. The study involves finding correlations between hen vocalisations and standard measures of animal well-being (e.g. weight loss). The spectrogram in this slide (Slide 28) shows the 24-hour soundscape of a poultry shed. The hens fall silent during “lights-off” but are vocally active with lights on. The dominant vocal band is 1-3 kHz but there is also much activity at high frequencies. Changes in the pattern of calling frequency could provide valuable indicators of poultry health.

Recording courtesy of Dr. Katherine Herborn, Dr. Alan McElligott (QMUL), and Dr. Lucy Asher (Newcastle University).

4. Using acoustic indices at all time-scales

Slide 29. [Towsey, M., Truskinger, A. & Roe, P. (2015)](#fn:ZOO)

Slide 29. Towsey, M., Truskinger, A. & Roe, P. (2015)

In previous slides we demonstrated the use of acoustic indices calculated at one-minute resolution to reveal the acoustic structure in a 24-hour soundscape. In Slides 26 and 27, we showed the use of acoustic indices calculated at finer resolution (0.1 seconds) to detect the sonar clicks of a sperm whale. The resolution of a standard spectrogram (as seen for example, in Slide 2) is around 0.01 seconds per frame. In fact it is possible to calculate acoustic indices at multiple time scales and use them to span the entire temporal scale from years and months down to milliseconds. Some indices reveal acoustic structure better at course resolution (for example ACI) and others reveal structure better at fine resolution. We have developed techniques5 in our lab to “stitch” all these different scale images together to make a pyramid stack as shown in Slide 29. But much more useful is that we are now able to “zoom” in and out of long duration recordings in the same way that Google Maps allows you to zoom in and out from planet Earth. This makes it easy to navigate recordings of arbitrary duration. Try a demo of this concept for yourself on a 24-hour recording by going to http://www.ecosounds.org/Zoom/zoom.

We are currently integrating the zooming false colour spectrograms technology into our acoustic workbench software—soon every audio recording hosted on ecosounds.org will benefit from the enhanced visualisation capability of zooming spectrograms.

Conclusion

Slide 30.

Slide 30.

And finally a picture of our lab in the Garden Point campus of the Queensland University of Technology. This picture is taken from across the Brisbane River. Imagine the sounds in this scene. And then consider the submerged soundscape within Brisbane River.

We are at an exciting stage in our research. Ecoacoustics is a rapidly developing field with lots of potential and diverse applications. We tremendously enjoy working with our collaborators. If you see other possible applications for the visualisation and use of acoustic recordings, go to our contact page .

References

  1. N. Pieretti, A. Farina, D. Morri (2011). A new methodology to infer the singing activity of an avian community: The Acoustic Complexity Index (ACI), Ecological Indicators, Volume 11, Issue 3, May 2011, Pages 868-873, ISSN 1470-160X, http://dx.doi.org/10.1016/j.ecolind.2010.11.005

  2. Towsey, M. et al., (2014). Visualization of long-duration acoustic recordings of the environment. Proceedings of the International Conference on Computational Science (ICCS 2014), Cairns, Australia, 9-12 June 2014, http://dx.doi.org/10.1016/j.procs.2014.05.063

  3. Sankupellay, Mangalam, Towsey, Michael W., Truskinger, Anthony, & Roe, Paul (2015) Visual fingerprints of the acoustic environment: The use of acoustic indices to characterise natural habitats. In IEEE International Symposium on Big Data Visual Analytics (BDVA 2015), 22-25 September 2015, Hobart, Tas, http://dx.doi.org/10.1109/BDVA.2015.7314306

  4. Zhang, L., Towsey, M., Zhang, J. & Roe, P. (2015). Computer-assisted Sampling of Acoustic Data for More Efficient Determination of Bird Species Richness. IEEE International Conference on Data Mining workshop Environmental Acoustic Data Mining, Atlantic City, NJ, USA, 14-17 November, 2015, http://dx.doi.org/10.1109/ICDMW.2015.42

  5. Towsey, M., Truskinger, A. & Roe, P. (2015). The Navigation and Visualisation of Environmental Audio using Zooming Spectrograms. IEEE International Conference on Data Mining, Workshop on Environmental Acoustic Data Mining, Atlantic City, NJ, USA, 14-17 November, 2015, http://eprints.qut.edu.au/91822/