You are here

The difference between being heard and understood

Being heard is not hard - even toddlers are experts on the matter, but it does not mean that they always make themselves understood. This important distinction between a sound being audible and it being intelligible is what defines efficient and intelligent communication solutions. We’d like to share our view on intelligibility -  what to look for and how to obtain it. 

Signal vs noise

For your message to be understood it first has to stand out. This is the one thing that almost all babies get right – making sure their message stands out from the surrounding background noise. This is where the signal-to-noise-ratio or SNR comes in. In audio communication, a good SNR is from 10 to 15dB. This means that the message has an average sound pressure level of 10 to 15 decibels above the background noise.

Intelligibility = being understood

But – as any parent can tell you - having a good SNR does not guarantee intelligibility. The message must be in a language that the recipient understands and be articulated clearly enough for the recipient to decipher. A person in distress will probably be able to formulate the right words and provide ample signal-to-noise, but the delivery or articulation may fail them, leading to “low speech intelligibility”. 

We can actually measure this with several tests, subjective and objective, such as CIS, %ALcons and STI. The test most commonly used today is STI (Speech Transmission Index) [link], which is represented by a score between 0.00 and 0.99. A score of 0.50 is commonly determined (by the EN 60849 standard, at least) as the thin, red line between high and low intelligibility. In practice, a score of 50 or more means that it’s more likely
that your message will be understood. An STI over 0.60 is considered good, and over 0.75 it’s excellent!

So, how does one achieve a high STI?

As discussed, the first requisite is a sufficient sound pressure level or SPL, ideally 15dB above the background noise. Secondly, the STI procedure will test the signal-to-noise in different frequency bands. To accurately represent human voice, the communication device must be able to reproduce frequencies from 125Hz to 8kHz. The signal is modulated, and any echo or reverberation will reduce both intelligibility and STI scores. The higher the energy (SPL) of the signal and the more reflective the surfaces are, the more intense the reverberations will be and the longer it will take them to die out.

A too high SPL can actually be detrimental to STI, both through increased reverberations and through an effect called level-dependent auditory masking - when we are exposed to high sounds, parts of the inner ear [link] contracts and tenses up, to protect the ear from damage. In addition, the brain equalizes very low and very high sounds, a phenomenon known as compression. The result of both of these mechanisms is that intelligibility is reduced at very high SPL.

At concerts, if you put earbuds (or fingers) in your ears. you’ve probably noticed how much “clearer” the sound becomes. This has been accounted for the STI standard (IEC60268-16) and the maximum STI is achieved around 70-80 dB [link]. For areas with a static background noise level, this can be set during installation. For areas where noise levels vary over time (street, school corridor, train platform), AVC or Automatic (or Adaptive) Volume Control, can ensure that the output is in the ideal range for optimal STI.

How can you ensure intelligibility? 


The key qualities in any communication solution are: 

  • Minimum widerange (125-7kHz) frequency response, This is often marketed as “HD Voice” or “Wideband audio”. Ideally full-range (20-20kHz) where possible. 
  • Sufficient sound pressure to overcome background noise. Depending on your environment, background noise levels can be as low as 35dB (a library) to over 90dB (a busy street or industrial areas). This means an actual output of 50-105 dB at normal listening distance, depending on application.
  • An audio path from transmitter, to receiver, via amplifier and loudspeaker with low distortion. With digital and IP technology, the transport is rarely an issue, but the microphone, amplifier, loudspeaker and enclosure should be designed for low distortion at all relevant frequencies and SPLs.
  • Technologies such as AVC and ANR (adaptive noise reduction) help maximize the intelligibility by providing the right SPL and reduce the ambient noise on the transmitting end before it’s sent to the receiver.
  • Most importantly, though, is proper placement and acoustic room treatment to reduce reverberations and provide even and homogenous coverage. For two-way communications, this is usually not as critical as it is for announcements or public address applications covering large areas. 

To learn more about how Zenitel achieves high intelligibility in demanding environments, check out our showcase on Vingtor-Stentofon Turbine