(see AUDIO FILES for information on the *.wav files)
This demonstration is based on the work of Shannon et al (see
references below). The speech utterance of the numbers "1 2
3" is first filtered by five gamma-tone filters (each with a
Q of 10) centered at 200 Hz, 400 Hz, 800 Hz, 1600 Hz, and 3200
Hz. The envelope of the filtered speech waveform at the output of
each filter is extracted by rectifying the filtered waveform and
low-pass filtering at 100 Hz. A broadband noise is then filtered
with the same five gamma-tone filters that were used to filter
the speech waveform. Each of these filtered noises is then
multiplied by the envelope extracted from the respective filtered
speech waveform. That is, the 200-Hz extracted speech envelope is
multiplied times the 200-Hz filtered noise, the 400-Hz extracted
envelope times the 400-Hz filtered noise, and so on. The Figure describes this signal
processing and generation procedure for two filter bands; a
low-frequency band (low CF) shown at top and a high-frequency
band (high CF) shown on the bottom. The two envelope-modulated
noise bands could be added together or presented in isolation. By
clicking on
1-2-3 you
will hear the starting speech waveform.
Clicking on
200-Hz
Band, will produce the waveform of the 200-Hz band of noise
modulated with the envelope of the speech waveform filtered at
200 Hz.
Clicking on
200+400-Hz
Bands, will produce the waveform of the 200-Hz band of noise
modulated with the envelope of the speech waveform filtered at
200 Hz plus the 400-Hz band multiplied by its respective
envelope.
Clicking on
200+400+800-Hz
Bands, will produce the waveform of the 200-Hz band of noise
modulated with the envelope of the speech waveform filtered at
200 Hz plus the 400-Hz band and the 800-Hz band each multiplied
by their respective envelopes.
Clicking on
200+400+800+1600-Hz
Band, will produce the waveform of the 200-Hz band of noise
modulated with the envelope of the speech waveform filtered at
200 Hz plus the 400-Hz band, the 800-Hz band, and the 1600-Hz
band each multiplied by their respective envelopes.
Clicking on
200+400+800+1600+3200-Hz
Bands, will produce the waveform of the 200-Hz band of noise
modulated with the envelope of the speech waveform filtered at
200 Hz plus the 400-Hz band, the 800-Hz band, the 1600-Hz band,
and the 3200-Hz band each multiplied by their respective
envelopes.
You will notice that the intelligibility of the speech-like sound increases as the number of bands that are added together increases. The five-band condition (200+400+800+1600+3200-Hz) produces a highly intelligible version of "1-2-3."
Clicking on
Altered
Band will produce the waveform generated when the bands used
to filter the speech are different from those used to filter the
noise. In particular, the center frequency of the bands used to
filter the noise were lowered one-octave below those used to
filter the speech waveform. That is, the envelope extracted from
the 200-Hz filtered portion of speech was multiplied times a
100-Hz center frequency filtered band of noise, the 400-Hz
extracted envelope was multiplied times a 200-Hz center frequency
filtered band of noise, and so on until the 3200-Hz extracted
envelope was multiplied times a 1600-Hz center frequency filtered
band of noise. Then all five bands of noise were added together
as in the case of the 200+400+800+1600+3200-Hz Band condition.
The Altered Band example is like frequency shifting downward.
This might be something one would do to provide a possible
hearing aid for a person with good low-frequency hearing and poor
high-frequency hearing. However as you hear, the intelligibility
is very poor in the Altered Band condition. The research that has
been done to date using this technique suggests that almost
anything that is done to disassociate the envelope extracted from
a particular band of noise with the band of noise that is used
for the multiplication will significantly lower intelligibility.
Suggested References:
Shannon, R.V., Zeng, F., Kamath, V., Wygonski, J., Ekelid, M., Speech Recognition with Primarily Temporal Cues, Science 270, 303-304, 1995
Grant, K.W., Braida, L.D., Renn, R.J., Single Band Amplitude Envelope Cues as an Aid to Speechreading, Quarterly Journal of Experimental Psychology 43(A), 621-645, 1991