07/05/06

Figure 6: A real-ear directional test on an adaptive instrument with active compression and noise reduction. (Directional test not available in RM500SL)

3.2Real-speech signal analysis

One of the most-used measures of a speech signal is the long-term average speech spectrum (LTASS). This is a 1/3 octave spectrum averaged over a sufficiently long portion of the speech material to provide a stable curve. In practice a 10 second average meets this requirement and, for this reason, all Speechmap passages are at least 10 seconds long.

The dynamic nature of speech is often characterized by the distribution of short-term levels in each 1/3 octave band. These levels are determined by calculating a spectrum for each of a series of short time periods within the passage. Historically, time periods of 120, 125 or 128 ms have been used. The Verifit and RM500SL use a 128 ms time period, resulting in 100 levels (or samples) in each 1/3 octave band for a 12.8 second passage. The level in each band that is exceeded by 1% of the samples (called L1, the 1st or 99th percentile) has historically been referred to as the speech peak for that band. The curve for these 1% levels is approximately 12 dB above the LTASS. The level in each band that is exceeded by 70% of the samples (called L70, the 70th or 30th percentile) has historically been called the valley of speech for that band. The curve for these 70% levels is approximately 18 dB below the LTASS. The region between these two curves is often called the speech region, speech envelope or speech “banana”. The speech envelope, when derived in this way, has significance in terms of both speech detection and speech understanding. Generally, speech will be detectable if the 1 % level is at or near threshold. The Speech Intelligibility Index (SII) is maximized when the entire speech envelope (idealized as a 30 dB range) is above (masked) threshold. This will not be an SII of 100% (or 1) because of loudness distortion factors, but higher SII values will not produce significantly higher scores on most test material. The speech- reception threshold (SRT) is attained when the LTASS is at threshold (approximately - depending on test material and the individual). These scenarios are shown in Figures 7 - 9 which follow.

It should be noted that analysis methods which use shorter time periods produce higher peak levels and significantly different speech envelopes. In order to produce results that can be directly compared to measures of threshold (and UCL), the analysis time period needs to approximate the integration time of the ear. Although this varies with frequency and individuals, a value between 100 - 200 ms is likely. The Verifit and RM500SL use a

07/05/06	© Etymonic Design Incorporated, 41 Byron Ave., Dorchester, ON, Canada N0L 1G0	Page 6
	USA 800-265-2093 519-268-3313 FAX 519-268-3256 www.audioscan.com

HP RM500SL manual 07/05/06

Models: RM500SL

Figure 6: A real-ear directional test on an adaptive instrument with active compression and noise reduction. (Directional test not available in RM500SL)

3.2Real-speech signal analysis