HP RM500SL manual FastFacts 19.5 Speech signal analysis

Models: RM500SL

1 112
Download 112 pages 21.85 Kb
Page 87
Image 87

19.5 Speech signal analysis

One of the most-used measures of a speech signal is the long-term average speech spectrum (LTASS). This is a 1/3 octave spectrum averaged over a sufficiently long portion of the speech material to provide a stable curve. In practice a 10 second average meets this requirement and, for this reason, all RM500SL passages are at least 10 seconds long.

The dynamic nature of speech is often characterized by the distribution of short- term levels in each 1/3 octave band. These levels are determined by calculating a spectrum for each of a series of short time periods within the passage. Historically, time periods of 120, 125 or 128 ms have been used. The RM500SL uses a 128 ms time period, resulting in 100 levels (or samples) in each 1/3 octave band for a

12.8second passage. The level in each band that is exceeded by 1% of the samples (called either the 1st or 99th percentile) has historically been referred to as the speech peak for that band. The curve for these 1% levels is approximately 12 dB above the LTASS. The level in each band that is exceeded by 70% of the samples (called either the 70th or 30th percentile) has historically been called the valley of speech for that band. The curve for these 70% levels is approximately 18 dB below the LTASS. The region between these two curves is often called the speech region, speech envelope or speech “banana”. The speech envelope, when derived in this way, has significance in terms of both speech detection and speech understanding. Generally, speech will be detectable if the 1 % level is at or near threshold. The Speech Intelligibility Index (SII) is maximized when the entire speech envelope (idealized as a 30 dB range) is above (masked) threshold. This will not be an SII of 100% (or 1) because of loudness distortion factors, but higher SII values will not produce significantly higher scores on most test material. The speech-reception threshold (SRT) is attained when the LTASS is at threshold (approximately - depending on test material and the individual)

FastFacts 19.5: Speech signal analysis

0611

RM500SL User’s Guide Version 2.8

Page 87

Page 87
Image 87
HP RM500SL manual FastFacts 19.5 Speech signal analysis