Pulse Source of Excitation in Speech Signal

封面

如何引用文章

全文:

开放存取 开放存取
受限制的访问 ##reader.subscriptionAccessGranted##
受限制的访问 订阅存取

详细

The properties of speech bursts of closure are studied using the material of a database of 39 speakers containing single-digit and multi-digit numerals with parallel recording of signals on a telephone handset and a directional microphone. Speech burst detection is performed by a short-term and long-term detector of spectral-temporal inhomogeneities, as well as a detector of the similarity measure of the eigenfunctions of the consonant burst spectrum and the current spectrum of the speech burst. The probability of the presence of a voiced or voiceless closure is estimated in the spaces of the amplitude spectrum and the spectrum of the group delay by the ratio of energy in the high and low frequency ranges. The place of articulation of a back-lingual consonant affects the probability distributions of the duration of the interval between the onset of a speech burst and the onset of a vowel, the frequency of the peak with maximum amplitude in the high-frequency region, the ratio of the energy in the high- and low-frequency region of the speech burst spectrum, and the similarity measures of the eigenfunctions of the consonant burst spectrum and the current spectrum of the speech burst.

全文:

受限制的访问

作者简介

V. Sorokin

Institute for Information Transmission Problems of the Russian Academy of Sciences

编辑信件的主要联系方式.
Email: vns@iitp.ru
俄罗斯联邦, Moscow

参考

  1. Jongman A., Miller J.D. Method for the location of burst-onset spectra in the auditory-perceptual space: A study of place of articulation in voiceless stop consonants // J. Acoust. Soc. Am. 1991. V. 89. N 2. P. 867–873.
  2. Bonneau A., Djezzar L., Laprie Y. Perception of the place of articulation of French stop bursts // J. Acoust. Soc. Am. 1996. V. 100. P. 555–564.
  3. Dorman M.F., Studdert-Kennedy M., Raphael L. Stop-consonant recognition: Release bursts and formant transitions as functionally equivalent, context-dependent cues // Perception & Psycophysics. 1977. V. 2. N 2. P. 109–122.
  4. Stevens K., Blumstein S. Invariant cues for place of articulation in stop consonants // J. Acoust. Soc. Am. 1978. V. 64. P. 1358–1368.
  5. Blumstein S., Stevens K. Acoustic invariance in speech production: evidence from measurements of the spectral characteristics of stop consonant // J. Acoust. Soc. Am. 1979. V. 66. P. 1001–017.
  6. Ohde R.N., Stevens K.N. Effect of burst amplitude on the perception of stop consonant place of articulation // J. Acoust. Soc. Am. 1983. V. 74. P. 706–714.
  7. Steinschneider M., Fishman Y.I., Arezzo J.C. Representation of the voice onset time (VOT) speech parameter in population responses within primary auditory cortex of the awake monkey // J. Acoust. Soc. Am. 2003. V. 114. N 1. P. 307–321.
  8. Sharma A., Dorman M.F. Cortical auditory evoked potential correlates of categorical perception of voice-onset time // J. Acoust. Soc. Am. 1999. V. 106. N 2. P. 1078–1083.
  9. Steinschneider M., Volkov I.O., Noh M.D., Garell P.C., Howard III M.A. Temporal encoding of the voice onset time phonetic parameter by field potentials recorded directly from human auditory cortex// J. Neurophysiol. 1999. V. 82. P. 2346–2357.
  10. Blumstein S.E., Myers E.B., Rissman J. The Perception of Voice Onset Time: An fMRI Investigation of Phonetic Category Structure // J. Cognitive Neuroscience. 2005. V. 17. N 9. P. 1353–1366.
  11. Rimol L.M., Eichele T., Hugdahl K. The effect of voice-onset time on dichotic listening with consonant-vowel syllables // Neuropsychologia. 2006. V. 44. N 2. P. 191–196.
  12. Auzou P., Ozsancak C., Hard R., Morris J., Jan M., Eueustache F., Hannequin D. Voice onset time in aphasia, apraxia of speech and dysarthria: a review // Clinical Linguistics & Phonetics. 2000. V. 14. N 2. P. 131–150.
  13. Min S.N., Park S.J., Im J.N., Subramaniyam M.A. Bayesian Model for Prediction of Stroke with Voice Onset // 3rd Int. Conf. on Advances in Mechanical Engineering (ICAME2020). IOP Conf. Series: Materials Science and Engineering. 2020. 912(6). 062003. https://doi.org/10.1088/1757–899X/912/6/062003
  14. Johansson I.-L., Samuelsson Ch., Müller N. Consonant articulation acoustics and intelligibility in Swedish speakers with Parkinson’s disease: a pilot study // Clinical Linguistics & Phonetics. 2023. V. 37. N 9. P. 845–865.
  15. Lisker L., Abramson A. A cross-language study of voicing in initial stops: Acoustical measurements // Word. 1964. V. 20. P. 384–422.
  16. Ladefoged P., Madison I. The sounds of the world’s languages. Blackwell Publishing, 1996.
  17. Cho T., Ladefoged P. Variation and Universals in VOT: Evidence from 18 Languages // J. Phonetics. 1999. V. 27. P. 207–229.
  18. Chodroff E., Golden A., Wilson C. Covariation of stop voice onset time across languages: Evidence for a universal constraint on phonetic realization // J. Acoust. Soc. Am. 2019. V. 145 (1). EL109-EL115.
  19. Cho T., Whalen D.H., Docherty G. Voice onset time and beyond: Exploring laryngeal contrast in 19 languages // J. Phonetics. 2019. V. 72. P. 52–65.
  20. Winn M.B. Manipulation of voice onset time in speech stimuli: A tutorial and flexible Praat script // J. Acoust. Soc. Am. 2020. V. 147. N 2. P. 852–866.
  21. Fant G. Speech and Sounds. MIT Press, 1973.
  22. Robb M., Gilbert H., Lerman J. Influence of Gender and Environmental Setting on Voice Onset Time // Folia Phoniatrica et Logopaedica. 2005. V. 57. P. 125–133.
  23. Niyogi P., Ramesh P. The voicing feature for stop consonants: Recognition experiments with continuously spoken alphabets. // Speech Commun. 2003. V. 41. P. 349–367.
  24. Сорокин В.Н. Синтез речи. М.: Наука, 1992.
  25. Winn M.B., Chatterjee M., Idsardi W.J. The roles of voice onset time and F0 in stop consonant voicing perception: Effects of masking noise and low-pass filtering // J. Speech Lang. Hear. Res. 2013. V. 56. N 4. P. 1097–1107.
  26. Yu V., De Nil L., Pang E. Effects of age, sex and syllable number on voice onset time: Evidence from children’s voiceless aspirated stops // Language and Speech. 2015. V. 58. N 2. P. 152–167.
  27. Stouten V., Van Hamme H. Automatic Voice Onset Time Estimation from Reassignment Spectra // Speech Communication. 2009. V. 51. N12. P. 1194–1205.
  28. Hansen J.H.L., Gray Sh.S., Kim W. Automatic voice onset time detection for unvoiced stops (/p/, /t/, /k/) with application to accent classification // Speech Communication. 2010. V. 52. P. 777–789.
  29. Lin Ch.-Y., Wang H.-Ch. Automatic estimation of voice onset time for word-initial stops by applying random forest to onset detection // J. Acoust. Soc. Am. 2011. V. 130. N 1. P. 514–525.
  30. Sonderegger M., Keshe J. Automatic Discriminative Measurement of Voice Onset Time // J. Acoust. Soc. Am. 2012. V. 132. N 6. P. 3965–3979.
  31. Prathosha P., Ramakrishnan A.G., Ananthapadmanabha T.V. Estimation of voice-onset time in continuous speech using temporal measures // J. Acoust. Soc. Am. 2014. V. 136. N 2. EL122.
  32. Сорокин В.Н. Теория речеобразования. 1985.
  33. Stevens K.N. Acoustic phonetics. MIT, 1998.
  34. Сорокин В.Н. Детекторы артикуляторных событий // Акуст. журн. 2020. Т. 66. № 1. С. 71–85.
  35. Patterson R.D., Holdsworth J. A functional model of neural activity patterns and auditory images // Advances in Speech, Hearing and Language Processing. 1996. V. 3. P. 547–563.
  36. Сорокин В.Н., Чепелев Д.Н. Первичный анализ речевых сигналов // Акуст. журн. 2005. Т. 51. № 4. С. 536–542.
  37. Whiteside S., Henry L., Dobbin R. Sex differences in voice onset time: A developmental study of phonetic context effects in British English // J. Acoust. Soc. Am. 2004. V. 116. N 2. P. 1179–1183.
  38. Sussman H.M., McCaffrey H.A., Matthews S.A. An investigation of locus equations as a source of relational invariance for stop place categorization // J. Acoust. Soc. Am. 1991. V. 90. P. 1309–1325.
  39. Iskarous Kh., Fowler C.A., Whalen D.H. Locus equations are an acoustic expression of articulator synergy // J. Acoust. Soc. Am. 2010. V. 128. N 4. P. 2021–2032.
  40. Montgomery A., Reed P.E., Crass K.A., Hubbard H.I., Stith J. The effects of measurement error and vowel selection on the locus equation measure of coarticulation // J. Acoust. Soc. Am. 2014. V. 136. N 5. P. 2747–2750.
  41. Речь. Артикуляция и восприятие. М.: Наука, 1965.
  42. Klatt D.H. Linguistic uses of segmental duration in English sentences // J. Acoust. Soc. Am. 1976. N 5. P. 1208–1221.
  43. Olson D.J. Phonetic feature size in second language acquisition: Examining VOT in voiceless and voiced stops // Second Language Research. 2022. V. 38(4). P. 913–940.

补充文件

附件文件
动作
1. JATS XML
2. Fig. 1. Area Svt (-), Svs (-); pressure Puvs on the blind (-∙-) and ringing (-∙-) bows; pressure Pvt on the blind (-) and ringing (-) bows; velocity V on the blind (-) and ringing (-) bows; impulse source W on the blind (-) and ringing (-) bows

下载 (139KB)
3. Fig. 2. Sound combination /apatakA/. (a) - From top to bottom: speech signal, total response of the short-term dynamic detector, sonogram of the speech signal; the beginning of the speech burst - dotted line. (b) - Three-dimensional sonogram of the short-term dynamic detector

下载 (214KB)
4. Fig. 3. Distributions of the difference between the estimates of the detector peak position d1 and the pulse source onset moment from the manual marking data

下载 (68KB)
5. Fig. 4. Semi-automatic estimation of the onset of the pulsed speech burst on the segments /loudspeaker-vowel/

下载 (72KB)
6. Fig. 5. Mismatch between the position of the maximum similarity position of the eigenfunctions P!, Th, Kh and the manual marking of the onset of the speech explosion

下载 (74KB)
7. Fig. 6. Automatic estimation of the beginning of the explosion in the word /sto/

下载 (161KB)
8. Fig. 7. Distributions of ∆Tvot of speech bursts of deaf and voiced consonants; manual markup

下载 (84KB)
9. Fig. 8. Distribution of ∆Tvot of deaf explosive consonants based on manual marking results

下载 (71KB)
10. Fig. 9. Distributions of relative energy of pulse bursts T! and D! in the amplitude spectrum; directional microphone

下载 (79KB)
11. Fig. 10. Amplitude and phase relations on deaf and ringing speech bursts. On the abscissa axis is the relative energy of the amplitude spectrum, on the ordinate axis is the relative energy of the group delay spectrum

下载 (185KB)
12. Fig. 11. Spectra of ringing and deafening speech bursts: short-term dynamic detector (-); burst (---)

下载 (193KB)
13. Fig. 12. Frequency distribution of the maximum peak frequency of the speech explosion spectrum for deaf blasts

下载 (67KB)
14. Fig. 13. Distribution of the ratio of the average energy in the 0.8-3 kHz band to the energy in the 3-6 kHz band

下载 (76KB)
15. Fig. 14. The measure of similarity of eigenfunctions of the spectrum of speech explosion Ph (-) and spectra of speech explosions Th and Kh

下载 (63KB)
16. Fig. 15. The measure of similarity of eigenfunctions of the short-term detector of the speech burst spectrum Kh (-) and the spectra of the short-term detector of the speech bursts Ph and Th

下载 (51KB)

版权所有 © The Russian Academy of Sciences, 2024