Linear prediction coefficients correction method for digital speech processing systems with data compression based on the autoregressive model of a voice signal

V. V. Savchenko; Савченко В. В.; L. V. Savchenko; Савченко Л. В.

doi:10.31857/S0033849424040056

Linear prediction coefficients correction method for digital speech processing systems with data compression based on the autoregressive model of a voice signal

Авторлар: Savchenko V.V.¹, Savchenko L.V.²
Мекемелер:
1. Editorial office of the journal “Radio Engineering and Electronics”
2. National Research University Higher School of Economics
Шығарылым: Том 69, № 4 (2024)
Беттер: 339-347
Бөлім: THEORY AND METHODS OF SIGNAL PROCESSING
URL: https://rjsvd.com/0033-8494/article/view/650688
DOI: https://doi.org/10.31857/S0033849424040056
EDN: https://elibrary.ru/JSCORK
ID: 650688

Дәйексөз келтіру

Толық мәтін

Ашық рұқсат
Рұқсат жабық

Рұқсат берілді
Рұқсат жабық

Рұқсат ақылы немесе тек жазылушылар үшін

Аннотация
Толық мәтін
Авторлар туралы
Әдебиет тізімі
Қосымша файлдар
Статистика

Аннотация

The problem of distortion of the autoregressive model of the voice signal under the influence of additive background noise in digital speech processing systems with data compression based on linear prediction is considered. In the frequency domain, these distortions are observed in the weakening of the main formants responsible for the intelligibility of the speaker’s speech. To compensate for formant attenuation, it is proposed to modify the parameters of the autoregressive model (linear prediction coefficients) using the impulse response of a recursive shaping filter. Along with the amplitude amplification of the formants, their frequencies remain unchanged to make the speaker’s voice recognizable. The effectiveness of the method was studied experimentally using specially developed software. Based on the experimental results, conclusions were drawn about a significant increase in the relative level of formants in the power spectrum of the corrected voice signal.

Негізгі сөздер

signal theory, voice signal, digital speech processing, digital speech transmission, spectral analysis, power spectral density, discrete spectral modeling, autoregressive model, all-pole model

Толық мәтін

Авторлар туралы

V. Savchenko

Editorial office of the journal “Radio Engineering and Electronics”

Хат алмасуға жауапты Автор.
Email: vvsavchenko@yandex.ru
Ресей, Mokhovaya St., 11, bldg. 7, Moscow, 125009

L. Savchenko

National Research University Higher School of Economics

Email: vvsavchenko@yandex.ru
Ресей, B. Pecherskaya St., 25, Nizhny Novgorod, 603155

Әдебиет тізімі

Rabiner L.R., Schafer R.W. // Foundations and Trends in Signal Processing. 2007. V. 1. № 1–2. P. 1. https://doi.org/10.1561/2000000001
O’Shaughnessy D. // J. Audio. Speech. Music Processing. 2023. V. 8. https://doi.org/10.1186/s13636-023-00274-x
Savchenko V.V. // Radioelectron. Commun. Systems. 2021. V. 64. № 11. P. 592. https://doi.org/10.3103/S0735272721110030
Gibson J. // Information. 2019. V. 10. № 5. 179. https://doi.org/10.3390/info10050179
Chaouch H., Merazka F., Marthon Ph. // Speech Commun. 2019. V. 108. P. 33. https://doi.org/10.1016/j.specom.2019.02.002.
Савченко В.В., Савченко Л.В. // Измерит. техника. 2019. № 9. С. 59. https://doi.org/10.32446/0368-1025it.2019-9-59-64
Candan Ç. // Signal Processing. 2020. V. 166. № 10. Р. 107256. https://doi.org/10.1016/j.sigpro.2019.107256
Semenov V.Yu. // J. Automation and Inform. Sci. 2019. V. 51. № 2. P. 30. https://doi.org/10.1615/JAutomatInfScien.v51.i2.40
Marple S.L. Digital Spectral Analysis with Applications. 2-nd ed. Mineola: Dover Publ., 2019.
Burg J.P. Maximum entropy spectral analysis. PhD Thesis. Stanford Univ., 1975.
Magi C., Pohjalainen J., Bäckström T., Alku P. // Speech Commun. 2009. V. 51. № 5. P. 401. https://doi.org/10.1016/j.specom.2008.12.005
Rout J.K., Pradhan G. // Speech Commun. 2022. V. 144. P. 101. https://doi.org/10.1016/j.specom.2022.09.004
Deng F., Bao Ch. // Speech Commun. 2016. V. 79. P. 30. https://doi.org/10.1016/j.specom.2016.02.006
Савченко В.В., Савченко А. В. // Измерит. техника. 2020. № 11. С. 65. https://doi.org/10.32446/0368-1025it.2020-11-65-72
Савченко В.В. // РЭ. 2023. Т. 68. № 2. С. 138. https://doi.org/10.31857/S0033849423020122
Kathiresan Th., Maurer D., Suter H., Dellwo V. // J. Acoust. Soc. Amer. 2018. V. 143. № 3. P. 1919. https://doi.org/10.1121/1.5036258
Ngo Th., Kubo R., Akagi M. // Speech Commun. 2021. V. 135. P. 11. https://doi.org/10.1016/j.specom.2021.09.004
Palaparthi A., Titze I. R. // Speech Commun. 2020. V. 123. P. 98. https://doi.org/10.1016/j.specom.2020.07.003
Sadasivan J., Seelamantula Ch.S., Muraka N.R. // Speech Commun. 2020. V. 116. P. 12. https://doi.org/10.1016/j.specom.2019.11.001
Gustafsson Ph.U., Laukka P., Lindholm T. // Speech Commun. 2023. V. 146. P. 82. https://doi.org/10.1016/j.specom.2022.12.001
Ito M., Ohara K., Ito A., Yano M. // Proc. Interspeech. 2010. V. 2490. https://doi.org/10.21437/Interspeech.2010-669
Arun-Sankar M.S., Sathidevi P. S. // Heliyon. 2019. V. 5. № 5. Р. e01820. https://doi.org/10.1016/j.heliyon.2019.e01820
Narendra N.P., Alku P. // Speech Commun. 2019. V. 110. P. 47. https://doi.org/10.1016/j.specom.2019.04.003
Alku P., Kadiri S.R., Gowda D. // Computer Speech & Language. 2023. V. 81. № 10. Р. 101515. https://doi.org/10.1016/j.csl.2023.101515
Sadok S., Leglaive S., Girin L. et al. // Speech Commun. 2023. V. 148. P. 53. https://doi.org/10.1016/j.specom.2023.02.005
Nguyen D.D., Chacon A., Payten Ch.L. et al. // Int. J. Language & Commun. Disorders. 2022. V. 57. № 2. P. 366. https://doi.org/10.1111/1460-6984.12705

Қосымша файлдар

Әрекет

1. JATS XML

Жүктеу

2. Fig. 1. Estimation of the envelope of the SPM (3) signal of the vowel phoneme “a” with the SNR q2 equal to 0 (1), 10 (2) and 20 dB (3).

Жүктеу (56KB)

Метадеректер

3. Fig. 2. Estimates of the KLP of the phoneme “a” signal with the q2 SNR equal to 0 (1), 10 (2) and 20 dB (3) in comparison with the KLP vector in the absence of noise (dotted line).

Жүктеу (67KB)

Метадеректер

4. Fig. 3. Pulse response (5) of the forming filter (4) at SNR q2 equal to 0 (1), 10 (2) and 20 dB (3).

Жүктеу (206KB)

Метадеректер

5. Fig. 4. Corrected impulse response (6) at c = 0.01 (1), 0.03 (2) and 0.05 (3) for the case of equal SNR q2 = 0 dB in comparison with the impulse response (5) in the absence of correction (dotted line).

Жүктеу (190KB)

Метадеректер

6. Fig. 5. The envelope of the SPM (3) of the synthesized voice signal at c = 0.01 (1), 0.03 (2) and 0.05 (3) for the case of equal SNR q2 = 0 dB and in the absence of correction (dotted line).

Жүктеу (43KB)

Метадеректер

7. Fig. 6. Fragments of the synthesized signal of the vowel phoneme “a" in c = 0.01 (1), 0.03 (2) and 0.05 (3) for the case of equal SNR q2 = 0 dB and in the absence of correction (dotted line).

Жүктеу (134KB)

Метадеректер

8. 7. Schuster periodogram (10) of the vowel phoneme “a” signal synthesized according to the AR model (2) at c = -0.06 (solid curve) and c = 0 (dotted line).

Жүктеу (167KB)

Метадеректер

9. 8. Schuster periodogram (10) of the signal of the fricative sound of speech “w" synthesized according to the AR model (2) at c = 0.06 (solid curve) and c = 0 (dotted line).

Жүктеу (169KB)

Метадеректер

Пайдаланушының аты
Құпиясөз
Мені есте сақтау

Құпия сөзді ұмыттыңыз ба?	Тіркеу

Пайдаланушының аты
Құпиясөз
Мені есте сақтау

Құпия сөзді ұмыттыңыз ба?	Тіркеу

Том 70, № 10 (2025)

Том 70, № 10 (2025)

Linear prediction coefficients correction method for digital speech processing systems with data compression based on the autoregressive model of a voice signal

Толық мәтін

Аннотация

Негізгі сөздер

Толық мәтін

Авторлар туралы

V. Savchenko

L. Savchenko

Әдебиет тізімі

Қосымша файлдар