Vestnik Yuzhno-Ural'skogo Gosudarstvennogo Universiteta. Seriya "Vychislitelnaya Matematika i Informatika"
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Vestn. YuUrGU. Ser. Vych. Matem. Inform.:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Vestnik Yuzhno-Ural'skogo Gosudarstvennogo Universiteta. Seriya "Vychislitelnaya Matematika i Informatika", 2022, Volume 11, Issue 2, Pages 43–58
DOI: https://doi.org/10.14529/cmse220204
(Mi vyurv277)
 

Method of an acoustic echo suppression based on recurrent neural network and clustering

G. M. Shahoud, O. L. Ibryaeva

South Ural State University (pr. Lenina 76, Chelyabinsk, 454080 Russia)
Abstract: The article solves the problem of acoustic echo suppression based on a neural network that evaluates an ideal binary mask IBM using features extracted from a mixture of near-end and far-end signals. The novelty of the proposed method lies in the use of the clustering algorithm in addition to the bidirectional recurrent neural network BLSTM. To evaluate the use of the EM, Mean-Shift, k-Means clustering algorithms, the models have been trained and tested on the TIMIT database. For each model, the ERLE, PESQ, STOI metrics have been calculated to characterize its quality. The use of the EM and Mean-Shift clustering algorithms appeared to be inefficient compared to the BLSTM algorithm at a signal-to-echo ratio of 10 dB. With a signal-to-echo ratio of 6 dB, BLSTM+Mean-Shift resulted in a marginal improvement in the PESQ metric compared to the BLSTM algorithm. The results of the experiments show the effectiveness of the proposed BLSTM model when using a network with the K-Means algorithm, compared to using a pure BLSTM for echo cancellation in double-talk scenarios. With a signal-to-echo ratio of 10 dB, the STOI metric, which characterizes speech intelligibility, has improved by 7%, and the PESQ metric, which characterizes the quality of speech restoration, by 18.8%.
Keywords: ideal binary mask, near-end signal, far-end signal, bidirectional recurrent neural network, clustering, double-talk.
Received: 01.04.2022
Document Type: Article
UDC: 004.032.26, 004.048
Language: Russian
Citation: G. M. Shahoud, O. L. Ibryaeva, “Method of an acoustic echo suppression based on recurrent neural network and clustering”, Vestn. YuUrGU. Ser. Vych. Matem. Inform., 11:2 (2022), 43–58
Citation in format AMSBIB
\Bibitem{ShaIbr22}
\by G.~M.~Shahoud, O.~L.~Ibryaeva
\paper Method of an acoustic echo suppression based on recurrent neural network and clustering
\jour Vestn. YuUrGU. Ser. Vych. Matem. Inform.
\yr 2022
\vol 11
\issue 2
\pages 43--58
\mathnet{http://mi.mathnet.ru/vyurv277}
\crossref{https://doi.org/10.14529/cmse220204}
Linking options:
  • https://www.mathnet.ru/eng/vyurv277
  • https://www.mathnet.ru/eng/vyurv/v11/i2/p43
  • Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Vestnik Yuzhno-Ural'skogo Gosudarstvennogo Universiteta. Seriya "Vychislitelnaya Matematika i Informatika"
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2025