Modelirovanie i Analiz Informatsionnykh Sistem
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive
Impact factor

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Model. Anal. Inform. Sist.:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Modelirovanie i Analiz Informatsionnykh Sistem, 2018, Volume 25, Number 4, Pages 435–458
DOI: https://doi.org/10.18255/1818-1015-2018-4-435-458
(Mi mais640)
 

This article is cited in 1 scientific paper (total in 1 paper)

Thesauri

Russianlanguage thesauri: automated construction and application for natural language processing tasks

N. S. Lagutina, K. V. Lagutina, A. S. Adrianov, I. V. Paramonov

P.G. Demidov Yaroslavl State University, 14 Sovetskaya str., Yaroslavl, 150003, Russia
Full-text PDF (637 kB) Citations (1)
References:
Abstract: The paper reviews the existing Russian-language thesauri in digital form and methods of their automatic construction and application. The authors analyzed the main characteristics of open access thesauri for scientific research, evaluated trends of their development, and their effectiveness in solving natural language processing tasks. The statistical and linguistic methods of thesaurus construction that allow to automate the development and reduce labor costs of expert linguists were studied. In particular, the authors considered algorithms for extracting keywords and semantic thesaurus relationships of all types, as well as the quality of thesauri generated with the use of these tools. To illustrate features of various methods for constructing thesaurus relationships, the authors developed a combined method that generates a specialized thesaurus fully automatically taking into account a text corpus in a particular domain and several existing linguistic resources. With the proposed method, experiments were conducted with two Russian-language text corpora from two subject areas: articles about migrants and tweets. The resulting thesauri were assessed by using an integrated assessment developed in the previous authors’ study that allows to analyze various aspects of the thesaurus and the quality of the generation methods. The analysis revealed the main advantages and disadvantages of various approaches to the construction of thesauri and the extraction of semantic relationships of different types, as well as made it possible to determine directions for future study.
Keywords: thesaurus, semantic relationships, automatic thesaurus construction, automatic relationship extraction, keyword extraction.
Received: 01.08.2018
Bibliographic databases:
Document Type: Article
UDC: 004.912
Language: Russian
Citation: N. S. Lagutina, K. V. Lagutina, A. S. Adrianov, I. V. Paramonov, “Russianlanguage thesauri: automated construction and application for natural language processing tasks”, Model. Anal. Inform. Sist., 25:4 (2018), 435–458
Citation in format AMSBIB
\Bibitem{LagLagAdr18}
\by N.~S.~Lagutina, K.~V.~Lagutina, A.~S.~Adrianov, I.~V.~Paramonov
\paper Russianlanguage thesauri: automated construction and application for natural language processing tasks
\jour Model. Anal. Inform. Sist.
\yr 2018
\vol 25
\issue 4
\pages 435--458
\mathnet{http://mi.mathnet.ru/mais640}
\crossref{https://doi.org/10.18255/1818-1015-2018-4-435-458}
\elib{https://elibrary.ru/item.asp?id=35452930}
Linking options:
  • https://www.mathnet.ru/eng/mais640
  • https://www.mathnet.ru/eng/mais/v25/i4/p435
  • This publication is cited in the following 1 articles:
    Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Моделирование и анализ информационных систем
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2025