Modelirovanie i Analiz Informatsionnykh Sistem
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive
Impact factor

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Model. Anal. Inform. Sist.:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Modelirovanie i Analiz Informatsionnykh Sistem, 2023, Volume 30, Number 1, Pages 86–100
DOI: https://doi.org/10.18255/1818-1015-2023-1-86-100
(Mi mais792)
 

This article is cited in 1 scientific paper (total in 1 paper)

Theory of data

Annotation of text corpora by sentiment and presence of irony within a project of citizen science

I. V. Paramonov, A. Yu. Poletaev

P. G. Demidov Yaroslavl State University, 14 Sovetskaya str., Yaroslavl 150003, Russia
Full-text PDF (671 kB) Citations (1)
References:
Abstract: The paper is devoted to construction of a sentence corpus annotated by the general sentiment into 4 classes (positive, negative, neutral, and mixed), a corpus of phrasemes annotated by the sentiment into 3 classes (positive, negative, and neutral), and a corpus of sentences annotated by the presence or absence of irony. The annotation was done by volunteers within the project “Prepare texts for algorithms” on the portal “People of science”.
The existing knowledge on the domain regarding each task was the basis to develop guidelines for annotators. A technique of statistical analysis of the annotation result based on the distributions and agreement measures of the annotations performed by various annotators was also developed. For the annotation of sentences by irony and phrasemes by the sentiment the agreement measures were rather high (the full agreement rate of 0.60–0.99), whereas for the annotation of sentences by the general sentiment the agreement was low (the full agreement rate of 0.40), presumably, due to the higher complexity of the task. It was also shown that the results of automatic algorithms of detecting the sentiment of sentences improved by 12–13 % when using a corpus for which all the annotators (from 3 till 5) had the agreement, in comparison with a corpus annotated by only one volunteer.
Keywords: sentiment analysis, text corpus, statistical analysis, agreement measures, citizen science.
Funding agency
The research was performed within the YarSU citizen science project No. CS-02/2022.
Received: 03.02.2023
Revised: 24.02.2023
Accepted: 27.02.2023
Document Type: Article
UDC: 004.912
MSC: 68T50
Language: Russian
Citation: I. V. Paramonov, A. Yu. Poletaev, “Annotation of text corpora by sentiment and presence of irony within a project of citizen science”, Model. Anal. Inform. Sist., 30:1 (2023), 86–100
Citation in format AMSBIB
\Bibitem{ParPol23}
\by I.~V.~Paramonov, A.~Yu.~Poletaev
\paper Annotation of text corpora by sentiment and presence of irony within a project of citizen science
\jour Model. Anal. Inform. Sist.
\yr 2023
\vol 30
\issue 1
\pages 86--100
\mathnet{http://mi.mathnet.ru/mais792}
\crossref{https://doi.org/10.18255/1818-1015-2023-1-86-100}
Linking options:
  • https://www.mathnet.ru/eng/mais792
  • https://www.mathnet.ru/eng/mais/v30/i1/p86
  • This publication is cited in the following 1 articles:
    Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Моделирование и анализ информационных систем
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2025