|
This article is cited in 3 scientific papers (total in 3 papers)
Method for description of multiword connectives in Supracorpora databases
O. Yu. Inkova, M. G. Kruzhkov Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
Abstract:
This article presents a new method for describing the structure of multiword connectives implemented in the Supracorpora database (SCDB) of connectives. Currently, the structure of connectives is underinvestigated, and criteria for determining boundaries of connectives and their components are lacking. The proposed method is based on the cognitive-semantic approach that considers multiword connectives as more or less free word combinations generated in the process of speech. A two-tier faceted classification is proposed which allows annotating, on one hand, specific tokens of connectives in texts (context annotation) and, on the other hand, the inner structure of connectives (structural annotation). The structural annotation is based on two aspects: structural type and structural components of connectives. Based on the proposed annotation method, a system of cross-clusters is implemented that extends the search and statistical capabilities of SCDB. In addition, this method allows researchers to eliminate subjectivity during the annotation process and to fill some gaps in linguistic knowledge, for example, to gather new data on combinatorial capabilities of Russian connectives.
Keywords:
connectives, linguistic items structure, linguistic items variation, corpus linguistics, annotation, faceted classification, supracorpora databases.
Received: 05.09.2018
Citation:
O. Yu. Inkova, M. G. Kruzhkov, “Method for description of multiword connectives in Supracorpora databases”, Sistemy i Sredstva Inform., 28:4 (2018), 168–181
Linking options:
https://www.mathnet.ru/eng/ssi616 https://www.mathnet.ru/eng/ssi/v28/i4/p168
|
|