Dataflow-driven crowdsourcing: relational models and algorithms
D. A. Ustalov
N.N. Krasovskii Institute of Mathematics and Mechanics of the Ural Branch of the Russian Academy of Sciences,
Sofia Kovalevskaya str., 16, Yekaterinburg, 620990, Russia
Recently, microtask crowdsourcing has become a popular approach for addressing various data mining problems. Crowdsourcing workflows for approaching such problems are composed of several data processing stages which require consistent representation for making the work reproducible. This paper is devoted to the problem of reproducibility and formalization of the microtask crowdsourcing process. A computational model for microtask crowdsourcing based on an extended relational model and a dataflow computational model has been proposed. The proposed collaborative dataflow computational model is designed for processing the input data sources by executing annotation stages and automatic synchronization stages simultaneously. Data processing stages and connections between them are expressed by using collaborative computation workflows represented as loosely connected directed acyclic graphs. A synchronous algorithm for executing such workflows has been described. The computational model has been evaluated by applying it to two tasks from the computational linguistics field: concept lexicalization refining in electronic thesauri and establishing hierarchical relations between such concepts. The “Add–Remove–Confirm” procedure is designed for adding the missing lexemes to the concepts while removing the odd ones. The “Genus–Species–Match” procedure is designed for establishing “is-a” relations between the concepts provided with the corresponding word pairs. The experiments involving both volunteers from popular online social networks and paid workers from crowdsourcing marketplaces confirm applicability of these procedures for enhancing lexical resources.
crowdsourcing, dataflow model, relational model, computational linguistics.
|Russian Foundation for Basic Research
|Russian Humanitarian Science Foundation
|The reported study was funded by RFBR according to the research project no. 16-37-00354 мол_а “Adaptive Crowdsourcing Methods for Linguistic Resources”. This work was supported by the Russian Foundation for the Humanities project no. 13-04-12020 “New Open Electronic Thesaurus for Russian” and project no. 16-04-12019 “RussNet and YARN thesauri integration”.
PDF file (744 kB)
D. A. Ustalov, “Dataflow-driven crowdsourcing: relational models and algorithms”, Model. Anal. Inform. Sist., 23:2 (2016), 195–210
Citation in format AMSBIB
\paper Dataflow-driven crowdsourcing: relational models and algorithms
\jour Model. Anal. Inform. Sist.
Citing articles on Google Scholar:
Related articles on Google Scholar:
|Number of views:|