|
This article is cited in 2 scientific papers (total in 2 papers)
Criteria for choosing the factorization model dimensionality
M. P. Krivenko Federal Research Center “Computer Science and Control” of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
Abstract:
The paper is devoted to the choice of model dimension of matrix factorization in the presence of missing elements. The problem of estimating the parameters of the adopted data model is solved by multidimensional optimization. Estimating the value of reduced dimensionality is a typical example of the problem of choosing a model when an alternative arises during data analysis and the choice means either finding out the preferences of individual options or highlighting the “best” representative. Typically, applied selection criteria are based on likelihood function which requires probabilistic assumptions about the data. But when evaluating the parameters of the factor model under consideration, they are not set and it is impractical to introduce them, so as not to violate the commonality of the formulated task of reducing dimensionality. Therefore, an attempt was made to turn to the idea of reusing the available data for the statistical output. None of the existing approaches (bootstrap, folding knife, rechecks, as well as permutation tests) is suitable; so, an original method for generating new data by additional omissions of elements of the original matrix was proposed. To process the formed samples, it is suggested to use a combination of the model of a mixture of normal distributions in conjunction with nuclear smoothing. The proposed solutions make it possible to correctly carry out the procedure for justifying the dimensionality of the adopted factorization model. The exposition is illustrated by an example of synthetic data processing.
Keywords:
lower rank matrix approximation, missing data, criteria for model selection, resampling methods, kernel smoothing.
Received: 27.01.2023
Citation:
M. P. Krivenko, “Criteria for choosing the factorization model dimensionality”, Inform. Primen., 17:2 (2023), 50–56
Linking options:
https://www.mathnet.ru/eng/ia844 https://www.mathnet.ru/eng/ia/v17/i2/p50
|
|