Combinatorial bounds of overfitting for threshold classifiers
Sh. Kh. Ishkina
Federal Research Center УComputer Science and ControlФ of RAS,
Vavilova str. 44/2,
119333, Moscow, Russia
Estimating the generalization ability
is a fundamental objective of statistical learning theory.
However, accurate and computationally efficient bounds are still unknown even for many very simple cases.
In this paper, we study one-dimensional threshold decision rules.
We use the combinatorial theory of overfitting based on a single probabilistic assumption that
all partitions of a set of objects into an observed training sample and a hidden test sample are of equal probability.
We propose a polynomial algorithm for computing both probability of overfitting and of complete cross-validation.
The algorithm exploits the recurrent calculation of the
number of admissible paths while walking over a three-dimensional lattice between two
prescribed points with restrictions of special form. We compare the obtain sharp estimate of the generalized ability and demonstrate that the known upper bound are too overstated and they can not be applied for practical problems.
computational learning theory, empirical risk minimization, combinatorial theory of overfitting,
probability of overfitting, complete cross-validation, generalization ability, threshold classifier, computational complexity.
|Russian Foundation for Basic Research
|The work is supported by RFBR under projects no. 15-37-50350 mol nr and no. 14-07-00847.
PDF file (552 kB)
Ufa Mathematical Journal, 2018, 10:1, 49–63 (PDF, 471 kB); https://doi.org/10.13108/2018-10-1-49
MSC: 68Q32, 60C05
Sh. Kh. Ishkina, “Combinatorial bounds of overfitting for threshold classifiers”, Ufimsk. Mat. Zh., 10:1 (2018), 50–65; Ufa Math. J., 10:1 (2018), 49–63
Citation in format AMSBIB
\paper Combinatorial bounds of overfitting for threshold classifiers
\jour Ufimsk. Mat. Zh.
\jour Ufa Math. J.
Citing articles on Google Scholar:
Related articles on Google Scholar:
|Number of views:|