Abstract:
This paper is a direct followup of a recent paper of the author. We continue to analyze approximation and recovery properties with respect to systems satisfying the universal sampling discretization property and a special unconditionality property. In addition, we assume that the subspace spanned by our system satisfies some Nikol'skii-type inequalities. We concentrate on recovery with an error measured in the $L_p$-norm for $2\le p<\infty$. We apply a powerful nonlinear approximation method — the Weak Orthogonal Matching Pursuit (WOMP), also known under the name of the Weak Orthogonal Greedy Algorithm (WOGA). We establish that the WOMP based on good points for $L_2$-universal discretization provides good recovery in the $L_p$-norm for $2\le p<\infty$. For our recovery algorithms we obtain both Lebesgue-type inequalities for individual functions and error bounds for special classes of multivariate functions.
We combine two deep and powerful techniques — Lebesgue-type inequalities for the WOMP and the theory of universal sampling discretization — in order to obtain new results on sampling recovery.
Bibliography: 19 titles.
This research was performed at Lomonosov Moscow State University and supported by the Russian Science Foundation under grant no. 23-71-30001, https://rscf.ru/en/project/23-71-30001/.
This paper is a followup of the recent paper [18]. We formulate here the results from [18] that we use in our analysis. The reader can find in [18] a discussion of the previous results connected directly with the ones in [18] and in this paper. Results on discretization can be found in the recent survey papers [2] and [10].
We begin with a brief description of some necessary concepts on sparse approximation. Let $X$ be a Banach space with the norm $\|\,{\cdot}\,\|:=\|\,{\cdot}\,\|_X$, and let $\mathcal{D}=\{g_i\}_{i=1}^\infty $ be a given (countable) system of elements of $X$. Given a finite subset $J\subset \mathbb{N}$, we define $V_J(\mathcal{D}):=\operatorname{span}\{g_j\colon j\in J\}$. For a positive integer $v$, we denote by $\mathcal{X}_v(\mathcal{D})$ the collection of all linear spaces $V_J(\mathcal{D})$ with $|J|=v$, and denote by $\Sigma_v(\mathcal{D})$ the set of all $v$-term approximants with respect to $\mathcal{D}$, that is, $\Sigma_v(\mathcal D):=\bigcup_{V\in\mathcal X_v(\mathcal D)} V$. Given $f\in X$, we set
In this paper we consider the case when $X=L_p(\Omega,\mu)$, $1\leqslant p<\infty$. More precisely, let $\Omega$ be a compact subset of $\mathbb{R}^d$ with the probability measure $\mu$ on it. By the $L_p$-norm, $1\leqslant p< \infty$, of a complex-valued function defined on $\Omega$, we understand
In the case when $X=L_p(\Omega,\mu)$ we sometimes write for brevity $\sigma_v(\,{\cdot}\,{,}\,{\cdot}\,)_p$ instead of $\sigma_v(\,{\cdot}\,{,}\,{\cdot}\,)_{L_p(\Omega,\mu)}$.
We study systems that have the universal sampling discretization property.
Definition 1.1. Let $1\leqslant p<\infty$. We say that a set $\xi:= \{\xi^j\}_{j=1}^m \subset \Omega$ provides $L_p$-universal sampling discretization for the collection $\mathcal{X}:= \{X(n)\}_{n=1}^k$ of finite-dimensional linear subspaces $X(n)$ if
We denote by $m(\mathcal{X},p)$ the minimum $m$ such that there exists a set $\xi$ of $m$ points that provides the $L_p$-universal sampling discretization (1.1) for the collection $\mathcal{X}$.
We will use the brief form $L_p$-usd for $L_p$-universal sampling discretization (1.1).
We let $C$, $C'$ and $c$, $c'$ denote various positive constants. Their arguments indicate the parameters they can depend on. Normally, these constants do not depend on the function $f$ or the running parameters $m$, $v$ and $u$. We use the following symbols for brevity. For two nonnegative sequences $a=\{a_n\}_{n=1}^\infty$ and $b=\{b_n\}_{n=1}^\infty$ the relation $a_n\ll b_n$ means that there is a number $C(a,b)$ such that for all $n$ we have $a_n\leqslant C(a,b)b_n$. The relation $a_n\gg b_n$ means that $b_n\ll a_n$, and $a_n\asymp b_n$ means that $a_n\ll b_n$ and $a_n \gg b_n$. For a real number $x$ we denote by $[x]$ the integer part of $x$, and $\lceil x\rceil$ is the smallest integer greater than or equal to $x$.
where $\delta_\mathbf{x}$ denotes the Dirac measure supported at a point $\mathbf{x}$.
For convenience we will use the notation $\|\,{\cdot}\,\|_{L_p(\nu)}$ to denote the norm in $L_p$ with respect to the measure $\nu$ on $\Omega$. Sometimes we use the notation $\|\,{\cdot}\,\|_{p}$.
The main goal of this paper is to study optimal recovery on specific classes of multivariate functions.
For a function class $\mathbf{F}\subset \mathcal{C}(\Omega)$ we define (see [19])
where $\mathcal{M}$ ranges over all mappings $\mathcal{M}\colon \mathbb{C}^m \to L_p(\Omega,\mu)$ and $\xi$ ranges over all subsets $\{\xi^1,\dots,\xi^m\}$ of $m$ points in $\Omega$. Here we use the index ‘o’ to indicate optimality.
In [18] we studied the behaviour of $\varrho_m^{\rm o}(\mathbf{F},L_p)$ for a new kind of classes, $\mathbf{A}^r_\beta(\Psi)$. We obtained the order of $\varrho_m^{\rm o}(\mathbf{A}^r_\beta(\Psi),L_p)$ (up to a factor logarithmic in $m$) in the case $1\leqslant p\leqslant 2$ for the class of uniformly bounded orthonormal systems $\Psi$. We give a definition of the classes $\mathbf{A}^r_\beta(\Psi)$. Given $p$, $1\leqslant p\leqslant \infty$, let $\Psi =\{\psi_{\mathbf{k}}\}_{\mathbf{k} \in \mathbb{Z}^d}$, $\psi_\mathbf{k} \in \mathcal{C}(\Omega)$, $\|\psi_\mathbf{k}\|_p \leqslant B$, $\mathbf{k}\in\mathbb{Z}^d$, be a system in the space $L_p(\Omega,\mu)$. We consider functions representable in the form of an absolutely convergent series:
For $\beta \in (0,1]$ and $r>0$ consider the class $\mathbf{A}^r_\beta(\Psi)$ of functions $f$ that have representations (1.3) satisfying the following conditions:
In the special case when $\Psi$ is the trigonometric system $\mathcal{T}^d := \{e^{i(\mathbf{k},\mathbf{x})}\}_{\mathbf{k}\in \mathbb{Z}^d}$ we proved the following lower bound in [18] (see Theorem 4.6 there):
In [18] we complemented this lower bound by the following upper bound. Assume that $\Psi$ is a uniformly bounded orthonormal system: $\|\psi_\mathbf{k}\|_\infty \leqslant B$, $\mathbf{k}\in\mathbb{Z}^d$. Let ${1\leqslant p\leqslant 2}$, $\beta \in (0,1]$ and $r>0$; then
It was noted in [18] that $(\log(m+1))^{-5}$ in (1.6) can be replaced by $(\log(m+1))^{-4}$. Bounds (1.5) and (1.6) show that, generally, for uniformly bounded orthonormal systems $\Psi$, for instance, for the trigonometric system $\mathcal{T}^d$, the gap between the upper and lower bounds is in terms of an extra factor logarithmic in $m$.
In this paper, alike [18], we study the special case when $\Psi$ satisfies certain restrictions. We formulate these restrictions in the form of conditions imposed on $\Psi$.
Condition A. Assume that $\Psi$ is a uniformly bounded system. Namely, assume that $\Psi:=\{\varphi_j\}_{j=1}^\infty$ is a system of uniformly bounded functions on $\Omega \subset \mathbb{R}^d$ such that
Condition B3. Assume that $\Psi$ is a Bessel system, that is, there exists a constant $K>0$ such that for all $N\in\mathbb{N}$ and any $(a_1,\dots, a_N) \in\mathbb{C}^N$,
Clearly, Condition B1 implies Condition B2 for $R_1=R_2=1$. Condition B2 implies Condition B3 for $K=R_1^{-2}$.
In the case $2<p<\infty$ the following upper bounds were proved in [18] (see Theorem 4.2 and Remark 6.1 there).
Theorem 1.1 ([18]). Assume that $\Psi$ is a uniformly bounded Riesz system (in the space $L_2(\Omega,\mu)$) satisfying (1.7) and (1.8) for some constants ${0<R_1\leqslant R_2<\infty}$. Let $2<p<\infty$, $\beta \in (0,1]$ and $r>0$. Then there exist constants $c'=c'(r,\beta,p,R_1,R_2,d)$ and $C'=C'(r,\beta,p,d)$ such that for each $v\in\mathbb{N}$ we have the bound
$$
\begin{equation*}
m\geqslant c' v^{p/2} (\log v)^{2}.
\end{equation*}
\notag
$$
In the proof of Theorem 1.1 known results about universal $L_p$-discretization played an important role. We formulate the corresponding result.
Theorem 1.2 ([3]). Assume that $\mathcal{D}_N=\{\varphi_j\}_{j=1}^N$ is a uniformly bounded Riesz system satisfying (1.7) and (1.8) for some constants $0<R_1\leqslant R_2<\infty$. Let $2<p<\infty$, and let $1\leqslant u\leqslant N$ be an integer. Then for a sufficiently large constant $C=C(p,R_1,R_2)$ and any $\varepsilon\in (0, 1)$ there exist $m$ points $\xi^1,\dots, \xi^m\in \Omega$, where
Moreover, the bound (1.13) is provided by a simple greedy algorithm, the Weak Orthogonal Matching Pursuit (see the definition below). The upper bound (5.11) and the lower bound from Proposition 5.1 give us the following inequalities in the special case of the $d$-variate trigonometric system $\mathcal{T}^d$:
The Weak Orthogonal Matching Pursuit (WOMP) is a greedy algorithm defined with respect to a given dictionary (system) $\mathcal{D}=\{g\}$ in a Hilbert space equipped with an inner product $\langle\,\cdot\,{,}\,\cdot\,\rangle$ and a norm $\|\,{\cdot}\,\|_H$. It is also known under the name of the Weak Orthogonal Greedy Algorithm (see, for instance, [15]).
The Weak Orthogonal Matching Pursuit (WOMP)
Let $\mathcal{D}=\{g\}$ be a system of nonzero elements of $H$ such that $\|g\|_H\leqslant 1$. Let $\tau:=\{t_k\}_{k=1}^\infty\subset [0, 1]$ be a given sequence of weakness parameters. Given $f_0\in H$, we define a sequence $\{f_k\}_{k=1}^\infty\subset H$ of residuals for $k=1,2,\dots$ recursively as follows.
In the case when $t_k=1$ for $k=1,2,\dots$ the WOMP is called the Orthogonal Matching Pursuit (OMP). In this paper we only consider the case when $t_k=t\in (0, 1]$ for $k=1,2,\dots$ . The term weak in the definition of the WOMP means that at step (1) we do not shoot for the optimal element of the dictionary which delivers the corresponding supremum. The obvious reason for this is that we do not know in general if an optimal element exists. Another practical reason is that the weaker the assumption, the easier it is to realize in practice. Clearly, $g_k$ may not be unique. However, all results formulated below are independent of the choice of the $g_k$.
For the sake of convenience in applications that follow we use the notation $\mathrm{WOMP}(\mathcal{D}; t)_H$ to denote the WOMP defined with respect to a given weakness parameter $t\in (0, 1]$ and a system $\mathcal{D}$ in a Hilbert space $H$.
$\mathbf{UP}(u,D)$: the $(u,D)$-unconditional property
We say that a system $\mathcal{D}=\{\varphi_i\}_{i\in I}$ of elements in a Hilbert space $H=(H, \|\,{\cdot}\,\|)$ is ($u,D$)-unconditional with constant $U>0$ for some integers $1\leqslant u\leqslant D$ if for each $f=\sum_{i\in A} c_i \varphi_i\in \Sigma_u(\mathcal{D})$, where $A\subset I$, and each $J\subset I\setminus A$ such that $|A|+|J| \leqslant D$ we have
where $ V_J(\mathcal{D}):=\operatorname{span}\{\varphi_i\colon i\in J\}$.
We gave the above definition for a countable (or finite) system $\mathcal{D}$. A similar definition can also be given for any system $\mathcal{D}$.
Theorem 2.1 ([11], Corollary 1.1). Let $\mathcal{D}$ be a dictionary in a Hilbert space $H=(H, \|\,{\cdot}\,\|)$ having the property $\mathbf{UP}(u,D)$ with constant $U>0$ for some integers $1\leqslant u\leqslant D$. Let $f_0\in H$, and let $t\in (0, 1]$ be a given weakness parameter. Then there exists a positive constant $c_\ast:=c(t,U)$, depending only on $t$ and $U$, such that the $\mathrm{WOMP}(\mathcal{D}; t)_H$, as applied to $f_0$, gives
where $C>1$ is an absolute constant and $f_k$ denotes the residual of $f_0$ after the $k$th iteration of the algorithm.
We will consider the Hilbert space $L_2(\Omega_m,\mu_m)$ instead of $L_2(\Omega,\mu)$, where $\Omega_m=\{\xi^\nu\}_{\nu=1}^m$ is a set of points providing good universal discretization, and $\mu_m$ is the uniform probability measure on $\Omega_m$, that is, $\mu_m\{\xi^\nu\}=1/m$, $\nu=1,\dots,m$. Let $\mathcal{D}_N(\Omega_m)$ denote the restriction of the system $\mathcal{D}_N$ to the set $\Omega_m$. Theorem 2.2 below guarantees that the simple greedy algorithm WOMP gives us the corresponding Lebesgue-type inequality in the norm $L_2(\Omega_m,\mu_m)$, and therefore provides good sparse recovery.
Theorem 2.2 ([4]). Let $\mathcal{D}_N=\{\varphi_j\}_{j=1}^N$ be a uniformly bounded Riesz basis in $L_2(\Omega, \mu)$ satisfying (1.7) and (1.8) for some constants $0<R_1\leqslant R_2<\infty$. Let $\Omega_m=\{\xi^1,\dots, \xi^m\}$ be a finite subset of $\Omega$ providing $L_2$-universal discretization (1.1) for the collection $\mathcal{X}_u(\mathcal{D}_N)$ and the given integers $1\leqslant u\leqslant N$. Given a weakness parameter $0<t\leqslant 1$, there exists a constant integer $c=c(t,R_1,R_2)\geqslant 1$ such that for any integer $0\leqslant v\leqslant u/(1+c)$ and any $f_0\in \mathcal{C}(\Omega)$ the
and that $X$ is said to be uniformly smooth if $\eta(w)/w\to 0$ as $w\to 0+$. It is well known that for $1< p<\infty$ the $L_p$-space is a uniformly smooth Banach space with
In addition to the conditions on a system $\Psi$ formulated in the introduction (§ 1) we need one more property of the system $\Psi$.
The $u$-term Nikol’skii inequality
Let $1\leqslant q\leqslant p\leqslant \infty$, and let $\Psi$ be a system in $L_p:=L_p(\Omega,\mu)$. For a natural number $u$ we say that the system $\Psi$ satisfies the $u$-term Nikol’skii inequality for the pair $(q,p)$ with constant $H$ if the following inequality holds:
In such a case we write $\Psi \in \mathrm{NI}(q,p,H,u)$.
Note, that obviously $H\geqslant 1$.
Theorem 3.1. Let $\mathcal{D}_N=\{\varphi_j\}_{j=1}^N$ be a uniformly bounded Riesz system in $L_2(\Omega, \mu)$ satisfying (1.7) and (1.8) for some constants $0<R_1\leqslant R_2<\infty$. Let $\Omega_m=\{\xi^1,\dots, \xi^m\}$ be a finite subset of $\Omega$ that provides $L_2$-universal discretization for the collection $\mathcal{X}_u(\mathcal{D}_N)$, $1\leqslant u\leqslant N$. Assume, in addition, that $\mathcal{D}_N \in \mathrm{NI}(2,p,H,u)$ for some $p\in [2,\infty)$. Then, given a weakness parameter $0<t\leqslant 1$, there exists a constant integer $c=c(t,R_1,R_2)\geqslant 1$ such that for any integer $0\leqslant v\leqslant u/(1+c)$ and any $f_0\in \mathcal{C}(\Omega)$, the
where the $C_i$, $i=0,1$, are absolute constants, $f_k$ denotes the residual of $f_0$ after the $k$th iteration of the algorithm, and $\mu_\xi$ was defined in (1.2).
Proof. We begin with inequality (3.2). It follows from (2.2) in Theorem 2.2. For completeness we give its simple proof here. Using (1.8) we obtain that for any sets $A\subset \{1,2,\dots, N\}$ and $\Lambda\subset \{1,\dots, N\}\setminus A$ and any sequences $\{x_i\}_{i\in A}$ and $\{c_i\}_{i\in\Lambda}\subset\mathbb{C}$,
meaning that the system $\mathcal{D}_N$ has the $\mathbf{UP}(v,N)$ property with constant $U_1 = R_2/R_1$ in the space $L_2(\Omega,\mu)$ for any integer $1\leqslant v< N$. Then it follows from the discretization inequalities (1.1) for $\mathbf{\mathcal{X}}=\mathbf{\mathcal{X}}_u(\mathcal{D}_N)$ that the system $\mathcal{D}:=\mathcal{D}_N(\Omega_m)$, which is the restriction of $\mathcal{D}_N$ to $\Omega_m=\{\xi^1, \dots,\xi^m\}$, has the property $\mathbf{UP}(v,u)$ in the space $L_2(\Omega_m,\mu_m)$ with constant $U_2 =U_1 3^{1/2}$ for any integers $1\leqslant v\leqslant u$. Thus, applying Theorem 2.1 to the discretized dictionary $\mathcal{D}_N(\Omega_m)$ in the Hilbert space $L_2(\Omega_m,\mu_m)$, we conclude that the algorithm
Second, using that $h - G_{cv}(f_0,\mathcal{D}_N(\Omega_m)) \in \Sigma_u(\mathcal{D}_N)$, from our assumption $\mathcal{D}_N \in \mathrm{NI}(2,p,H,u)$ and the discretization (1.1) we conclude that
For brevity set $L_p(\xi) := L_p(\Omega_m,\mu_m)$, where $\Omega_m=\{\xi^\nu\}_{\nu=1}^m$ and $\mu_m(\xi^\nu) =1/m$, $\nu=1,\dots,m$. Let $B_v(f,\mathcal{D}_N,L_p(\xi))$ denote the best $v$-term approximation of $f$ in the $L_p(\xi)$-norm with respect to the system $\mathcal{D}_N$. Note that $B_v(f,\mathcal{D}_N,L_p(\xi))$ may not be unique. It is obvious that
Theorem 3.2 ([5]). Let $1\leqslant p<\infty$ and let $m$, $v$ and $N$ be given natural numbers such that $2v\leqslant N$. Let $\mathcal{D}_N\subset \mathcal{C}(\Omega)$ be a system of $N$ elements. Assume that there exists a set $\xi:= \{\xi^j\}_{j=1}^m \subset \Omega$ that provides one-sided $L_p$-universal discretization,
Theorem 3.3. Let $2\leqslant p\leqslant \infty$, and let $m$, $v$, and $N$ be given natural numbers such that $2v\leqslant N$. Let $\mathcal{D}_N\subset \mathcal{C}(\Omega)$ be a system of $N$ elements such that $\mathcal{D}_N \in \mathrm{NI}(2,p,H,2v)$. Assume that there exists a set $\xi:= \{\xi^j\}_{j=1}^m \subset \Omega$ that provides one-sided $L_2$-universal discretization
Proof. The proofs of inequalities (3.11) and (3.12) are similar. We begin with (3.11). For brevity set $g:= B_v(f,\mathcal{D}_N,L_2(\xi))$ and $h:=B_v(f,\mathcal{D}_N,L_p(\mu_\xi))$. Then
Second, using the obvious fact that $h-g\in \Sigma_{2v}(\mathcal{D}_N)$, from the assumption $\mathcal{D}_N \in \mathrm{NI}(2,p,H,2v)$ we obtain
$$
\begin{equation*}
\|h-g\|_p \leqslant H \|h-g\|_2
\end{equation*}
\notag
$$
and continue by using (3.10) and the trivial inequality $\|f-g\|_{L_2(\xi)} \leqslant \|f-h\|_{L_2(\xi)}$:
$$
\begin{equation*}
\begin{aligned} \, \|h-g\|_p &\leqslant HD \|h-g\|_{L_2(\xi)} \leqslant 2HD \|f-h\|_{L_2(\xi)} \\ &\leqslant 2 HD \|f-h\|_\infty=2 HD \sigma_v(f,\mathcal D_N)_\infty. \end{aligned}
\end{equation*}
\notag
$$
Combining the above inequalities we complete the proof of (3.12).
The theorem is proved.
§ 4. New results on lower bounds
Let us discuss lower bounds for the nonlinear characteristic $\varrho_m^{\rm o}(\mathbf{A}^r_\beta(\Psi),L_p)$. We do this in the special case when $\Psi$ is the trigonometric system $\mathcal{T}^d \mkern-1mu\!:=\!\mkern-1mu \{\mkern-1.5mu e^{i(\mathbf{k},\mathbf{x})}\mkern-1.5mu\}_{\mathbf{k}\in \mathbb{Z}^d}$. For $\mathbf{N} = (N_1,\dots,N_d)$, $N_j\in \mathbb{N}_0$, $j=1,\dots,d$, set
In this section $\Omega = \mathbb{T}^d$ and $\mu$ is the normalized Lebesgue measure on $\mathbb{T}^d$.
Lemma 4.1. Let $1\leqslant q\leqslant p\leqslant \infty$, and let $\mathcal{T}(\mathbf{N},d)_q$ denote the unit $L_q$-ball of the subspace $\mathcal{T}(\mathbf{N},d)$. Then for $m\leqslant \vartheta(\mathbf{N})/2$ we have
Let $g_\xi\in T(\xi)$ and a point $\mathbf{x}^*$ be such that $|g_\xi(\mathbf{x}^*)|=\|g_\xi\|_\infty=1$. For the further argument we need some classical trigonometric polynomials. The univariate Fejér kernel of order $j - 1$ is
Let $\mathcal{K}_\mathbf{j}(\mathbf{x}):= \prod_{i=1}^d \mathcal{K}_{j_i}(x_i)$ be the $d$-variate Fejér kernels for $\mathbf{j} = (j_1,\dots,j_d)$ and $\mathbf{x}=(x_1,\dots,x_d)$. We set
This, relations (4.3), (4.5) and the fact that both $f/\|f\|_q$ and $-f/\|f\|_q$ belong to $\mathcal{T}(2\mathbf{N},d)_q$ complete the proof of Lemma 4.1.
We now show how Theorem 3.2 can be used to show that Theorem 1.2 cannot be improved substantially in the sense of relation between $m$ and $u$.
Proposition 4.1. Let $d\in \mathbb{N}$ and $p\in (2,\infty)$. Then there exists a positive constant $C(d,p)$ with the following property. For any
and $m\leqslant \vartheta(\mathbf{N})/2$ there is no set of $m$ points that provides the one-sided $L_p(\mathbb{T}^d,\mu)$-universal discretization (3.7) for $\mathcal{X}_{2v}(\mathcal{D}_N)$ with
$$
\begin{equation*}
v \geqslant C(d,p)(2D+1)^2 (\vartheta({\mathbf N}))^{2/p}.
\end{equation*}
\notag
$$
Here $\mu$ is the normalized Lebesgue measure on $\mathbb{T}^d$.
Proof. The proof goes by contradiction. Assume that there exists a set $\xi:= \{\xi^j\}_{j=1}^m \subset \mathbb{T}^d$ that provides the one-sided $L_p$-universal discretization (3.7) for $\mathcal{X}_{2v}(\mathcal{D}_N)$. Then by Theorem 3.2 inequality (3.8) holds for any $f\in\mathcal{C}(\mathbb{T}^d)$. We take the function $f$ from the proof of Lemma 4.1, as defined in (4.2). By the definition of $f$ we obtain $f(\xi^j)=0$ for all $j=1,\dots,m$. Therefore, $B_v(f,\mathcal{D}_N,L_p(\xi))=0$, and from (4.5) we obtain
From (2.5) we obtain that $\eta(L_p(\mathbb{T}^d,\mu_\xi),w) \leqslant (p-1)w^2/2$. It is known (see, for instance, [15], p. 342) that for any dictionary $\mathcal{D} = \{g\}$, $\|g\|_X \leqslant 1$, in a Banach space $X$ with $\eta(X,w) \leqslant \gamma w^q$, $1<q\leqslant 2$, we have
Note that (4.8) was proved in [15] in the case of real Banach spaces. It is clear that we can obtain (4.8) for complex systems $\mathcal{D}_N$ from the corresponding results for real trigonometric polynomials in $\mathcal{T}(2\mathbf{N},d)$.
We now apply inequality (4.8) and find from (4.7) that
Theorem 5.1 ([18]). Let $1<p<\infty$, $r>0$ and $\beta\in (0,1]$. Assume that $\|\psi_\mathbf{k}\|_{L_p(\Omega,\mu)} \leqslant B$, $\mathbf{k}\in\mathbb{Z}^d$, for some probability measure $\mu$ on $\Omega$. Then there exist two constants $c^*=c(r,\beta,p,d)$ and $C=C(r,\beta,p,d)$ such that for any $v\in \mathbb{N}$ there is $J\in\mathbb{N}$ independent of the measure $\mu$ such that $2^J \leqslant v^{c^*}$ and
( $p^*:= \min\{p,2\}$). Moreover, this bound is provided by a simple greedy algorithm.
We proceed to a new result on sampling recovery. In the proof of Theorem 5.3 below we need the following known result on universal discretization.
Theorem 5.2 ([4]). Let $1\leqslant p\leqslant 2$. Assume that $ \mathcal{D}_N=\{\varphi_j\}_{j=1}^N\subset \mathcal{C}(\Omega)$ is a system satisfying conditions (1.7) and (1.9) for some constant $K\geqslant 1$. Let $\xi^1,\dots, \xi^m$ be independent random points on $\Omega$ that are identically distributed with respect to $\mu$. Then there exist constants $C=C(p)>1$ and $c=c(p)>0$ such that, given any integers $1\leqslant u\leqslant N$ and
$$
\begin{equation*}
m \geqslant C Ku \log N (\log(2Ku ))^2 (\log (2Ku )+\log\log N),
\end{equation*}
\notag
$$
hold with probability $\geqslant 1-2 \exp( -cm/(Ku\log^2 (2Ku)))$.
Theorem 5.3. Assume that $\Psi$ is a uniformly bounded Riesz system (in the space $L_2(\Omega,\mu)$) satisfying (1.7) and (1.8) for some constants $0<R_1\leqslant R_2<\infty$. Let $2\leqslant p<\infty$ and $r>0$. For any $v\in\mathbb{N}$ let $u := \lceil (1+c)v\rceil$, where $c$ is from Theorem 3.1. Assume, in addition, that $\Psi \in \mathrm{NI}(2,p,H,u)$ for $p\in [2,\infty)$. Then there exist constants $c'=c'(r,\beta,p,R_1,R_2,d)$ and $C'=C'(r,\beta,p,d)$ such that the bound
$$
\begin{equation}
\varrho_{m}^{\rm o}({\mathbf A}^r_\beta(\Psi),L_p(\Omega,\mu)) \leqslant C' H v^{1/2 -1/\beta -r/d}
\end{equation}
\tag{5.2}
$$
holds for any $m$ satisfying
$$
\begin{equation*}
m\geqslant c' v (\log(2v ))^4.
\end{equation*}
\notag
$$
Moreover, this bound is provided by the WOMP.
Proof. First we use Theorem 5.1 in the space $L_p(\Omega,\mu)$. In our case $B=1$ and $2<p<\infty$, which implies that $p^*=2$. Consider the system $\Psi_J := \{\psi_\mathbf{k}\}_{\|\mathbf{k}\|_\infty <2^J}$, $2^J \leqslant v^{c^*}$, from Theorem 5.1. Note, that $J$ does not depend on $\mu$. By Theorem 3.1 for $\mathcal{D}_N = \Psi_J$ and Theorem 5.2 with $p=2$ and $K=R_1^{-2}$ there exist $m$ points $\xi^1,\dots,\xi^m\in \Omega$, where
$$
\begin{equation}
m\leqslant C u (\log N)^4,
\end{equation}
\tag{5.3}
$$
such that, given $f_0\in \mathcal{C}(\Omega)$, the WOMP with weakness parameter $t$, as applied to $f_0$, with respect to the system $\mathcal{D}_N(\Omega_m)$ in the space $L_2(\Omega_m,\mu_m)$, provides for any integer $1\leqslant v \leqslant u/(1+c)$ the inequality
In order to bound the right-hand side of (5.4) we use Theorem 5.1 in the space $L_p(\Omega, \mu_\xi)$. To do this it is sufficient to check that $\|\psi_\mathbf{k}\|_{L_p(\Omega, \mu_\xi)} \leqslant 1$, $\mathbf{k}\in\mathbb{Z}^d$. This follows from the assumption that $\Psi$ satisfies (1.7). Thus, by Theorem 5.1, for $f_0 \in \mathbf{A}^r_\beta(\Psi)$ we obtain
Corollary 5.1. Assume that $\Psi$ is a uniformly bounded Riesz system (in the space $L_2(\Omega,\mu)$) satisfying (1.7) and (1.8) for some constants $0<R_1\leqslant R_2<\infty$. Let ${2\leqslant p<\infty}$ and $r>0$. Then there exist constants $c=c(r,\beta,p,R_1,R_2,d)$ and $C=C(r,\beta,p,d)$ such that the bound
$$
\begin{equation*}
m\geqslant c v (\log(2v ))^4.
\end{equation*}
\notag
$$
Moreover, this bound is provided by the WOMP.
Proof. We prove that a system $\Psi$ satisfying the assumptions of Corollary 5.1 also satisfies the condition $\Psi \in \mathrm{NI}(2,p,H,u)$ for $H= C(R_2) u^{1/2-1/p}$, as required in Theorem 5.3. Then Corollary 5.1 follows from Theorem 5.3. Indeed, let
Using the following well-known simple inequality $\|f\|_p \leqslant \|f\|_2^{2/p}\|f\|_\infty^{1-2/p}$, ${2\!\leqslant\! p\!\leqslant\! \infty}$, we obtain
We have derived Theorem 5.3 from Theorem 3.1. Now we formulate an analogue of Theorem 5.3, which can be derived from Theorem 3.3 in the same way as Theorem 5.3 is derived from Theorem 3.1.
Theorem 5.4. Let $2\leqslant p<\infty$, and let $m$ and $v$ be given natural numbers. Let $\Psi$ be a uniformly bounded Bessel system satisfying (1.7) and (1.9) such that $\Psi \in \mathrm{NI}(2,p,H,2v)$. Then there exist constants $c=c(r,\beta,p,K,d)$ and $C=C(r,\beta,p,d)$ such that the bound
It was pointed out in [4] that known results on the restricted isometry property for the trigonometric system can be used to improve results on universal discretization in the $L_2$-norm in the case of the trigonometric system. We explain this in more detail. Let $M\in \mathbb{N}$ and $d\in \mathbb{N}$. Let $\Pi(M) := [-M,M]^d$ denote the $d$-dimensional cube. Consider the system $\mathcal{T}^d(M) := \mathcal{T}^d(\Pi(M))$ of functions $e^{i(\mathbf{k},\mathbf{x})}$, $\mathbf{k} \in \Pi(M)$, defined on $\mathbb{T}^d =[0,2\pi)^d$. Then $\mathcal{T}^d(M)$ is an orthonormal system in $L_2(\mathbb{T}^d,\mu)$, where $\mu$ is the normalized Lebesgue measure on $\mathbb{T}^d$. The cardinality of this system is $N(M):= |\mathcal{T}^d(M)| = (2M+1)^d$. We are interested in bounds on $m(\mathcal{X}_v(\mathcal{T}^d(M)),2)$ in the special case when $M\leqslant v^c$ for some constant $c$, which can depend on $d$. Then Theorem 5.2 for $p=2$ gives
Now we prove that (5.11) cannot be substantially improved. We use Lemma 4.1. Take $n\in\mathbb{N}$ and set $N:=2^n-1$, $\mathbf{N} := (N,\dots,N)$. Then for $f\in \mathcal{T}(\mathbf{N},d)_2$, by Hölder’s inequality with parameter $2/\beta$ we have
In this paper we emphasize the importance of the classes $\mathbf{A}^r_\beta(\Psi)$, which are defined by structural conditions, rather than smoothness conditions. We give a brief history of the development of this idea. The first result connecting the best approximations $\sigma_v(f,\Psi)_2$ with the convergence of the series $\sum_{k}|\langle f,\psi_k\rangle|$ was obtained by Stechkin [13] in 1955, in the case of an orthonormal system $\Psi$. We formulate his result and a more general result below. The following classes were introduced in [6], in the study of sparse approximation with respect to arbitrary systems ( dictionaries) $\mathcal{D}$ in a Hilbert space $H$. Given a general dictionary $\mathcal{D}$ and $\beta>0$, we define the class of functions (elements)
and we define $\mathcal{A}_\beta(\mathcal{D},M)$ as the closure of $\mathcal{A}^{\rm o}_\beta(\mathcal{D},M)$ (in $H$). Furthermore, we define $\mathcal{A}_\beta(\mathcal{D})$ as the union of the classes $\mathcal{A}_\beta(\mathcal{D},M)$ over all $M>0$. In the case when $\mathcal{D}$ is an orthonormal system Stechkin [13] proved (see a discussion of this result in [7], § 7.4) that
$$
\begin{equation}
f\in \mathcal A_1(\mathcal D) \quad \text{if and only if}\quad \sum_{v=1}^\infty (v^{1/2}\sigma_v(f,\mathcal D)_2)\frac{1}{v} <\infty.
\end{equation}
\tag{6.1}
$$
A version of (6.1) for the classes $\mathcal{A}_\beta(\mathcal{D})$, $\beta\in (0,2)$, was obtained in [6]:
$$
\begin{equation}
f\in \mathcal A_\beta(\mathcal D) \quad \text{if and only if}\quad \sum_{v=1}^\infty (v^{\alpha}\sigma_v(f,\mathcal D)_H)^\beta\frac{1}{v} <\infty,
\end{equation}
\tag{6.2}
$$
where $\alpha := 1/\beta-1/2$. In particular, (6.2) implies that for $f\in \mathcal{A}_\beta(\mathcal{D})$ and $\beta\in (0,2)$, we have
We recall that relations (6.1)–(6.3) were proved for an orthonormal system $\mathcal{D}$. It is very interesting to note that (6.3) actually holds for a general normalized system $\mathcal{D}$, provided that $\beta \in (0,1]$. In the case $\beta =1$ it was proved by Maurey (see [12]). For $\beta \in (0,1]$ it was proved in [6]. We note that the classes $\mathcal{A}_1(\mathcal{D})$ and their generalizations to the case of Banach spaces play a fundamental role in the study of best $v$-term approximation and the convergence properties of greedy algorithms with respect to general dictionaries $\mathcal{D}$ (see [15]). This fact has encouraged experts to introduce and study function classes defined in terms of some restrictions on the coefficients of the functions’ expansions. We explain this approach in detail.
As above, let the system $\Psi:=\{\psi_\mathbf{k}\}_{\mathbf{k}\in \mathbb{Z}^d}$ be indexed by $\mathbf{k}\in\mathbb{Z}^d$. Consider a sequence of subsets $\mathcal{G}:=\{G_j\}_{j=1}^\infty$, $G_j \subset \mathbb{Z}^d$, $j=1,2,\dots$, such that
For $\beta \in (0,1]$ and $r>0$ consider the following class $\mathbf{A}^r_\beta(\Psi,\mathcal{G})$ of functions $f$ that have representations (6.5) satisfying the conditions
One can also consider the following narrower version $\mathbf{A}^r_\beta(\Psi,\mathcal{G},\infty)$, the class of functions $f$ with representations (6.5) satisfying the condition
The first realization of the idea of the classes $\mathbf{A}^r_\beta(\Psi,\mathcal{G})$ was probably carried out in [16] in the special case when $\Psi$ is the trigonometric system $\mathcal{T}^d := \{e^{i(\mathbf{k},\mathbf{x})}\}_{\mathbf{k}\in\mathbb{Z}^d}$. We now proceed to the definition of the classes $\mathbf{W}^{a,b}_A(\mathcal{T}^d)$ from [16], which correspond to the case $\beta=1$. We introduce the necessary notation. Let $\mathbf s=(s_1,\dots,s_d )$ be a vector whose coordinates are nonnegative integers, and set
The following classes, which are convenient in the studies of sparse approximation with respect to the trigonometric system, were introduced and considered in [16]. For $f\in L_1(\mathbb{T}^d)$ we set
and the classes $\mathbf{W}^{a,b}_A$ with $b=0$ are similar to the classes $\mathbf{A}^a_1(\mathcal{T}^d,\mathcal{G})$ defined above. A more narrow version $\mathbf{A}^a_1(\mathcal{T}^d,\mathcal{G},\infty)$ of these classes was studied recently in [9].
The classes $\mathbf{A}^r_\beta(\Psi)$ considered in this paper correspond to the case of $\mathbf{A}^r_\beta(\Psi,\mathcal{G})$ where
Note that the classes $\mathbf{A}^r_\beta(\mathcal{T}^d)$ are related to the periodic isotropic Nikol’skii classes $H^r_q$. There are several equivalent definitions of the classes $H^r_q$. We give a definition convenient for us. Let $r>0$ and $1\leqslant q\leqslant \infty$. The class $H^r_q$ consists of the periodic functions $f$ of $d$ variables satisfying the conditions
For instance, it is easy to see that in the case when $q=2$ and $r/d >1/2$ the class $H^r_q$ is embedded in $\mathbf{A}^{r-d/2}_1(\mathcal{T}^d)$. However, the class $\mathbf{A}^{r-d/2}_1(\mathcal{T}^d)$ is substantially larger than $H^r_2$. For instance, take $\mathbf{k} \in G_j\setminus G_{j-1}$, where $G_j$ is defined in (6.8). Then the function $g_\mathbf{k} := 2^{-(r-d/2)j}e^{i(\mathbf{k},\mathbf{x})}$ belongs to $\mathbf{A}^{r-d/2}_1(\mathcal{T}^d)$ but does not belong to any $H^{r'}_2$ for $r'> r-d/2$.
Now we give a brief comparison of sampling recovery results for the classes $H^r_q$ and $\mathbf{A}^r_\beta(\mathcal{T}^d)$. Recall the setting of optimal linear recovery. For fixed $m$ and a set of points $\xi:=\{\xi^j\}_{j=1}^m\subset \Omega$, let $\Phi$ be a linear operator from $\mathbb{C}^m$ into $L_p(\Omega,\mu)$. For a class $\mathbf{F}$ (usually, a centrally symmetric and compact subset of $L_p(\Omega,\mu)$; see [14]) set
Clearly, (6.9) implies the same upper bound for $\varrho^{\rm o}_m(H^r_q,L_p)$. The results on lower bounds for $\varrho^{\rm o}_m(\mathcal{T}(\mathbf{N},d)_q,L_p)$ from Lemma 4.3 in [18] and Lemma 4.1 in this paper show that
Relations (6.11)–(6.13) mean that we obtain close bounds for the class $H^r_2$ and the larger class $\mathbf{A}^{r-d/2}_1(\mathcal{T}^d)$, $r>d/2$. Note that for $H^r_2$ we obtain the same bounds for linear sampling recovery. It was proved in [18] that for linear sampling recovery in $\mathbf{A}^r_\beta(\mathcal{T}^d)$ we have
Inequalities (6.14) for $\beta=1$ and (6.13) for $p=2$ demonstrate that nonlinear sampling recovery provides better error guarantees than linear sampling recovery.
Bibliography
1.
J. Bourgain, “An improved estimate in the restricted isometry problem”, Geometric aspects of functional analysis, Lecture Notes in Math., 2116, Springer, Cham, 2014, 65–70
2.
F. Dai, A. Prymak, V. N. Temlyakov and S. Yu. Tikhonov, “Integral norm discretization and related problems”, Russian Math. Surveys, 74:4 (2019), 579–630
3.
F. Dai and V. Temlyakov, “Universal sampling discretization”, Constr. Approx., 58:3 (2023), 589–613
4.
F. Dai and V. Temlyakov, “Random points are good for universal discretization”, J. Math. Anal. Appl., 529:1 (2024), 127570, 28 pp. ; arXiv: 2301.12536
5.
F. Dai and V. Temlyakov, Lebesgue-type inequalities in sparse sampling recovery, arXiv: 2307.04161v1
6.
R. A. DeVore and V. N. Temlyakov, “Some remarks on greedy algorithms”, Adv. Comput. Math., 5:2–3 (1996), 173–187
7.
Dinh Dũng, V. Temlyakov and T. Ullrich, Hyperbolic cross approximation, Adv. Courses Math. CRM Barcelona, Birkhäuser/Springer, Cham, 2018, xi+218 pp.
8.
I. Haviv and O. Regev, “The restricted isometry property of subsampled Fourier matrices”, Geometric aspects of functional analysis, Lecture Notes in Math., 2169, Springer, Cham, 2017, 163–179
9.
T. Jahn, T. Ullrich and F. Voigtlaender, Sampling numbers of smoothness classes via $\ell^1$-minimization, arXiv: 2212.00445v1
10.
B. Kashin, E. Kosov, I. Limonova and V. Temlyakov, “Sampling discretization and related problems”, J. Complexity, 71 (2022), 101653, 55 pp.
11.
E. D. Livshitz and V. N. Temlyakov, “Sparse approximation and recovery by greedy algorithms”, IEEE Trans. Inform. Theory, 60:7 (2014), 3989–4000; arXiv: 1303.3595v1
12.
G. Pisier, “Remarques sur un résultat non publié de B. Maurey”, Seminaire d'analyse fonctionalle, Exp. No. V, v. 1980–1981, École Polytechnique, Centre de Mathématiques, Palaiseau, 1981, 12 pp.
13.
S. B. Stechkin, “On absolute convergence of orthogonal series”, Dokl. Akad. Nauk SSSR, 102 (1955), 37–40 (Russian)
14.
V. N. Temlyakov, “On approximate recovery of functions with bounded mixed derivative”, J. Complexity, 9:1 (1993), 41–59
15.
V. Temlyakov, Greedy approximation, Cambridge Monogr. Appl. Comput. Math., 20, Cambridge Univ. Press, Cambridge, 2011, xiv+418 pp.
16.
V. N. Temlyakov, “Constructive sparse trigonometric approximation and other problems for functions with mixed smoothness”, Sb. Math., 206:11 (2015), 1628–1656; arXiv: 1412.8647v1
17.
V. Temlyakov, Multivariate approximation, Cambridge Monogr. Appl. Comput. Math., 32, Cambridge Univ. Press, Cambridge, 2018, xvi+534 pp.
18.
V. Temlyakov, Sparse sampling recovery by greedy algorithms, arXiv: 2312.13163v2
19.
J. F. Traub, G. W. Wasilkowski and H. Woźniakowski, Information-based complexity, Comput. Sci. Sci. Comput., Academic Press, Inc., Boston, MA, 1988, xiv+523 pp.
Citation:
V. N. Temlyakov, “Sparse sampling recovery in integral norms on some function classes”, Sb. Math., 215:10 (2024), 1406–1425