Abstract:
The paper presents elements of the theory of local extremum in the problem of optimal control with free right end and, in general, uncertain initial position of trajectories on the basis of exact formulae for the increment (variations of infinite order) of the objective functional. Necessary conditions for optimality of ‘feedback’ type are obtained: their formulations involve auxiliary feedback controls which generate program descent controls (in the minimum problem). The conditions proposed in this work provide an alternative to the classical Pontryagin principle (and even improve it in some special cases) and open the way to constructing indirect methods for local search without procedures for adjustment of the parameters of ‘descent depth’.
Bibliography: 26 titles.
Keywords:
optimal control, exact formulae for the increment of the objective functional, feedback necessary conditions of optimality, Pontryagin's maximum principle, continuity equation.
This work elaborates the subject of [1]–[3], which is asymmetric to Krotov’s sufficient optimality conditions (see [4]); it is devoted to so-called feedback necessary optimality conditions for program controls and to the corresponding methods of descent in the classical problem of dynamic optimization
and in one of its qualitative generalizations, namely, the problem of control for an ensemble of trajectories (§ 3). The first condition in (1.1) is assumed to hold almost everywhere (in the sense of the Lebesgue measure) on the interval $I \doteq [0,1]$; we let $x=x[u]$ denote the Carathéodory solution of the Cauchy problem (1.1) which corresponds to the control $u$.
Assume that a given process $\overline \sigma=(\overline x, \overline u)$, $\overline u \in \mathcal U$, $\overline x=x[\overline u]$, has to be examined for optimality. The conventional necessary conditions for a strong (as well as a Pontryagin or an $L_1$-) extremum (see [5] and [6]) are based on classes of needle-shaped and weak variations of the control $\overline u$, which generate processes $\sigma=(x, u)$ with the property of potential descent with respect to the functional
We adopt the terminology of [1] and say that such variations reject $\overline u$, and we refer to the corresponding processes as comparison processes.
In feedback (feedback) necessary conditions the role of rejecting variations is played by feedback controls (and this is already indicated by the term ‘feedback’). These variations widen the set of comparison processes by augmenting it with some sliding modes, namely, admissible process for the convexified problem1[x]1The consistency of such an extension is based on an approximation of sliding modes by admissible processes for problem $(\mathrm{P})$ and a continuous extension of the objective functional. $(\operatorname{co} \mathrm{P})$. Trajectories admitted to compare with $\overline x$ are curves (in general, inadmissible in $(\mathrm{P})$) generated by feedback controls
where $\overline \varphi \colon I \times \mathbb{R}^n \to \mathbb{R}$ is a sufficiently regular weakly nonincreasing2[x]2A function $\varphi\colon I \times \mathbb{R}^n \to \mathbb{R}$, $(t,\mathrm x) \mapsto \varphi_t(\mathrm x)$, is said to satisfy some property of monotonicity weakly with respect to the control system(1.1), (1.2) if the corresponding property is exhibited by its composition $\varphi\circ x \colon I\to \mathbb{R}$, $t \mapsto \varphi_t(x(t))$, with (at least one) admissible trajectory $x$. solution of the boundary problem for the Hamilton–Jacobi inequality
associated with $\overline \sigma$ in one way or another (in the general case the boundary condition can contain an arbitrary majorant of the function $\ell$); $\nabla_{\mathrm x}$ is the gradient operator with respect to the variable $\mathrm x \in \mathbb{R}^n$, and the dot denotes the scalar product of two vectors.
Once $\overline \varphi$ has been defined, the strategy (1.4) of control for the system that delivers a minimum to (1.5) gives us a formal structure of control for descent from $\overline \sigma$, like in dynamic programming. Let us put aside for a while the (nontrivial) aspect of the implementation of this strategy and discuss the problem of the selection of the function $\overline \varphi$, which was called in [1] the majorant of the objective functional at the point $\overline \sigma$. Among all solutions of inequality (1.5) it is reasonable to select one of those for which the construction (1.4) provides the deepest descent. It should be noted that in general conditions (1.5) themselves do not guarantee any descent at all: it can occur (see Example 2 in § 8.3) that for badly chosen $\overline \varphi$ all curves $x=x[w]$ synthesized by the rule (1.4) are definitely ‘worse’ than $\overline x$, that is, $\ell(x(1)) > \ell(\overline x(1))$.
The problem of the construction of adequate majorants in the nonlinear problem $(\mathrm{P})$ was qualified in [3] as an open problem of the theory.3[x]3The ideal majorant is, of course, the Bellman function, which is much harder to find than to solve problem $(\mathrm{P})$ itself. It is a common practice to consider linear or quadratic (in the state variable) functions generated by constructions of Pontryagin’s maximum principle (PMP) and second-order conditions (solutions of the conjugate system), namely, matrix impulses of Gabasov and Riccati type. These functions generate indeed controls for descent in the corresponding particular classes of problems — linear and linear-quadratic in the state variable — however, in the general case their use has heuristic nature.
In this work we propose a universal (and fairly simple) class of nonlinear majorants in problem $(\mathrm{P})$: the sought-for majorant is the function $\overline p\colon I \times \mathbb{R}^n \to \mathbb{R}$ which takes a constant value along the flow $\overline X_{1,t}$ of the vector field $(t,\mathrm x)\mapsto f(\mathrm x,\overline u(t))$ on the interval $[t,1]$ and coincides with $\ell$ at the terminal instant. In other words, the majorant is defined by the relation $\overline p_t(\mathrm x)\doteq\ell(\overline X_{1,t}(\mathrm x))$ for all $(t,\mathrm x)\in I\times \mathbb{R}^n$.
Note that under the standard regularity assumptions for the vector field $f$ the mapping $(t,\mathrm x) \mapsto \overline X_{1,t}(\mathrm x)$, as well as the majorant $\overline p$, is absolute continuous in $t$ and continuously differentiable in $\mathrm x$. In particular, for each $\mathrm x\in \mathbb{R}^n$ we have the equalities (see § 3.1)
$$
\begin{equation}
\partial_t \overline p_t(\mathrm x)+\nabla_{\mathrm x} \overline p_t(\mathrm x) \cdot f(\mathrm x,\overline u(t))=0 \quad\text{for almost all } t \in I; \qquad p_1(\mathrm x)=\ell(\mathrm x).
\end{equation}
\tag{1.6}
$$
Below we interpret the transport equation in this ‘pointwise’ sense. Note at the same time that the function $\overline p$ is also the unique weak solution (that is, a solution in the sense of distributions) of problem (1.6); see [7].
The majorant constructed above possesses all required properties. First, it is weakly monotone, namely, ‘weakly constant’. Second, $\overline p$ satisfies the Hamilton–Jacobi inequality (1.5):
at all points of $I \times \mathbb{R}^n$ where the derivative $\partial_t \overline p_t(\mathrm x)$ exists. Finally, the corresponding feedback strategies
generate (at least one) trajectory (of a sliding mode) satisfying (1.3). We establish this result in § 7. This fact is obvious under the assumption that some implementation of the synthesis of $w$ gives a program control $u \in \mathcal U$ and the corresponding solution $x=x[u]$ for which
Then $({d}/{dt})\overline p_t(x(t)) \leqslant 0$ and $\ell(x(1))=\overline p_1(x(1)) \leqslant \overline p_0(\mathrm y)=\overline p_1(\overline x(1))= \ell(\overline x(1))$. Here the first and last equalities follow from the boundary condition (1.6), the inequality is due to the nonincreasing behaviour of the composition $\overline p\circ x$, and the second equality is a consequence of the fact that $\overline p$ is constant along $\overline x$.
Below we will see that the function $\overline p$ is a solution of the boundary problem (1.5) for which the cost of the process synthesized by the rule (1.4) is certainly at most $\mathcal I[\overline u]=\ell(\overline x(1))$. This will be established as a consequence of an exact formula for the increment of the functional, which generalizes such formulae for the linear and linear-quadratic settings (see [8]).
By $\mathcal P(\mathbb R^s)$ we denote the set of all probability measures on $\mathbb R^s$ and by $\mathcal P_c(\mathbb{R}^n)$ the subset of measures with compact support. The latter set can be endowed with the structure of a metric space by equipping it with the Kantorovich ${\mathrm p}$-metric $W_{\mathrm p}$, ${\mathrm p}\geqslant 1$ (see [9]) (this space is not complete, but it is dense in any complete space $(\mathcal P_{\mathrm p}(\mathbb{R}^n), W_{\mathrm p})$, where $\mathcal P_{\mathrm p}(\mathbb{R}^n)$ is the subset of $\mathcal P(\mathbb{R}^n)$ consisting of all measures with finite $\mathrm p$th moment: see [10]). It is always assumed below that $\mathcal P_c(\mathbb{R}^n)$ denotes the metric space $(\mathcal P_c(\mathbb{R}^n), W_1)$ without further qualification.
We distinguish two Borel measures on $\mathbb{R}^s$, namely, $\mathcal L^s$ is the classical Lebesgue measure and $\delta_{\mathrm x}$ is the atomic Dirac measure concentrated at a point $\mathrm x \in \mathbb{R}^s$. The abbreviation ‘a.e.’ stands for ‘almost everywhere’ or for ‘almost all’ points of the corresponding set with respect to the measure indicated. When the measure is not indicated, it is assumed to be the one-dimensional Lebesgue measure $\mathcal L^1$.
Given a norm $|\cdot|$ on $\mathbb{R}^n$, we denote the matrix norm consistent with it by the same symbol. Finally, $D_{\mathrm x} V$ is the matrix of partial derivatives $(\partial_{\mathrm x_i} V^j)$ of the vector field $V\colon \mathbb{R}^n \to \mathbb{R}^n$, where the $V^j$ are components of $V$.
In this work the following basic assumptions are supposed to hold.
Remark 1. We restrict our considerations to the autonomous case, since the approach elaborated here involves (in § 6) elements of the theory developed by Krasovskii and Subbotin [11], in which the continuity of the vector field with respect to the variable $t$ plays an essential role. At the same time the case when the dependence mentioned above is sufficiently regular reduces to the autonomous case by extending the phase space.
§ 2. Variational approach. Bilinear problem
The method of the derivation of feedback necessary conditions described above does not emphasize their variational character. However, the condition corresponding to the feedback variation (1.7) can be derived in a classical way, by considering the increment of the functional. On the whole this approach reproduces the standard algorithm for deriving the PMP; it is most demonstrative in the case of a bilinear problem.
For the sake of simplicity let us make an additional assumption:
$(\mathbf A_4)$ the mapping $\upsilon \mapsto f(\mathrm x, \upsilon)$ is affine and the set $U$ is convex.
Recall that in this case the operator $u \mapsto x[u]$ is continuous as a function $L_\infty(I; \mathbb{R}^n)\to C(I; \mathbb{R}^n)$, where $L_\infty$ is endowed with the weak* topology $\sigma(L_\infty, L_1)$ of duality to $L_1$. The way the main results can be carried over to the general case (sliding modes of control) is discussed in § 7.
In the PMP theory the canonical class of rejecting controls is formed by needle-shaped variations, however, in the $u$-affine case it is sufficient to restrict the consideration to so-called weak variations:
in a neighborhood of the origin (the first variation of the functional $\mathcal I$, the Gateaux derivative in the direction $u-\overline u$) can be represented in the form
where $H(\mathrm x, {\mathrm p}, \upsilon) \doteq {\mathrm p} \cdot f(\mathrm x, \upsilon)$ is the classical Pontryagin function and $\overline \psi=\psi[\overline u]$ is the solution of the conjugate problem
which guarantees that for sufficiently small values of $\lambda$ controls of the form (2.1) are ‘not worse than’ $\overline u$. In combination with a certain method for adjusting the parameter $\lambda$ (the method of linear descent) formulae (2.1) and (2.5) describe an algorithm of consecutive approximations in problem $(\mathrm{P})$, an analogue of the method of gradient descent (see, for example, [8]). The condition of the unimprovability of the control $\overline u$ and the halting criterion is, obviously, the validity of the relation
which determines the classical Pontryagin principle (in the form of the minimum principle).
It is obvious that if the initial conditions of the problem are sufficiently regular, then one can go even further and consider the Taylor series expansion of the function $\lambda \mapsto \Delta_{\overline u^\lambda} \mathcal I[\overline u]$ by taking the second and higher variations (a common practice is not to go beyond the second variation). This gives necessary conditions of higher order and more sophisticated methods for local search.
It is known (see [8]) that in the particular case when problem $(\mathrm{P})$ has a linear-quadratic structure there is an analogue of representation (2.2), (2.3) without remainder terms, hence, without any parameters adjusting the closeness of the controls $\overline u$ and $u$. Such a representation can be considered an ‘infinite-order’ variation of $\mathcal I$. Moreover, if the functions $\mathrm x \mapsto f(\mathrm x, \upsilon)$ and $\mathrm x \mapsto \ell(\mathrm x)$ are linear (that is, the problem is linear in state), then this representation is fully expressed in terms of standard constructions of the PMP:4[x]4This representation is easily derived from the relation $\Delta_u \mathcal I[\overline u]=\overline\varphi_1(x(1))$, where $\overline \varphi_t(\mathrm x)=\overline \psi(t) \cdot (\mathrm x-\overline x(t))$.
Let us compare (2.7) and (2.3). The only difference between them is that now the first argument is a point on the ‘new’ trajectory $x=x[u]$ rather than on the reference trajectory. As above, one can provide the inequality $\Delta_u \mathcal I[\overline u]\leqslant 0$ by choosing $u$ so as to satisfy the condition of pointwise minimization of the integrand
However, in contrast to the pointwise condition (2.5), which gives us a control in an explicit form, the operator equation (2.8) determines the function $u$ implicitly. If the control $\overline u$ is not extremal, then at first sight it is by no means obvious that there exist any solutions of this equation in the class $\mathcal U$.
One could act in the following way: first, distinguish a ‘control construction’ in the form of the feedback $w(t, \mathrm x)$ as a solution of the problem
and, finally, if the last system has a Carathéodory solution $x$, put $u(t)=w(t, x(t))$. This new program control would provide an optimality criterion for $\overline u$ which has the same form as (2.6):
This is what a ‘feedback analogue’ of Pontryagin’s principle might look like.
Let us emphasize once again that, in contrast to the PMP, which involves only the process $\overline \sigma$ to be tested for optimality, the formulation of the feedback condition assumes that there exists an additional comparison process $\sigma=(x=x[u], u)$ which is not supposed to be close to $\overline \sigma$ either in control or in trajectory.
Of course, in the case of a general position the function $\mathrm x \mapsto w(t, \mathrm x)$ is not continuous, the system (2.9) has no Carathéodory solutions, and nonclassical solutions (such as Krasovskii–Subbotin motions) are not generated by any control $u \in \mathcal U$ (for nonconvex $U$ or a nonconvex set $f(x, U) \doteq \{f(x, \upsilon)|\upsilon \in U\}$). Moreover, formula (2.7) itself is valid only in the case when problem $(\mathrm{P})$ is linear in state.
To carry over the idea presented above to the general nonlinear case it is convenient to embed the classical problem $(\mathrm{P})$ into a weaker statement which is linear in the corresponding state variable, namely, the problem of control for an ensemble of trajectories (on the metric space of probability measures). This relaxation is discussed in § 3. The linearity of the weakened problem allows us to involve elements of the duality theory (§ 4). In § 5 we derive two symmetric exact formulae for the increment of the objective functional in the transformed problem (and, as a corollary, in $(\mathrm{P})$), which are similar to Weierstrass’ classical formula in the calculus of variations (see [12]). As a corollary of these formulae, we obtain a series of necessary optimality conditions similar to (2.10): in § 6 the corresponding conditions are derived in the case of affine dependence on $u$ and a convex set $U$; this result is generalized in § 7 to the general statement by going over to sliding modes of control. Section 8 is devoted to the discussion of the status of the necessary conditions obtained in this work among similar results, in particular, their relation to Pontryagin’s principle. In § 9 we formulate the method of descent along the functional, which involves the extremal construction of feedback controls (1.7), and establish its convergence. Sections 10 and 11 contain necessary technical results.
§ 3. Relaxation
Let us show that a representation similar to (2.7) is valid for problem $(\mathrm{P})$ of the general form. Below we use the notation $f_{\upsilon}(\mathrm x) \doteq f(\mathrm x, \upsilon)$ and $\displaystyle\int \doteq \displaystyle\int_{\mathbb{R}^n}$.
3.1. Flows of vector fields. The transport equation
We start by recalling some necessary facts. Suppose that the assumptions $(\mathbf A_2)$ and $(\mathbf A_3)$ are satisfied. Then for each $u \in \mathcal U$ the function $(t, \mathrm x)\mapsto f_{u(t)}(\mathrm x)$ generates the mapping $X=X[u]\colon (s, t, \mathrm x)\mapsto X_{s,t}(\mathrm x)$, which is called the flow of the nonautonomous vector field $f$. Here $t \mapsto X_{s,t}(\mathrm x)$ is the solution of the Cauchy problem
For all $s,t \in \mathbb{R}$ the mapping $X_{s,t}\colon \mathrm x\mapsto X_{s,t}(\mathrm x)$ is a $C^1$-diffeomorphism $\mathbb{R}^n\to \mathbb{R}^n$ with the property $X_{\tau,t} \circ X_{s,\tau}=X_{s,t}$ for all $ s, \tau, t \in \mathbb{R}$. These facts, in particular, imply the invertibility of $X_{s,t}$ and the relation $(X_{s,t})^{-1}=X_{t,s}$.
Fixing some $s$ we introduce the shorthand notation $P_t=X_{s,t}$ and $Q_t=X_{t,s}$. Clearly,
where $\mathrm{id}$ denotes the identity mapping $\mathbb{R}^n \to \mathbb{R}^n$. Since the expression in parentheses vanishes for all values $\mathrm z=P_{t}(\mathrm x)$ and the mapping $\mathrm x \mapsto P_{t}(\mathrm x)$, $\mathbb{R}^n \to \mathbb{R}^n$, is bijective for each $t \in I$, it can be concluded that $t \mapsto Q_t$ satisfies on $I \times \mathbb{R}^n$ the conditions
which are treated in the same pointwise sense as equation (1.6). This yields a useful expression for the derivative of the flow with respect to the first index:
where $s \mapsto J_{t,s}[u](\mathrm x)$ for each $\mathrm x\in \mathbb{R}^n$ is a solution (see [13], Theorem 2.3.2) of the Cauchy problem for the linear matrix equation
$E=E_n$ denotes the identity matrix of size $n\times n$. Let $\xi \in C^1(\mathbb{R}^n; \mathbb{R})$. Then it turns out that the function $p\doteq \xi \circ Q$ is5[x]5This fact is easily verified by direct differentiation if one puts the equality $p_t=\xi \circ Q_t$ into the equivalent form $p_t \circ P_{t}=\xi$. a solution (in the sense indicated above) of the nonconservative transport equation
3.2. The problem of control for a (statistical) ensemble of trajectories
Note that the phase space $\mathbb{R}^n$ in problem $(\mathrm{P})$ can be endowed with the natural structure of a probability space $(\mathbb{R}^n, \mathcal F, \mathbb P)$ by introducing the canonical probability measure6[x]6It is obvious that for this choice of $\mathbb P$ the particular choice of the $\sigma$-algebra $\mathcal F$ does not matter. $\mathbb P=\delta_\mathrm y$. In this case the function $t \mapsto X_{t}(\mathrm x) \doteq X_{0,t}(\mathrm x)[u] \doteq x[u](t)$ must be treated as a (deterministic) random process. For each $t \in I$ the distribution of the random variable $\mathrm x\,{\mapsto}\, X_{t}(\mathrm x)$ is determined by the probability measure $\mu_t \doteq (X_{t})_\sharp \delta_{\mathrm y}=\delta_{x[u](t)} \in \mathcal P(\mathbb{R}^n)$. It is well known that under the standard regularity assumptions the function $\mu=\mu[u]\colon t \mapsto \mu_t$, which describes the behaviour of this measure in time, is a weak solution (see the definition below) of the linear partial differential equation
This formal equation is a direct generalization of the classical continuity equation to the case of arbitrary probability (or nonnegative) measures. If $\mathbb P=\delta_\mathrm y$, then it is equivalent to (its characteristic) ordinary differential equation (1.1): the only weak solution of the continuity equation on $I$ with the initial condition $\mu_0=\delta_\mathrm y$ for a control $u \in \mathcal U$ is the curve $t \mapsto \delta_{x[u](t)}$.
In turn, the quality criterion for the problem $(\mathrm{P})$ can be formulated in terms of the linear mapping $\mathcal P(\mathbb{R}^n) \to \mathbb{R}$,
the minimum of which over all $\mu=\mu[u]$, $u \in \mathcal U$, coincides with the value of $(\mathrm{P})$. Thus, we arrive at an equivalent statement of the original control problem, which is now linear in the new state variable.
Now we put aside the particular choice of the probability structure $(\mathcal F, \mathbb P)$ and consider the extremal problem
$$
\begin{equation}
u \in \mathcal U. \nonumber
\end{equation}
\notag
$$
Here the role of states is played by the probability measures $\mu_t \in \mathcal P(\mathbb{R}^n)$ on the phase space of problem $(\mathrm{P})$; the initial distribution of $\vartheta \in \mathcal P(\mathbb{R}^n)$ is specified. The class $\mathcal U$ of admissible controls remains the same. Assumptions $(\mathbf A_1)$–$(\mathbf A_4)$ are still supposed to be fulfilled, along with the additional assumption
$(\mathbf A_5)$ the measure $\vartheta$ has a compact support ($\vartheta \in \mathcal P_c(\mathbb{R}^n)$).
A weak solution of equation (3.6) is a function $\mu \in C(I; \mathcal P(\mathbb{R}^n))$ for which the Newton–Leibniz formula holds:
Remark 2. We adopt the nonclassical definition of a weak solution (with a wider class of test functions), which is here equivalent to the classical one; see [9], Remark 2.5 and Lemma 2.6.
It is known (see [9]) that under the assumptions $(\mathbf A_2)$, $(\mathbf A_3)$ and $\vartheta \in \mathcal P_1(\mathbb{R}^n)$ (in particular, $(\mathbf A_5)$) the weak solution of the Cauchy problem (3.6), (3.7) does exist, is unique and admits the representation
Here the operator $F_\sharp\colon \mathcal P(\mathbb{R}^n) \to \mathcal P(\mathbb{R}^n)$ defines the image of the measure under the action of the Borel measurable vector field $F\colon \mathbb{R}^n \to \mathbb{R}^n$:
Remark 3. Under assumption $(\mathbf A_5)$ the family of measures (3.9) satisfies condition (3.8) for an arbitrary $\varphi \in C^1(I\times \mathbb{R}^n)$ ($\varphi$ does not necessarily has a compact support with respect to $\mathrm x$).
The setup $(\mathrm{RP})$ is called the ensemble control problem. It generalizes problem $(\mathrm{P})$ to the case of an uncertain initial state (more realistic from the standpoint of applications). Although the variational analysis of the original problem on the basis of exact formulae for the increment can be performed directly, the approach proposed in our work can almost literally be carried over to the generalized model, and it is reasonable (and is even simpler in a certain sense) to present it in terms of the latter. Moreover, the main advantage of this approach — the absence of variation parameters — is most pronounced in problems of control for distributed systems, in which Pontryagin’s principle is formulated in terms of the conjugate partial differential equation (see [14], Theorem 2) and the ‘computational cost’ of the classical and feedback optimality condition is almost the same.
Note that under assumptions $(\mathbf A_1)$–$(\mathbf A_4)$ the minimum in $(\mathrm{RP})$ is attained in the class $\mathcal U$ of admissible controls: see [15], Theorem 3.2.
The reader familiar with geometric control theory can draw a parallel between the setup $(\mathrm{RP})$ and the formalism of chronological calculus [16], [17], in which the probability structure is replaced with the algebraic one.
§ 4. Duality
Let $\xi \in C^1(\mathbb{R}^n;\mathbb{R})$ and $u \in \mathcal U$ be fixed, and let $X=X[u]$ be the flow of the system (1.1) corresponding to the control $u$. Consider the function $p=p[u]$: ${[s,1] \times \mathbb{R}^n \to \mathbb{R}}$,
As mentioned above, this function is a solution of the Cauchy problem (3.5). It is easily seen that the action of the measure $\mu_t$ on $p_t$ does not depend on $t$:
which allows us to consider the trajectory $p$ as the conjugate of $\mu$. Using this we can get rid of the variable $\mu$ and reformulate problem $(\mathrm{RP})$ in terms of the variable $p$. Indeed, putting $s=1$ and $\xi=\ell$ we obtain
which is equivalent to the original (classical nonlinear) problem $(\mathrm{P})$. This result is closely related to Theorem 2.1 in [18] (see also [19]) and it can be looked upon as a generalization of relations of nonclassical duality (Remark 4 in § 5; see [20]).
§ 5. Exact formulae for the increment
5.1. The increment of the functional of the weakened problem
Problem $(\mathrm{RP})$ admits an exact representation of the increment of the functional which is similar to (2.7).
Proposition 1. Suppose that conditions $(\mathbf A_1)$–$(\mathbf A_3)$ hold. Let $u, \overline u \in \mathcal U$, $u\neq \overline u$, be arbitrary controls, $X=X[u]$ and $\overline X=X[\overline u]$ be the corresponding flows of the characteristic system (1.1) and $\mu=\mu[u]$ be the solution of equation (3.6) corresponding to the control $u$.
Then the increment $\Delta_{u} \mathcal J[\overline u] \doteq \mathcal J[u]-\mathcal J[\overline u]$ of the objective functional of problem $(\mathrm{RP})$ can be represented in the form
$H$ is the classical Pontryagin function and $\overline p$ is defined by condition (4.1) for $\xi= \ell$, $s=1$, and $u=\overline u$; the matrix $\overline J_t \doteq \overline J_{t,1} \doteq D_x \overline X_{t,1}$ is the solution of the Cauchy problem (3.4) at the instant $s=1$ for $u=\overline u$.
Proof. The proof of Proposition 1 is based on elementary facts from calculus and the theory of ordinary differential equations.
(1) Let us show that the function $s \mapsto \ell \circ \overline X_{s,1} \circ X_{0,s}(\mathrm x)$, $I \to \mathbb{R}$, is Lipschitz continuous for each $\mathrm x \in \mathbb{R}^n$. Consider the orbit $\mathcal O_{I}(\mathrm x)=\{X_{0,1}[\omega_s[u,\overline u]](\mathrm x)\mid s \in I\}$ of the point $\mathrm x$ under the mapping $s \mapsto X_{0,1}[\omega_s[u,\overline u]](\mathrm x) \doteq \overline X_{s,1} \circ X_{0,s}(\mathrm x)$, $I \to \mathbb{R}^n$, where
$$
\begin{equation}
\omega_s[u,v] \doteq \begin{cases} u & \text{on }[0,s), \\ v & \text{on }[s,1]. \end{cases}
\end{equation}
\tag{5.3}
$$
The standard arguments based on the Grönwall–Bellman inequality show that under assumptions $(\mathbf A_2)$ and $(\mathbf A_3)$ the set $\mathcal O_{I}(\mathrm x)$ is bounded. Then its closure $\operatorname{cl} \mathcal O_{I}(\mathrm x)$ is a compact subset of $\mathbb{R}^n$. It follows from assumption $(\mathbf A_1)$ that the function $\ell$ is locally Lipschitz continuous, hence (in view of the local compactness of $\mathbb{R}^n$) its restriction to $\operatorname{cl} \mathcal O_{I}(\mathrm x)$ is Lipschitz continuous. Now the required fact follows from the Lipschitz continuity of the functions $t \mapsto X_{0,t}(\mathrm x)$ and $t \mapsto \overline X_{t,1}(\mathrm x)$ with regard to the uniform (in $t$) local Lipschitz continuity of the function $\mathrm x \mapsto \overline X_{t,1}(\mathrm x)$ (Lemma 1):
where $L_1=\operatorname{Lip}(\ell; \operatorname{cl} \mathcal O_{I}(\mathrm x))$ is the Lipschitz constant of the objective function $\ell$ on $\operatorname{cl} \mathcal O_{I}(\mathrm x)$, $L_2= \operatorname{Lip}(\overline X_{t,1}(\,\cdot\,); \{ X_{0,t}(\mathrm x)\mid t \in I\})$ is the Lipschitz constant of $\mathrm x \mapsto \overline X_{t,1}(\mathrm x)$ on the phase portrait $\{ X_{0,t}(\mathrm x)\mid t \in I\}$, $L_3=\operatorname{Lip}(X_{0, \cdot}(\mathrm x); I)$ and $L_4\,{=}\max_{s \in I}\operatorname{Lip}(\overline X_{\cdot, 1}(X_{0,s}(\mathrm x)); I)$.
(2) With regard to the definition (3.10) and the equalities $X_{s,s}=\overline X_{s,s}=\mathrm{id}$ for all $ s$ we represent the increment of the functional in the form
As the mapping $t \mapsto \ell \circ \overline X_{t,1} \circ X_{0,t}(\mathrm x)$ is absolutely continuous, one can extend the chain of equalities and convert the last difference with the use of the Newton–Leibniz formula:
By the semigroup property of the flow $\overline X$ the quantity $\ell \circ \overline X_{t,1} \circ \overline X_{0,t}=\ell \circ \overline X_{0,1}$ does not depend on $t$. Consequently,
Introducing the notation $\overline J_t \doteq D_{\mathrm x} \overline X_{1,t}$ and taking the expression (3.3) (for $s=1$ and $u=\overline u$) into account gives
To finish the proof it remains to apply Fubini’s theorem and take representation (3.9) into account.
The proof of the proposition is complete.
5.2. A ‘direct’ formula for the increment in problem $(\mathrm{P})$
We refine Proposition 1 for the original setup $(\mathrm{P})$. Putting $\vartheta= \delta_\mathrm y$ (which yields the equality $\mu_t[u]=\delta_{x[u](t)}$ for each $t \in I$) we obtain $\mathcal J[u]=\mathcal I[u]$. Then (5.1) takes the form
Remark 4. It is easily seen that in problem $(\mathrm{P})$, which is linear in state, the composition $\nabla_{\mathrm x} \overline p \circ \overline x$ coincides on $I$ with the reference cotrajectory $\overline \psi \doteq \psi[\overline u]$. This follows from the representation
where $t \mapsto \overline J_{t,1}^*$ is the fundamental matrix solution in the inverse time of the equation in (2.4) (here $\overline J_{t,1}$ does not depend on $\mathrm x$). In this case formula (5.1) reduces to (2.7).
In the nonlinear case the equality $\nabla_{\mathrm x} \overline p \circ \overline x=\overline \psi$ holds under an additional regularity assumption:
$(\mathbf A_6)$ the function $\ell$ is twice continuously differentiable, as also is the function $\mathrm x \mapsto f(\mathrm x, \upsilon)$ for each $\upsilon \in U$.
This fact is established by direct differentiation of the function $t \mapsto \nabla_{\mathrm x} \overline p_t(\overline x(t))$.
5.3. The dual formula for the increment
Renaming $\overline u \to u$, we obtain a ‘dual’ representation for the increment of the functional in problem $(\mathrm{RP})$:
where $\mathrm H_t(\mathrm x, \upsilon) \doteq H(\mathrm x, \xi_t(\mathrm x), \upsilon)$, $\xi_t=\nabla_{\mathrm x}p_t[u]\doteq J_{t}^*\nabla \ell \circ X_{t,1}$ and $p=p[u]$ is defined by condition (4.1) for $s=1$ and the control $u$ ($X=X[u]$ is the corresponding flow (1.1) and $t \mapsto J_t \doteq D_x X_{t,1}$ is a solution of (3.4)). Refining this representation of problem $(\mathrm{P})$ we obtain an exact formula for the increment:
Let $u$ and $\overline{u}$ be arbitrary admissible controls, and $X$ and $\overline{X}$ be the corresponding flows. For each $s\in [0,1]$ consider the intermediate control $\omega_s[u,\overline u] \in \mathcal U$ defined by equality (5.3) and the flow $X^s$ of the system (1.1) generated by this control. Note that $X^s_{0,1}=\overline X_{s,1}\circ X_{0,s}$. It is obvious that the function $\gamma\colon I \to \mathbb{R}^n$,
$$
\begin{equation}
\gamma(s)=X^s_{0,1}(\mathrm y), \qquad s \in I,
\end{equation}
\tag{5.9}
$$
specifies a parametrization of a curve on the attainability set $\mathcal D_1(\mathrm y)$ of the system (1.1), (1.2). This curve joins the points $\overline{X}_{0,1}(\mathrm y)\doteq \overline x(1) \doteq x[\overline u](1)$ and $X_{0,1}(\mathrm y)\doteq x(1)\doteq x[u](1)$ (Figure 1). Recall (formula (4.1)) that $\overline p_{s}=\ell\circ \overline{X}_{s,1}$. Hence $\overline p_s(\mathrm x)$ is the cost of the reference control $\overline u$ in problem $(\mathrm{P})$ as restricted to the interval of time $[s, 1]$ with the initial condition $x(s)=\mathrm x$. Now assume that $u$ takes the initial state $x(0)=\mathrm y$ to a point $\mathrm x$ in time $s$, that is, $\mathrm x=X_{0,s}(\mathrm y)$. Then $\overline p_s(\mathrm x)$ is the cost of the intermediate control $\omega_s[u, \overline u]$ in $(\mathrm{P})$. A small variation $s+\Delta t$ of the moment of ‘switching’ between the controls $u$ and $\overline u$ has the cost
It remains to recall that $\overline p_{s}(\mathrm x)=\ell(X^s_{0,1}(\mathrm y))$ and $\overline p_{s+\Delta t}(X_{s,s+\Delta t}(\mathrm x))=\ell(X^{s+\Delta t}_{0,1}(\mathrm y))$. Consequently, the rate of change of the cost along the curve $\gamma$ is
we obtain an exact formula for the increment (5.6).
For a rigorous proof of relation (5.10) we go in the opposite direction: we apply the formula for the increment (5.6) and take into account that the curve $\ell\circ\gamma$ is Lipschitz continuous (see the proof of Proposition 1). This, in particular, shows that (5.10) holds only for almost all $s\in [0,1]$.
The dual formula (5.8) has a similar representation involving another class of curves $\zeta$ on $\mathcal D_1(\mathrm y)$ generated by variations $\omega_s[\overline u, u]$ of the control of the form (5.3) with the arguments $\overline u$ and $u$ in the reverse order, and joining the points $\overline x(1)$ and $x(1)$ in the ‘opposite direction’: $\zeta(0)=x(1)$ and $\zeta(1)=\overline x(1)$. It is clear that there exist even more sophisticated parametrizations of the ‘motion’ between $\overline x(1)$ and $x(1)$ inside $\mathcal D_1(\mathrm y)$ — those akin to ‘packages of needles’ in the PMP theory — for example, parametrizations corresponding to the controls
$$
\begin{equation*}
\omega_s=\begin{cases} u & \displaystyle \text{on }\bigcup_{j=0}^{k-1}\biggl[\frac{j}{k},\frac{j}{k}+s\biggr), \\ \overline u & \text{otherwise}, \end{cases} \qquad s \in \biggl[0, \dfrac 1kr\biggr], \qquad k=2, 3, \dots\,.
\end{equation*}
\notag
$$
Representation (5.10) suggests an obvious way to organize a monotone descent along the functional $\mathcal I$: one must construct a curve $\gamma$ along which the function $\ell$ does not increase, that is, $\nabla \ell\circ\gamma\cdot \dot \gamma \leqslant 0$ almost everywhere on $I$. For example, one can take a feedback control $w$ satisfying inclusion (1.7). Then the process $(x,u)$ with control $u(t)=w(t,x(t))$ (of course, if such a process is well defined) generates the required curve $\gamma$ and therefore is an improving process. A rigorous implementation of this idea is the subject of the rest of this paper.
§ 6. Necessary conditions of optimality
We turn back to the problem of the generation of descent directions for $(\mathrm{P})$. Throughout this section assumptions $(\mathbf A_1)$–$(\mathbf A_4)$ are supposed to hold.
6.1. The principle of optimality with feedback controls
As discussed in § 2, the sign $\Delta_{u}\mathcal I[\overline u] \leqslant 0$ is guaranteed by the choice of $u$ so as to minimize the integrand in (5.6). This turns us back to the problem of the solvability of the operator equation
$$
\begin{equation}
\overline{\mathrm H}_t(x[u](t), u(t))=\min_{\upsilon \in U}\overline{\mathrm H}_t(x[u](t), \upsilon) \quad\text{for a.e. } t \in I,
\end{equation}
\tag{6.1}
$$
similar to (2.8), in the class of admissible program controls $\mathcal U$.
Let us verify that for each $\overline u \in \mathcal U$ the set
(the set of such solutions is denoted by $\overline{\mathcal{KS}}$),
$\bullet$ $u \in \mathcal U$ is an arbitrary program control generating the function $x$ as a solution of system (1.1) (such a control exists by Proposition 6; clearly, if $w$ is Borel measurable, then the class of such controls includes the Borel equivalence class of $u$ such that $u(t) \doteq w(t, x(t))$ for almost all $t \in I$).
Proposition 2. Let $x \in \overline{\mathcal KS}$, $u \in \mathcal U$ and $x=x[u]$. Then the pair $\sigma \doteq (x,u)$ satisfies (6.1).
This fact follows from a more general result of Proposition 4 for sliding modes.
It follows from the representation (5.6) that in problem $(\mathrm{P})$ any control $u$ satisfying (6.1) is ‘not worse’ than the reference one, that is, $\mathcal I[\overline u] \geqslant \mathcal I[u]$. If the process $\overline \sigma$ is optimal, then it is obvious that any process $\sigma \in \mathcal S[\overline u]$ is optimal as well. This observation can be considered a necessary optimality condition in the spirit of the feedback conditions [1]. It is clear that if this condition holds, then $\mathcal I[\overline u]=\mathcal I[u]$ for all $\sigma \in \mathcal S[\overline u]$.
Note that for any $u$ having the property (6.1) the integrand in (5.6) is nonpositive; then we can reformulate our necessary condition in a form close to the PMP.
Theorem 1 (minimum principle with feedback controls). Let $\overline \sigma=(\overline x, \overline u)$ be an optimal process in problem $(\mathrm{P})$. Then the condition
$$
\begin{equation}
\overline{\mathrm H}_t(x(t), \overline u(t)) =\min_{\upsilon \in U}\overline{\mathrm H}_t(x(t), \upsilon) \ (=\overline{\mathrm H}_t(x(t), u(t))) \quad\textit{for almost all } t \in I
\end{equation}
\tag{6.3}
$$
holds for each $\sigma=(x, u) \in \mathcal S[\overline u]$.
In fact, Theorem 1 contains a series of necessary conditions parametrized by the (nonempty) set $\mathcal S[\overline u]$, which we still call comparison processes; in contrast to [1] it is only admissible processes that we admit to comparison. This theorem proposes the concept of ‘feedback extremal’ alternative to [3]: a feedback extremal is a pair $(\overline \sigma, \sigma)$ of processes satisfying condition (6.3) (in particular, the equality in parenthesis). With regard to Remark 4 the claim that the process $\overline \sigma$ is extremal in the classical sense is equivalent to the inclusion $\overline \sigma \in \mathcal S[\overline u]$ which means that the pair $(\overline \sigma, \overline \sigma)$ is a feedback extremal. The relationship between the two types of extremality is discussed in § 8.1 below.
In conclusion we give a rigorous interpretation of the feedback condition in terms of the curve of monotone descent on the attainability set of the control system (§ 5.4).
Proposition 3. Let $\overline \sigma=(\overline x, \overline u)$ be an admissible process, $\sigma=(x, u)\in \mathcal S[\overline u]$ be some comparison process and the curve $\gamma \colon I \to \mathbb{R}^n$ be defined by condition (5.9). Then the objective function $\ell$ does not increase along $\gamma$.
6.2. Descent controls in the weakened problem. Co- and bifeedback optimality conditions
The method for the generation of comparison controls that was presented in § 6 remains unchanged in the weakened problem $(\mathrm{RP})$: the expression (5.1) gives the structure of a descent control in the form of a feedback with respect to the measure $\mu$ (it characterizes the state of the system):
This representation can be used to derive a feedback necessary condition and to construct nonlocal numerical algorithms in the problem of control for an ensemble of trajectories. Some results in this area were obtained in [21].
The dual representations (5.7) and (5.8) suggest a construction for a cofeedback descent control in the form of a functional feedback
and produce a series of ‘cofeedback’ necessary conditions, which can be combined with Theorem 1. The feedback strategies (6.4) can be implemented with the use of the Krasovskii–Subbotin scheme in reverse time, starting from the point $\overline x(1)=\zeta(1) \doteq X_{s, 1}\circ \overline X_{0,s}|_{s=1}$ (see § 5.4). We do not elaborate here this idea and limit out considerations to the direct approach.
§ 7. General case. Sliding modes
Now we abandon assumption $(\mathbf A_4)$ and suppose that $\upsilon \mapsto f(\mathrm x, \upsilon)$ is an arbitrary mapping satisfying $(\mathbf A_3)$ and $U$ is an arbitrary (nonconvex) compact set. Although this case is much more general, from the technical point of view it is little different from the one discussed above, provided that we apply a classical trick originating from the theory of Young measures (see [22]): we relax the class of admissible controls $\mathcal U$ by identifying functions $u$ with elements $\nu$ of the set
whose marginals $t\mapsto \nu_t$ — families of measures obtained by desintegrating $\nu$ with respect to the Lebesgue measure $\mathcal L^1$ on $I$ — have the form $t \mapsto \delta_{u(t)}$ (in control theory the mappings $t \mapsto \nu_t$ are called controls of Gamkrelidze or Warga–Gamkrelidze type, in differential games they are called mixed strategies). This causes a relaxation of the original control system
Let $x=x[\nu]$ be the solution of (7.1) that corresponds to $\nu \in \mathcal Y$. Processes $(x, \nu)$ are called sliding modes of the control. The mapping $\nu \mapsto x[\nu]$ is well known to be continuous as a function $\mathcal Y\mapsto C(I;\mathbb R^n)$, where ${\mathcal Y}$ is endowed with the topology of weak convergence of probability measures. The flow $(s, t, \mathrm x) \mapsto X_{s,t}[\nu]$ of system (7.1) and the corresponding mapping $(s, t, \mathrm x) \mapsto J_{s,t}[\nu]$ can be defined in the same way as in § 3.1.
It is obvious that problem $(\operatorname{co} \mathrm{RP})$ of the minimization of the linear form $\langle \ell, \mu_1[\nu] \rangle$ over the curves in the family $t \mapsto \mu_t[\nu] \doteq (X_{0,t}[\nu])_\sharp \vartheta$, $\nu \in \mathcal Y$, is the convexification of problem $(\mathrm{RP})$. The convexified version $(\operatorname{co} \mathrm{P})$ of the original problem $(\mathrm{P})$ is a particular case of $(\operatorname{co} \mathrm{RP})$ that corresponds to the initial measure $\vartheta=\delta_\mathrm y$. Note that equation (7.1) is linear in the variable of generalized control and the corresponding (weakened) problem $(\operatorname{co}\mathrm{RP})$ is bilinear in the pair $(\mu, \nu)$.
Now all results of § 6 can be rewritten (almost word for word) in terms of the generalized control $\nu$ and the function
where $\operatorname{spt}$ denotes the support of the measure. Proposition 4 claims the existence of (at least one) comparison process in the class of sliding modes; under assumption $(\mathbf A_4)$ it reduces to Proposition 2. In turn, a direct generalization of Theorem 1 is as follows.
Theorem 2. Let $(\overline x, \overline \nu)$ be an optimal process of problem $({\operatorname{co} \mathrm{P}})$. Then the relation
$$
\begin{equation}
\overline{\mathbf H}_t(x(t), \overline \nu_t)=\min_{\varrho \in \mathcal P(U)}\overline{\mathbf H}_t(x(t),\varrho) \quad\textit{for a.e. } t \in I
\end{equation}
\tag{7.3}
$$
holds for any pair $(x=x[\nu],\nu)$ satisfying (7.2).
Note that for a ‘regular’ process $\overline \sigma=(\overline x, \overline u)$ condition (7.3) reduces to
As above, to check a process $\overline \sigma$ in the original problem $(\mathrm{P})$ for optimality, one can rewrite conditions (7.4) and (7.5) in the form
$$
\begin{equation*}
\overline{\mathrm H}_t(x(t), \overline u(t))=\min_{\varrho \in \mathcal P(U)}\overline{\mathbf H}_t(x(t),\varrho) \quad \text{for a.e. } t \in I
\end{equation*}
\notag
$$
respectively. Since one selector of the multivalued mapping $I \to \mathcal P(U)$ generated by the last inclusion is the function $t\mapsto \delta_{\overline u(t)}$, condition (7.4), in particular, contains Pontryagin’s principle. The same ‘additive’ inclusion of Pontryagin’s extremals in the class of comparison processes is assumed by the formulation of the feedback minimum principle [1]. However, the relationship between the conditions mentioned above and Theorems 1 and 2 is not that trivial. It is discussed in the next section.
§ 8. Discussion and examples
Here we discuss the status of Theorems 1 and 2 among close results.
8.1. The relation to Pontryagin’s principle
We turn back to problem $(\mathrm{P})$ under assumptions $(\mathbf A_1)$–$(\mathbf A_4)$ and $(\mathbf A_6)$. It follows from the equality $\nabla_{\mathrm x} \overline p\circ \overline x=\overline \psi$ (Remark 4) that the process $\overline \sigma$ satisfies the PMP once the comparison trajectory $x$ coincides with $\overline x$ on the whole interval $I$ (it is possible, however, that $u \neq \overline u$). This is definitely so, for example, in the case when for all $(t ,\mathrm x)\in I \times\mathbb{R}^n$ the extremal problem (6.2) has a unique solution.7[x]7If a problem is affine in control, then this can be accomplished by means of a standard ‘regularization’, addition of a convex integral term $\displaystyle\frac{\alpha}{2}\int_I |u(t)|^2 \,dt$ with small positive weight $\alpha$ to the objective functional. This method is widely used in the numerical solution of control problems. However, such a perturbation of the problem can lead to a degeneration of the feedback optimality conditions (see Example 3 in § 8.3). Sometimes a more efficient method of perturbation is concave ‘antiregularization’ proposed by Dykhta, namely, subtraction of the above integral from the objective function. We present several versions of such statements. We introduce a feedback analogue of the regularity property for an extremal, namely, the absence of so-called singular pieces of control components.
Definition 1. Let $(\overline\sigma, \sigma)$ be a pair of admissible processes. A component $\overline u_k$ of a control $\overline u=(\overline u_1, \dots, \overline u_m)$ is said to be regular if
(except, perhaps, finitely many points $t \in I$). The pair $(\overline\sigma, \sigma)$ is said to be regular if inequality (8.1) holds for every $k=1,\dots, m$.
We can verify that in some typical situations all regular feedback extremals of problem $(\mathrm{P})$ which is affine in control are extremals of Pontryagin’s principle.
Theorem 4 (relation to PMP). Suppose that assumptions $(\mathbf A_1)$–$(\mathbf A_4)$ and $(\mathbf A_6)$ are satisfied, and let $(\overline \sigma, \sigma)$ be a feedback extremal. Suppose that one of the following conditions holds.
Then the process $\overline \sigma$ is an extremal in the sense of the PMP and $\sigma=\overline \sigma$.
Proof. (1) It is obvious that in the case under study the finite-dimensional extremal problem $\min_{\upsilon \in U} \overline{\mathrm H}$ is separable, which means that
It is also clear that the regularity of the component $\overline u_k$, in combination with condition (6.3), implies the equality $u_k(t)=\overline u_k(t)$ for almost all $t \in I$, and since all components of $\overline u$ are supposed to be regular, we have $u=\overline u$. Now the PMP-extremality of $\overline \sigma$ follows from the equality $\nabla_{\mathrm x} \overline p\circ \overline x=\overline \psi$.
(2) By (8.1). for almost all $t \in I$ the linear form $\upsilon \mapsto \upsilon \cdot \nabla_{\upsilon} \overline{\mathrm H}_t|_{x=x(t)}$ is nondegenerate and all of its minimum points lie on the boundary of the compact set $U$; if $U$ is strictly convex, then such a point is unique. Then, as above, condition (6.3) implies the equality $u=\overline u$.
The proof of the theorem is complete.
In the regular case the feedback condition is not weaker than the PMP (and Example 3 for $\alpha=0$ illustrates the phenomenon of strict strengthening). A natural question arises: what are the types of PMP-extremal processes excluded/not excluded by Theorems 1 and 2? We have already seen that processes in the class $\mathcal S[\overline u]$ cannot be better than the PMP-extremal $\overline \sigma$ corresponding to the point of local minimum of the function $\ell$ on the attainability set of the (convexified) system, since the principle of the construction of $\gamma$ as a curve of monotone nonincreasing does not assume the ‘ascension’ along level lines of $\ell$ to get out of $\overline x(1)$. However, feedback extremals can correspond to other types of stationary points (Example 3).
8.2. The relation to the feedback minimum principle
The centerpiece of the theory of feedback necessary conditions is the so-called feedback minimum principle (FMP). This condition was originally formulated in terms of a linear majorant of the objective functional with a reference cotrajectory (that is, in the framework of the standard objects of the PMP; see [1]), while the most general result of this type — with a nonlinear majorant — was presented in [3]. Recall the original formulation of the FMP in problem $(\mathrm{P})$: the optimality of a pair $\overline \sigma=(\overline x, \overline u)$ implies the optimality of the curve $\overline x$ in the so-called adjoint problem
where $\mathcal X[\overline u]$ is the set of all Carathéodory and Krasovskii–Subbotin solutions corresponding to the selectors $w(t, \mathrm x) \in \arg\min_{\upsilon \in U} H(\mathrm x, \mathbf p(t, \mathrm x), \upsilon)$ and the function $\mathbf p$ is defined by the expression
Here the construction of feedback controls is the same as in (6.2) up to replacing $\nabla_{\mathrm x}\overline p$ by $\mathbf p$. In contrast to the conditions obtained above, the FMP has a variational form. In this setting it is assumed that the optimal process $\overline \sigma$
Here condition (a) yields directly (see [2]) the extremality of the process $\overline \sigma$, as the PMP is certainly ‘included’ in the FMP; this (slightly unnatural) inclusion is provided by the use of feedback Carathéodory solutions, which can be absent if $\overline \sigma$ is not extremal. As illustrated by examples below, the PMP by no means follows from the ‘essential’ part of the FMP, condition (b).
Let us compare the FMP with Theorems 1 and 2. To do this we reveal the relationship between the function $\mathbf p$ and the reference solution $\overline p$ of the transport equation. If $\ell$ is linear and the assumption $(\mathbf A_6)$ holds, then $\mathbf p=\overline \psi$ is obviously the gradient $\nabla_{\mathrm x}$ of the linear approximation of the solution $\overline p$ in a neighbourhood of the characteristic curve $\overline x$:
In particular, in the bilinear problem we have $\mathbf p=\overline \psi \doteq \nabla_{\mathrm x} \overline p_t$ (Remark 4). In the general case we have $\mathbf p=\nabla_{\mathrm x} \eta$, where
The function $\eta$ is a rather crude approximation of the ‘exact majorant’ $\overline p$, which combines (8.2) with the linear approximation of the objective function: $\ell(\mathrm x) \approx \ell(\overline x(t))+\nabla_{\mathrm x} \ell(\overline x(t)) \cdot (\mathrm x-\overline x(t))$.
If $(\mathrm{P})$ is linear in state, then Theorem 2 can be weakened by restricting it to the sliding modes, which are applied only to the set of curves $\mathcal X[\overline u]$ (our formulation also allows other curves). Then the result obtained reduces to (8.2). Hence the proposed necessary condition is not weaker than the FMP. In the bilinear problem the FMP (in its essential part (b)) coincides with the statement of Theorem 1.
8.3. Examples
We begin by illustrating an application of Theorem 2 in comparison with the PMP and FMP.
in which the infimum is attained at the sliding mode $(\check x, \check\nu)$, $\check\nu_t \equiv \frac{1}{2}(\delta_{-1}+\delta_{1})$. We write out the auxiliary constructions of the PMP. The Pontryagian has the form
Clearly, the process $\overline \sigma=(\overline x, \overline y, \overline z; \overline u)$ is a singular extremal and $\mathcal I[\overline u]=0$. We apply the FMP and Theorem 2 to this process. Here $\mathbf p \equiv (0, -\mathrm y, 1)$ and $\overline p_t=(1-t){\mathrm x^2}/{2}-{\mathrm y^2}/{2}+\mathrm z$. The FMP offers for comparison all ‘feedback’ controls
Both these classes include descent strategies generating, in particular, the optimal sliding mode by the Krasovskii–Subbotin scheme. However, it is only the second class that provides important information, namely, the explicit structure of the feedback: since $y(0)=0$, the FMP gives back the original set of (all admissible) program controls, which means that it actually degenerates.
(2) The FMP: lack of improvement of a nonextremal process. The control $\overline u \equiv 1$ as the reference control gives
We are interested in the case when $\mathrm y \geqslant 0$ and $t \in [0,1]$. Note that the quantity $\gamma=\gamma(t, \mathrm y) \doteq (\mathrm y+1-t)$ is strictly positive for $t \in [0, 1)$ (the point $t=1$ can be ignored). Hence we have to minimize the concave function
Here the quantity $\beta=\beta(t) \doteq(1-t^2)/2$ is also strictly positive for $t \in [0, 1)$, and therefore the minimization over $\upsilon \in [-1,1]$ gives us the extremal mapping $\overline U \equiv \{-1\}$. In other words, all feedback controls of the FMP are exhausted by the single function $w \equiv -1$, which generates the unique (in any sense) solution
with the same cost as the reference process: $\mathcal I[u\equiv -1]=-1/3$. We arrive at the conclusion that the FMP does not improve the nonextremal process under consideration.
Thus, the sets of processes satisfying the PMP and the comparison condition from the FMP are not proper subsets of each other. This means that the PMP and FMP are in fact two independent necessary conditions of optimality.
(3) Theorem2: improvement of the control $\overline u \equiv 1$. To apply Theorem 2 we find that
over $\upsilon \in [-1,1]$ for small values of $t$ gives us a single control $\delta_{-1}$, which is realized till the moment of time $t=1/3$. On the interval $[1/3, 1]$ the sliding mode $\nu=\frac{3}{4} \delta_{1}+\frac{1}{4} \delta_{-1}$ is applied, for which $x(t)= (t-1)/{2}$ and $\overline U=\{\pm 1\}$. As a result, the new process looks as follows:
and it has cost $\mathcal I[\nu]=-13/27 <-1/3$. We see that in the case under study Theorem 2 improves both the PMP and FMP.
As shown in [3], in this example even the second-order FMP ‘does not work’. The improvement is due to a certain strengthening of the FMP by passing to the extremal of the convexified problem on a series of controls which is equivalent to $\overline u$ (see [3], Proposition 1).
In Example 1 we see ‘looping’ iterations of FMP, however, the feedback controls obtained there does not disimprove the reference process in any case. Let us show that the use of the linear majorant in the nonlinear problem can cause strict disimprovement of a nonextremal control.
The dynamics of the system is organized so that the strategy $u \equiv 1$ is optimal for large values of the parameter $\varepsilon$ and counteroptimal for small values. Consider a nonextremal control $\overline{u} \equiv 0$ and write out all objects of the PMP:
(this inequality becomes obvious as $\varepsilon$ tends to zero).
As in the previous problem, here the phase variables and control variable are separated and the functional is linear. Therefore, the control generated by the FMP turns out to be a program control. At the initial instant the FMP ‘determines’ correctly the direction of local descent, but the absence of feedback makes it impossible to adapt the strategy afterwards. As a result, the FMP ‘makes a mistake’, and this mistake is stable under small variations of the parameter $\varepsilon$.
In contrast to the FMP, Theorem 1 ‘introduces’ the lacking feedback and generates a locally optimal synthesis $w(t, \mathrm x)=-\operatorname{sign}(2-3\mathrm x)\mathrm x(1-t)$ (here $w(0,\varepsilon)=-1$ for small values of $\varepsilon$ and $w(0,\varepsilon)=1$ for large values); we leave out the calculations and only indicate that $\overline p_t({\mathrm x, y})=(\mathrm x^2-\mathrm x^3) (1-t)+\mathrm y)$.
The last example illustrates the phenomenon of the absence of feedback improvement of a PMP-extremal not corresponding to a point of local minimum of the objective function.
Here $H=(\mathrm p_x-\mathrm x)\upsilon+(\alpha/2)\upsilon^2$. For any $\alpha \geqslant 0$ the control $\overline u \equiv 0$ corresponds to an extremal (singular for $\alpha=0$ and strict otherwise) with cotrajectory $\overline \psi_x \equiv 0$, $\overline \psi_y=\overline \psi_z \equiv 1$. It is easily seen that for $\alpha \in (0,1)$ the vector $(\overline y(1), \overline z(1))$ is neither a point of local minimum, nor a point of local maximum of the function $\ell(\mathrm x, \mathrm y, \mathrm z)=\mathrm y+\mathrm z$ on the attainability set of the convexified system10[x]10For $\alpha=0$ the extremal under consideration is a point of global maximum. (for example, for a constant $u \neq 0$ we have $\mathcal I[u] <0$, and for a sliding mode $\nu_{\lambda, \varepsilon}=(1-\lambda) \delta_{0}+({\lambda}/{2})(\delta_{\varepsilon}+\delta_{-\varepsilon})$, where $\varepsilon, \lambda \in (0,1]$, it turns out that $\mathcal I[\nu_{\lambda, \varepsilon}]>0$). Since the problem is linear in state, the FMP and Theorem 1 produce the same result ($\overline p \equiv \mathrm y+z$). For $\alpha=0$ (when the problem is bilinear) this result is the feedback control $w(\mathrm x)=\operatorname{sign}\mathrm x$ generating, in particular, the global solutions $u \equiv \pm 1$. However, for $\alpha >0$ the only feedback strategy
admitted for comparison leaves the reference point unchanged.
In conclusion, we comment on the role of feedback conditions in calculus of variations: in contrast to the PMP, which sometimes provides information about the structure and properties of an unknown optimal process, Theorems 1 and 2 (as well as the FMP) cannot be applied without the knowledge of the reference approximation $\overline u$. From the analytic point of view it is reasonable to consider these results as an additional test for the optimality of the already constructed PMP-extremal control. The direct iterative application of these theorems produces an algorithm for the numerical solution of the optimization problems under study. The formulation and properties of this algorithm are discussed in the next section.
§ 9. Descent method
Let us turn again to problem $(\mathrm{P})$ that is affine in control. On the set $\mathcal U \times {\mathcal U}$ we introduce the nonnegative functional
Clearly, the equality $\mathcal E[u, v]=0$ is equivalent to the claim that $(x[v], v)$ is a comparison process for $(x[u], u)$, and $\mathcal E[u, u]=0$ means the PMP-extremality of $u$.
Let us describe an iteration of the conceptual descent algorithm based on Theorem 1.
Descent method. Put $u^0=\overline u$ and suppose that $u^k \in \mathcal U$ has already been calculated.
Remark 5. Finding the function $\overline p$ as a solution of the transport partial differential equation is a well-known computational problem, which can be solved by the classical grid methods of integration only in spaces of low dimension (actually, for ${n \leqslant 3}$). However, with the use of the explicit representation $\overline p_t=\ell\circ\overline X_{t, 1}$ and the Krasovskii–Subbotin method this obstacle can be avoided: let $(t_i, \mathrm x_i)$ be the current node of the polygonal approximation of the synthesized trajectory (§ 12). In accordance with the method of descent, the computation of the $(i+1)$st node assumes the knowledge of the gradient $\nabla_{\mathrm x} \overline p_{t}(\mathrm x) \doteq \overline J_t^*(\mathrm x) \nabla \ell(X_{t, 1}(\mathrm x))$ at the point $(t_{i}, \mathrm x_i)$. To achieve this it is sufficient to solve the phase system (1.1) and the linearized system (3.4) with the Cauchy conditions $x(t_i)= \mathrm x_i$ and $J_{t_i, t_i}=E$, respectively.
The convergence of the method of descent is established by the following result.
Proposition 5. Suppose that assumptions $(\mathbf A_1)$–$(\mathbf A_4)$ are fulfilled, and let $(u^{k})$ be a sequence of controls produced by the method. Then the following hold.
(1) There exists a subsequence $\omega^{k_j} \subseteq \omega^k \doteq \{(u^{2k}, u^{2k+1})\}$ that converges in the space $\mathcal U \times \mathcal U$ with the direct product topology, where $\mathcal U$ is equipped with the topology $\sigma(L_\infty, L_1)$.
(2) Let $(u, v)$ be a partial limit of the sequence $\omega^k$. Then $\mathcal E[u, v]=0$.
Proof. (1) Since $U$ is a convex compact set in $\mathbb{R}^m$, the family $\mathcal U$ is compact in the space $L_\infty$ with the topology $\sigma(L_\infty, L_1)$ by the Banach–Alaoglu theorem. Then by Tychonoff’s classical theorem the space $\mathcal U \times \mathcal U$ is compact in the direct product topology. Since $L_1$ is separable, the weak* topology $\sigma(L_\infty, L_1)$ on $\mathcal U$ is metrizable, and therefore compactness is equivalent to sequential compactness, that is, the existence of a convergent subsequence.
(2) Under the above assumptions the operators $u \mapsto X[u](\mathrm x)$ and $u \mapsto J[u](\mathrm x)$ are continuous as functions $\mathcal U \to C(I;\mathbb{R}^n)$ for any $\mathrm x \in \mathbb{R}^n$ (as systems (3.1) and (3.4) are affine in control). Hence for each $\mathrm x \in \mathbb{R}^n$ the mapping $u \mapsto \nabla_{\mathrm x}p[u](\mathrm x) \doteq \nabla\ell \circ X_{\cdot, 1}[u](\mathrm x) J[u](\mathrm x)$ is a continuous function $\mathcal U \to C(I;\mathbb{R}^n)$. Since the function $\mathrm x \mapsto \nabla_{\mathrm x}p[u](\mathrm x)$, $\mathbb{R}^n \to \mathbb{R}^n$, is also continuous, the mapping $(u, v) \mapsto \nabla_{\mathrm x}p[u](x[v])$, $\mathcal U \times \mathcal U\to C(I;\mathbb{R}^n)$, is continuous as a composition of continuous functions. Then the operator $(u,v) \mapsto \mathcal E[u,v]$, $\mathcal U \times \mathcal U \to \mathbb{R}$, is also continuous as a composition of continuous operators.
By the definition of the residual $\mathcal E$ the equality $\mathcal E[u^{k}, u^{k+1}]=\mathcal I[u^{k+1}]-\mathcal I[u^{k}]$ holds for every $k$. The sequence of numbers $\{\mathcal I[u^{k}]\}$ is monotone and bounded, hence convergent. Consequently, $\lim_{k \to \infty} \mathcal E[u^{k}, u^{k+1}]=0$. Then the subsequence $\omega^{k_j} \doteq \{(u^{2k_j}, u^{2k_j+1})\} \subseteq \omega^k$ having the limit $(u,v)$ satisfies the equality $0=\lim_{j \to \infty}\mathcal E[u^{2k_j}, u^{2k_j+1}]=\mathcal{E}[u, v]$.
The proof of the proposition is complete.
The output of the method proposed above is a feedback (in a particular case, Pontryagin’s) extremal or a sequence of controls lying on the level set of the functional $\mathcal I$ corresponding to such an extremal (the case of ‘looping’). Its implementation with the use of the Krasovskii–Subbotin scheme produces an algorithm of dynamical optimization,11[x]11A halting criterion for the iterative process can consist in the condition $\mathcal E[u^{k}, u^{k+1}]=\mathcal I[u^k]-\mathcal I[u^{k+1}] < \varepsilon$ with a predetermined accuracy $\varepsilon>0$. which, in contrast to indirect algorithms based on the PMP [8], does not contain parameters of the ‘descent depth’12[x]12It can be said that the method of feedback variations contains a method for the computation of an optimal step of gradient descent. (hence no internal procedures for linear search). Of course, in practice this algorithm is applicable to the general problem $(\mathrm{P})$ (without assumptions $(\mathbf A_4)$), as well as to the convexified problem $(\operatorname{co} \mathrm{P})$, without any modification. In the latter problem, to approximate sliding modes with property (7.5) one can use feedback controls of the form
where $w$ obeys inclusion (6.2). This is consistent with the informal idea of the ‘locally optimal’ synthesis of control (see [23]).
§ 10. Appendix: auxiliary assertions
Below $\operatorname{Lip}(\varphi; A)$ is the minimum Lipschitz constant of the function $\varphi\colon \mathbb{R}^n \to \mathbb{R}$ on the set $A \subseteq \mathbb{R}^n$ and $\operatorname{Lip}(\varphi)\doteq \operatorname{Lip}(\varphi; \mathbb{R}^n)$; $\mathbb B_{r}$ is the closed ball of radius $r$ centred at the origin.
Lemma 1. Suppose that assumptions $(\mathbf A_1)$–$(\mathbf A_3)$ are fulfilled. Then the following hold.
(1) For any compact set $K \subset \mathbb{R}^n$ the attainability set $\{X_{s,t}[\nu](\mathrm x)\mid s,t \in I,\,\mathrm x \in K,\nu \in \mathcal Y\}$ of system (7.1) from the set $K$ on the time interval $[s, t]$ is contained in a closed ball $\mathbb B_{R_{f}}$ whose radius $R_{f}={R_{f}}(K, U)>0$ depends only on $C_f$, $K$ and $U$ (and is the same for all $s$ and $t$).
(2) For any compact set $K \subset \mathbb{R}^n$ and any $s,t \in I$, $\mathrm x \in K$ and $\nu \in \mathcal Y$ the following estimate is valid:
(3) For any compact set $K \subset \mathbb{R}^n$ the functions $t \mapsto X_{s,t}[\nu](\mathrm x)$, $t \mapsto X_{t,s}[\nu](\mathrm x)$ and $t \mapsto J_{t,s}[\nu](\mathrm x)$ are Lipschitz continuous on $I$ with Lipschitz constants $L_{X}(K,U)$ and $L_{J}(K,U)$ which are common to all $s \in I$, $\mathrm x \in K$ and $\nu \in \mathcal Y$.
(4) The functions $\mathrm x\mapsto X_{t,s}[\nu](\mathrm x)$ and $\mathrm x\mapsto J_{t,s}(\mathrm x)\doteq D_{\mathrm x}X_{t,s}[\nu](\mathrm x)$ are locally Lipschitz continuous on $\mathbb{R}^n$ uniformly in $s,t \in I$ and $\nu \in \mathcal Y$.
Proof. We restrict our consideration to the proof of the uniform local Lipschitz continuity of the family $J_{t,s}[\nu]$, $s,t\in I$, $\nu \in \mathcal Y$. All other facts are well known in the theory of ordinary differential equations.
Let $\mathrm x, \mathrm z\in K\subset \mathbb{R}^n$, where $K$ is a compact set; suppose for definiteness that $s > t$. It follows from (3.4) that
Lemma 2. Suppose that conditions $(\mathbf A_1)$–$(\mathbf A_3)$ hold, $\overline u \in \mathcal U$ and $K \subset \mathbb{R}^n$ is a compact set. Then the restriction of the family of functions $(t,\mathrm x) \mapsto \overline{\mathrm H}_t(\mathrm x, \upsilon)$, $\upsilon \in U$, to the set $I\times K$ is a Lipschitz continuous function whose Lipschitz constant depends only on $C_f$, $\operatorname{Lip}(\ell;K)$, $L_X(K,U)$, $L_J(K,U)$, $C_J(K,U)$, ${R_{f}}(K, U)$ and $L_f(K,U)$, where $L_f(K,U)$ is a common Lipschitz constant for the mappings $\mathrm x \mapsto f(\mathrm x, \upsilon)$, $\upsilon \in U$, on the set $K$ and $L_X$, $L_J$, $C_J$ and ${R_{f}}$ were defined in Lemma 1.
Proof. In view of Lemma 1 the claim of the lemma follows from the representation $\nabla_{\mathrm x}\overline p_t=\overline J_t^*\nabla \ell\circ\overline X_{t,1}$ and the following estimates, which are valid for any $s,t \in I$, ${\mathrm x, z} \in K$ and $\upsilon \in U$:
§ 11. Appendix: feedback controls and sliding modes
This section contains a brief review of the main facts about feedback controls and their relation to sliding modes. A feedback control for system (1.1) can be an arbitrary function $(t, \mathrm x) \mapsto w(t, \mathrm x)$, $I \times \mathbb{R}^n \to U$. In general, this function is not even supposed to be measurable, which makes impossible its direct substitution (for $u$) into (1.1). The action of $w$ on the system can be defined by use of various ‘sampling’ schemes, the most widely known (and simplest) of which is the Krasovskii–Subbotin scheme.
Definition 2. Let $w\colon I \times \mathbb{R}^n \to U$ be a given function. Consider the sequence $\{\pi^N\} \subset I$ of partitions of the interval $I$ by points
with the properties $\pi^{N+1} \supset \pi^N$ (the sequence $\{\pi^N\}$ is nondecreasing by inclusion) and
$$
\begin{equation*}
\lim_{N\to \infty}\min_{1\leqslant i \leqslant N}|t_i^N-t_{i-1}^{N}|=0.
\end{equation*}
\notag
$$
We define the sequence of ‘Euler’s polygons’ iteratively: $x^{N}(t)\!=\!x[w(\mkern-1mut^N_{i}\mkern-2mu, x^{N}\mkern-1mu(\mkern-1mu t^N_{i}\mkern-1mu)\mkern-1mu)](t)$, $t \in [t^N_{i}, t^N_{i+1})$, $i=0,\dots,N-1$, and define $x^{N}$ at the point $t=1$ by continuity. Let $\{x^{N_s}\} \subseteq \{x^N\}$ be a subsequence of the sequence $N \mapsto x^N$ that has a uniform limit $x$ on $I$ as $s \to \infty$. All such partial limits $x=x[w]$ are called Krasovskii–Subbotin solutions of the closed equation (2.9).
By construction each of the indicated solutions is contained in the tube of trajectories of the convexified system (7.1), that is, presents a sliding mode. In other words, $x[w]=x[\nu]$ for some $\nu \in\mathcal Y$.
Suppose that assumptions $(\mathbf A_2)$ and $(\mathbf A_3)$ hold, and let ${\mathcal KS}[w]$ be the system of all Krasovskii–Subbotin solutions corresponding to $w$. It is clear that any subset of ${\mathcal KS}[w]$ is relatively compact in $C(I; \mathbb{R}^n)$, since all the trajectories of sliding modes are uniformly bounded and Lipschitz continuous with a common constant, and therefore satisfy the hypotheses of the Arzelà–Ascoli theorem. This yields the existence of (at least one) solution $x[w]$ for each $w$. This solution is not unique even in simplest cases (see [11]).
If the operator $u \mapsto x[u]$ is continuous on $\mathcal U$, then among the generalized controls $\nu$ that generate the curve $x[w]$ as a trajectory of a sliding mode there is a measure with the property $\nu_t=\delta_{u(t)}$ for almost all $t \in I$, $u \in \mathcal U$ (see [24], Corollary 1), which yields the following result.
Proposition 6. Suppose that assumptions $(\mathbf A_2)$–$(\mathbf A_4)$ hold, and let $w \colon I \times \mathbb{R}^n \to U$ be an arbitrary feedback control and $x=x[w]$ be one of the Krasovskii–Subbotin solutions of system (2.9) generated by it. Then there exists a program control $u \in \mathcal U$ such that $x[u]=x[w]$.
If the piecewise constant strategies in the Krasovskii–Subbotin scheme are chosen in accordance with some (sufficiently regular) rule, then it is natural to expect that the last assertion also holds for the limiting sliding mode.
be a given nonnegative function such that the mapping $(t,\mathrm x)\mapsto g(t, \mathrm x, \upsilon)$ is locally Lipschitz continuous uniformly in $\upsilon \in U$. Further, let $w\colon I \times \mathbb{R}^n \to U$ be an arbitrary function, $\{\pi^N\} \subset I$ be a sequence of partitions (11.1) of the interval $I$ which is nondecreasing by inclusion, $\{u^N\}$ be a sequence of piecewise constant controls of the form
$$
\begin{equation*}
u^N(t)\equiv u^{N,i} \in U \textit{ for }t \in \Delta t^N_i \doteq (t_{i-1}^N, t^N_i), \qquad i=1,\dots, N, \quad N\geqslant 1,
\end{equation*}
\notag
$$
and $\{x^N\doteq x[u^N]\}$ be the corresponding sequence of Euler polygons. Finally, let $\nu$ be a partial limit of the sequence $(\nu^N)$ of Young measures for which $\nu_t^N=\delta_{u^N(t)}$ and $x=x[\nu]$ be the corresponding partial uniform limit of the sequence $\{x^N\}$. Suppose that the condition
Proof. For the sake of simplicity we restrict our consideration to uniform partitions of the interval $I$ by the points $t_i^N={i}/{N}$, $i=0,\dots,N$. It follows from the continuity of the function $g$, disintegration theorem and the definition of a Young measure that
Let $K \subset \mathbb{R}^n$ be a compact set containing the tube of trajectories (1.1), (1.2), and $L_{g}(K)$ be the minimal common Lipschitz constant for the functions $(t,\mathrm x)\mapsto g(t, \mathrm x, \upsilon)$, $\upsilon \in U$, on $I \times K$. Each term in the last sum can be estimated as follows:
Here the constant $L_X(K,U)$ is the same as in Lemma 1. Note that $M$ does not depend on $N$. Now recalling that $g \geqslant 0$ and summing over $i=1,\dots,N$ we obtain the estimate
Finally, taking the limit as $N \to \infty$ completes the proof.
In conclusion we pay special attention to one detail concerned with the use of program controls as feedback ones. Each representative of the class $u \in \mathcal U$ can formally be interpreted as a feedback. Then different realizations of $u$ — functions which differ from one another on a set of Lebesgue measure zero — generate, in general, different sets of Krasovskii–Subbotin solutions (since the set of functions $w$ is subject to no factorization). This effect can be avoided by adopting a more accurate rule of selection, for example, by restricting the set of values $u(t_k)$ at points of the partition (11.1) to elements of the closure (the suitable one-sided closure at the endpoints) of the function $u$ with respect to the Lebesgue measure [25].
Another way consists in abandoning the Krasovskii–Subbotin scheme in favour of an alternative sampling algorithm in which piecewise constant strategies for constructing the polygons are replaced by ‘piecewise program’ strategies (see, for example, [26]).
§ 12. Conclusion
This work presents a new approach to the theory of local extremum in problems of optimal control, which is alternative (but related) to Pontryagin’s principle. Advantages of the approach developed include, along with improving the classical necessary condition and a natural algorithmization, rather simple proofs of the main results. Further research will be devoted to aspects of the practical implementation of the descent method proposed here. The results obtained can easily (as we believe) be carried over to some classes of problems of optimal stochastic control and mean-field control.
Acknowledgments
The authors are grateful to V. A. Dykhta for the discussions of this work and a series of valuable suggestions that he made and to anonymous referees, whose constructive observations contributed to an essential improvement of the original version of this paper.
Bibliography
1.
V. A. Dykhta, “Weakly monotone solutions of the Hamilton–Jacobi
inequality and optimality conditions with feedback controls”, Autom. Remote Control, 75:5 (2014), 829–844
2.
V. A. Dykhta, “Variational necessary optimality conditions with feedback descent controls for optimal control problems”, Dokl. Math., 91:3 (2015), 394–396
3.
V. A. Dykhta, “Feedback minimum principle: variational strengthening of the concept of extremality in optimal control”, Izv. Irkutsk. Gos. Univ. Ser. Mat., 41 (2022), 19–39 (Russian)
4.
V. F. Krotov, “Global methods to improve control and optimal control of resonance interaction of light and matter”, Modeling and control of systems in engineering, quantum mechanics, economics and biosciences (Sophia–Antipolis 1988), Lect. Notes Control Inf. Sci., 121, Springer, Berlin, 1989, 267–298
5.
L. S. Pontryagin, V. G. Boltyanskii, R. V. Gamkrelidze and E. F. Mishchenko, The mathematical theory of optimal processes, Intersci. Publ. John Wiley & Sons, Inc., New York–London, 1962, viii+360 pp.
6.
A. Ya. Dubovitskii and A. A. Milyutin, “Extremum problems in the presence of restrictions”, U.S.S.R. Comput. Math. Math. Phys., 5:3 (1965), 1–80
7.
R. J. DiPerna and P. L. Lions, “Ordinary differential equations, transport theory and Sobolev spaces”, Invent. Math., 98:3 (1989), 511–547
8.
V. A. Srochko, Iterative methods of solution of optimal control problems, Fizmatlit, Moscow, 2000, 160 pp. (Russian)
9.
L. Ambrosio and G. Savaré, “Gradient flows of probability measures”, Handbook of differential equations: evolutionary equations, v. III, Handb. Differ. Equ., Elsevier/North-Holland, Amsterdam, 2007, 1–136
10.
V. I. Bogachev, Weak convergence of measures, Math. Surveys Monogr., 234, Amer. Math. Soc., Providence, RI, 2018, xii+286 pp.
11.
N. N. Krasovskiĭ and A. I. Subbotin, Game-theoretical control problems, Springer Ser. Soviet Math., Springer-Verlag, New York, 1988, xii+517 pp.
12.
M. A. Lavrentyev and L. A. Lyusternik, A course of variational calculus, GONTI–NKTI, Moscow–Leningrad, 1938, 192 pp. (Russian)
13.
A. Bressan and B. Piccoli, Introduction to the mathematical theory of control, AIMS Ser. Appl. Math., 2, Amer. Inst. Math. Sci. (AIMS), Springfield, MO, 2007, xiv+312 pp.
14.
N. Pogodaev, “Program strategies for a dynamic game in the space of measures”, Optim. Lett., 13:8 (2019), 1913–1925
15.
N. Pogodaev and M. Staritsyn, “Impulsive control of nonlocal transport equations”, J. Differential Equations, 269:4 (2020), 3585–3623
16.
A. A. Agrachev and Yu. L. Sachkov, Control theory from the geometric viewpoint, Encyclopaedia Math. Sci., 87, Control Theory Optim., II, Springer-Verlag, Berlin, 2004, xiv+412 pp.
17.
R. J. Kipka and Yu. S. Ledyaev, “Extension of chronological calculus for dynamical systems on manifolds”, J. Differential Equations, 258:5 (2015), 1765–1790
18.
R. Vinter, “Convex duality and nonlinear optimal control”, SIAM J. Control Optim., 31:2 (1993), 518–538
19.
F. H. Clarke and C. Nour, “Nonconvex duality in optimal control”, SIAM J. Control Optim., 43:6 (2005), 2036–2048
20.
V. A. Dykhta, “Nonstandard duality and nonlocal necessary optimality conditions in nonconvex optimal control problems”, Autom. Remote Control, 75:11 (2014), 1906–1921
21.
M. Staritsyn, N. Pogodaev, R. Chertovskih and F. Lobo Pereira, “Feedback maximum principle for ensemble control of local continuity equations: an application to supervised machine learning”, IEEE Control Syst. Lett., 6 (2022), 1046–1051
22.
C. Castaing, P. Raynaud de Fitte and M. Valadier, Young measures on topological spaces. With applications in control theory and probability theory, Math. Appl., 571, Kluwer Acad. Publ., Dordrecht, 2004, xii+320 pp.
23.
V. I. Gurman, The extension principle in control problems, 2nd revised and augmented ed., Fizmatlit, Moscow, 1997, 288 pp. (Russian)
24.
N. Pogodaev, “Optimal control of continuity equations”, NoDEA Nonlinear Differential Equations Appl., 23:2 (2016), 21, 24 pp.
25.
A. V. Arutyunov, D. Yu. Karamzin and F. L. Pereira, “Conditions for the absence of jumps of the solution to the adjoint system of the maximum principle for optimal control problems with state constraints”, Proc. Steklov Inst. Math. (Suppl.), 292, suppl. 1 (2016), 27–35
26.
M. Staritsyn and S. Sorokin, “On feedback strengthening of the maximum principle for measure differential equations”, J. Global Optim., 76:3 (2020), 587–612
Citation:
N. I. Pogodaev, M. V. Staritsyn, “Exact formulae for the increment of the objective functional and necessary optimality conditions, alternative to Pontryagin's maximum principle”, Sb. Math., 215:6 (2024), 790–822
\Bibitem{PogSta24}
\by N.~I.~Pogodaev, M.~V.~Staritsyn
\paper Exact formulae for the increment of the objective functional and necessary optimality conditions, alternative to Pontryagin's maximum principle
\jour Sb. Math.
\yr 2024
\vol 215
\issue 6
\pages 790--822
\mathnet{http://mi.mathnet.ru/eng/sm9967}
\crossref{https://doi.org/10.4213/sm9967e}
\mathscinet{https://mathscinet.ams.org/mathscinet-getitem?mr=4804039}
\zmath{https://zbmath.org/?q=an:07945696}
\adsnasa{https://adsabs.harvard.edu/cgi-bin/bib_query?2024SbMat.215..790P}
\isi{https://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=Publons&SrcAuth=Publons_CEL&DestLinkType=FullRecord&DestApp=WOS_CPL&KeyUT=001334620600005}
\scopus{https://www.scopus.com/record/display.url?origin=inward&eid=2-s2.0-85206914941}
Linking options:
https://www.mathnet.ru/eng/sm9967
https://doi.org/10.4213/sm9967e
https://www.mathnet.ru/eng/sm/v215/i6/p77
This publication is cited in the following 1 articles: