Model order reduction for optimality systems through empirical gramians

Mechelli, Luca; Rohleff, Jan; Volkwein, Stefan

doi:10.3389/fams.2023.1144142

ORIGINAL RESEARCH article

Front. Appl. Math. Stat., 11 December 2023

Sec. Dynamical Systems

Volume 9 - 2023 | https://doi.org/10.3389/fams.2023.1144142

This article is part of the Research TopicData-based Model Order Reduction and Reduced Order Modelling of Dynamical SystemsView all 5 articles

Model order reduction for optimality systems through empirical gramians

Luca Mechelli^†

Jan Rohleff^†

Stefan Volkwein^*^†

Department of Mathematics and Statistics, University of Konstanz, Konstanz, Germany

In the present article, optimal control problems for linear parabolic partial differential equations (PDEs) with time-dependent coefficient functions are considered. One of the common approach in literature is to derive the first-order sufficient optimality system and to apply a finite element (FE) discretization. This leads to a specific linear but high-dimensional time variant (LTV) dynamical system. To reduce the size of the LTV system, we apply a tailored reduced order modeling technique based on empirical gramians and derived directly from the first-order optimality system. For testing purpose, we focus on two specific examples: a multiobjective optimization and a closed-loop optimal control problem. Our proposed methodology results to be better performing than a standard proper orthogonal decomposition (POD) approach for the above mentioned examples.

1 Introduction

Optimization problems constrained by time-dependent partial differential equations (PDEs) arise in many fields of application in engineering and across all sciences. Examples of such problems include optimal (material) design or optimal control of processes and inverse problems, where parameters of a PDE model are unknown and need to be estimated from measurements. The numerical solution of such problems is very challenging as the underlying PDEs have to be solved repeatedly within outer optimization algorithms, and the dimension of the parameters that need to be optimized might be very high or even infinite dimensional, especially when one is interested in multiobjective optimization or closed-loop optimal control. In a classical approach, the underlying PDE (forward problem) is approximated by a high dimensional full order model (FOM) that results from discretization. For the spatial discretization, often a finite element (FE) method is used, leading to high dimensional dynamical systems. Hence, the complexity of the optimization problem directly depends on the numbers of degrees of freedom (DOF) of the FOM. Mesh adaptivity has been advised to minimize the number of DOFs (see, e.g., [1, 2]).

A more recent approach is the usage of model order reduction (MOR) methods in order to replace the FOM by a surrogate reduced order model (ROM) of possibly very low dimension. MOR is a very active research field that has shown tremendous development in recent years, both from a theoretical and application point of view. For an introduction and overview, we refer to the monographs and collections [3–6]. In the context of optimal control, ROM is utilized [7–11]. In multiobjective optimization (cf., e.g., [12, 13]) and model predictive control (cf., e.g., [14, 15]), the situation is even more complex because many optimization problems have to be solved for varying data functions.

An appropriate ROM for the optimal control problem governed by high dimensional dynamical systems has to guarantee not only a sufficiently accurate surrogate model for the dynamical system but also, in particular, provide an acceptable suboptimal control. For that reason, MOR should inheritate properties such as controllability and observability of the ROM for the dynamical system (see [9, 16–19]). In the context of linear time-invariant (LTI) systems, it is well known that balanced truncation is an MOR strategy that ensures controllability and observability of the reduced LTI system under reasonable conditions [7, 9]. However, for linear time-variant (LTV) systems, balanced truncation cannot be applied directly. Here, the empirical gramian approach offers a promising extension [20], where an empirical controllability gramian and an empirical observability gramian are computed by simulations of the dynamical system for different impulse controls and different initial conditions, respectively. Then, the MOR can be derived from a suitable singular value decomposition; cf. Section 3. Recalling that for LTI systems, this approach is equivalent to balanced truncation (see [20, 21] for more details). In this study, we consider optimal control problems governed by linear parabolic PDE constraints with transient inhomogeinities or coefficients. After an FE discretization, we derive an LTV system of the form as follows:

\begin{array}{l} ẏ (t) = A (t) y (t) + B (t) u (t) + f (t) for t \in (0, T], y (0) = y_{\circ}, & (1a) \end{array}

\begin{array}{l} z (t) = C (t) y (t) for t \in (0, T] . & (1b) \end{array}

Let us mention that the empirical gramian approach can be also used for general non-linear dynamical systems. However, to get a low-dimensional ROM for a non-linear dynamical system, an efficient realization of the non-linear term is required, for instance, by applying the (discrete) empirical interpolation method (cf. [22–24]) or missing point evaluation (cf. [25]). For this reason, in this study, we focus on LTV systems and quadratic cost functionals. The optimal solutions can be characterized by a linear first-order optimality system due to convexity. This first-order optimality system can be observed as a extended coupled linear LTV system in the state and the adjoint variable. The new contribution of the present study is the development of a ROM, which is tailored to the optimization problem by utilizing empirical gramians computed from solutions of the coupled LTV system that describes the first-order necessary optimality conditions. The parameters of the state equation (different convection functions) and the cost functional (different desired states) lead to different inputs for the simulations required to compute the empirical gramians. It turns out that the obtained ROM is more reliable and robust regarding changes in the parameters than a multiple snapshot proper orthogonal decomposition (POD) method [11]. Furthermore, our MOR approach is certified by a-posteriori error estimates for the controls.

The study is organized as follows: In Section 2 we recall the empirical gramian framework and explain how this framework can be utilized to get a ROM for (1). Our empirical gramian approach is stated in Section 3. Based on the first-order optimality system, an MOR is computed. In Section 4, we first recall the multiple snapshot POD method. Moreover, the multiobjective optimal control problem and the closed-loop control are discussed. We presented some conclusions in Section 5. Finally, the a-posteriori error estimate is briefly presented in the Appendix.

2 Empirical gramians

In this section, we explain the concepts of empirical gramians and how they are used to perform model order reduction for linear time-variant input—output systems of the form (1). We mainly follow the study by [26] and suppose that T = ∞ holds. Throughout this study, we make use of the following hypothesis.

Assumption 1. For given control $u \in L^{\infty} (0, \infty; ℝ^{n_{u}})$ , the state and the output of system (1) satisfy $y \in L^{\infty} (0, \infty; ℝ^{n_{y}})$ and $z \in L^{\infty} (0, \infty; ℝ^{n_{z}})$ , respectively.

Remark 1. Assumption 1 is a standard assumption in the context of dynamical systems. As also noted in the study mentioned in [20], this assumption is generally satisfied for stable linear systems and exponentially stable non-linear systems. ◇

First, gramian-based methods are introduced for linear time-invariant (LTI) systems when

\begin{array}{l} A (t) = A, B (t) = B, C (t) = C and f (t) = 0 for t > 0 & (2) \end{array}

hold with constant matrices $A \in ℝ^{n_{y} \times n_{y}}$ , $B \in ℝ^{n_{y} \times n_{u}}$ , and $C \in ℝ^{n_{z} \times n_{y}}$ (see [27]). We recall the following definition from the study mentioned in [18] (Definitions 3.1 and 3.4).

Definition 2. The LTI system

\begin{array}{l} \begin{matrix} ẏ (t) = A y (t) + B u (t) for t > 0, x (0) = y_{\circ}, \\ z (t) = C y (t) for t > 0 \end{matrix} & (3) \end{array}

is called controllable if, for any initial state $y_{\circ} \in ℝ^{n_{y}}$ , final time T > 0 and final state $y_{T} \in ℝ^{n_{y}}$ ; there exists a piecewise continuous input $u : [0, T] \to ℝ^{n_{u}}$ such that the solution of (3) satisfies y(T) = y_T. Otherwise, the LTI system is said to be uncontrollable. The LTI system (3) is said to be observable, if, for any T > 0, the initial condition y_° can be determined from the knowledge of the input u(t) and the output y(t) for t ∈ [0, T]. Otherwise, (3) is called unobservable.

Suppose that the LTI system is stable, i.e., all eigenvalues of A have strictly negative real part; cf. [18] (Definition 3.1). In that case, the linear controllability gramian is defined as the symmetric matrix

\begin{array}{l} L_{c} : = C C^{*} = \int_{0}^{\infty} e^{A t} B B^{⊤} e^{A^{⊤} t} d t = \int_{0}^{\infty} (e^{A t} B) (e^{A t} B)^{⊤} d t \in ℝ^{n_{y} \times n_{y}}, \end{array}

where $C : L^{2} (0, \infty) \to ℝ^{n_{y}}$ , $u \mapsto \int_{0}^{\infty} e^{A t} B u (t) d t$ , is the (linear) controllability operator. Throughout the study, the symbol “⊤” stands for the transpose of a vector or matrix. Since the LTI system is stable, the operator $C$ is bounded and its (Hilbert space) adjoint $C^{*} : ℝ^{n_{y}} \to L^{2} (0, \infty)$ satisfies

\begin{array}{l} {〈 C^{*} v, φ 〉}_{L^{2} (0, T)} = {〈 v, C φ 〉}_{ℝ^{n_{y}}} for v \in ℝ^{n_{y}} and φ \in L^{2} (0, T) . \end{array}

Moreover, the linear observability gramian is given as the symmetric matrix

\begin{array}{l} L_{o} : = O^{*} O = \int_{0}^{\infty} e^{A^{⊤} t} C^{⊤} C e^{A t} d t = \\ \int_{0}^{\infty} (e^{A^{⊤} t} C^{⊤}) (e^{A^{⊤} t} C^{⊤})^{⊤} d t \in ℝ^{n_{y} \times n_{y}}, \end{array}

where $O : ℝ^{n_{y}} \to L^{2} (0, \infty; ℝ^{n_{z}})$ , x ↦ Ce^Atx is the (linear and bounded) observability operator. Recalling that L_c and L_o have the following properties (cf., e.g., [18, 28]; Section 3.8):

Lemma 3. Let us consider the LTI system (3). We assume that all eigenvalues of A have strictly negative real part. Then, the controllability gramian L_c is a positive semidefinite solution to the Lyapunov equation

\begin{array}{l} A L_{c} + L_{c} A^{⊤} + B B^{⊤} = 0 . & (4) \end{array}

If $Im (C) = ℝ^{n_{y}}$ , (4) admits a unique solution L_c which has full rank n_y and is positive definite. Moreover, the observability gramian L_o is a positive semidefinite solution to the Lyapunov equation

\begin{array}{l} A^{⊤} L_{o} + L_{o} A + C^{⊤} C = 0 . & (5) \end{array}

If $ker (O) = {0}$ , system (5) admits a unique solution L_o. Moreover, L_o is positive definite.

The matrices L_c and L_o contain essential information which states that the dynamical system is controllable and observable, respectively. This is utilized in model order reduction approaches such as balanced truncation, where states which are not or only less controllable and observable are neglected; cf., e.g., [9, 29, 30]. To combine both information, the so-called cross gramian matrix has been introduced; cf., e.g., [31–33]. Here, we have to assume that the input and output dimensions are the same. Thus, we have n_u = n_z and define the cross gramian matrix

\begin{array}{l} L_{x} : = \int_{0}^{\infty} e^{A t} B C e^{A t} d t = \int_{0}^{\infty} (e^{A t} B) (e^{A^{⊤} t} C^{⊤})^{⊤} d t \in ℝ^{n_{y} \times n_{y}} . \end{array}

Unfortunately, the previous relationships cannot be used in the time variant case (or even in the more general non-linear case). In the case of linear time variant systems, both gramians are functions of two variables: the initial time moment and the final time moment. Unlike linear time-invariant systems, the difference in time moments alone does not uniquely characterize the gramian matrices. If one changes the initial time moment while maintaining the interval length for both gramians, the resulting solutions will also differ. This discrepancy arises because the system description evolves over time, leading to changes in system characteristics. Consequently, the shifted system will differ from the original one. Therefore, in the literature, various methods exist for computing gramians in the context of linear time variant systems. A general extension is a data-driven approach, where tentative candidates are constructed by subsequent simulations of the model. That is the reason why such candidates are called empirical gramians. This concept is introduced in the study by [30] and extended in the study by [20]. It turns out that we get a data-driven approach because the empirical gramians can be computed from measured or computed trajectories of the dynamical system.

Definition 4. For given k ∈ ℕ let $I_{k} \in ℝ^{k \times k}$ be the identity matrix, we denote by $T$ ^k an arbitrary set of n_$T$ orthogonal matrices, i.e.,

\begin{array}{l} T^{k} : = {T_{1}, \dots, T_{n_{T}} | T_{i} \in ℝ^{k \times k}, T_{i}^{⊤} T_{i} = I_{k}, i = 1, \dots, n_{T}} \\ \subset ℝ^{k \times k} \end{array}

with $M$ an arbitrary set of s positive constants, i.e.,

\begin{array}{l} M : = {c_{1}, \dots, c_{n_{M}} | c_{i} \in ℝ, c_{i} > 0 for i = 1, \dots, n_{M}} \subset ℝ, \end{array}

and with $E$ ^k ⊂ ℝ^k the set of standard unit vectors $e_{1}, \dots, e_{k} \in ℝ^{k}$ . Furthermore, given n ∈ ℕ and a function w ∈ L^∞(0, ∞; ℝⁿ), we define the mean $\bar{w} \in ℝ^{n}$ as

\begin{array}{l} \bar{w} : = lim_{T \to \infty} \frac{1}{T} \int_{0}^{T} w (t) d t \in ℝ^{n} \end{array}

provided the limit exists.

Definition 5 (see [20, Definition 4]). Suppose that the sets $T^{n_{u}}$ , $M$ , and $E^{n_{u}}$ are given as in Definition 4 for k = n_u. For (1) the empirical controllability gramian ${\hat{L}}_{c} \in ℝ^{n_{y} \times n_{y}}$ is defined as follows:

\begin{array}{l} {\hat{L}}_{c} : = \sum_{i = 1}^{n_{T}} \sum_{j = 1}^{n_{M}} \sum_{l = 1}^{n_{u}} \frac{1}{c_{j}^{2} n_{T} n_{M}} \int_{0}^{\infty} Y^{i j l} (t) d t with Y^{i j l} (t) = \\ (y^{i j l} (t) - ȳ^{i j l}) (y^{i j l} (t) - ȳ^{i j l})^{⊤}, \end{array}

where $y^{i j l} (t) \in ℝ^{n_{y}}$ solves (1) corresponding to the impulse input u(t) = c_jT_ie_lδ(t) and ȳ^ijl(t) stands for its mean.

Remark 6. In our numerical example, we utilize the following weighted inner product in the state space $ℝ^{n_{y}}$ induced by the symmetric positive definite matrix $W \in ℝ^{n_{y} \times n_{y}}$ :

\begin{array}{l} {〈 y, ỹ 〉}_{W} = y^{⊤} W ỹ, {‖ y ‖}_{W} = \sqrt{{〈 y, y 〉}_{W}} for y, ỹ \in ℝ^{n_{y}} . & (6) \end{array}

In Definiton 5, the matrices Y^ijl(t) have to be replaced by (y^ijl(t) − ȳ^ijl)(y^ijl(t) − ȳ^ijl)^⊤W. ◇

Definition 7 (see [20, Definition 6]). Let the sets $T^{n_{y}}$ , $M$ , and $E^{n_{y}}$ be defined as in Definition 4 for the choice k = n_u. For (1) the empirical observability gramian, ${\hat{L}}_{o} \in ℝ^{n_{y} \times n_{y}}$ is defined as

\begin{array}{l} {\hat{L}}_{o} : = \sum_{i = 1}^{n_{T}} \sum_{j = 1}^{n_{M}} \frac{1}{c_{j}^{2} n_{T} n_{M}} \int_{0}^{\infty} T_{i} Z^{i j} (t) T_{i}^{⊤} d t with Z_{ν μ}^{i j} (t) = \\ (z^{i j ν} (t) - {\bar{z}}^{i j ν})^{⊤} (z^{i j μ} (t) - {\bar{z}}^{i j μ}), \end{array}

where $z^{i j ν} (t) \in ℝ^{n_{z}}$ , ν = 1, …, n_y, is the output of (1) corresponding to the initial condition y_° = c_jT_ie_ν, and ${\bar{z}}^{i j ν}$ denotes its mean.

Remark 8. 1) In our numerical examples, we have n_y = n_z so that the output space is also supplied by the weighted inner product 〈·, ·〉_W introduced in (6). Then, we replace the matrix elements $Z_{ν μ}^{i j} (t)$ by ${(z^{i j ν} (t) - {\bar{z}}^{i j ν})}^{⊤} W (z^{i j μ} (t) - {\bar{z}}^{i j μ})$ .

2) Recalling that the empirical gramians reconstruct the true gramians in the case of LTI systems; cf. [20] (Lemmas 5 and 7). In all the other cases, they can be observed as an extension of the gramian notion.

3) In [26] (Section 3.1.3), also an empirical cross gramian is introduced. Since in our numerical examples, the results based on empirical cross gramians are not as good as the ones using a combination of empirical controllability and observability gramians, we skip the empirical cross gramian here. ◇

Given the symmetric empirical gramians ${\hat{L}}_{c}$ and ${\hat{L}}_{o}$ , it is possible to define a reduced order model through a balancing transformation which is based on the singular value decomposition (SVD). We follow the approach proposed in the study by [21]. First, we compute the SVD of the symmetric matrices ${\hat{L}}_{c}$ and ${\hat{L}}_{o}$ :

\begin{array}{l} {\hat{L}}_{c} = U_{c} Σ_{c} U_{c}^{⊤} \in ℝ^{n_{y} \times n_{y}} and {\hat{L}}_{o} = U_{o} Σ_{o} U_{o}^{⊤} \in ℝ^{n_{y} \times n_{y}} & (7a) \end{array}

with orthogonal matrices U_c, $U_{o} \in ℝ^{n_{y} \times n_{y}}$ and diagonal matrices Σ_c, $Σ_{o} \in ℝ^{n_{y} \times n_{y}}$ containing the non-negative singular values in descending order. Then, we can compute the matrices ${\hat{L}}_{c}^{1 / 2}$ and ${\hat{L}}_{o}^{1 / 2}$ as well as their product

\begin{array}{l} {\hat{L}}_{c o}^{1 / 2} : = {\hat{L}}_{c}^{1 / 2} {\hat{L}}_{o}^{1 / 2} = U_{c} Σ_{c}^{1 / 2} U_{c}^{⊤} U_{o} Σ_{o}^{1 / 2} U_{o}^{⊤} \in ℝ^{n_{y} \times n_{y}} . & (7b) \end{array}

Finally, we derive the SVD of ${\hat{L}}_{c o}^{1 / 2}$ :

\begin{array}{l} {\hat{L}}_{c o}^{1 / 2} = U Σ V^{⊤} & (7c) \end{array}

with orthogonal matrices U, $V \in ℝ^{n_{y} \times n_{y}}$ and a diagonal matrix $Σ \in ℝ^{n_{y} \times n_{y}}$ containing the non-negative singular values σ_i, i = 1, …, n_y, of ${\hat{L}}_{c o}^{1 / 2}$ in descending order.

Remark 9. Recalling that for LTI systems, the matrix L_co = L_cL_o possesses an eigenvalue decomposition (cf. [18, p. 77]):

\begin{array}{l} L_{c} L_{o} = Ũ [\begin{matrix} {\tilde{Σ}}^{2} & 0 \\ 0 & 0 \end{matrix}] Ũ^{- 1} \end{array}

which is equivalent to the fact that $L_{c o}^{1 / 2}$ has the following SVD

\begin{array}{l} L_{c}^{1 / 2} L_{o}^{1 / 2} = [Û_{1} | Û_{2}] [\begin{matrix} \tilde{Σ} & 0 \\ 0 & 0 \end{matrix}] [\begin{matrix} {\hat{V}}_{1}^{⊤} \\ {\hat{V}}_{2}^{⊤} \end{matrix}], \end{array}

see [21] (Theorem 1). Then, the LTI system can be transformed into a minimal realization, i.e., the transformed LTI system is balanced, and its controllability and observability gramians are both equal to Σ. ◇

Next, we turn to the reduced order modeling (ROM). Here, we utilize the approach from balanced truncation. To build a reduced-order model, we truncate the sufficiently small singular values, taking only ℓ ≪ n_y so that σ_i < ε holds for i = ℓ + 1, …, n_y. For that purpose, we partition the SVD as follows:

\begin{array}{l} {\hat{L}}_{c}^{1 / 2} {\hat{L}}_{o}^{1 / 2} = U Σ V^{⊤} = [U_{1} | U_{2}] [\begin{matrix} Σ_{1} & 0 \\ 0 & Σ_{2} \end{matrix}] [V_{1} | V_{2}]^{⊤} = \\ (\begin{matrix} U_{1} Σ_{1} V_{1}^{⊤} & 0 \\ 0 & U_{2} Σ_{2} V_{2}^{⊤} \end{matrix}), \end{array}

where $U_{1} \in ℝ^{n_{y} \times ℓ}$ , $U_{2} \in ℝ^{n_{y} \times (n_{y} - ℓ)}$ , and $Σ_{1} \in ℝ^{ℓ \times ℓ}$ , $Σ_{2} \in ℝ^{(n_{y} - ℓ) \times (n_{y} - ℓ)}$ . We suppose that the matrix $W^{ℓ} = V_{1}^{⊤} U_{1} \in ℝ^{ℓ \times ℓ}$ is invertible. Now the reduced-order model for (1) is derived in a standard way (cf., e.g., [9]). For given large terminal time T > 0 and t ∈ [0, T], we approximate $y (t) \in ℝ^{n_{y}}$ by $U_{1} y^{ℓ} (t)$ , where y^ℓ(t) ∈ ℝ^ℓ solves together with $z^{ℓ} (t) \in ℝ^{n_{z}}$

\begin{array}{l} W^{ℓ} ẏ^{ℓ} (t) = A^{ℓ} (t) y^{ℓ} (t) + B^{ℓ} (t) u (t) + \\ V_{1}^{⊤} f (t) for t \in (0, T], y^{ℓ} (0) = V_{1}^{⊤} y_{\circ}, & (8a) \end{array}

\begin{array}{l} z^{ℓ} (t) = C (t) U_{1} y^{ℓ} (t) for t \in [0, T] . & (8b) \end{array}

with $A^{ℓ} (t) = V_{1}^{⊤} A (t) U_{1} \in ℝ^{ℓ \times ℓ}$ , $B^{ℓ} (t) = V_{1}^{⊤} B (t) \in ℝ^{ℓ \times n_{u}}$ , $f^{ℓ} (t) = V_{1}^{⊤} f (t)$ and $y_{\circ}^{ℓ} = V_{1}^{⊤} y_{\circ}$ .

Remark 10. 1) Equation (8a) is an ℓ-dimensional nonlinear system of differential equation. Since we assume ℓ ≪ n_y, (8) is called a low-dimensional or reduced order model for (1).

2) Note that for an LTI system this approach is equivalent to balanced truncation; see, e.g., [20, 21] for more details.

3) In our numerical examples the matrices are given as $W^{ℓ} = V_{1}^{⊤} W U_{1}$ , $A^{ℓ} (t) = V_{1}^{⊤} W A (t) U_{1} \in ℝ^{ℓ \times ℓ}$ , $B^{ℓ} (t) = V_{1}^{⊤} W B (t) \in ℝ^{ℓ \times n_{u}}$ , $f^{ℓ} (t) = V_{1}^{⊤} W f (t)$ and $y_{\circ}^{ℓ} = V_{1}^{⊤} W y_{\circ}$ ; cf. Remarks 6 and 8. ◇

3 Model reduction for linear time-variant optimality systems

Notices that (1) is an LTV system and therefore to reduce it one cannot apply directly balanced truncation. Here we follow a different approach. We are interested in controlling (1) in an optimal way. Thus, we define the following objective function

\begin{array}{l} \begin{array}{l} J (y, u) = \frac{1}{2} \int_{0}^{T} {‖ C (t) y (t) - z_{d} (t) ‖}_{Q}^{2} d t + \frac{1}{2} \int_{0}^{T} {‖ u (t) ‖}_{R}^{2} d t \\ = \frac{1}{2} \int_{0}^{T} {‖ z (t) - z_{d} (t) ‖}_{Q}^{2} d t + \frac{1}{2} \int_{0}^{T} {‖ u (t) ‖}_{R}^{2} d t, \end{array} & (9) \end{array}

where $z_{d} \in L^{2} (0, T; ℝ^{n_{z}})$ is a desired state, $Q \in ℝ^{n_{z} \times n_{z}}$ and $R \in ℝ^{n_{u} \times n_{u}}$ are symmetric positive definite matrices, $‖ \cdot | |_{Q} = {〈 \cdot, \cdot 〉}_{Q}^{1 / 2}$ and $‖ \cdot | |_{R} = {〈 \cdot, \cdot 〉}_{R}^{1 / 2}$ hold. Now the optimal control problems reads

\begin{array}{l} min J (y, u) subject to (s.t.) (y, u) \in H^{1} (0, T; ℝ^{n_{y}}) \times (P) \\ L^{2} (0, T; ℝ^{n_{u}}) satisfies (1) . \end{array}

The Lagrange functional associated with P is given by

\begin{array}{r} ℒ (y, u, p, p_{°}) = J (y, u) + \int_{0}^{T} 〈 \dot{y} (t) - A (t) y (t) - B (t) u (t) - \\ f (t), p (t) 〉_{W} d t + {〈 y (0) - y_{\circ}, p_{\circ} 〉}_{W}, \end{array}

where the weighted inner product has been defined in (6). Due to convexity, a first-order sufficient optimality system can be derived from stationarity conditions of the Lagrangian (see, e.g., [34])

\begin{array}{l} ẏ (t) = A (t) y (t) + B (t) R^{- 1} B {(t)}^{⊤} W p (t) + f (t), t \in (0, T], y (0) = y_{\circ}, & (10a) \end{array}

\begin{array}{l} - ṗ (t) = W^{- 1} (A {(t)}^{⊤} W p (t) + \\ C {(t)}^{⊤} Q (z_{d} (t) - C (t) y (t))), t \in [0, T), p (T) = 0 & (10b) \end{array}

and

\begin{array}{l} u (t) = R^{- 1} B {(t)}^{⊤} W p (t) for t \in (0, T) . & (11) \end{array}

Note that utilizing (11) the control u is replaced in (10a). System (10) can be rewritten to get the form of a dynamical system, although formally it is not. We introduce the transformed variable q(t): = p(T − t). For brevity, we set

\begin{array}{l} {\tilde{z}}_{d} (t) = z_{d} (T - t) \in ℝ^{n_{z}}, Ã (t) = A (T - t) \in ℝ^{n_{y} \times n_{y}}, \\ \tilde{C} (t) = C (T - t) \in ℝ^{n_{z} \times n_{y}} \end{array}

for t ∈ [0, T]. This allows us to express (10) as

\begin{array}{l} \begin{matrix} ẏ (t) = A (t) y (t) + B (t) R^{- 1} B {(t)}^{⊤} W q (T - t) + f (t), t \in (0, T], y (0) = y_{\circ}, \\ \overset{\circ}{q} (t) = W^{- 1} (Ã {(t)}^{⊤} W q (t) + \tilde{C} {(t)}^{⊤} Q ({\tilde{z}}_{d} (t) - \\ \tilde{C} (t) y (T - t))), t \in [0, T), q (0) = 0 . \end{matrix} & (12) \end{array}

For t ∈ (0, T) let

\begin{array}{l} x (t) = [\begin{matrix} y (t) \\ q (t) \end{matrix}] \in ℝ^{2 n_{y}}, F (t) = [\begin{matrix} f (t) \\ \tilde{C} {(t)}^{⊤} Q {\tilde{z}}_{d} (t) \end{matrix}] \in ℝ^{2 n_{y}}, \\ x_{\circ} = [\begin{matrix} y_{\circ} \\ 0 \end{matrix}] \in ℝ^{2 n_{y}} . \end{array}

Then, (12) can be written as

\begin{array}{l} ẋ (t) = A (t) x (t) + \tilde{A} (t) x (T - t) + F (t) for t \in (0, T), x (0) = x_{\circ} & (13) \end{array}

with the two (2n_y) × (2n_y)-matrices

\begin{array}{l} A (t) = [\begin{matrix} A (t) & 0 \\ 0 & W^{- 1} Ã {(t)}^{⊤} W \end{matrix}], \tilde{A} (t) = \\ [\begin{matrix} 0 & B (t) R^{- 1} B {(t)}^{⊤} W \\ - W^{- 1} \tilde{C} {(t)}^{⊤} Q \tilde{C} (t) & 0 \end{matrix}] . \end{array}

Although (13) has not a canonical form, it can be seen as a dynamical system, because the solution x evolves from 0 to the final time T. Thus, we can apply a model-order reduction scheme based on empirical gramians as introduced in Section 2. For that purpose we interprete (13) as an input-output system, where we consider the desired states z_d as input and the control $ũ (t) = R^{- 1} B {(t)}^{⊤} W \hat{x} (T - t)$ as output with $\hat{x} (t) = {(x_{n_{y} + 1} (t), \dots, x_{2 n_{y}} (t))}^{⊤}$ . Note that $\hat{x} (t)$ is nothing else than the adjoint variable (modulus a time transformation) for P. Hence the output ũ is the optimal control sought. Applying the empirical gramian approach explained in Section 2, we can construct a reduced-order model for (13) which should achieve satisfactory approximation performances while the target z_d varies. What we aim in fact is not to construct a reduced order model for (1), but a way to cheaply compute the optimal solution of problem P for different values of its underlying parameters, such as the right-hand side f or the target z_d. This technique is particularly useful in the context of multiobjective optimization or model predictive control (MPC), where many optimization problems with changing parameters must be solved; cf. Section 4.

Remark 11. Let us define the right-hand side

\begin{array}{l} F (t, x) = (\begin{matrix} A (t) & B (t) R^{- 1} B {(t)}^{⊤} W \\ W^{- 1} C {(t)}^{⊤} Q C (t) & - W^{- 1} A {(t)}^{⊤} W \end{matrix}) x + \\ (\begin{matrix} f (t) \\ - W^{- 1} C {(t)}^{⊤} Q z_{d} (t) \end{matrix}) \end{array}

for t ≥ 0 and $x = (y, p) \in ℝ^{n_{y}} \times ℝ^{n_{y}}$ . Then we can write the differential equations (10) as follows

\begin{array}{l} ẋ (t) = F (t, x (t)) for t > 0 . & (14) \end{array}

However, we cannot pose an initial condition for the second component p of x (due to the terminal condition for the dual variable). Suppose that Φ_t(x_°) denotes the solution to (14) at time t ≥ 0 for an arbitrarily chosen initial condition $x_{\circ} \in ℝ^{2 n_{y}}$ . Then, Φ_t is a symplectic function [35, Lemma 1], and we can introduce symplectic Koopman operators to get accurate approximations of certain optimal nonlinear feedback laws; cf. [35] for more details. Here we follow a different approach. ◇

4 Two applications

In this section we illustrate our proposed strategy for two different examples, which are highlighting the power of the empirical gramians in comparison to a standard approach based on Proper Orthogonal Decomposition (POD). In both examples we consider optimization problems with parabolic PDE constraints. In Section 4.1 we introduce the POD method, which will serve as comparison for our results. In Section 4.2 we apply our proposed methodology to a multiobjective optimization framework, while in Section 4.3 we test it in a feedback control context.

From now on, let T > 0 and Ω ⊂ ℝ^d (d ∈ {1, 2, 3}) with boundary Γ = ∂Ω. We set Q = (0, T) × Ω and Σ = (0, T) × ∂Ω. Furthermore, we define the function spaces V = H¹(Ω), H = L²(Ω), $Y$ = W(0, T) = L²(0, T; V) ∩ H¹(0, T; V′) and $H$ = $U$ = L²(0, T; H). For more details on Lebesgue and Sobolev spaces we refer the reader to [36], for instance.

4.1 The POD method

The (discrete) POD method is based on constructing a low-dimensional subspace that can resemble the information carried out by a given set of vectors ${y_{j}^{k}}_{j = 1}^{n}$ , 1 ≤ k ≤ ℘, (the so-called snapshots) belonging to the Euclidean space ℝ^m; cf., e.g., [37] and [11] (Section 2.1). Let

\begin{array}{l} V = span {y_{j}^{k} | 1 \leq j \leq n and 1 \leq k \leq ℘} \subset ℝ^{m} \end{array}

be the space spanned by the snapshots with dimension n_$V$ = dim $V$ ≤ min{n℘, m}. To avoid trivial cases we assume n_$V$ ≥ 0. For ℓ ≤ n_$V$ the POD method generates pairwise orthonormal functions ${ψ_{i}}_{i = 1}^{ℓ}$ such that all $y_{j}^{k}$ , 1 ≤ j ≤ n and 1 ≤ k ≤ ℘, can be represented with sufficient accuracy by a linear combination of the ψ_i's. This is done by a minimization of the mean square error between the $y_{j}^{k}$ 's and their corresponding ℓ-th partial Fourier sum:

\begin{array}{l} {\begin{array}{l} min \sum_{k = 1}^{℘} \sum_{j = 1}^{n} α_{j} ‖ y_{j}^{k} - \sum_{i = 1}^{ℓ} {〈 y_{j}^{k}, ψ_{i} 〉}_{\tilde{W}} ψ_{i} ‖_{\tilde{W}}^{2} \\ s.t. {ψ_{i}}_{i = 1}^{ℓ} \subset ℝ^{m} and {〈 ψ_{i}, ψ_{j} 〉}_{\tilde{W}} = δ_{i j}, 1 \leq i, j \leq ℓ, \end{array} & (15) \end{array}

where the α_j's are positive weighting parameters for j = 1, …, n. In fact, the weights α_j are chosen to resemble a trapezoidal rule for the temporal integration. Moreover, $\tilde{W} \in ℝ^{m \times m}$ defines the weighted inner product ${〈 \cdot, \cdot 〉}_{\tilde{W}}$ and associated norm $‖ \cdot | |_{\tilde{W}}$ ; cf. (6). The symbol δ_ij denotes the Kronecker symbol satisfying δ_ii = 1 and δ_ij = 0 for i ≠ j. An optimal solution to (15) is denoted as a POD basis of rank ℓ. It can be proven that such a solution is characterized by the eigenvalue problem

\begin{array}{l} R ψ_{i} = λ_{i} ψ_{i} for 1 \leq i \leq ℓ, & (16) \end{array}

where λ₁ ≥ … ≥ λ_ℓ ≥ … ≥ λ_{n_$V$} > 0 denote the eigenvalues of the linear, compact, nonnegative and self-adjoint operator

R : ℝ^{m} \to ℝ^{m}, R ψ = \sum_{k = 1}^{℘} \sum_{j = 1}^{n} α_{j} {〈 ψ, y_{j}^{k} 〉}_{\tilde{W}} y_{j}^{k} for ψ \in ℝ^{m};

cf., e.g., [11] (Lemma 2.2). We refer to [11] (Remark 2.11) for an explicit form of the operator $R$ in the case ℘ = 2. Recall that for a solution ${ψ_{i}}_{i = 1}^{ℓ}$ to (16) the following approximation error formula holds true:

\sum_{k = 1}^{℘} \sum_{j = 1}^{n} α_{j} ‖ y_{j}^{k} - \sum_{i = 1}^{ℓ} {〈 y_{j}^{k}, ψ_{i} 〉}_{\tilde{W}} ψ_{i} ‖_{\tilde{W}}^{2} = \sum_{i = ℓ + 1}^{n_{V}} λ_{i};

cf. [11] (Theorem 2.7). In our numerical tests the (temporal) snapshots ${y_{j}^{k}}_{j = 1}^{n}$ will generally come from simulations of (13) for ℘ different choices of the inputs. The index n is related to the discrete time steps 0 = t₁ < … < t_n = T and ℘ to the number of inputs used. We define the POD matrix $Ψ : = [ψ_{1} | \dots | ψ_{ℓ}] \in ℝ^{m \times ℓ}$ and derive – analogous to (8a) – a POD-based reduced order model for (13) by choosing U₁ = Ψ and $V_{1} = \tilde{W} Ψ$ which satisfy $U_{1}^{⊤} V_{1} = Ψ^{⊤} \tilde{W} Ψ = I_{ℓ}$ .

4.2 Multiobjective optimization problem

There is hardly ever a situation, where only one goal is of interest at a time. When carrying out a purchase for example, we want to pay a low price while getting a high quality product. In the same manner, multiple objectives are present in most technical applications such as fast and energy efficient driving or designing light and stable constructions. This dilemma leads to the field of multiobjective optimization (cf., e.g., [12]), where the aim is to minimize all relevant objectives J = (J₁, …, J_k) simultaneously.

While we are usually satisfied with one (global or even local) optimal solution in the single-objective setting, there generally exists an infinite number of optimal compromises in the situation where multiple objectives are present since the different objectives are in conflict with each other. Let us recall the concept of Pareto optimality:

1) A point ū dominates a point u, if J(ū) ≤ J(u) componentwise and J_i(ū) < J_i(u) for at least one i ∈ {1, …, k};

2) A feasible point ū is called Pareto optimal if there exists no feasible u dominating ū;

3) The set of Pareto optimal points is called the Pareto set, its image under J the Pareto front.

In multiobjective optimization, the goal can be therefore to compute the Pareto set, i.e., the set of points ū for which no u exists that is superior in all objective components, cf. Figure 1.

FIGURE 1

Figure 1. Sketch (red lines) of a Pareto set (A) and Pareto front (B) for min{J(u)|u ∈ ℝ²} with J:ℝ² → ℝ², $J_{1} (u) = {(u_{1} + 1)}^{2} + {(u_{2} + 1)}^{2}$ and $J_{2} (u) = {(u_{1} - 1)}^{2} + {(u_{2} - 1)}^{2}$ . The goal is to simultaneously minimize J₁ and J₂ represented by the blue and the orange paraboloid, respectively.

4.2.1 Problem formulation

Now we turn to our PDE-constrained multiobjective optimization problem (MOP). Given the canonical embedding $C : Y \to H$ and two desired states $𝔷_{d}^{1}$ , $𝔷_{d}^{2} \in H$ we consider the following strictly convex MOP:

\begin{array}{l} \begin{array}{l} min J (𝔲) = [\begin{matrix} J_{1} (𝔲) \\ J_{2} (𝔲) \end{matrix}] = \frac{1}{2} [\begin{array}{l} {‖ C 𝔶_{𝔲} - 𝔷_{d}^{1} ‖}_{H}^{2} + σ {‖ 𝔲 ‖}_{U}^{2} \\ {‖ C 𝔶_{𝔲} - 𝔷_{d}^{2} ‖}_{H}^{2} + σ {‖ 𝔲 ‖}_{U}^{2} \end{array}] \\ s.t. 𝔲 \in U and 𝔶 = 𝔶_{𝔲} \in Y solves the linear parabolic problem \\ \begin{array}{l} \frac{\partial 𝔶_{𝔲}}{\partial t} (t, x) - Δ 𝔶_{𝔲} (t, x) + 𝔳 (x) \cdot \nabla 𝔶_{𝔲} (t, x) = 𝔲 (t, x) + \\ 𝔣 (t, x) for (t, x) \in Q, \\ \frac{\partial 𝔶_{𝔲}}{\partial n} (t, s) = 0 for (t, s) \in Σ, 𝔶_{𝔲} (0, x) = 𝔶_{\circ} (x) for x \in Ω, \end{array} \end{array} \\ (\hat{P}) \end{array}

where σ > 0, 𝔳 ∈ C(Ω; ℝ^d), 𝔣 ∈ $H$ and 𝔶_° ∈ H hold. Recall that a unique (weak) solution 𝔶_𝔲 ∈ $Y$ follows from standard results for any control 𝔲 ∈ $U$ ; cf., e.g., [38, 39].

Due to convexity it can be proven that a sufficient first-order optimality condition for $\hat{P}$ is given as follows: If there exist a weight $\bar{α} \in [0, 1]$ satisfying together with an element $\bar{𝔲} \in U$ the conditions

\begin{array}{l} \bar{α} \nabla J_{1} (\bar{𝔲}) + (1 - \bar{α}) \nabla J_{2} (\bar{𝔲}) = 0 & (17) \end{array}

then $\bar{𝔲}$ is a Pareto optimal point for $\hat{P}$ ; see [12] (Theorem 3.27) and [13] (Section 2.1), for instance.

Remark 12. For fixed weight $\bar{α} \in [0, 1]$ we observe that (17) is a sufficient first-order optimality condition for the (strictly) convex problem

\begin{array}{l} min J^{α} (𝔲) = α J_{1} (𝔲) + (1 - α) J_{2} (𝔲) s.t. 𝔲 \in U . & (18) \end{array}

with the choice $α = \bar{α}$ . This motivates the well-known weighted sum method; cf. [40]. By varying the weight α ∈ [0, 1] the strictly convex problem (18) is solved. This requires many optimization solves, and model order reduction offers the possibility to speed-up the solution process. ◇

For arbitrarily chosen α ∈ [0, 1] we set

\begin{array}{l} C^{α} = (\begin{matrix} α C \\ (1 - α) C \end{matrix}) : Y \to H \times H and 𝔷_{d}^{α} = (\begin{matrix} α 𝔷_{d}^{1} \\ (1 - α) 𝔷_{d}^{2} \end{matrix}) \\ \in H \times H . \end{array}

Following Remark 12, we consider the (strictly) convex scalar optimization problem

\begin{array}{l} \begin{array}{l} min J_{α} (𝔲) = \frac{1}{2} {‖ C^{α} 𝔶_{𝔲} - 𝔷_{d}^{α} ‖}_{H \times H}^{2} + \frac{σ}{2} {‖ 𝔲 ‖}_{U}^{2} \\ s.t. u \in U and 𝔶_{𝔲} \in Y solves the linear parabolic problem \\ \frac{\partial 𝔶_{𝔲}}{\partial t} (t, x) - Δ 𝔶_{𝔲} (t, x) + \\ 𝔳 (x) \cdot \nabla 𝔶_{𝔲} (t, x) = 𝔲 (t, x) + 𝔣 (t, x) for (t, x) \in Q, \\ \frac{\partial 𝔶_{𝔲}}{\partial n} (t, s) = 0 for (t, s) \in Σ, 𝔶_{𝔲} (0, x) = 𝔶_{\circ} (x) for x \in Ω . \end{array} \\ ({\hat{P}}^{α}) \end{array}

Next, we discretize ${\hat{P}}^{α}$ in space using piecewise linear finite elements (FE). Let m = n_y = n_u, n_z = 2m be valid and φ₁, …, φ_m ∈ V denote the FE ansatz functions. The FE subspace is $V^{fe} = span {φ_{1}, \dots, φ_{m}} \subset V$ . For (t, x) ∈ Ω_T we approximate the state and control as

\begin{array}{l} 𝔶 (t, x) \approx 𝔶^{fe} (t, x) = \sum_{i = 1}^{m} y_{i} (t) φ_{i} (x), 𝔲 (t, x) \approx 𝔲^{fe} (t, x) = \\ \sum_{i = 1}^{m} u_{i} (t) φ_{i} (x), \end{array}

respectively. Moreover, let

\begin{array}{l} 𝔷_{d}^{j} (t, x) \approx \sum_{i = 1}^{m} z_{d i}^{j} (t) φ_{i} (x) (j = 1, 2), 𝔣 (t, x) \\ \approx \sum_{i = 1}^{m} f_{i} (t) φ_{i} (x), 𝔶_{\circ} (x) \approx \sum_{i = 1}^{m} y_{\circ i} φ_{i} (x) \end{array}

for (t, x) ∈ Q. Finally, we introduce the m × m matrices

\begin{array}{l} M = (({〈 φ_{j}, φ_{i} 〉}_{H})), D = (({〈 φ_{j}, v \cdot \nabla φ_{i} 〉}_{H})), \\ S = (({〈 \nabla φ_{j}, \nabla φ_{i} 〉}_{H^{d}})), A = S + D, B = I_{m}, R = σ M, W = \tilde{W} = M \\ C = (\begin{matrix} α I_{m} \\ (1 - α) I_{m} \end{matrix}), z_{d}^{α} (t) = (\begin{matrix} α z_{d}^{1} (t) \\ (1 - α) z_{d}^{2} (t) \end{matrix}), Q^{α} = (\begin{matrix} α M & 0 \\ 0 & (1 - α) M \end{matrix}) . \end{array}

Now problem ${\hat{P}}_{fe}^{α}$ is approximated by the following FE optimization problem

\begin{array}{l} \begin{array}{l} min J_{fe}^{α} (u) = \int_{0}^{T} \frac{1}{2} {‖ C^{α} y (t) - z_{d}^{α} (t) ‖}_{Q^{α}}^{2} + \frac{1}{2} {‖ u (t) ‖}_{R}^{2} d t \\ s.t. u \in U^{fe} and y \in Y^{fe} solves the initial value problem \\ M ẏ (t) = A y (t) + M u (t) + M f (t) for t \in (0, T], y (0) = y_{\circ}, \end{array} \\ ({\hat{P}}_{fe}^{α}) \end{array}

where $U$ ^fe = L²(0, T; ℝ^m) and $Y$ ^fe = H¹(0, T; ℝ^m). Notice that in this semidiscrete setting, our problem is equivalent to (1a) of Section 3. For (10) we derive

\begin{array}{l} ẏ (t) = M^{- 1} A y (t) + \frac{1}{σ} p (t) + f (t), t \in (0, T], y (0) = y_{\circ}, & (19a) \end{array}

\begin{array}{l} - ṗ (t) = M^{- 1} A^{⊤} p (t) + α z_{d}^{1} (t) + \\ (1 - α) z_{d}^{2} (t) - y (t), t \in [0, T), p (T) = 0, & (19b) \end{array}

where we have used u(t) = R⁻¹B(t)^⊤Wp(t) = p(t)/σ.

In particular, a solution to ${\hat{P}}_{fe}^{α}$ can be computed by solving the sufficient first-order optimality system (13). For the specific case of ${\hat{P}}_{fe}^{α}$ we have

\begin{array}{l} ẋ (t) = A x (t) + \tilde{A} x (T - t) + F^{α} (t) for t \in (0, T), x (0) = x_{\circ} & (20) \end{array}

with

\begin{array}{l} A = [\begin{matrix} M^{- 1} A & 0 \\ 0 & M^{- 1} A^{⊤} \end{matrix}], \tilde{A} = [\begin{matrix} 0 & \frac{1}{σ} I_{m} \\ - I_{m} & 0 \end{matrix}], \\ F^{α} (t) = [\begin{matrix} f (t) \\ {\tilde{z}}_{d}^{α} (t) \end{matrix}], x_{\circ} = [\begin{matrix} y_{\circ} \\ 0 \end{matrix}] \end{array}

and ${\tilde{z}}_{d}^{α} (t) = α z_{d}^{1} (T - t) + (1 - α) z_{d}^{2} (T - t)$ .

In conclusion, to compute the Pareto front of the multiobjective optimization problem ${\hat{P}}_{fe}^{α}$ with a weighted sum approach we have to solve several optimality systems of the form (20) with changing parameter α. For the gramian based MOR approach we proceed as described in Algorithm 1.

ALGORITHM 1

Algorithm 1. Gramian-based MOP using the weighted sum approach.

Remark 13.(1) Steps 1–4 of Algorithm 1 we call training or offline phase. For the gramian orthogonal matrices set $T$ ^2m, respectively the positive constant set $M$ we choose n_$T$ = n_$M$ = 3, where $I_{2 m} \in T^{2 n_{y}}$ and 1 ∈ $E$ . The other two orthogonal matrices and scalars were chosen randomly. Furthermore, we restrict the set $E$ ^2m to ten random unit vectors to save computational effort. Altogether, we compute 3 × 3 × 10 = 90 trajectories to (20) to get the empirical controllability gramian. For ẑ_d we choose a random input instead of δ-impulse inputs; cf., e.g., [26].

(2) Note that our gramian-based approach ensures sufficiently good approximation qualities for varying desired state ${\tilde{z}}_{d}^{α}$ for α ∈ [0, 1]. This is required in the weighted sum method.

(3) For solving the dynamical systems (20) and (21) over a fixed time horizon [0, T] we use the implicit Euler scheme with the equidistant time grid t_j = (j − 1)Δt and Δt = T/(n − 1) for j = 1, …, n with step size Δt > 0. One can show that the resulting linear system is uniquely solvable.

(4) Since the empirical gramians are trained with respect to the desired state, the resulting basis functions are flexible while using different targets. Therefore, they are also more robust for a varying parameter α, which is pretty important while computing the Pareto front. ◇

In our numerical experiments we compare the results that we got from Algorithm 1 with the ones obtained from a POD reduced-order modeling approach. The snapshots used to generate the POD basis are the solutions of (20) while computing the controllabilty and observability gramians in Algorithm 1. Clearly, this is done only for fair comparison purposes among the techniques. In general, building a POD model capable of approximating the solution of problems with different targets would require to run many simulations varying targets and controls. Unfortunately this is possible only if the targets are known a-priori. Otherwise the only option remaining is choosing random quantities, which would require exactly the same effort of the gramian approach. As we can see, the big difference is that the gramian approach is capable to exploit the randomness to construct a good approximation of the full order model, while the POD seriously struggles if the generated snapshots are not close to the optimal solution of the problem. Furthermore, notice that if we train our POD model by computing snapshots for varying targets and controls we would get a huge amount of snapshots, which will lead to a costly SVD, while the gramian have the advantage to lead to perform SVDs on matrix which have the same size of the problem. The costly SVD for the POD model is directly noticeable at the CPU time of the offline phase; cf. Table 2.

4.2.2 Numerical results

In the following numerical test the chosen parameter and problem required functions are summarized in Table 1.

TABLE 1

Table 1. Domain, optimization and PDE data for the numerical implementation of $\hat{P}$ .

First of all, already in Figure 2 one can notice the better accuracy of the proposed approach based on empirical gramians for two different number of reduced order models. To better quantify the accuracy of the reduced-order solutions we compare the absolute errors

\begin{array}{l} t \mapsto er r_{abs}^{J} = ‖ (\begin{matrix} J_{1} (u_{α}^{fe}) \\ J_{2} (u_{α}^{fe}) \end{matrix}) - (\begin{matrix} J_{1} (u_{α}^{ℓ}) \\ J_{2} (u_{α}^{ℓ}) \end{matrix}) ‖_{2}, \\ t \mapsto er r_{abs}^{y} = {‖ y_{α}^{fe} (t) - y_{α}^{ℓ} (t) ‖}_{M}, t \mapsto er r_{abs}^{u} = {‖ u_{α}^{fe} (t) - u_{α}^{ℓ} (t) ‖}_{M}, \end{array}

and also the relative errors

\begin{array}{l} t \mapsto er r_{rel}^{J} = \frac{er r_{abs}^{J}}{‖ (\begin{matrix} J_{1} (u_{α}^{fe}) \\ J_{2} (u_{α}^{fe}) \end{matrix}) ‖_{2}}, t \mapsto er r_{rel}^{y} = \frac{er r_{abs}^{y}}{‖ y_{α}^{fe} (t) ‖_{M}}, \\ t \mapsto er r_{rel}^{u} = \frac{er r_{abs}^{u}}{‖ u_{α}^{fe} (t) | |_{M}} . \end{array}

FIGURE 2

Figure 2. Pareto front for the full-order model, for the empirical gramian-based model and the classical POD model with ℓ = 30 (left plot) and ℓ = 50 (right plot) basis functions.

These errors are reported in Figure 3 for the gramian and POD approaches. As one can see, the empirical gramians perform better than the POD method. Since the desired states $𝔷_{d}^{1}$ and $𝔷_{d}^{2}$ are not part of the training set, it is way harder for the POD-based reduced order model to reconstruct the optimal solution. Also switching the desired states $𝔷_{d}^{1}$ and $𝔷_{d}^{2}$ afterwards one get similar approximation results. This shows the advantage of flexibility of the gramian-based approach. With significant reduction of time (due to cheaper SVDs), one can prepare a basis which is more suitable for varying targets and thus different optimal control problem. This is a large advantage if one considers that often the POD basis must be tailored for each problem at a time. In Table 2, one can see the significant speed-up of the offline phase with the empirical gramians method with respect to the POD one. We picked here the cases ℓ = 30 and ℓ = 50. As said, for the gramian approach we can add all snapshots up in two matrices, whose sizes coincide with the dimension of the problem. For the POD basis, instead, we need a way larger matrix, which depends on the number of snapshots. If one generates many snapshots, as in this case, the SVD of such a matrix gets numerical costly. The reason is that for multiple snapshots the matrix that has to be decomposed by SVD gets larger; cf. [11] (Remark 2.12).

FIGURE 3

Figure 3. Errors for the reduced order solutions with empirical gramians (solid line) compared to POD approximimation (dashed line). The parameter ℓ represents the number of basis functions.

TABLE 2

Table 2. CPU times to solve the multiobjective optimization problem $\hat{P}$ and corresponding speed-ups for the respective MOR techniques for ℓ basis functions.

In the brackets there are the respective values including the training time. Notice that the offline time for the gramian-based MOR is significantly lower than for the POD method which leeds to better speed-up factors. The reason for this is that the required SVD for the POD basis computation has to be computed for a much larger data matrix including all snapshots.

4.3 Model predictive control

In the second example we show the potential of the empirical gramians in a model predictive control (MPC) framework. MPC is an optimization based method for the feedback control of dynamical systems where the time horizon usually tends to infinity; see, e.g., [14] for a general introduction and [41] for parabolic PDEs. For large terminal time T we are considering problems of the form

\begin{array}{l} \begin{array}{l} min_{(y, u)} J (y, u) = \frac{1}{2} \int_{0}^{T} {‖ C (t) y (t) - z_{d} (t) ‖}_{Q}^{2} d t + \frac{1}{2} \int_{0}^{T} {‖ u (t) ‖}_{R}^{2} d t \\ = \frac{1}{2} \int_{0}^{T} {‖ z (t) - z_{d} (t) ‖}_{Q}^{2} d t + \frac{1}{2} \int_{0}^{T} {‖ u (t) ‖}_{R}^{2} d t, \\ s.t. (y, u) \in H^{1} (0, T; ℝ^{n_{y}}) \times L^{2} (0, T; ℝ^{n_{u}}) solves \\ ẏ (t) = A (t) y (t) + B (t) u (t) + f (t) for t \in (0, T], y (0) = y_{\circ} . \end{array} \\ (MPC) \end{array}

MPC is based on iterative, finite-horizon optimization of dynamical control problems. Let first fix a time discretization with initial time step t₀ and t_k = t₀ + kΔt for Δt > 0 and k ∈ ℕ. Fixed a time shift T_pred = NΔt for N > 0, N ∈ ℕ, we compute an optimal control u_k such that u_k(t_i) is minimizing the cost function for t_i in the discrete time interval [t_k, t_k + T_pred], i.e., i = k, …, k + N. Now, only the optimal control over the discrete time interval [t_k, t_k + T_f] (T_f = MΔt with 0 < M ≤ N) is stored and the previous procedure is repeated starting from the new resulting current state at time t_k + T_f. This yields to a new optimal control u_k+1 which will contribute to another piece of the final reconstructed suboptimal control for the whole infinite horizon. In conclusion we are solving iteratively optimal control problems, where the finite prediction horizon keeps being shifted forward. The parameter T_pred is called MPC prediction horizon and T_f is the MPC feedback time step. The reason why we are building a feedback control relies on the fact that each new computed optimal control for a given time window depends on the current state of the system. We present our MPC method in Algorithm 2.

ALGORITHM 2

Algorithm 2. Model predictive control (MPC).

4.3.1 Problem formulation

In our numerical example we study the following problem:

\begin{array}{l} \begin{array}{l} min J (u) = \frac{1}{2} {‖ C 𝔶_{u} - 𝔷_{d} ‖}_{H}^{2} + \frac{σ}{2} {‖ u ‖}_{U}^{2} \\ s.t. 𝔲 \in U and 𝔶_{𝔲} \in Y solves the linear parabolic problem \\ \frac{\partial 𝔶_{𝔲}}{\partial t} (t, x) - Δ 𝔶_{𝔲} (t, x) + α (t) 𝔳 (x) \cdot \nabla 𝔶_{𝔲} (t, x) = χ_{ω} (x) 𝔲 (t, x) + \\ 𝔣 (t, x) for (t, x) \in Q, \\ \frac{\partial 𝔶_{𝔲}}{\partial n} (t, s) = 0 for (t, s) \in Σ, 𝔶_{𝔲} (0, x) = 𝔶_{\circ} (x) for x \in Ω, \end{array} \\ (\tilde{P}) \end{array}

where the spaces $Y$ , $U$ and $H$ , the linear embedding operator $C$ and the data σ, 𝔶_°, 𝔳 and 𝔣 have been introduced at the beginning of Sections 4, 4.2.1. Moreover, let α ∈ L^∞(0, T) be scalar-valued parameter function andχ_ω the characteristic function of a nonzero subset ω ⊂ Ω. Recall that a unique (weak) solution 𝔶_𝔲 ∈ $Y$ for any control 𝔲 ∈ $U$ follows from standard results; see, e.g., [38, 39].

Next, we proceed analog to Section 4.2 to derive the semidiscrete problem with weighted norms. The semidiscrete first-order necessary and sufficient optimality system of $\tilde{P}$ is equivalent to solving the system

\begin{array}{l} ẋ (t) = A^{α} (t) x (t) + \tilde{A} x (T - t) + F (t) for t \in (0, T), x (0) = x_{\circ} \in ℝ^{2 m}, & (22) \end{array}

for x = (y, q), q(t) = p(T − t), with y and p solution of state and adjoint equation of $\tilde{P}$ respectively, and

\begin{array}{l} A^{α} (t) = [\begin{matrix} M^{- 1} A^{α} (t) & 0 \\ 0 & M^{- 1} Ã^{α} {(t)}^{⊤} \end{matrix}], {\tilde{A}}^{α} (t) = [\begin{matrix} 0 & \frac{1}{σ} M^{- 1} B M^{- 1} B^{⊤} \\ - I_{m} & 0 \end{matrix}], \\ B = (({〈 χ_{ω} φ_{j}, φ_{i} 〉}_{H})), F (t) = [\begin{matrix} f (t) \\ {\tilde{z}}_{d} (t) \end{matrix}], x_{\circ} = [\begin{matrix} y_{\circ} \\ 0 \end{matrix}], \\ A^{α} (t) = S + α (t) D, Ã^{α} (t) = S + α (T - t) D . \end{array}

Moreover, we have u(t) = M⁻¹B^⊤p(t)/σ.

Now, the possible approach for applying model-order reduction to $\tilde{P}$ would be to apply the POD method; cf., e.g., [11]. One would run simulations of state and adjoint equations varying the control u in order to construct a POD basis. The more the inputs u for the training are closer to the unknown optimal control, the better the POD approximation will be. Clearly this shows the limitation of this approach, since it is not really possible to guess a-priorly what the optimal control will be. Anyway, since we are in an MPC framework, one could think of paying the price of solving occasionally FE models to generate (and update) the POD basis and perform the rest of the runs with the resulting reduced-order model. This strategy has also the advantage that can be controlled by the use of an a-posteriori error estimator to decide when trigger the update. For this MPC-POD method we refer to [15, 42]. With respect to this method, the gramian-based approach offers additional advantages:

1) The possibility of training the controllability gramian (and thus the reduced-order model) choosing as input the advection coefficient α(t). In such a way, the resulting basis functions will be less sensitive to perturbations of this advection coefficient.

2) Similarly, we can train the observability gramian by choosing as output z(t) = M⁻¹B^⊤p(t)/σ, which nothing else that the first-order optimality condition of the perturbed optimal control problem that has (22) as optimality system. In such a way, the reduced-order model will be capable to reconstruct the optimal control with high accuracy.

As also pointed out in [15], in fact, the MPC-POD method suffers in presence of advection already for a small Peclet number and any strong deviation from the original POD snapshots can also lead to inaccuracy. Although the a-posteriori error estimator mitigates these problems, the continue triggering of the update of the basis has also a negative impact in the overall time performances of the algorithm. The gramian based MPC can overcome this by preparing a reduced-order model slightly sensible to perturbations of advection and optimal control. The basis functions are capable to keep track of the changes in this case, thanks to the randomized training performed at the begin. For a more detailed description of the gramian based MPC we refer to Algorithm 3, for the MPC-POD algorithm we refer to [15] (Algorithm 11).

ALGORITHM 3

Algorithm 3. Gramian-based MPC.

4.3.2 Numerical results

For the following numerical tests we choose the same domain and PDE parameter than in Section 4.1.1 apart from the end time point, which is here set to T = 250. The additional advection coefficient function α ∈ L^∞(0, T; ℝ) is chosen randomly in a specified range, which will be varied to defined different test cases. We report the other chosen parameters in Table 3.

TABLE 3

Table 3. Optimization and MPC data for the numerical implementation of $\tilde{P}$ .

The numerical results are reported in Tables 4, 5. The quantity $er r_{abs}^{y}$ , $er r_{abs}^{u}$ , $er r_{rel}^{y}$ and $er r_{rel}^{u}$ are the absolute errors and the relative errors in approximating full-order optimal state ȳ and control ū, respectively. More precisely, being ȳ^ℓ and ū^ℓ a reduced-order model solution, we define

\begin{array}{l} e r r_{a b s}^{y} = {(\int_{0}^{T} ‖ \bar{y} (t) - {\bar{y}}^{ℓ} (t) ‖_{M}^{2} d t)}^{1 / 2}, e r r_{r e l}^{y} = \frac{e r r_{a b s}^{y}}{{(\int_{0}^{T} ‖ \bar{y} (t) ‖_{M}^{2} d t)}^{1 / 2}}, \\ e r r_{a b s}^{u} = {(\int_{0}^{T} ‖ \bar{u} (t) - {\bar{u}}^{ℓ} (t) ‖_{M}^{2} d t)}^{1 / 2}, e r r_{r e l}^{u} = \frac{e r r_{a b s}^{u}}{{(\int_{0}^{T} ‖ \bar{u} (t) ‖_{M}^{2} d t)}^{1 / 2}}, \end{array}

TABLE 4

Table 4. Error in approximating the optimal state ȳ and control ū solution of the discretized full order model $\tilde{P}$ with randomly chosen advection coefficient function α in the reported range and ℓ ∈ {10, 30, 50} basis functions.

TABLE 5

Table 5. Average CPU times to solve the MPC framework for $\tilde{P}$ and corresponding speed-ups for the respective MOR techniques after 1,000 simulations with randomly chosen advection coefficients function α.

where the time integrals are realized numerical by applying a standard trapezoidal approximation. As said, since we train the gramians with respect to the advection coefficient α, we obtain basis functions which are robust with respect to changes of the advection term. This can be seen in Table 4, where the relative error increases significantly only for the largest range of [0, 10].

The MPC-POD method, instead, starts loosing accuracy already at the stage [0, 5]. Furthermore, the relative error for the control resulting from Algorithm 3 is from two to four order of magnitude smaller with respect to the MPC-POD one. In particular, in the last case of α(t) ∈ [0, 10] the MPC-POD method is not practically usable. Therefore, we can say that Algorithm 3 is able to recover the optimal solution with a better approximation than the MPC-POD method. Let us mention, that for a fair comparison, we used the same snapshots for the two approaches. The problem for POD is that the resulting basis has no capability to track the information carried by perturbing the advection, while the gramian can. Anyway, the deterioration of accuracy for increasing advection emerges also in the context of the gramian basis, although does not affect its usability in our numerical tests. From Table 5, one can also see another disadvantage of the POD method.

Since we are using the same number of snapshots for the two techniques, the POD matrix has the dimension of N_x × L, where N_x is the of FE nodes and L the number of snapshots. Clearly, if L ≫ N_x, the time to compute a POD basis, i.e., performing a SVD on the matrix N_x × L is significantly larger than the two SVDs required for the N_x × N_x gramians. This reflects the difference on the overall time speed-up between MPC-POD (1.7) and Algorithm 3 (4.8). We can then conclude that our proposed gramian based approach has the double advantage of robustness with respect to variation of the advection field and overall time speed-up in comparison to the MPC-POD method.

4.3.3 A-Posteriori error analysis

To validate our numerical approach we control the MOR error by an a-posteriori error estimate. The idea is that the error of the difference between the (unknown) optimal control and its suboptimal approximation computed by the MOR-based discretization can be estimated without knowing the optimal solution (𝔶_α, 𝔲_α) to $\tilde{P}$ . The techniques are based on well-known Galerkin-type estimates. In the Appendix we have summarized the results. For more details we refer to [8, 10, 11], for instance.

From Figure 4 it can be seen that in each step of the MPC algorithm the error in the computed suboptimal controls are smaller than 0.1. The impact of the truncation value ℓ on the error is thoroughly examined in [43].

FIGURE 4

Figure 4. Reduced MPC with one reduced gramain (left) and POD (right) basis with reduced basis rank ℓ = 50 and range(α) = [0.1]. One every column are the absolute L²-spatial-difference in state and control over the time, i.e., $t \mapsto ‖ ȳ (t) - ȳ^{ℓ} (t) | |_{M}$ and $t \mapsto ‖ ū (t) - ū^{ℓ} (t) | |_{M}$ (top) and the particular error estimates for every MPC step.

5 Conclusions

In the present paper empirical gramian have been used to derive reduced order models tailored for the first-order optimality system which consists of the coupled state and dual equation. Here, data of the optimization problem serve as inputs for the computation of the gramian matrices. The validation by a-posteriori error analysis shows that we get reliable reduced order models which turn out to perform better than reduced order models based on a standard POD, which utilizes exactly the same snapshots for the ROM. The reason of such performances is the possibility of including controllability and observability knowledge through the empirical gramian approach. This leads to the fact that the gramian-based ROM is more appropriate for the optimization purposes. Finally, as future perspective, the proposed approach can be applied to nonlinear dynamical systems as well. Here, the accuracy could be checked by hierarchical a-posteriori error analysis; see, e.g., [44], which extends the work in [45]. To get an efficient numerical reduced order realization for the nonlinear dynamical system, one would need to apply the (discrete) empirical interpolation method (cf. [22–24]) or missing point evaluation (cf. [25]).

Data availability statement

The original contributions presented in the study are included in the article/Supplementary material, further inquiries can be directed to the corresponding author.

Author contributions

All authors listed have made a substantial, direct, and intellectual contribution to the work and approved it for publication.

Funding

The authors acknowledge funding by the Independent Research Grant Efficient model order reduction for model predictive control of non-linear input-output dynamical systems of the Zukunftskolleg at the University of Konstanz and by the Deutsche Forschungsgemeinschaft (DFG) for the project Localized Reduced Basis Methods for PDE-Constrained Parameter Optimization under contract VO 1658/6-1.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fams.2023.1144142/full#supplementary-material

References

1. Becker R, Kapp H, Rannacher R. Adaptive finite element methods for optimal control of partial differential equations: basic concept. SIAM J Control Optimiz. (2000) 39:113–32. doi: 10.1137/S0363012999351097

CrossRef Full Text | Google Scholar

2. Liu W, Yan N. A posteriori error estimates for distributed convex optimal control problems. Adv Comp Math. (2001) 15:285–309. doi: 10.1023/A:1014239012739

CrossRef Full Text | Google Scholar

3. Holmes P, Lumley J, Berkooz G, Rowley C. Turbulence, Coherent Structures, Dynamical Systems and Symmetry. 2nd ed. Cambridge: Cambridge University Press (2013).

Google Scholar

4. Hesthaven J, Rozza G, Stamm B. Certified Reduced Basis Methods for Parametrized Partial Differential Equations. SpringerBriefs in Mathematics. Cham: Springer (2016).

Google Scholar

5. Quarteroni A, Manzoni A, Negri F. Reduced Basis Methods for Partial Differential Equations: An Introduction. UNITEXT-La Matematica per il 3+2. Cham: Springer (2016).

Google Scholar

6. Benner P, Ohlberger M, Cohen A, Willcox K. Model Reduction and Approximation. Philadelphia, PA: Society for Industrial and Applied Mathematics (2017).

Google Scholar

7. Benner P, Mehrmann V, Sorensen D. Dimension reduction of large-scale systems: Proceedings of a Workshop held in Oberwolfach, Germany, October 19–25, 2003. In: Lecture Notes in Computational Science and Engineering. Heidelberg: Springer (2006).

Google Scholar

8. Hinze M, Volkwein S. Error estimates for abstract linear-quadratic optimal control problems using proper orthogonal decomposition. Comput Optim Appl. (2008) 39:319–45. doi: 10.1007/s10589-007-9058-4

CrossRef Full Text | Google Scholar

9. Antoulas AC. Approximation of Large-Scale Dynamical Systems. Advances in Design and Control. Philadelphia, PA: SIAM (2009).

Google Scholar

10. Tröltzsch F, Volkwein S. Pod a-posteriori error estimates for linear-quadratic optimal control problems. Comput Optim Appl. (2009) 44:83–115. doi: 10.1007/s10589-008-9224-3

CrossRef Full Text | Google Scholar

11. Gubisch M, Volkwein S. Proper Orthogonal Decomposition for Linear-Quadratic Optimal Control, Chapter 1. Philadelphia, PA: SIAM (2017). p. 3–63.

Google Scholar

12. Ehrgott M. Multicriteria Optimization. 2nd ed. Berlin: Springer (2005).

Google Scholar

13. Banholzer S. ROM-Based Multiobjective Optimization with PDE Constraints (Ph.D. thesis). University of Konstanz (2021). Available online at: http://nbnresolving.de/urn:nbn:de:bsz:352-2-1g98y1ic7inp29 (accessed November 28, 2023).

Google Scholar

14. Grüune L, Pannek J. Nonlinear Model Predictive Control: Theory and Algorithms. 2nd ed. London: Springer (2016).

Google Scholar

15. Mechelli L. POD-based State-Constrained Economic Model Predictive Control of Convection-Diffusion Phenomena (Ph.D. thesis), University of Konstanz (2019). Available online at: http://nbn-resolving.de/urn:nbn:de:bsz:352-2-2zoi8n9sxknm1 (accessed November 28, 2023).

Google Scholar

16. Bellman R. Introduction to the Mathematical Theory of Control Processes. Vol. II: Nonlinear Processes. New York, NY: Academic Press (1971).

Google Scholar

17. Isidori A. Nonlinear Control Systems: An Introduction. 2nd ed. Berlin; Heidelberg: Springer-Verlag (1989).

Google Scholar

18. Zhou K, Doyle J, Glover K. Robust and Optimal Control. Upper Saddle River, NJ: Prentice-Hall (1996).

Google Scholar

19. Hernández-Lerma O, Laura-Guarachi L, Mendoza-Palacios S, González-Sánchez D. An Introduction to Optimal Control Theory. The Dynamic Programming Approach. Texts in Applied Mathematics. Cham: Springer (2023).

Google Scholar

20. Lall S, Marsden JE, Glavaški S. Empirical model reduction of controlled nonlinear systems. In: IFAC Proceedings Volumes, 32:2598–2603. 14th IFAC World Congress 1999, Beijing, Chia, 5–9 July. Oxford: Pergamon (1999).

Google Scholar

21. Garcia J, Basilio J. Computation of reduced-order models of multivariable systems by balanced truncation. Int J Syst Sci. (2002) 33:847–54. doi: 10.1080/0020772021000017308

CrossRef Full Text | Google Scholar

22. Barrault M, Maday Y, Nguyen N, Patera A. An empirical interpolation method: application to efficient reduced-basis discretization of partial differential equations. Comp Rendus Math. (2004) 339:667–72. doi: 10.1016/j.crma.2004.08.006

CrossRef Full Text | Google Scholar

23. Chaturantabut S, Sorensen D. Nonlinear model reduction via discrete empirical interpolation. SIAM J Sci Comp. (2010) 32:2737–64. doi: 10.1137/090766498

CrossRef Full Text | Google Scholar

24. Chaturantabut S, Sorensen D. A state space estimate for POD-DEIM nonlinear model reduction. SIAM J Numer Anal. (2012) 50:46–63. doi: 10.1137/110822724

CrossRef Full Text | Google Scholar

25. Astrid P, Weiland S, Willcox K, Backx T. Missing point estimation in models described by proper orthogonal decomposition. IEEE Trans Automat Contr. (2008) 53:2237–51. doi: 10.1109/TAC.2008.2006102

CrossRef Full Text | Google Scholar

26. Himpe C. emgr-The empirical gramian framework. Algorithms. (2018) 11:91. doi: 10.3390/a11070091

CrossRef Full Text | Google Scholar

27. Kalman R. Mathematical description of linear dynamical systems. SIAM J. Control Optim. (1963) 1:182–92. doi: 10.1137/0301010

CrossRef Full Text | Google Scholar

28. Kailath T. Linear Systems. Englewood Cliffs, NJ: Prentice-Hall (1980).

Google Scholar

29. Baur U, Benner P. Cross-gramian based model reduction for datasparse systems. Electron Transact Numer Anal. (2008) 31:256–70. doi: 10.1137/070711578

CrossRef Full Text | Google Scholar

30. Moore B. Principal component analysis in linear systems: controllability, observability, and model reduction. IEEE Trans Automat Contr. (1981) 26:17–32. doi: 10.1109/TAC.1981.1102568

CrossRef Full Text | Google Scholar

31. Aldhaheri R. Model order reduction via real schur-formdecomposition. Int J Control. (1991) 53:709–16. doi: 10.1080/00207179108953642

CrossRef Full Text | Google Scholar

32. Gallivan K, Vandendorpe A, Van Dooren P. Sylvester equations and projection-based model reduction. J Comput Appl Math. (2004) 162:213–29. doi: 10.1016/j.cam.2003.08.026

CrossRef Full Text | Google Scholar

33. Sorensen D, Antoulas A. The sylvester equation and approximate balanced reduction. Linear Algebra Appl. (2002) 351/2:671–700. doi: 10.1016/S0024-3795(02)00283-5

CrossRef Full Text | Google Scholar

34. Betts JT. Practical methods for optimal control and estimation using nonlinear programming. In: Advances in Design and Control. 2nd, ed. Philadelphia, PA: SIAM (2010).

Google Scholar

35. Villanueva M, Jones C, Houska B. Towards global optimal control via koopman lifts. Automatica. (2021) 132:109610. doi: 10.1016/j.automatica.2021.109610

CrossRef Full Text | Google Scholar

36. Evans L. Partial Differential Equations. Graduate Studies in Mathematics. American Mathematical Society. Providence, RI: American Mathematical Society (2010).

Google Scholar

37. Kunisch K, Volkwein S. Galerkin proper orthogonal decomposition methods for parabolic problems. Numer Math. (2001) 90:117–48. doi: 10.1007/s002110100282

CrossRef Full Text | Google Scholar

38. Dautray R, Lions J-L. Mathematical Analysis and Numerical Methods for Science and Technology. Vol. 5: Evolution Problems I. Berlin: Springer (2000).

Google Scholar

39. Pazy A. Semigoups of Linear Operators and Applications to Partial Differential Equation. Berlin: Springer (1983).

Google Scholar

40. Zadeh L. Optimality and non-scalar-valued performance criteria. IEEE Trans Automat Contr. (1963) 8:59–60. doi: 10.1109/TAC.1963.1105511

CrossRef Full Text | Google Scholar

41. Azmi B, Kunisch K. A hybrid finite-dimensional rhc for stabilization of time-varying parabolic equations. SIAM J Control Optimiz. (2019) 57:3496–526. doi: 10.1137/19M1239787

CrossRef Full Text | Google Scholar

42. Mechelli L, Volkwein S. POD-based economic model predictive control for heat-convection phenomena. In: Radu FA, Kumar K, Berre I, Nordbotten JM, Pop IS, , editors. Numerical Mathematics and Advanced Applications ENUMATH 2017. Cham: Springer International Publishing (2019). p. 663–671.

Google Scholar

43. Rohleff J. Data-Driven Model-Order Reduction for Model Predictive Control (Master thesis). University of Konstanz (2023).

Google Scholar

44. Azmi B, Petrocchi A, Volkwein S. Adaptive parameter optimization for an elliptic-parabolic system using the reduced-basis method with hierarchical a-posteriori error analysis. arXiv. (2023)

Google Scholar

45. Hain S, Ohlberger M, Radic M, Urban K. A hierarchical a posteriori error estimator for the reduced basis method. Adv Comp Math. (2019) 45:2191–214. doi: 10.1007/s10444-019-09675-z

CrossRef Full Text | Google Scholar

Keywords: model order reduction, empirical gramians, proper orthogonal decomposition, parabolic partial differential equations, multiobjective optimization, model predictive control, a-posteriori error analysis

Citation: Mechelli L, Rohleff J and Volkwein S (2023) Model order reduction for optimality systems through empirical gramians. Front. Appl. Math. Stat. 9:1144142. doi: 10.3389/fams.2023.1144142

Received: 13 January 2023; Accepted: 14 November 2023;
Published: 11 December 2023.

Edited by:

Jan Heiland, Max Planck Society, Germany

Reviewed by:

Abdulsattar Abdullah Hamad, University of Samarra, Iraq
Zhihua Xiao, Yangtze University, China

Copyright © 2023 Mechelli, Rohleff and Volkwein. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Stefan Volkwein, U3RlZmFuLlZvbGt3ZWluQHVuaS1rb25zdGFuei5kZQ==

^†These authors have contributed equally to this work and share first authorship

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.