A topological model for partial equivariance in deep learning and data analysis

Ferrari, Lucia; Frosini, Patrizio; Quercioli, Nicola; Tombari, Francesca

doi:10.3389/frai.2023.1272619

METHODS article

Front. Artif. Intell., 21 December 2023

Sec. Machine Learning and Artificial Intelligence

Volume 6 - 2023 | https://doi.org/10.3389/frai.2023.1272619

This article is part of the Research TopicSymmetry as a Guiding Principle in Artificial and Brain Neural Networks, Volume IIView all 7 articles

A topological model for partial equivariance in deep learning and data analysis

Lucia Ferrari¹

Patrizio Frosini¹

Nicola Quercioli²^*

Francesca Tombari³

¹Department of Mathematics, University of Bologna, Bologna, Italy
²Department of Electrical, Electronic, and Information Engineering (DEI) and WiLab-National Laboratory for Wireless Communications, National Inter-University Consortium for Telecommunications (CNIT), University of Bologna, Bologna, Italy
³Department of Mathematics, Royal Institute of Technology (KTH), Stockholm, Sweden

In this article, we propose a topological model to encode partial equivariance in neural networks. To this end, we introduce a class of operators, called P-GENEOs, that change data expressed by measurements, respecting the action of certain sets of transformations, in a non-expansive way. If the set of transformations acting is a group, we obtain the so-called GENEOs. We then study the spaces of measurements, whose domains are subjected to the action of certain self-maps and the space of P-GENEOs between these spaces. We define pseudo-metrics on them and show some properties of the resulting spaces. In particular, we show how such spaces have convenient approximation and convexity properties.

1 Introduction

Over the past decade, several geometric techniques have been incorporated into Deep Learning (DL), giving rise to the new field of Geometric Deep Learning (GDL) (Cohen and Welling, 2016; Masci et al., 2016; Bronstein et al., 2017). This geometric approach to deep learning is exploited with a dual purpose. On one hand, geometry provides a common mathematical framework to study neural network architectures. On the other hand, a geometric bias, based on prior knowledge of the data set, can be incorporated into DL models. In this second case, GDL models take advantage of the symmetries imposed by an observer, which encode and elaborate the data. The general blueprint of many deep learning architectures is modeled by group equivariance to encode such properties. If we consider measurements on a data set and a group encoding their symmetries, i.e., transformations taking admissible measurements (for example, rotation or translation of an image), the group equivariance is the property guaranteeing that such symmetries are preserved after applying an operator (e.g., a layer in a neural network) on the observed data. In particular, let us assume that the input measurements Φ, the output measurements Ψ and, respectively, their symmetry groups G and H are given. Then the agent F: Φ → Ψ is T-equivariant if F(φg) = F(φ)T(g), for any φ in Φ and any g in G, where T is a group homomorphism from G to H. In the theory of Group Equivariant Non-Expansive Operators (GENEOs) (Camporesi et al., 2018; Bergomi et al., 2019; Cascarano et al., 2021; Bocchi et al., 2022, 2023; Conti et al., 2022; Frosini et al., 2023; Micheletti, 2023), as in many other GDL models, the collection of all symmetries is represented by a group, but in some applications, the group axioms do not necessarily hold since real-world data rarely follow strict mathematical symmetries due to noise, incompleteness, or symmetry-breaking features. As an example, we can consider a data set that contains images of digits and the group of rotations as the group acting on it. Rotating an image of the digit “6” by a straight angle returns an image that the user would most likely interpret as “9”. At the same time, we may want to be able to rotate the digit “6” by small angles while preserving its meaning (see Figure 1).

Figure 1

Figure 1. Example of a symmetry breaking feature. Applying a rotation g of π/4, the digit “6” preserves its meaning (left). The rotation g⁴ of π is, instead, not admissible, since it transforms the digit “6” into the digit “9” (right).

It is then desirable to extend the theory of GENEOs by relaxing the hypotheses on the sets of transformations. The main aim of this article is to give a generalization of the results obtained for GENEOs to a new mathematical framework, where the property of equivariance is maintained only for some transformations of the measurements, encoding a partial equivariance with respect to the action of the group of all transformations. To this end, we introduce the concept of Partial Group Equivariant Non-Expansive Operator (P-GENEO).

In this new model, there are some substantial differences with respect to the theory of GENEOs:

1. The user chooses two sets of measurements in input: the one containing the original measurements and another set that encloses the admissible variations of such measurements, defined in the same domain. For example, in the case where the function that represents the digit “6” is being observed, we define an initial space that contains this function and another space that contains certain small rotations of “6” but excludes all the others.

2. Instead of considering a group of transformations, we consider a set containing only those that do not change the meaning of our data, i.e., only those associating with each original measurement another one inside the set of its admissible variations. Therefore, by choosing the initial spaces, the user defines also which transformations of the data set, given by right composition, are admissible and which ones are not.

3. We define partial GENEOs, or P-GENEOs, as a generalization of GENEOs. P-GENEOs are operators that respect the two sets of measurements in input and the set of transformations relating them. The term “partial” refers to the fact that the set of transformations does not necessarily need to be a group.

With these assumptions in mind, we will extend the results proven in the study by Bergomi et al. (2019) and Quercioli (2021a) for GENEOs. We will define suitable pseudo-metrics on the spaces of measurements, the set of transformations, and the set of non-expansive operators. Grounding on their induced topological structures, we prove compactness and convexity of the space of P-GENEOs under the assumption that the function spaces are compact and convex. These are useful properties from a computational point of view. For example, compactness guarantees that the space can be approximated by a finite set. Moreover, convexity allows us to take the convex combination of P-GENEOs in order to generate new ones.

2 Related work

The main motivation for our study is that observed data rarely follow strict mathematical symmetries. This may be due, for example, to the presence of noise in data measurements. The idea of relaxing the hypothesis of equivariance in GDL and data analysis is not novel, as it is shown by the recent increase in the number of publications in this area (see, for example, Weiler and Cesa, 2019; Finzi et al., 2021; Romero and Lohit, 2022; van der Ouderaa et al., 2022; Wang et al., 2022; Chachlski et al., 2023).

We identify two main ways to transform data via operators that are not strictly equivariant due to the lack of strict symmetries of the measurements. On one hand, one could define approximately equivariant operator. These are operators for which equivariance holds up to small perturbation. In this case, given two groups, G and H acting on the spaces of measurements Φ and Ψ, respectively, and a homomorphism between them, T: G → H, we say that F: Φ → Ψ is ε-equivariant if, for any g ∈ G and φ ∈ Φ, ||F(φg) − F(φ)T(g)||_∞ ≤ ε. Alternatively, when defining operators transforming the measurements of certain data sets, equivariance may be substituted by partial equivariance. In this case, equivariance is guaranteed for a subset of the groups acting on the space of measurements, with no guarantees for this subset to be a subgroup. Among the previously cited articles about relaxing the property of equivariance in DL, the approach by Finzi et al. (2021) is closer to an approximate equivariance model. Here, the authors use a Bayesian approach to introduce an inductive bias in their network that is sensitive to approximate symmetry. The authors of Romero and Lohit (2022) utilize a partial equivariance approach, where a probability distribution is defined and associated with each group convolutional layer of the architecture, and the parameters defining it are either learnt, to achieve equivariance, or partially learnt, to achieve partial equivariance. The importance of choosing equivariance with respect to different acting groups on each layer of the CNN was actually first observed in the study by Weiler and Cesa (2019) for the group of Euclidean isometries in ℝ².

The point of view of this article is closer to the latter. Our P-GENEOs are indeed operators that preserve the action of certain sets ruling the admissibility of the transformations of the measurements of our data sets. Moreover, non-expansiveness plays a crucial role in our model. This is, in fact, the feature allowing us to obtain compactness and approximability in the space of operators, distinguishing our model from the existing literature on equivariant machine learning.

3 Mathematical setting

3.1 Data sets and operations

Consider a set X and the normed vector space $(ℝ_{b}^{X}, ∥ \cdot ∥_{\infty})$ , where $ℝ_{b}^{X}$ is the space of all bounded real-valued functions on X and ∥·∥_∞ is the usual uniform norm, i.e., for any $f \in ℝ_{b}^{X}$ , $∥ f ∥_{\infty} : = \sup_{x \in X} | f (x) |$ . On the set X, the space of transformations is given by elements of Aut(X), i.e., the group of bijections from X to itself. Then, we can consider the right group action $R$ defined as follows (we represent composition as a juxtaposition of functions):

R : ℝ_{b}^{X} \times Aut (X) \to ℝ_{b}^{X}, (φ, s) \mapsto φ s .

Remark 3.1. For every s ∈ Aut(X), the map $R_{s} : ℝ_{b}^{X} \to ℝ_{b}^{X},$ with $R_{s} (φ) : = φ s$ preserves the distances. In fact, for any $φ_{1}, φ_{2} \in ℝ_{b}^{X}$ , by bijectivity of s, we have that

\begin{array}{l} ∥ R_{s} (φ_{1}) - R_{s} (φ_{2}) ∥_{\infty} = sup_{x \in X} | φ_{1} s (x) - φ_{2} s (x) | \\ = sup_{y \in X} | φ_{1} (y) - φ_{2} (y) | \\ = ∥ φ_{1} - φ_{2} ∥_{\infty} . \end{array}

In our model, our data sets are represented as two sets Φ and Φ′ of bounded real-valued measurements on X. In particular, X represents the space where the measurements can be made, Φ is the space of permissible measurements, and Φ′ is a space which Φ can be transformed into, without changing the interpretation of its measurements after a transformation is applied. In other words, we want to be able to apply some admissible transformations on the space X so that the resulting changes in the measurements in Φ are contained in the space Φ′. Thus, in our model, we consider operations on X in the following way:

Definition 3.2. A (Φ, Φ′)-operation is an element s of Aut(X) such that, for any measurement φ ∈ Φ, the composition φs belongs to Φ′. The set of all (Φ, Φ′) operations is denoted by ${Aut}_{Φ, Φ^{'}} (X)$ .

Remark 3.3. We can observe that the identity function id_X is an element of ${Aut}_{Φ, Φ^{'}} (X)$ if Φ ⊆ Φ′.

For any $s \in {Aut}_{Φ, Φ^{'}} (X)$ , the restriction to $Φ \times Au t_{Φ, Φ^{'}} (X)$ of the map $R_{s}$ takes values in Φ′ since $R_{s} (φ) : = φ s \in Φ^{'}$ for any φ ∈ Φ. We can consider the restriction of the map $R$ (for simplicity, we will continue to use the same symbol to denote this restriction):

R : Φ \times Au t_{Φ, Φ^{'}} (X) \to Φ^{'}, (φ, s) \mapsto φ s

where $R (φ, s) = R_{s} (φ)$ , for every $s \in {Aut}_{Φ, Φ^{'}} (X)$ and every φ ∈ Φ.

Definition 3.4. Let X be a set. A perception triple is a triple (Φ, Φ′, S) with $Φ, Φ^{'} \subseteq ℝ_{b}^{X}$ and $S \subseteq {Aut}_{Φ, Φ^{'}} (X)$ . The set X is called the domain of the perception triple and is denoted by dom(Φ, Φ′, S).

Example 3.5. Given X = ℝ², consider two rectangles R and R′ in X. Assume Φ: = {φ: X → [0, 1]: supp(φ) ⊆ R} and Φ′: = {φ′:X → [0, 1]: supp(φ′) ⊆ R′}. We recall that, if we consider a function f: X → ℝ, the support of f is the set of points in the domain, where the function does not vanish, i.e., supp(f) = {x ∈ X | f(x) ≠ 0}. Consider S as the set of translations that bring R into R′. The triple (Φ, Φ′, S) is a perception triple. If Φ represents a set of gray level images, S determines which translations can be applied to our pictures.

3.2 Pseudo-metrics on data sets

In our model, considering a generic set X, data are represented by a space $Ω \subseteq ℝ_{b}^{X}$ of bounded real-valued functions. We endow the real line ℝ with the usual Euclidean metric and the space X with an extended pseudo-metric induced by Ω:

D_{X}^{Ω} (x_{1}, x_{2}) = sup_{ω \in Ω} | ω (x_{1}) - ω (x_{2}) |

for every x₁, x₂ ∈ X. The choice of this pseudo-metric over X means that two points can only be distinguished if they assume different values for some measurements. For example, if Φ contains only a constant function and X contains at least two points, the distance between any two points of X is always null.

The pseudo-metric space $X_{Ω} : = (X, D_{X}^{Ω})$ can be considered as a topological space with the basis

B_{Ω} = {B_{Ω} (x_{0}, r)}_{x_{0} \in X, r \in ℝ^{+}} = {{x \in X : D_{X}^{Ω} (x, x_{0}) < r}}_{x_{0} \in X, r \in ℝ^{+}},

and the induced topology is denoted by τ_Ω. The reason for considering a topological space X, rather than just a set, follows from the need of formalizing the assumption that data are stable under small perturbations.

Remark 3.6. In our case, there are two collections of functions Φ and Φ′ in $ℝ_{b}^{X}$ representing our data, both of which induce a topology on X. Hence, in the model, we consider two pseudo-metric spaces X_Φ and $X_{Φ^{'}}$ with the same underlying set X. If $Φ \subseteq Φ^{'} \subseteq ℝ_{b}^{X}$ , the topologies τ_Φ and $τ_{Φ^{'}}$ are comparable and, in particular, $τ_{Φ^{'}}$ is finer than τ_Φ.

Now, given a set $Ω \subseteq ℝ_{b}^{X}$ , we will prove a result about the compactness of the pseudo-metric space X_Ω. Before proceeding, let us recall the following lemma (e.g., see Gaal, 1964):

Lemma 3.7. Let (P,d) be a pseudo-metric space. The following conditions are equivalent:

1. P is totally bounded;

2. Every sequence in P admits a Cauchy subsequence.

Theorem 3.8. If Ω is totally bounded, X_Ω is totally bounded.

Proof: By Lemma 3.7, it will suffice to prove that every sequence in X admits a Cauchy subsequence with respect to the pseudo-metric $D_{X}^{Ω}$ . A sequence (x_i)_i∈ℕ in X_Ω is considered and a real number ε > 0 is taken. Since Ω is totally bounded, we can find a finite subset Ω_ε = {ω₁, …, ω_n} such that for every ω ∈ Ω, there exists ω_r ∈ Ω for which ||ω−ω_r||_∞ < ε. We can consider now the real sequence (ω₁(x_i))_i∈ℕ, which is bounded since $ω_{1} \in ℝ_{b}^{X}$ . From Bolzano-Weierstrass Theorem, it follows that we can extract a convergent subsequence (ω₁(x_{i_h}))_h∈ℕ. Again, we can extract from (ω₂(x_{i_h}))_h∈ℕ another convergent subsequence (ω₂(x_{i_{_h_t}}))_t∈ℕ. Repeating the process, we are able to extract a subsequence of (x_i)_i∈ℕ, that for simplicity of notation we can indicate as (x_{i_j})_j∈ℕ, such that (ω_k(x_{i_j}))_j∈ℕ is a convergent subsequence in ℝ, and hence a Cauchy sequence in ℝ, for every k ∈ {1, …, n}. By construction, Ω_ε is finite, then we can find an index $\bar{ȷ}$ such that for any k ∈ {1, …, n}

\begin{array}{l} | ω_{k} (x_{i_{ℓ}}) - ω_{k} (x_{i_{m}}) | \leq ε, for every ℓ, m \geq \bar{ȷ} . \end{array}

Furthermore, we have that, for any ω ∈ Ω, any ω_k ∈ Ω_ε, and any ℓ, m ∈ ℕ

\begin{array}{l} | ω (x_{i_{ℓ}}) - ω (x_{i_{m}}) | \leq | ω (x_{i_{ℓ}}) - ω_{k} (x_{i_{ℓ}}) | + \\ | ω_{k} (x_{i_{ℓ}}) - ω_{k} (x_{i_{m}}) | + | ω_{k} (x_{i_{m}}) - ω (x_{i_{m}}) | \\ \leq || ω - ω_{k} {||}_{\infty} + | ω_{k} (x_{i_{ℓ}}) - ω_{k} (x_{i_{m}}) | + \\ || ω_{k} - ω {||}_{\infty} . \end{array}

We observe that the choice of $\bar{ȷ}$ depends only on ε and Ω_ε not on k. Then, choosing a ω_k ∈ Ω_ε such that ||ω_k−ω||_∞ < ε, we get ||ω(x_{i_ℓ})−ω(x_{i_m})||_∞ < 3ε for every ω ∈ Ω and every $ℓ, m \geq \bar{ȷ}$ . Then,

\begin{array}{l} D_{X}^{Ω} (x_{i_{ℓ}}, x_{i_{m}}) = sup_{ω \in Ω} | ω (x_{i_{ℓ}}) - ω (x_{i_{m}}) | < 3 ε for every ℓ, m \geq \bar{ȷ} . \end{array}

Then (x_{i_j})_j∈ℕ is a Cauchy sequence in X_Ω. For Lemma 3.7 the statement holds.

Corollary 3.9. If Ω is totally bounded and X_Ω is complete, X_Ω is compact.

Proof: From Theorem 3.8, we have that X_Ω is totally bounded, and since by hypothesis it is also complete, it is compact.

Now, we will prove that the choice of the pseudo-metric $D_{X}^{Ω}$ on X makes the functions in Ω non-expansive.

Definition 3.10. Two pseudo-metric spaces (P, d_P) and (Q, d_Q) are considered. A non-expansive function from (P, d_P) to (Q, d_Q) is a function f: P → Q such that d_Q(f(p₁), f(p₂)) ≤ d_P(p₁, p₂) for any p₁, p₂ ∈ P.

We denote as NE(P, Q) the space of all non-expansive functions from (P, d_P) to (Q, d_Q).

Proposition 3.11. Ω ⊆ NE(X_Ω, ℝ).

Proof: For any x₁, x₂ ∈ X, we have that

| ω (x_{1}) - ω (x_{2}) | \leq sup_{ω \in Ω} | ω (x_{1}) - ω (x_{2}) | = D_{X}^{Ω} (x_{1}, x_{2}) .

Then, the topology on X induced by $D_{X}^{Ω}$ naturally makes the measurements in Ω continuous. In particular, since the previous results hold for a generic $Ω \subseteq ℝ_{b}^{X}$ , they are also true for Φ and Φ′ in our model.

Remark 3.12. Assuming that (Φ, Φ′, S) is a perception triple. A function φ′ ∈ Φ′ may not be continuous from X_Φ to ℝ and a function φ ∈ Φ may not be continuous from $X_{Φ^{'}}$ to ℝ. In other words, the topology on X induced by the pseudo-metric of one of the function spaces does not make the functions in the other continuous.

Example 3.13. Assuming X = ℝ, for every a, b ∈ ℝ the functions φ_a:X → ℝ and $φ_{b}^{'} : X \to ℝ$ are defined by setting

φ_{a} (x) = {\begin{array}{l} 0 & if x \geq a \\ 1 & otherwise \end{array}, φ_{b}^{'} (x) = {\begin{array}{l} 0 & if x \leq b \\ 1 & otherwise \end{array} .

Suppose Φ: = {φ_a:a ≥ 0} and $Φ^{'} : = {φ_{b}^{'} : b \leq 0}$ . Consider the symmetry with respect to the y-axis, i.e., the map s(x) = −x. Surely, $s \in {Aut}_{Φ, Φ^{'}} (X)$ . We can observe that the function φ₁ ∈ Φ is not continuous from $X_{Φ}^{'}$ to ℝ; indeed $D_{X}^{Φ^{'}} (0, 2) = 0$ , but |φ₁(0)−φ₁(2)| = 1.

However, if Φ ⊆ Φ′, we have that the functions in Φ are also continuous on $X_{Φ^{'}}$ , indeed:

Corollary 3.14. If Φ ⊆ Φ′, then $Φ \subseteq NE (X_{Φ^{'}}, ℝ)$ .

Proof: By Proposition 3.11, the statement trivially holds since $Φ \subseteq Φ^{'} \subseteq NE (X_{Φ^{'}}, ℝ)$ .

3.3 Pseudo-metrics on the space of operations

Proposition 3.15. Every element of ${Aut}_{Φ, Φ^{'}} (X)$ is non-expansive from $X_{Φ^{'}}$ to X_Φ.

Proof: Considering a bijection $s \in {Aut}_{Φ, Φ^{'}} (X)$ we have that

\begin{array}{l} D_{X}^{Φ} (s (x_{1}), s (x_{2})) = sup_{φ \in Φ} | φ s (x_{1}) - φ s (x_{2}) | \\ = sup_{φ \in Φ s} | φ (x_{1}) - φ (x_{2}) | \\ \leq sup_{φ^{'} \in Φ^{'}} | φ^{'} (x_{1}) - φ^{'} (x_{2}) | = D_{X}^{Φ^{'}} (x_{1}, x_{2}) \end{array}

for every x₁, x₂ ∈ X, where Φs = {φs, φ ∈ Φ}. Then, $s \in NE (X_{Φ^{'}}, X_{Φ})$ and the statement is proved.

Now, we are ready to put more structure on ${Aut}_{Φ, Φ^{'}} (X)$ . Considering a set $Ω \subseteq ℝ_{b}^{X}$ of bounded real-valued functions, we can endow the set Aut(X) with a pseudo-metric inherited from Ω

D_{Aut}^{Ω} (s_{1}, s_{2}) : = sup_{ω \in Ω} || ω s_{1} - ω s_{2} {||}_{\infty}

for any s₁, s₂ in Aut(X).

Remark 3.16. Analogously to what happens in Remark 3.6 for X, the sets Φ and Φ′ can endow Aut(X) with two possibly different pseudo-metrics $D_{Aut}^{Φ}$ and $D_{Aut}^{Φ^{'}}$ . In particular, we can consider ${Aut}_{Φ, Φ^{'}} (X)$ as a pseudo-metric subspace of Aut(X) with the induced pseudo-metrics.

Remark 3.17. We observe that, for any s₁, s₂ in Aut(X),

\begin{array}{l} D_{Aut}^{Ω} (s_{1}, s_{2}) : = sup_{ω \in Ω} ∥ ω s_{1} - ω s_{2} ∥_{\infty} \\ = sup_{x \in X} sup_{ω \in Ω} | ω (s_{1} (x)) - ω (s_{2} (x)) | \\ = sup_{x \in X} D_{X}^{Ω} (s_{1} (x), s_{2} (x)) . & (3.3.1) \end{array}

In other words, the pseudo-metric $D_{Aut}^{Ω}$ , which is based on the action of the elements of Aut(X) on the set Ω, is exactly the usual uniform pseudo-metric on X_Ω.

3.4 The space of operations

Since we are only interested in transformations of functions in Φ, it would be natural to just endow $Au t_{Φ, Φ^{'}} (X)$ with the pseudo-metric $D_{Aut}^{Φ}$ . However, it is sometimes necessary to consider the pseudo-metric $D_{Aut}^{Φ^{'}}$ in order to guarantee the continuity of the composition of elements in $Au t_{Φ, Φ^{'}} (X)$ , whenever it is admissible. Considering two elements s, t in ${Aut}_{Φ, Φ^{'}} (X)$ such that st is still an element of ${Aut}_{Φ, Φ^{'}} (X)$ , i.e., for every function φ ∈ Φ, we have that φst ∈ Φ′. Then, for any φ ∈ Φ we have that

\begin{array}{l} φ^{'} : = φ s \in Φ s \subseteq Φ^{'}, φ^{'} t \in Φ^{'} . \end{array}

Therefore, t is also an element of ${Aut}_{Φ s, Φ^{'}} (X)$ . By definition, Φs is contained in Φ′ for every $s \in {Aut}_{Φ, Φ^{'}} (X)$ , and this justifies the choice of considering in ${Aut}_{Φ, Φ^{'}} (X)$ also the pseudo-metric $D_{Aut}^{Φ^{'}}$ . We have shown, in particular, that if s, t are elements of ${Aut}_{Φ, Φ^{'}} (X)$ such that st is still an element of ${Aut}_{Φ, Φ^{'}} (X)$ , t is an element of ${Aut}_{Φ s, Φ^{'}} (X)$ , which is an implication of the following proposition:

Proposition 3.18. Let $s, t \in {Aut}_{Φ, Φ^{'}} (X)$ . Then, $s t \in {Aut}_{Φ, Φ^{'}} (X)$ if $t \in {Aut}_{Φ s, Φ^{'}} (X)$ .

Proof: If the composition st belongs to ${Aut}_{Φ, Φ^{'}} (X)$ , we have already proven that $t \in {Aut}_{Φ s, Φ^{'}} (X)$ . On the other hand, if $t \in {Aut}_{Φ s, Φ^{'}} (X)$ , we have that $\bar{φ} t \in Φ^{'}$ for every $\bar{φ} \in Φ s$ . Since φ(st) = (φs)t, it follows that φ(st) ∈ Φ′ for every φ ∈ Φ. Therefore, $s t \in {Aut}_{Φ, Φ^{'}} (X)$ and the statement is proved.

Remark 3.19. Let $t \in {Aut}_{Φ, Φ^{'}} (X)$ . We can observe that if s ∈ Aut_Φ(X), Φs ⊆ Φ and $s t \in {Aut}_{Φ, Φ^{'}} (X)$ .

Lemma 3.20. Consider r, s, t ∈ Aut(X). For any $Ω \subseteq ℝ_{b}^{X}$ , it holds that

D_{Aut}^{Ω} (r t, s t) = D_{Aut}^{Ω} (r, s) .

Proof: Since $R_{t}$ preserves the distances, we have that:

\begin{array}{l} D_{Aut}^{Ω} (r t, s t) : = sup_{ω \in Ω} || ω r t - ω s t {||}_{\infty} \\ = sup_{ω \in Ω} || ω r - ω s {||}_{\infty} \\ = D_{Aut}^{Ω} (r, s) . \end{array}

Lemma 3.21. Consider r, s ∈ Aut(X) and $t \in {Aut}_{Φ, Φ^{'}} (X)$ . It holds that

D_{Aut}^{Φ} (t r, t s) \leq D_{Aut}^{Φ^{'}} (r, s) .

Proof: Since Φt ⊆ Φ′, we have that:

\begin{array}{l} D_{Aut}^{Φ} (t r, t s) = sup_{φ \in Φ} || φ t r - φ t s {||}_{\infty} \\ = sup_{φ^{'} \in Φ t} || φ^{'} r - φ^{'} s {||}_{\infty} \\ \leq sup_{φ^{'} \in Φ^{'}} || φ^{'} r - φ^{'} s {||}_{\infty} \\ = D_{Aut}^{Φ^{'}} (r, s) . \end{array}

Let Π be the set of all pairs (s, t) such that $s, t, s t \in {Aut}_{Φ, Φ^{'}} (X)$ . We endow Π with the pseudo-metric

D_{Π} ((s_{1}, t_{1}), (s_{2}, t_{2})) : = D_{Aut}^{Φ} (s_{1}, s_{2}) + D_{Aut}^{Φ^{'}} (t_{1}, t_{2})

and the corresponding topology.

Proposition 3.22. The function $◦ : Π \to ({Aut}_{Φ, Φ^{'}} (X), D_{Aut}^{Φ})$ that maps (s, t) to st is non-expansive and hence continuous.

Proof: Consider two elements (s₁, t₁), (s₂, t₂) of Π. By Lemma 3.20 and Lemma 3.21,

\begin{array}{l} D_{Aut}^{Φ} (s_{1} t_{1}, s_{2} t_{2}) \leq D_{Aut}^{Φ} (s_{1} t_{1}, s_{2} t_{1}) + D_{Aut}^{Φ} (s_{2} t_{1}, s_{2} t_{2}) \\ \leq D_{Aut}^{Φ} (s_{1}, s_{2}) + D_{Aut}^{Φ^{'}} (t_{1}, t_{2}) \\ = D_{Π} ((s_{1}, t_{1}), (s_{2}, t_{2})) . \end{array}

Therefore, the statement is proved.

Let Υ be the set of all s with $s, s^{- 1} \in {Aut}_{Φ, Φ^{'}} (X)$ .

Proposition 3.23. The function ${(\cdot)}^{- 1} : (Υ, D_{Aut}^{Φ^{'}}) \to ({Aut}_{Φ, Φ^{'}} (X), D_{Aut}^{Φ})$ , that maps s to s⁻¹, is non-expansive, and hence continuous.

Proof: Consider two bijections s₁, s₂ ∈ Υ. Because of Lemma 3.20 and Lemma 3.21, we obtain that

\begin{array}{l} D_{Aut}^{Φ} (s_{1}^{- 1}, s_{2}^{- 1}) = D_{Aut}^{Φ} (s_{1}^{- 1} s_{2}, s_{2}^{- 1} s_{2}) \\ = D_{Aut}^{Φ} (s_{1}^{- 1} s_{2}, i d_{X}) \\ = D_{Aut}^{Φ} (s_{1}^{- 1} s_{2}, s_{1}^{- 1} s_{1}) \\ \leq D_{Aut}^{Φ^{'}} (s_{2}, s_{1}) = D_{Aut}^{Φ^{'}} (s_{1}, s_{2}) . \end{array}

We have previously defined the map

R : Φ \times Au t_{Φ, Φ^{'}} (X) \to Φ^{'}, (φ, s) \mapsto φ s

where $R (Φ, s) = R_{s} (Φ)$ , for every $s \in {Aut}_{Φ, Φ^{'}} (X)$ .

Proposition 3.24. The function $R$ is continuous by choosing the pseudo-metric $D_{Aut}^{Φ}$ on ${Aut}_{Φ, Φ^{'}} (X)$ .

Proof: We have that

\begin{array}{l} ∥ R (φ, t) - R (\bar{φ}, s) ∥_{\infty} = ∥ φ t - \bar{φ} s ∥_{\infty} \\ \leq || φ t - φ s {||}_{\infty} + || φ s - \bar{φ} s {||}_{\infty} \\ = || φ t - φ s {||}_{\infty} + || φ - \bar{φ} {||}_{\infty} \\ \leq D_{Aut}^{Φ} (t, s) + || φ - \bar{φ} {||}_{\infty} \end{array}

for any $φ, \bar{φ} \in Φ$ and any $t, s \in {Aut}_{Φ, Φ^{'}} (X)$ . This proves that $R$ is continuous.

Now, we can give a result about the compactness of $({Aut}_{Φ, Φ^{'}} (X), D_{Aut}^{Φ})$ under suitable assumptions.

Proposition 3.25. If Φ and Φ′ are totally bounded, $({Aut}_{Φ, Φ^{'}} (X), D_{Aut}^{Φ})$ is totally bounded.

Proof: Consider a sequence (s_i)_i∈ℕ in ${Aut}_{Φ, Φ^{'}} (X)$ and a real number ε > 0. Since Φ is totally bounded, we can find a finite subset Φ_ε = {φ₁, …, φ_n} such that for every φ ∈ Φ, there exists φ_r ∈ Φ for which ||φ−φ_r||_∞ < ε. Now, consider the sequence (φ₁s_i)_i∈ℕ in Φ′. Since also Φ′ is totally bounded, from Lemma 3.7, it follows that we can extract a Cauchy subsequence (φ₁s_{i_h})_h∈ℕ. Again, we can extract another Cauchy subsequence (φ₂s_{i_{_h_t}})_t∈ℕ. Repeating the process for every k ∈ {1, …, n}, we are able to extract a subsequence of (s_i)_i∈ℕ, that for simplicity of notation we can indicate as (s_{i_j})_j∈ℕ, such that (φ_ks_{i_j})_j∈ℕ is a Cauchy sequence in Φ′ for every k ∈ {1, …, n}.

Since Φ_ε is finite, we can find an index $\bar{ȷ}$ such that for any k ∈ {1, …, n}

\begin{array}{l} || φ_{k} s_{i_{ℓ}} - φ_{k} s_{i_{m}} {||}_{\infty} \leq ε, for every ℓ, m \geq \bar{ȷ} . & (3.4.1) \end{array}

Furthermore, we have that for any φ ∈ Φ, any φ_k ∈ Φ_ε, and any ℓ, m ∈ ℕ

\begin{array}{l} || φ s_{i_{ℓ}} - φ s_{i_{m}} {||}_{\infty} \leq || φ s_{i_{ℓ}} - φ_{k} s_{i_{ℓ}} {||}_{\infty} + \\ || φ_{k} s_{i_{ℓ}} - φ_{k} s_{i_{m}} {||}_{\infty} + || φ_{k} s_{i_{m}} - φ s_{i_{m}} {||}_{\infty} \\ = || φ - φ_{k} {||}_{\infty} + || φ_{k} s_{i_{ℓ}} - φ_{k} s_{i_{m}} {||}_{\infty} + || φ_{k} - φ {||}_{\infty} . \end{array}

We observe that the choice of $\bar{ȷ}$ in (3.4.1) depends only on ε and Φ_ε not on φ. Then, choosing a φ_k ∈ Φ_ε such that ||φ_k−φ||_∞ < ε, we get ||φs_{i_ℓ}−φs_{i_m}||_∞ < 3ε for every φ ∈ Φ and every $ℓ, m \geq \bar{ȷ}$ . Hence, for every ℓ, m ∈ ℕ

\begin{array}{l} D_{Aut}^{Φ} (s_{i_{ℓ}}, s_{i_{m}}) = sup_{φ \in Φ} || φ s_{i_{ℓ}} - φ s_{i_{m}} {||}_{\infty} < 3 ε \end{array}

Therefore, (s_{i_j})_j∈ℕ is a Cauchy sequence in ${Aut}_{Φ, Φ^{'}} (X)$ . For Lemma 3.7, the statement holds.

Corollary 4.26. Assume that $S \subseteq {Aut}_{Φ, Φ^{'}} (X)$ . If Φ and Φ′ are totally bounded and $(S, D_{Aut}^{Φ})$ is complete, it is also compact.

Proof: From Proposition 3.25, we have that S is totally bounded, and since by hypothesis it is also complete, the statement holds.

4 The space of P-GENEOs

In this section, we introduce the concept of Partial Group Equivariant Non-Expansive Operator (P-GENEO). P-GENEOs allow us to transform data sets, preserving symmetries and distances and maintaining the acceptability conditions of the transformations. We will also describe some topological results about the structure of the space of P-GENEOs and some techniques used for defining new P-GENEOs in order to populate the space of P-GENEOs.

Definition 4.1. Let X, Y be sets and (Φ, Φ′, S), (Ψ, Ψ′, Q) be perception triples with domains X and Y, respectively. Consider a triple of functions (F, F′, T) with the following properties:

• F: Φ → Ψ, F′:Φ′ → Ψ′, T: S → Q;

• For any s, t ∈ S such that st ∈ S it holds that T(st) = T(s)T(t);

• For any s ∈ S such that s⁻¹ ∈ S it holds that T(s⁻¹) = T(s)⁻¹;

• (F, F′, T) is equivariant, i.e., F′(φs) = F(φ)T(s) for every φ ∈ Φ, s ∈ S.

The triple (F, F′, T) is called a perception map or a Partial Group Equivariant Operator (P-GEO) from (Φ, Φ′, S) to (Ψ, Ψ′, Q).

In Remark 3.3, we observed that ${id}_{X} \in {Aut}_{Φ, Φ^{'}} (X)$ if Φ ⊆ Φ′. Then, we can consider a perception triple (Φ, Φ′, S) with Φ ⊆ Φ′ and ${id}_{X} \in S \subseteq {Aut}_{Φ, Φ^{'}} (X)$ . Now, we will show how a P-GEO from this perception triple behaves.

Lemma 4.2. Consider two perception triples (Φ, Φ′, S) and (Ψ, Ψ′, Q) with domains X and Y, respectively, and with $i d_{X} \in S \subseteq {Aut}_{Φ, Φ^{'}} (X)$ . Let (F, F′, T) be a P-GEO from (Φ, Φ′, S) to (Ψ, Ψ′, Q). Then, Ψ ⊆ Ψ′ and $i d_{Y} \in Q \subseteq {Aut}_{Ψ, Ψ^{'}} (Y)$ .

Proof: Since (F, F′, T) is a P-GEO, by definition, we have that, for any s, t ∈ S such that st ∈ S, T(st) = T(s)T(t). Since id_X ∈ S, then

\begin{array}{l} T ({id}_{X}) = T ({id}_{X} {id}_{X}) = T ({id}_{X}) T ({id}_{X}) \end{array}

and hence $T ({id}_{X}) = {id}_{Y} \in Q \subseteq {Aut}_{Ψ, Ψ^{'}} (X)$ . Moreover, for Remark 3.3, we have that Ψ ⊆ Ψ′.

Proposition 4.3. Consider two perception triples (Φ, Φ′, S) and (Ψ, Ψ′, Q) with domains X and Y, respectively, and with $i d_{X} \in S \subseteq {Aut}_{Φ, Φ^{'}} (X)$ . Let (F, F′, T) be a P-GEO from (Φ, Φ′, S) to (Ψ, Ψ′, Q). Then $F^{'} |_{Φ} = F$ .

Proof: Since (F, F′, T) is a P-GEO, it is equivariant, and by Lemma 4.2, we have that

F^{'} (φ) = F^{'} (φ {id}_{X}) = F (φ) T ({id}_{X}) = F (φ) {id}_{Y} = F (φ)

for every φ ∈ Φ.

Definition 4.4. Assume that (Φ, Φ′, S) and (Ψ, Ψ′, Q) are perception triples. If (F, F′, T) is a perception map from (Φ, Φ′, S) to (Ψ, Ψ′, Q) and F, F′ are non-expansive, i.e.,

\begin{array}{l} || F (φ_{1}) - F (φ_{2}) {||}_{\infty} \leq || φ_{1} - φ_{2} {||}_{\infty}, \\ || F^{'} (φ_{1}^{'}) - F^{'} (φ_{2}^{'}) {||}_{\infty} \leq || φ_{1}^{'} - φ_{2}^{'} {||}_{\infty} \end{array}

for every φ₁, φ₂ ∈ Φ, $φ_{1}^{'}, φ_{2}^{'} \in Φ^{'}$ , (F, F′, T) is called a Partial Group Equivariant Non-Expansive Operator (P-GENEO).

In other words, a P-GENEO is a triple (F, F′, T) such that F, F′ are non-expansive and the following diagram commutes for every s ∈ S:

\begin{matrix} ℛ_{s} \\ Φ & \to & Φ^{'} \\ F ↓ & ↓ F^{'} \\ ℛ_{T (s)} \\ Ψ & \to & Ψ^{'} \end{matrix}

Remark 4.5. We can observe that a GENEO (see Bergomi et al., 2019) can be represented as a special case of P-GENEO, considering two perception triples (Φ, Φ′, S), (Ψ, Ψ′, Q) such that Φ = Φ′, Ψ = Ψ′, and the subsets containing the invariant transformations S and Q are groups (and then the map T: S → Q is a homomorphism). In this setting, a P-GENEO (F, F′, T) is a triple where the operators F, F′ are equal to each other (because of Proposition 4.3), and the map T is a homomorphism. Hence, instead of the triple, we can simply write the pair (F, T) that is a GENEO.

Considering two perception triples, we typically want to study the space of all P-GENEOs between them with the map T fixed. Therefore, when the map T is fixed and specified, we will simply consider pairs of operators (F, F′) instead of triples (F, F′, T), and we say that (F, F′) is a P-GENEO associated with or with respect to the map T. Moreover, in this case, we indicate the property of equivariance of the triple (F, F′, T) writing that the pair (F, F′) is T-equivariant.

Example 4.6. Let X = ℝ². Take a real number ℓ > 0. In X, consider the square Q₁: = [0, ℓ] × [0, ℓ] and its translation s_a of a vector $a = (a_{1}, a_{2}) \in ℝ^{2}$ , $Q_{1}^{'} : = [a_{1}, ℓ + a_{1}] \times [a_{2}, ℓ + a_{2}]$ . Analogously, let us consider a real number 0 < ε < ℓ and two squares inside Q₁ and $Q_{1}^{'}$ , Q₂: = [ε, ℓ−ε] × [ε, ℓ−ε] and $Q_{2}^{'} : = [a_{1} + ε, ℓ + a_{1} - ε] \times [a_{2} + ε, ℓ + a_{2} - ε]$ , as shown in Figure 2.

Figure 2

Figure 2. Squares used in Example 4.6.

Consider the following function spaces in $ℝ_{b}^{X}$ :

\begin{array}{l} Φ : = {φ : X \to ℝ | supp (φ) \subseteq Q_{1}} \\ Φ^{'} : = {φ^{'} : X \to ℝ | supp (φ^{'}) \subseteq Q_{1}^{'}} \\ Ψ : = {ψ : X \to ℝ | supp (ψ) \subseteq Q_{2}} \\ Ψ^{'} : = {ψ^{'} : X \to ℝ | supp (ψ^{'}) \subseteq Q_{2}^{'}} . \end{array}

Let $S : = {s_{a}^{- 1}}$ , where s_a is the translation by the vector a = (a₁, a₂). The triples (Φ, Φ′, S) and (Ψ, Ψ′, S) are perception triples. This example could model the translation of two nested gray-scale images. We want to build now an operator between these images in order to obtain a transformation that commutes with the selected translation. We can consider the triple of functions (F, F′, T) defined as follows: F: Φ → Ψ is the operator that maintains the output of functions in Φ at points of Q₂ and set them to zero outside it; analogously, F′:Φ′ → Ψ′ is the operator that maintains the output of functions in Φ′ at points of $Q_{2}^{'}$ and set them to zero outside it and T = id_S. Therefore, the triple (F, F′, T) is a P-GENEO from (Φ, Φ′, S) to (Ψ, Ψ′, S). It turns out that the maps are non-expansive, and the equivariance holds

F^{'} (φ s_{a}^{- 1}) = F (φ) T (s_{a}^{- 1}) = F (φ) s_{a}^{- 1}

for any φ ∈ Φ. From the point of view of application, we are considering two square images and their translations, and we apply an operator that “cuts” the images, taking into account only the part of the image that interests the observer. This example justifies the definition of P-GENEO as a triple of operators (F, F′, T), without requiring F and F′ to be equal in the possibly non-empty intersection of their domains. In fact, if φ is a function contained in Φ ∩ Φ′, its image via F and F′ may be different.

4.1 Methods to construct P-GENEOs

Starting from a finite number of P-GENEOs, we will illustrate some methods to construct new P-GENEOs. First of all, the composition of two P-GENEOs is still a P-GENEO.

Proposition 4.7. Given two composable P-GENEOs, $(F_{1}, F_{1}^{'}, T_{1}) : (Φ, Φ^{'}, S) \to (Ψ, Ψ^{'}, Q)$ and $(F_{2}, F_{2}^{'}, T_{2}) : (Ψ, Ψ^{'}, Q) \to (Ω, Ω^{'}, K)$ , their composition defined as

(F, F^{'}, T) : = (F_{2} ◦ F_{1}, F_{2}^{'} ◦ F_{1}^{'}, T_{2} ◦ T_{1}) : (Φ, Φ^{'}, S) \to (Ω, Ω^{'}, K)

is a P-GENEO.

Proof: First, one could easily check that the map T = T₂ ◦ T₁ respects the second and the third property of Definition 4.1. Therefore, it remains to verify that F(Φ) ⊆ Ω, F′(Φ′) ⊆ Ω′ and the properties of equivariance and non-expansiveness are maintained.

1. Since F₁(Φ) ⊆ Ψ and F₂(Ψ) ⊆ Ω, we have that F(Φ) = (F₂ ◦ F₁)(Φ) = F₂(F₁(Φ)) ⊆ F₂(Ψ) ⊆ Ω. Analogously, F′(Φ′) ⊆ Ω′.

2. Since $(F_{1}, F_{1}^{'}, T_{1})$ and $(F_{2}, F_{2}^{'}, T_{2})$ are equivariant, (F, F′, T) is equivariant. Indeed, for every φ ∈ Φ, we have that

\begin{array}{l} F^{'} (φ s) = (F_{2}^{'} ◦ F_{1}^{'}) (φ s) = F_{2}^{'} (F_{1}^{'} (φ s)) \\ = F_{2}^{'} (F_{1} (φ) T_{1} (s)) = F_{2} (F_{1} (φ)) T_{2} (T_{1} (s)) \\ = (F_{2} ◦ F_{1}) (φ) (T_{2} ◦ T_{1}) (s) = F (φ) T (s) . \end{array}

3. Since F₁ and F₂ are non-expansive, F is non-expansive; indeed for every φ₁, φ₂ ∈ Φ, we have that

\begin{array}{l} || F (φ_{1}) - F (φ_{2}) {||}_{\infty} = || (F_{2} ◦ F_{1}) (φ_{1}) - (F_{2} ◦ F_{1}) (φ_{2}) {||}_{\infty} \\ = || F_{2} (F_{1} (φ_{1})) - F_{2} (F_{1} (φ_{2})) {||}_{\infty} \\ \leq || F_{1} (φ_{1}) - F_{1} (φ_{2}) {||}_{\infty} \\ \leq || φ_{1} - φ_{2} {||}_{\infty} . \end{array}

Analogously, F′ is non-expansive.

Given a finite number of P-GENEOs with respect to the same map T, we illustrate a general method to construct a new operator as a combination of them. Given two sets X and Y, consider a finite set {H₁, …, H_n} of functions from $Ω \subseteq ℝ_{b}^{X}$ to $ℝ_{b}^{Y}$ and a map $L : ℝ^{n} \to ℝ$ , where ℝⁿ is endowed with the norm $|| (x_{1}, \dots, x_{n}) {||}_{\infty} : = {max}_{1 \leq i \leq n} | x_{i} |$ . We define $L^{*} (H_{1}, \dots, H_{n}) : Ω \to ℝ_{b}^{Y}$ as

L^{*} (H_{1}, \dots, H_{n}) (ω) : = [L (H_{1} (ω), \dots, H_{n} (ω))],

for any ω ∈ Ω, where $[L (H_{1} (ω), \dots, H_{n} (ω))] : Y \to ℝ$ is defined by setting

[L (H_{1} (ω), \dots, H_{n} (ω))] (y) : = L (H_{1} (ω) (y), \dots, H_{n} (ω) (y))

for any y ∈ Y. Now, consider two perception triples (Φ, Φ′, S) and (Ψ, Ψ′, Q) with domains X and Y, respectively, and a finite set of P-GENEOs $(F_{1}, F_{1}^{'}), \dots (F_{n}, F_{n}^{'})$ between them associated with the map T: S → Q. We can consider the functions $L^{*} (F_{1}, \dots, F_{n}) : Φ \to ℝ_{b}^{Y}$ and $L^{*} (F_{1}^{'}, \dots, F_{n}^{'}) : Φ^{'} \to ℝ_{b}^{Y}$ , defined as before and state the following result.

Proposition 4.8. Assume that $L : ℝ^{n} \to ℝ$ is non-expansive. If $L^{*} (F_{1}, \dots, F_{n}) (Φ) \subseteq Ψ$ and $L^{*} (F_{1}^{'}, \dots, F_{n}^{'}) (Φ^{'}) \subseteq Ψ^{'}$ , $(L^{*} (F_{1}, \dots, F_{n}), L^{*} (F_{1}^{'}, \dots, F_{n}^{'}))$ is a P-GENEO from (Φ, Φ′, S) to (Ψ, Ψ′, Q) with respect to T.

Proof: By hypothesis, $L^{*} (F_{1}, \dots, F_{n}) (Φ) \subseteq Ψ$ and $L^{*} (F_{1}^{'}, \dots, F_{n}^{'}) (Φ^{'}) \subseteq Ψ^{'}$ , so we just need to verify the properties of equivariance and non-expansiveness.

1. Since $(F_{1}, F_{1}^{'}), \dots, (F_{n}, F_{n}^{'})$ are T-equivariant, for any φ ∈ Φ and any s ∈ S, we have that:

\begin{array}{l} L^{*} (F_{1}^{'}, \dots, F_{n}^{'}) (φ s) = [L (F_{1}^{'} (φ s), \dots, F_{n}^{'} (φ s))] \\ = [L (F_{1} (φ) T (s), \dots, F_{n} (φ) T (s))] \\ = [L (F_{1} (φ), \dots, F_{n} (φ))] T (s) \\ = L^{*} (F_{1}, \dots, F_{n}) (φ) T (s) . \end{array}

Therefore, $(L^{*} (F_{1}, \dots, F_{n}), L^{*} (F_{1}^{'}, \dots, F_{n}^{'}))$ is T-equivariant.

2. Since F₁, …, F_n and $L$ are non-expansive, for any φ₁, φ₂ ∈ Φ, we have that:

\begin{array}{l} || L^{*} (F_{1}, \dots, F_{n}) (φ_{1}) - L^{*} (F_{1}, \dots, F_{n}) (φ_{2}) {||}_{\infty} \\ = max_{y \in Y} | [L (F_{1} (φ_{1}), \dots, F_{n} (φ_{1}))] (y) \\ - [L (F_{1} (φ_{2}), \dots, F_{n} (φ_{2}))] (y) | \\ = max_{y \in Y} | L (F_{1} (φ_{1}) (y), \dots, F_{n} (φ_{1}) (y)) \\ - L (F_{1} (φ_{2}) (y), \dots, F_{n} (φ_{2}) (y)) | \\ \leq max_{y \in Y} || (F_{1} (φ_{1}) (y) - F_{1} (φ_{2}) (y), \dots, F_{n} (φ_{1}) (y) - F_{n} (φ_{2}) (y)) {||}_{\infty} \\ = max_{y \in Y} max_{1 \leq i \leq n} | F_{i} (φ_{1}) (y) - F_{i} (φ_{2}) (y) | \\ = max_{1 \leq i \leq n} || F_{i} (φ_{1}) - F_{i} (φ_{2}) {||}_{\infty} \\ \leq || φ_{1} - φ_{2} {||}_{\infty} . \end{array}

Hence, $L^{*} (F_{1}, \dots, F_{n})$ is non-expansive. Analogously, since $F_{1}^{'}, \dots, F_{n}^{'}$ and $L$ are non-expansive, $L^{*} (F_{1}^{'}, \dots, F_{n}^{'})$ is non-expansive.

Therefore, $(L^{*} (F_{1}, \dots, F_{n}), L^{*} (F_{1}^{'}, \dots, F_{n}^{'}))$ is a P-GENEO from (Φ, Φ′, S) to (Ψ, Ψ′, Q) with respect to T.

Remark 4.9. The above result describes a general method to build new P-GENEOs, starting from a finite number of known P-GENEOs via non-expansive maps. Some examples of such non-expansive maps are the maximum function, the power mean, and the convex combination (for further details, see Frosini and Quercioli, 2017; Quercioli, 2021a,b).

4.2 Compactness and convexity of the space of P-GENEOs

Given two perception triples, under some assumptions on the data sets, it is possible to show two useful features in applications: compactness and convexity. These two properties guarantee, on the one hand, that the space of P-GENEOs can be approximated by a finite subset of them, and, on the other hand, that a convex combination of P-GENEOs is again a P-GENEO.

First, we define a metric on the space of P-GENEOs. Let X, Y be sets and consider two sets $Ω \subseteq ℝ_{b}^{X}, Δ \subseteq ℝ_{b}^{Y}$ , we can define the distance

D_{NE}^{Ω} (F_{1}, F_{2}) : = sup_{ω \in Ω} || F_{1} (ω) - F_{2} (ω) {||}_{\infty}

for every F₁, F₂ ∈ NE(Ω, Δ).

The metric D_P-GENEO on the space $F_{T}^{a l l}$ of all the P-GENEOs between the perception triples (Φ, Φ′, S) and (Ψ, Ψ′, Q) associated with the map T is defined as

\begin{array}{l} D_{P-GENEO} ((F_{1}, F_{1}^{'}), (F_{2}, F_{2}^{'})) : = max {D_{NE}^{Φ} (F_{1}, F_{2}), D_{NE}^{Φ^{'}} (F_{1}^{'}, F_{2}^{'})} \\ = max {sup_{φ \in Φ} || F_{1} (φ) - F_{2} (φ) {||}_{\infty}, sup_{φ^{'} \in Φ^{'}} || F_{1}^{'} (φ^{'}) - F_{2}^{'} (φ^{'}) {||}_{\infty}} \end{array}

for every $(F_{1}, F_{1}^{'}), (F_{2}, F_{2}^{'}) \in F_{T}^{a l l}$ .

4.2.1 Compactness

Before proceeding, we need to prove that the following result holds:

Lemma 4.10. If (P, d_P), (Q, d_Q) are compact metric spaces, NE(P, Q) is compact.

Proof: Theorem 5 in the study by Li et al. (2012) implies that NE(P, Q) is relatively compact, since it is a equicontinuous space of maps. Hence, it will suffice to show that NE(P, Q) is closed. Considering a sequence (F_i)_i∈ℕ in NE(P, Q) such that ${lim}_{i \to \infty} F_{i} = F$ , we have that

\begin{array}{l} d_{Q} (F (p_{1}), F (p_{2})) = lim_{i \to \infty} d_{Q} (F_{i} (p_{1}), F_{i} (p_{2})) \leq d_{P} (p_{1}, p_{2}) \end{array}

for every p₁, p₂ ∈ P. Therefore, F ∈ NE(P, Q). It follows that NE(P, Q) is closed.

Consider two perception triples (Φ, Φ′, S) and (Ψ, Ψ′, Q), with domains X and Y, respectively, and the space $F_{T}^{a l l}$ of P-GENEOs between them associated with the map T: S → Q. The following result holds:

Theorem 4.11. If Φ, Φ′, Ψ and Ψ′ are compact, $F_{T}^{a l l}$ is compact with respect to the metric D_P−GENEO.

Proof: By definition, $F_{T}^{a l l} \subseteq NE (Φ, Ψ) \times NE (Φ^{'}, Ψ^{'})$ . Since Φ, Φ′, Ψ and Ψ′ are compact, for Lemma 4.10, the spaces NE(Φ, Ψ) and NE(Φ′, Ψ′) are also compact, and then, by Tychonoff's Theorem, the product NE(Φ, Ψ) × NE(Φ′, Ψ′) is also compact, with respect to the product topology. Hence, to prove our statement, it suffices to show that $F_{T}^{a l l}$ is closed. Let us consider a sequence ${((F_{i}, F_{i}^{'}))}_{i \in ℕ}$ of P-GENEOs, converging to a pair (F, F′) ∈ NE(Φ, Ψ) × NE(Φ′, Ψ′). Since $(F_{i}, F_{i}^{'})$ is T-equivariant for every i ∈ ℕ and the action of Q on Ψ is continuous (see Proposition 3.24), (F, F′) belongs to $F_{T}^{a l l}$ . Indeed, we have that

\begin{array}{l} F^{'} (φ s) = lim_{i \to \infty} F_{i}^{'} (φ s) = lim_{i \to \infty} F_{i} (φ) T (s) = F (φ) T (s) \end{array}

for every s ∈ S and every φ ∈ Φ. Hence, $F_{T}^{a l l}$ is a closed subset of a compact set, and then, it is also compact.

4.2.2 Convexity

Assume that Ψ, Ψ′ are convex. Let $(F_{1}, F_{1}^{'}), \dots, (F_{n}, F_{n}^{'}) \in F_{T}^{a l l}$ and consider an n-tuple $(a_{1}, \dots, a_{n}) \in ℝ^{n}$ with a_i ≥ 0 for every i ∈ {1, …, n} and $\sum_{i = 1}^{n} a_{i} = 1$ . We can define two operators F_Σ:Φ → Ψ and $F_{Σ}^{'} : Φ^{'} \to Ψ^{'}$ as

F_{Σ} (φ) : = \sum_{i = 1}^{n} a_{i} F_{i} (φ), and F_{Σ}^{'} (φ^{'}) : = \sum_{i = 1}^{n} a_{i} F_{i}^{'} (φ^{'})

for every φ ∈ Φ, φ′ ∈ Φ′. We notice that the convexity of Ψ and Ψ′ guarantees that F_Σ and $F_{Σ}^{'}$ are well defined.

Proposition 4.12. $(F_{Σ}, F_{Σ}^{'})$ belongs to $F_{T}^{a l l}$ .

Proof: By hypothesis, for every i ∈ {1, …, n}, $(F_{i}, F_{i}^{'})$ is a perception map, and then:

\begin{array}{l} F_{Σ}^{'} (φ s) = \sum_{i = 1}^{n} a_{i} {F^{'}}_{i} (φ s) = \sum_{i = 1}^{n} a_{i} (F_{i} (φ) T (s)) \\ = (\sum_{i = 1}^{n} a_{i} F_{i} (φ)) T (s) \\ = F_{Σ} (φ) T (s) \end{array}

for every φ ∈ Φ and every s ∈ S. Furthermore, since for every i ∈ {1, …, n}, F_i(Φ) ⊆ Ψ and Ψ are convex, also F_Σ(Φ) ⊆ Ψ. Analogously, the convexity of Ψ′ implies that $F_{Σ}^{'} (Φ^{'}) \subseteq Ψ^{'}$ . Therefore $(F_{Σ}, F_{Σ}^{'})$ is a P-GEO. It remains to show the non-expansiveness of F_Σ and $F_{Σ}^{'}$ . Since F_i is non-expansive for any i, for every φ₁, φ₂ ∈ Φ, we have that

\begin{array}{l} || F_{Σ} (φ_{1}) - F_{Σ} (φ_{2}) {||}_{\infty} = || \sum_{i = 1}^{n} a_{i} F_{i} (φ_{1}) - \sum_{i = 1}^{n} a_{i} F_{i} (φ_{2}) {||}_{\infty} \\ = || \sum_{i = 1}^{n} a_{i} (F_{i} (φ_{1}) - F_{i} (φ_{2})) {||}_{\infty} \\ \leq \sum_{i = 1}^{n} | a_{i} | || F_{i} (φ_{1}) - F_{i} (φ_{2}) {||}_{\infty} \\ \leq \sum_{i = 1}^{n} | a_{i} | || φ_{1} - φ_{2} {||}_{\infty} = || φ_{1} - φ_{2} {||}_{\infty} . \end{array}

Analogously, since every $F_{i}^{'}$ is non-expansive, for every $φ_{1}^{'}, φ_{2}^{'} \in Φ^{'}$ , we have that

‖ F_{Σ}^{'} (φ_{1}^{'}) - F_{Σ^{'}} (φ_{2}^{'}) ‖_{\infty} \leq \sum_{i = 1}^{n} | a_{i} | ‖ φ_{1}^{'} - φ_{2}^{'} ‖_{\infty} = ‖ φ_{1}^{'} - φ_{2}^{'} ‖_{\infty} .

Therefore, we have proven that $(F_{Σ}, F_{Σ}^{'})$ is a P-GEO with F_Σ and $F_{Σ}^{'}$ non-expansive. Hence it is a P-GENEO.

Then, the following result holds:

Corollary 4.13. If Ψ, Ψ′ are convex, the set $F_{T}^{a l l}$ is convex.

Proof: It is sufficient to apply Proposition 4.12 for n = 2 by setting a₁ = t, a₂ = 1−t for 0 ≤ t ≤ 1.

5 P-GENEOs in applications

The importance of equivariance with respect to a group is becoming clear and widespread in many machine learning applications used for drug design, traffic forecasting, object recognition, and detection (see, e.g., Bronstein et al., 2021; Gerken et al., 2023). In some situations, however, requiring equivariance with respect to a whole group could even become an obstacle in the correct learning process of an equivariant neural network. In the following, we describe a possible application to optical character recognition (OCR), in which partial equivariance might be better suited than equivariance. Consider a planar transformation that deforms characters. One may notice that if such transformation is performed too many times, the letter may lose or change its meaning, as shown in Figure 3. Another example is given by a reparameterization of the domain of a sound message. While a limited contraction or dilation of the domain can preserve the meaning attributed to the sound, an iterated application of the same transformation can radically change the perceived message.

Figure 3

Figure 3. Applying a “shape-preserving” homeomorphism twice can change a letter k into a letter x.

Furthermore, experiments performed in the study by Weiler and Cesa (2019) have shown that tuning the level of equivariance in each layer of a neural network may increase the performance of the model. This tuning is, however, performed manually. The successive step, conducted in the study by Romero and Lohit (2022), is to learn the level of equivariance of each layer directly from data, possibly restricting to certain subsets whenever the full equivariance prevents a good classification performance. The authors of Romero and Lohit (2022) test their result on MNIST. In applications of this type, the use of P-GENEOs could allow partial equivariance to be framed within a precise mathematical model.

6 Conclusion

In this article, we proposed a generalization of some known results in the theory of GENEOs to a new mathematical framework, where the collection of all symmetries is represented by a subset of a group of transformations. We introduced P-GENEOs and showed that they are a generalization of GENEOs. We defined pseudo-metrics on the space of measurements and on the space of P-GENEOs and studied their induced topological structures. Under the assumption that the function spaces are compact and convex, we showed compactness and convexity of the space of P-GENEOs. In particular, compactness guarantees that any operator can be approximated by a finite number of operators belonging to the same space, while convexity allows us to build new P-GENEOs by taking convex combinations of P-GENEOs. Compactness and convexity together ensure that every strictly convex loss function on the space of P-GENEOs admits a unique global minimum. Given a collection of P-GENEOs, we presented a general method to construct new P-GENEOs as combinations of the initial ones.

Data availability statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Author contributions

LF: Writing – original draft. PF: Writing – original draft, Writing – review & editing. NQ: Writing – original draft, Writing – review & editing. FT: Writing – original draft, Writing – review & editing.

Funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This research has been partially supported by INdAM-GNSAGA. FT was supported by the Wallenberg AI, Autonomous System and Software Program (WASP) funded by Knut and Alice Wallenberg Foundation. PF and NQ carried out this work in the framework of the CNIT National Laboratory WiLab and the WiLab-Huawei Joint Innovation Center.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Bergomi, M. G., Frosini, P., Giorgi, D., and Quercioli, N. (2019). Towards a topological-geometrical theory of group equivariant non-expansive operators for data analysis and machine learning. Nat. Mach. Intellig. 1, 423–433. doi: 10.1038/s42256-019-0087-3

Crossref Full Text | Google Scholar

Bocchi, G., Botteghi, S., Brasini, M., Frosini, P., and Quercioli, N. (2023). On the finite representation of linear group equivariant operators via permutant measures. Ann. Mathem. Artif. Intellig. 91, 465–487. doi: 10.1007/s10472-022-09830-1

Crossref Full Text | Google Scholar

Bocchi, G., Frosini, P., Micheletti, A., Pedretti, A., Gratteri, C., Lunghini, F., et al. (2022). GENEOnet: a new machine learning paradigm based on Group Equivariant Non-Expansive Operators. An application to protein pocket detection. arXiv.

Google Scholar

Bronstein, M. M., Bruna, J., LeCun, Y., Szlam, A., and Vandergheynst, P. (2017). Geometric deep learning: going beyond euclidean data. IEEE Signal Process. Mag. 34, 18–42. doi: 10.1109/MSP.2017.2693418

Crossref Full Text | Google Scholar

Bronstein, M. M., Bruna, J., Cohen, T., and Velivcković, P. (2021). Geometric deep learning: grids, groups, graphs, geodesics, and gauges. arXiv.

Google Scholar

Camporesi, F., Frosini, P., and Quercioli, N. (2018). “On a new method to build group equivariant operators by means of permutants,” in Machine Learning and Knowledge Extraction: Second IFIP TC 5, TC 8/WG 8.4, 8.9, TC 12/WG 12.9 International Cross-Domain Conference, CD-MAKE 2018. Hamburg, Germany: Springer, 265–272.

Google Scholar

Cascarano, P., Frosini, P., Quercioli, N., and Saki, A. (2021). On the geometric and Riemannian structure of the spaces of group equivariant non-expansive operators. arXiv.

Google Scholar

Chachlski, W., De Gregorio, A., Quercioli, N., and Tombari, F. (2023). Symmetries of data sets and functoriality of persistent homology. Theory Appl. Categor. 39, 667–686.

Google Scholar

Cohen, T., and Welling, M. (2016). “Group equivariant convolutional networks,” in International Conference on Machine Learning. PMLR, 2990–2999. Available online at: jmlr.org

Google Scholar

Conti, F., Frosini, P., and Quercioli, N. (2022). On the construction of group equivariant non-expansive operators via permutants and symmetric functions. Front. Artif. Intellig. 5, 16. doi: 10.3389/frai.2022.786091

PubMed Abstract | Crossref Full Text | Google Scholar

Finzi, M., Benton, G., and Wilson, A. G. (2021). “Residual pathway priors for soft equivariance constraints,” in Advances in Neural Information Processing Systems (Cambridge, MA: The MIT Press), 30037–30049.

Google Scholar

Frosini, P., Gridelli, I., and Pascucci, A. (2023). A probabilistic result on impulsive noise reduction in topological data analysis through group equivariant non-expansive operators. Entropy 25, 1150. doi: 10.3390/e25081150

PubMed Abstract | Crossref Full Text | Google Scholar

Frosini, P., and Quercioli, N. (2017). “Some remarks on the algebraic properties of group invariant operators in persistent homology,” in Machine Learning and Knowledge Extraction, eds. A. Holzinger, P. Kieseberg, A. M. Tjoa, and E. Weippl. Cham. Springer International Publishing, 14–24.

Google Scholar

Gaal, S. (1964). Point Set Topology. Amsterdam: Elsevier Science.

Google Scholar

Gerken, J., Aronsson, J., Carlsson, O., Linander, H., Ohlsson, F., Petersson, C., et al. (2023). Geometric deep learning and equivariant neural networks. Artif. Intell. Rev. 56, 1–58. doi: 10.1007/s10462-023-10502-7

Crossref Full Text | Google Scholar

Li, R., Zhong, S., and Swartz, C. (2012). An improvement of the Arzel-Ascoli theorem. Topol. Appl. 159, 2058–2061. doi: 10.1016/j.topol.2012.01.014

Crossref Full Text | Google Scholar

Masci, J., Rodolà, E., Boscaini, D., Bronstein, M. M., and Li, H. (2016). “Geometric deep learning,” in SIGGRAPH ASIA 2016 Courses. New York, NY: Association for Computing Machinery, 1–50.

Google Scholar

Micheletti, A. (2023). A new paradigm for artificial intelligence based on group equivariant non-expansive operators. Eur. Math. Soc. Mag. 128, 4–12. doi: 10.4171/mag/133

Crossref Full Text | Google Scholar

Quercioli, N. (2021a). On the Topological Theory of Group Equivariant Non-Expansive Operators (PhD thesis). Bologna: Alma Mater Studiorum - Universit di Bologna. Available online at: http://amsdottorato.unibo.it/9770/. (accessed July 12, 2023).

Google Scholar

Quercioli, N. (2021b). “Some new methods to build group equivariant non-expansive operators in TDA,” in Topological Dynamics and Topological Data Analysis, 229–238.

Google Scholar

Romero, D. W., and Lohit, S. (2022). Learning partial equivariances from data. Adv. Neural Inform. Proc. Syst. 35, 36466–36478.

Google Scholar

van der Ouderaa, T., Romero, D. W., and van der Wilk, M. (2022). Relaxing equivariance constraints with non-stationary continuous filters. Adv. Neural Inform. Proc. Syst. 35, 33818–33830.

Google Scholar

Wang, R., Walters, R., and Yu, R. (2022). “Approximately equivariant networks for imperfectly symmetric dynamics,” in Proceedings of the 39th International Conference on Machine Learning. PMLR, 23078–23091. Available online at: jmlr.org

Google Scholar

Weiler, M., and Cesa, G. (2019). “General E(2)-equivariant steerable CNNs,” in Advances in Neural Information Processing Systems, eds. H. Wallach, H. Larochelle, A. Beygelzimer, F. d Alché-Buc, E. Fox, and R. Garnett. New York: Curran Associates, Inc, 32.

Google Scholar

Keywords: partial-equivariant neural network, P-GENEO, pseudo-metric space, compactness, convexity

Citation: Ferrari L, Frosini P, Quercioli N and Tombari F (2023) A topological model for partial equivariance in deep learning and data analysis. Front. Artif. Intell. 6:1272619. doi: 10.3389/frai.2023.1272619

Received: 04 August 2023; Accepted: 27 November 2023;
Published: 21 December 2023.

Edited by:

Fabio Anselmi, University of Trieste, Italy

Reviewed by:

Maurizio Parton, G. d'Annunzio University of Chieti and Pescara, Italy
Filippo Maggioli, Sapienza University of Rome, Italy

Copyright © 2023 Ferrari, Frosini, Quercioli and Tombari. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Nicola Quercioli, bmljb2xhLnF1ZXJjaW9saUBnbWFpbC5jb20=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.