Riemannian Geometry of Functional Connectivity Matrices for Multi-Site Attention-Deficit/Hyperactivity Disorder Data Harmonization

Simeon, Guillem; Piella, Gemma; Camara, Oscar; Pareto, Deborah

doi:10.3389/fninf.2022.769274

ORIGINAL RESEARCH article

Front. Neuroinform., 23 May 2022

Volume 16 - 2022 | https://doi.org/10.3389/fninf.2022.769274

Riemannian Geometry of Functional Connectivity Matrices for Multi-Site Attention-Deficit/Hyperactivity Disorder Data Harmonization

$\nGuillem Simeon$ Guillem Simeon¹^*

Gemma Piella²

Oscar Camara³

Deborah Pareto⁴

¹Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), Barcelona, Spain
²SimBioSys Group, BCN MedTech, Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona, Spain
³PhySense Group, BCN MedTech, Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona, Spain
⁴Neuroradiology Section, Department of Radiology (Institut de Diagnòstic per la Imatge), Vall d'Hebron Hospital Universitari, Barcelona, Spain

The use of multi-site datasets in neuroimaging provides neuroscientists with more statistical power to perform their analyses. However, it has been shown that the imaging-site introduces variability in the data that cannot be attributed to biological sources. In this work, we show that functional connectivity matrices derived from resting-state multi-site data contain a significant imaging-site bias. To this aim, we exploited the fact that functional connectivity matrices belong to the manifold of symmetric positive-definite (SPD) matrices, making it possible to operate on them with Riemannian geometry. We hereby propose a geometry-aware harmonization approach, Rigid Log-Euclidean Translation, that accounts for this site bias. Moreover, we adapted other Riemannian-geometric methods designed for other domain adaptation tasks and compared them to our proposal. Based on our results, Rigid Log-Euclidean Translation of multi-site functional connectivity matrices seems to be among the studied methods the most suitable in a clinical setting. This represents an advance with respect to previous functional connectivity data harmonization approaches, which do not respect the geometric constraints imposed by the underlying structure of the manifold. In particular, when applying our proposed method to data from the ADHD-200 dataset, a multi-site dataset built for the study of attention-deficit/hyperactivity disorder, we obtained results that display a remarkable correlation with established pathophysiological findings and, therefore, represent a substantial improvement when compared to the non-harmonization analysis. Thus, we present evidence supporting that harmonization should be extended to other functional neuroimaging datasets and provide a simple geometric method to address it.

1. Introduction

Functional magnetic resonance imaging (fMRI) has become one of the leading methods to conduct research on human brain mapping. fMRI acquisitions allow the discovery of brain activation patterns, which help to understand the brain processes behind cognition or task performance. A relevant paradigm of acquisition is called resting-state fMRI, where the subject's brain is studied without performing any specific task. Resting-state fMRI made possible to identify intrinsic functional connectivity and resting-state networks in the brain (van den Heuvel and Pol, 2010; Smitha et al., 2017), that is, distant brain regions that exhibit a temporal correlation in their blood-oxygen-level-dependent (BOLD) signals and are thought to activate in a synchronous way. Moreover, resting-state functional connectomics has provided new insights into brain organization in disease states, since specific changes in connectivity patterns have been directly correlated to multiple disorders (Greicius, 2008; Du et al., 2018). These alterations in connectivity are useful to identify biomarkers or to gain more knowledge on these neurological or psychiatric disorders. A very informative approach to summarizing whole-brain functional connectivity consists of the construction of functional connectivity matrices. It is based on the parcellation of the brain into some predefined regions of interest (ROIs) and the comparison of the BOLD time series associated with these ROIs. Usually, this comparison is made in terms of a pair-wise Pearson's correlation coefficient between ROI signals (even though partial correlations can also be used Kim et al., 2015). After the construction of functional connectivity matrices from a collection of subjects, one can identify entry-wise statistically significant differences between subsamples of these subjects (for example, patients and healthy controls), usually by performing univariate statistical tests on each entry separately. Therefore, one is able to spot key functional connections that display differences according to the condition of the subject and, hopefully, playing an important role in the disease state.

The fact that functional connectivity matrices are constructed using correlation coefficients makes them belong to a particular subset of matrices called symmetric positive-definite (SPD) matrices, which are symmetric matrices with all eigenvalues strictly greater than zero. This set of matrices does not form a vector space. This property implies that some conventional Euclidean operations on SPD matrices, such as subtraction of two SPD matrices, do not yield another element of the set of SPD matrices. Even though one can perform Euclidean operations with them, resulting matrices do not take into account the underlying geometry of the space of SPD matrices. Mathematically, the space of SPD is what is known as a manifold. Roughly speaking, a manifold is a curved space that locally looks like a flat (Euclidean) space. Although one cannot perform Euclidean operations with elements (points) of the space, there exists an extremely powerful mathematical formalism, known as Riemannian geometry, that deals with manifolds and allows computations respecting their underlying geometry. Since functional connectivity matrices are SPD matrices, it is preferable to perform operations on them that take into account their underlying geometry, rather than applying standard Euclidean methods (You and Park, 2021). Geometry-aware approaches allow, for example, substantial increases in classification performance of functional connectivity matrices and accurate comparison of connectomes (Varoquaux et al., 2010; Ng et al., 2014; Dodero et al., 2015; Slavakis et al., 2017). Nevertheless, the potential of Riemannian geometric methods has not been fully recognized yet, though their consideration in functional studies is increasing. One remarkable contribution to the spread of these methods has been recently made by You and Park (You and Park, 2021), where they introduced SPDtoolbox, a MATLAB-based toolbox that offers the possibility of exploiting the geometry of functional connectivity matrices in a straightforward manner. One very relevant contribution put forward by You and Park (2021) is the proposal of a geometry-aware permutation testing framework that allows the identification of statistically significant differences in functional connectivity between groups.

Given the need for large cohorts to carry out statistical studies, together with the advent of big data and machine learning, neuroimaging datasets have increased their size usually by collecting data acquired at different sites. It is known that the use of different scanners, acquisition protocols, or processing pipelines introduces variability in the data that cannot be attributed to the biological variability of the subjects. To overcome this issue, the standardization of acquisition and processing protocols plays a key role. Nevertheless, it has been shown that some imaging-site bias remains in the data even when standardizing protocols and pipelines (David et al., 2013). Furthermore, from a machine learning perspective, classification might be more difficult because unwanted variability prevents the algorithm from learning the adequate hypothesis function and also because the learned biased hypothesis might not have the necessary generalization power to unseen examples, affecting classification performance (Brain and Webb, 2002). One well-known approach to harmonizing neuroimaging data is ComBat (Johnson et al., 2006). Essentially, ComBat involves subtracting batch variability by modeling it as a deviation from the estimated influence of known covariates of the model. By taking into account the effect of covariates, one can control for the other sources of variability other than the batch (site) effect. In the field of neuroimaging, the ComBat method was shown to remove imaging-site effects in diffusion tensor imaging acquisitions and cortical thickness studies (Fortin et al., 2017, 2018; Beer et al., 2020). Later, ComBat was adapted to functional acquisitions and their derived functional connectivity matrices (Yu et al., 2018).

Nevertheless, this approach does not take into account the geometry of the space of SPD matrices and, therefore, it cannot be considered to be operating in a geometry-constrained manner. In fact, the proposed adaptation of ComBat to functional connectivity matrices (Yu et al., 2018) only considers that the resulting harmonized matrices need to be symmetric. Nevertheless, positive-definiteness is not enforced and, therefore, resulting matrices do not belong in general to the original space of SPD matrices. Positive-definiteness is a geometric constraint and precisely defines the manifold of SPD matrices as a subspace of the space of symmetric matrices. Any method that operates on SPD matrices and respects their symmetry and positive-definiteness can be regarded as a geometry-constrained method. Although site harmonization using Riemannian methods has not been directly addressed, some proposals have been put forward to deal with related procedures. In particular, there have been previous works on the use of manifold-constrained operations to develop domain adaptation procedures specifically for functional data (Ng et al., 2014; Yair et al., 2019). However, in their case, domain adaptation was performed targeting inter-session variability (that is, differences arising from acquiring data from the same subject in different sessions, possibly with intervention in-between).

Our intention in this work is to use the rich formalism of Riemannian geometry to geometrically characterize imaging-site bias and to propose methods to remove it (at least partially) from multi-site functional connectivity matrices. To this aim, we will adapt two previous contributions to domain adaptation of functional data and we will also propose a method from our own, called Rigid Log-Euclidean Translation. These operations will be particularized to ADHD-200 (ADHD-200 Consortium, 2017), a multi-site dataset of resting-state acquisitions from patients with attention-deficit/hyperactivity (ADHD) and healthy controls (HC). Furthermore, using the geometry-aware permutation testing algorithm proposed in You and Park (2021), implemented in SPDtoolbox, we will analyze the impact of imaging-site bias on the results that can be distilled from functional connectivity matrices, and we will correlate them with established pathophysiological findings in ADHD. In short, we intend to provide a simple but powerful geometrically-grounded framework for multi-site functional connectivity matrices harmonization.

2. Materials and Methods

2.1. Functional Connectivity Matrices From ADHD-200

The ADHD-200 sample (ADHD-200 Consortium, 2017) is the result of a joint international effort that provides researchers worldwide with openly-shared resting-state and structural acquisitions from 8 different sites, comprising 362 children and adolescents diagnosed with ADHD and 585 typically developing controls, along with their phenotypic data (ADHD subtype, ADHD score, IQ, medication status, etc). In an effort to standardize the analyses carried out with ADHD-200, three different preprocessing pipelines for resting-state acquisitions were proposed, being also preprocessed data made publicly available (Bellec et al., 2017). One of this preprocessing strategy, the Athena pipeline, was performed by Craddock (Craddock, 2011). The relevant derived data for our present work are the extracted ROI time courses from resting-state acquisitions. As previously mentioned, the computation of the Pearson's correlation coefficient between time courses allows the construction of functional connectivity matrices. Craddock provided researchers with time series extracted using different parcellation schemes. In particular, Craddock derived specific functional parcellations for ADHD-200 data from their resting-state acquisitions, giving rise to a parcellation into 190 ROIs (CC200).

For our present work, we have used a subset of functional connectivity matrices constructed using the CC200 parcellation scheme. Specifically, our dataset has consisted of functional connectivity matrices from 80 patients with ADHD and 80 HC. Their acquisitions have been randomly pooled (avoiding any kind of matching in order to maximize intrinsic variability) from four different sources: 20 ADHD/20 HC from Kennedy Krieger Institute (KKI) (Johns Hopkins University), 20 ADHD/20 HC from the NeuroIMAGE sample (NIM), 20 ADHD/20 HC from New York University Child Study Center (NYU), and again 20 ADHD/20 HC from Peking University (PKG). Therefore, we have worked in total with 160 functional connectivity matrices, with dimensions of 190 × 190.

2.2. Riemannian Geometry of SPD Matrices

Symmetric positive definite (SPD) matrices are the set of symmetric matrices with all eigenvalues strictly greater than zero. This property is equivalent to the requirement that, for an n × n symmetric matrix Σ to be positive definite, x^T Σ x > 0 for all non-zero x ∈ ℝⁿ. The set of SPD matrices forms a manifold, a topological space that locally looks like Euclidean space. At each point p of some manifold $M$ , one can construct a tangent space $T_{p} M$ , generated by the tangent vectors of curves crossing p, which is a vector space and approximates the manifold in the neighborhood of p. In our present case, $M$ is the manifold of n × n SPD matrices, which we will denote by ${S y m}_{n}^{+}$ , where each point is an SPD matrix, and one can show that at every point $p \in {S y m}_{n}^{+}$ , $T_{p} {S y m}_{n}^{+}$ is the space of symmetric n × n matrices, Sym_n.

Riemannian geometry provides a framework that allows one to regard the manifold as a metric space. Specifically, a Riemannian manifold is a smooth manifold equipped with a SPD tensor field called metric tensor. The metric tensor g at every point $p \in M$ is a map

\begin{array}{l} g_{p} : T_{p} M \times T_{p} M \to ℝ, & (1) \end{array}

constructed as a generalization of the canonical dot product in Euclidean space and that can be used to characterize the geometry of the manifold. Two other important maps in Riemannian geometry are the exponential map exp_p and the logarithmic map log_p at some point $p \in M$ ,

\begin{array}{l} {exp}_{p} : T_{p} M \to M, {log}_{p} : M \to T_{p} M . & (2) \end{array}

Given a tangent vector $v \in T_{p} M$ , exp_p(v) results in a projection of v onto the neighborhood of p in the manifold $M$ by defining a geodesic (i.e., the shortest length curve on the manifold) in the direction of v. The logarithmic map performs the inverse operation, mapping points in the manifold to vectors in the tangent space of p.

Particularizing previous notions to our case, Pennec et al. (2006) introduced one of the most used geometric Riemannian structures on the manifold of SPD matrices, ${S y m}_{n}^{+}$ , the affine-invariant Riemannian metric (AIRM). They proposed the following metric tensor at some point $Σ \in S y m_{n}^{+}$ (a SPD matrix) applied on two tangent vectors X, Y ∈ Sym_n (two symmetric matrices):

\begin{array}{l} g_{Σ} (X, Y) : = Tr (Σ^{- 1} X Σ^{- 1} Y), & (3) \end{array}

where Σ⁻¹ is the matrix inverse and Tr(·) is the usual trace operator. Associated to this metric, the exponential map at Σ acting on a tangent vector V ∈ Sym_n and the logarithmic map at Σ acting on a point $Π \in {S y m}_{n}^{+}$ read

\begin{array}{l} {exp}_{Σ} (V) = Σ^{\frac{1}{2}} Exp (Σ^{- \frac{1}{2}} V Σ^{- \frac{1}{2}}) Σ^{\frac{1}{2}} \in {S y m}_{n}^{+}, & (4) \end{array}

\begin{array}{l} {log}_{Σ} (Π) = Σ^{\frac{1}{2}} Log (Σ^{- \frac{1}{2}} Π Σ^{- \frac{1}{2}}) Σ^{\frac{1}{2}} \in {S y m}_{n}, & (5) \end{array}

being Exp(A) and Log(A) the exponential and logarithm of matrix A, respectively, which can be computed after eigenvalue decomposition A = UDU^T as Exp(A) = U exp(D) U^T, Log(A) = U log(D) U^T, where U is the eigenvector matrix and D the diagonal eigenvalue matrix, and exp(D) (log(D)) denotes the application of the exponential (logarithmic) function to the eigenvalues. $A^{\frac{1}{2}} = Exp (\frac{1}{2} Log (A))$ is the unique SPD square root of the SPD matrix A. Finally, under the AIRM framework, the geodesic distance between two SPD matrices Σ and Π is

\begin{array}{l} d^{2} (Σ, Π) = {| | Log (Σ^{- 1 / 2} Π Σ^{- 1 / 2}) | |}_{F}^{2} & (6) \end{array}

with $| | A | |_{F}^{2} = \sum_{i} \sum_{j} | A_{i j} |^{2}$ , the (squared) Frobenius norm of a matrix A. Arsigny et al. (2005) proposed another framework to account for the Riemannian structure of the manifold of SPD matrices. This framework, known as Log-Euclidean Riemannian Metric (LERM), consists basically of embedding points of the manifold in a Euclidean space by using the matrix logarithm Log(·). Standard Euclidean computations can be performed on SPD matrices' logarithms since they have been mapped to Euclidean space. Under LERM, the geodesic distance between two SPD matrices Σ, Π is simply

\begin{array}{l} d^{2} (Σ, Π) = {| | Log (Σ) - Log (Π) | |}_{F}^{2}, & (7) \end{array}

that is, the Euclidean distance of matrix logarithms. The computation of these distances is directly implemented in SPDtoolbox.

A useful object in the manifold that can be constructed given a collection of SPD matrices Σ₁, Σ₂, …, Σ_N is their Fréchet mean. The Fréchet mean is the generalization of the concept of centroid or center of mass to more general metric spaces other than Euclidean space, giving a sense of centrality measure. Mathematically, the Fréchet mean of the previous set of matrices is defined as

\begin{array}{l} \bar{Σ} = \underset{Σ \in M}{argmin} \sum_{i = 1}^{N} d^{2} (Σ, Σ_{i}), & (8) \end{array}

that is, the SPD matrix $\bar{Σ}$ that minimizes the sum of the squared distances from itself to the collection of matrices. The Fréchet mean depends on the chosen distance function in the manifold and, therefore, the Fréchet mean under LERM and AIRM, in general, do not coincide. When considering AIRM, the Fréchet mean has to be computed by an optimization procedure. In Pennec (2006), the author proposed a Newton gradient descent algorithm which, after some mean initialization, consists of iterating the following update rule to get successive estimates ${\bar{Σ}}_{t} \to {\bar{Σ}}_{t + 1}$ until convergence:

\begin{array}{l} {\bar{Σ}}_{t + 1} = ex p_{{\bar{Σ}}_{t}} (\frac{1}{N} \sum_{i = 1}^{N} {log}_{{\bar{Σ}}_{t}} (Σ_{i})) . & (9) \end{array}

The algorithm involves the computation of several matrix logarithms, square roots, inverses and exponentials at each iteration, resulting in a computationally expensive process, especially when dealing with a large number of samples. On the contrary, under LERM, the Fréchet mean has a closed form that enables a very fast computation:

\begin{array}{l} \bar{Σ} = Exp (\frac{1}{N} \sum_{i = 1}^{N} Log (Σ_{i})), & (10) \end{array}

which reduces to the obtention of the Euclidean mean of the matrices' logarithms (vectors) and its projection back to the manifold via an exponential map. Again, both Fréchet mean computation methods are directly implemented in SPDtoolbox.

2.3. Geometry-Aware Permutation Testing

Permutation tests are non-parametric statistical tests that rely on the randomization of the observed data to assess the statistical significance of group differences (Nichols and Holmes, 2001). In our case, we have collected two sets of functional connectivity matrices, one with N₊ patients with ADHD ${Σ_{i}^{+} | i = 1, . . ., N_{+}}$ and the other one with N₋ healthy controls ${Σ_{i}^{-} | i = 1, . . ., N_{-}}$ . The philosophy behind permutation testing is to make a very large number of random permutations of the N subjects into the two subgroups and to sample the distribution of values of some statistic under the null hypothesis, that is, under the assumption that the allocation of a subject into one of the two groups is arbitrary. This procedure allows the obtention of the approximate null distribution of the statistic and the assessment of how extreme this observation is with respect to the null distribution of values. Specifically, one can use the definition of p-value to reject the null hypothesis, i.e., the probability to obtain a value for the statistic as or more extreme than the one observed, under the assumption that the null hypothesis is correct. Therefore, under the permutation testing approach, the p-value is the proportion of sampled permutations where the statistic is greater than or equal to the originally observed statistic.

The geometry-aware algorithm proposed in You and Park (2021) and implemented in SPDtoolbox has the following steps:

1. Compute Fréchet mean ${\bar{Σ}}^{+}$ and ${\bar{Σ}}^{-}$ for patients with ADHD and healthy controls, respectively, and take $| {\bar{Σ}}^{+} - {\bar{Σ}}^{-} | \equiv {| {\bar{Σ}}^{+} - {\bar{Σ}}^{-} |}_{o b s}$ (the entry-wise absolute value of the difference matrix, the observed statistic of the n × n entries).

2. For T iterations, perform permutations of the subjects of sizes N₊ and N₋ as previously described, recompute group Fréchet mean, and record the value of the statistic at iteration t, ${| {\bar{Σ}}^{+} - {\bar{Σ}}^{-} |}_{t}$ .

3. For each entry ij, compute its corresponding p-value by taking the ratio between the number of iterations giving a statistic greater than or equal to ${({| {\bar{Σ}}^{+} - {\bar{Σ}}^{-} |}_{o b s})}_{i j}$ and the total number of iterations T.

At this point, assuming some confidence level α, one can reject the null hypothesis for entries with p-values lower than α. It is worth noting that the algorithm tests for statistical significance in absolute difference and, therefore, cannot provide information on the positive or negative character of the difference. The output of the algorithm in our case will be a 190 × 190 matrix with p-values as entries. Nevertheless, we will not rely on a single permutation test on the 80 ADHD/80 HC subjects to perform our analyses. Instead, we have decided to perform 100 separate permutation tests on 10 ADHD/10 HC randomly sampled subjects in each experiment. The reasons for following this approach are first, taking into account the philosophy behind permutation testing, performing a reliable permutation test on 80 ADHD and 80 HC is not computationally tractable, since the number of ways of dividing 160 subjects into two separate groups of 80 are 160!/(80!80!) ~ 10⁴⁷ and, therefore, by no means one can sample a large enough fraction of permutations to rely on the obtained distribution of statistic values; second, by randomly pooling 10 ADHD/10 HC 100 times, regardless of imaging site, one can probe different combinations of site members and site influence on the variability of the results; and third, one can assess the relevance of salient connections and nodes by checking for coincidences between independent permutation tests. Taken together, this treatment allows evaluating the power of harmonization methods, in the sense that when data is harmonized, one expects more sensitivity (coherence, coincidences) in statistically significant results when using randomly sampled subjects from different sites. Given these aforementioned points, we have operated in the following manner: we have performed 100 separate permutation tests, each one of 1,000 iterations, on 10 ADHD/10 HC randomly sampled subjects; after each test, we have identified the entries with p < 0.001 and declared them as significant; at this point, we have been left with 100 binary matrices of dimensions 190 × 190 (one for each separate test) with ones in the location of significant entries and zeros elsewhere; and, finally, we have summed these binary matrices to obtain a symmetric 190 × 190 matrix, which we call F, with F_ij being the number of times the connection ij has been declared significant, that could be regarded as a 2-dimensional histogram.

When working at a significance level of α = 0.001 for each individual permutation test, we are assuming that, at most, the probability of mistakenly declaring an entry as statistically significant by chance is 0.001 (incorrect rejection of the null hypothesis). However, when performing 100 separate experiments and considering jointly their results (as in the frequency histograms), this effect can be accumulated. As a simplification, we can model each separate experiment as a random Bernoulli experiment: the probability of mistakenly declaring each entry as significant by chance in a separate experiment is 0.001. The concatenation of several independent Bernoulli experiments follows the binomial distribution. Therefore, we can use the binomial distribution to compute the probabilities for each separate entry to be declared significant by chance once, twice, 3 times, etc. In our case, considering a probability of 0.001 and 100 trials, the binomial distribution predicts that in the final frequency matrix, the probability for each entry to appear once is 0.09. That is, each entry has a probability of 9% of appearing once in the frequency histogram by chance, which indeed means that at this frequency level (F_ij = 1), significance is compromised. However, for entries F_ij ≥ 2, the (complementary cumulative) probability is ≈ 0.005, and for entries F_ij ≥ 3, the resulting (complementary cumulative) probability is ≈ 0.00015. As a consequence, one can threshold the histogram matrix F_ij at different frequencies depending on the confidence that one wants to assume.

Finally, using these notions and taking into account that the previously described procedure will be applied to different imaging-site harmonization methods introduced in Section 2.4, we define a simple quantity that measures the power of a method to specifically point to some entries by accumulating counts across the 100 separate experiments. We call this measure the sensitivity of the method. At a given frequency n, the sensitivity S(n) was defined as the ratio between the number of entries such that F_ij ≥ n and the number of entries such that F_ij ≥ 1.

2.4. Geometry-Aware Site Harmonization

As mentioned in the Introduction, imaging site has an impact on the acquisition of functional data and their analyses. We show that this is indeed the case for ADHD-200, even when using the same preprocessing pipeline (Wang et al., 2017). We also prove that, in our case, the principal difference stems from the biased and site-clustered distribution of connectivity matrices in the SPD manifold. Some geometry-aware domain adaptation approaches have been proposed, even though not targeting imaging-site variability. We can distinguish two different harmonization methods that will be analyzed separately: matrix whitening at identity and parallel transport. When being described, they will be adapted to our dataset of patients with ADHD ${Σ_{i}^{+} | i = 1, . . ., N_{+}}$ and healthy controls ${Σ_{i}^{-} | i = 1, . . ., N_{-}}$ and to our imaging-site harmonization aim.

2.4.1. Matrix Whitening at Identity

Suppose our dataset can also be divided into collections of matrices ${Σ_{i}^{(k)}}$ , with k labeling the imaging site, where resting-state data from subject i was acquired. If we denote by ${\bar{Σ}}^{(k)}$ , the AIRM Fréchet mean of matrices obtained from site k, the matrix whitening approach reduces to applying the following transformation to all $Σ_{i}^{(k)}$ (Ng et al., 2014; Yair et al., 2019):

\begin{array}{l} Σ_{i}^{(k)} \to {({\bar{Σ}}^{(k)})}^{- \frac{1}{2}} Σ_{i}^{(k)} {({\bar{Σ}}^{(k)})}^{- \frac{1}{2}} . & (11) \end{array}

One can show that this transformation is equivalent to a displacement of the matrices such that their Fréchet mean is the identity matrix. To this aim, consider C ∈ GL(n), that is, an invertible n × n matrix. The following properties hold when considering the AIRM framework (Yair et al., 2019):

1. The geodesic distance (6) between two SPD matrices Σ and Π is invariant under Σ → C Σ C^T and Π → C Π C^T, i.e., d(C Σ C^T, C Π C^T) = d(Σ, Π).

2. Given some Fréchet mean $\bar{Σ} (Σ_{1}, . . ., Σ_{N})$ , then $\bar{Σ} (C Σ_{1} C^{T}, . . ., C Σ_{N} C^{T})$ is equivalent to $C \bar{Σ} (Σ_{1}, . . ., Σ_{N}) C^{T}$ .

Transformation (11) can be expressed as $Σ_{i}^{(k)} \to C Σ_{i}^{(k)} C^{T}$ with $C = {({\bar{Σ}}^{(k)})}^{- \frac{1}{2}} = C^{T}$ , following the last equality from the fact that, if ${\bar{Σ}}^{(k)}$ is SPD, ${({\bar{Σ}}^{(k)})}^{- \frac{1}{2}}$ is also SPD and, therefore, ${({({\bar{Σ}}^{(k)})}^{- \frac{1}{2}})}^{T} = {({\bar{Σ}}^{(k)})}^{- \frac{1}{2}}$ . By virtue of the first property, intra-site geodesic distances are preserved, and the Fréchet mean becomes ${\bar{Σ}}^{(k)} \to {({\bar{Σ}}^{(k)})}^{- \frac{1}{2}} {\bar{Σ}}^{(k)} {({\bar{Σ}}^{(k)})}^{- \frac{1}{2}} = I$ , following from the second property.

In Ng et al. (2014), authors followed this approach and applied the logarithmic map Log(·) to project transformed SPD matrices to the common tangent space at identity before performing classification. To our understanding, invariance of intra-site distances is a desirable property for harmonization since it is essential to preserve intra-site variability, presumably coming from biological sources (that is, disease, age, gender, medication status, etc). However, we think that, in our setting, it might be a priori more cautious not to remove information about the original location of matrices in the manifold by displacing them to the neighborhood of the identity matrix, since it could also remove relevant clinical information.

One could also consider the use of two different ${\bar{Σ}}^{(k)}$ , one for patients and another one for healthy controls since one expects their Fréchet mean to be different. However, the use of a single site Fréchet mean to perform the transformation is based on two considerations. On the one hand, the use of different transformations for patients with ADHD and HC would violate the preservation of all pair-wise distances within the site, since the matrix C would be different depending on the condition of the subjects. Distances among HC on one side and distances among patients with ADHD on the other side would be preserved, while distances between HC and patients with ADHD would not. This fact could be regarded as the introduction of an uncontrolled source of variability between conditions. On the other hand, it would be desirable to have a harmonization approach that does not depend on the availability of knowledge about subjects' conditions, which would be required when using different transformations. For example, as stated in the Introduction, domain adaptation has been proven useful when training classifiers. Ideally, when the classifier is reliable enough, one would like to assign the condition to a subject without knowing it a priori. Therefore, the use of a single transformation would allow the harmonization of the data prior to the use of the classifier.

2.4.2. Parallel Transport

Elements from two different tangent spaces cannot be directly compared. To compare them in an appropriate manner, one needs to perform what is called parallel transport. In short, parallel transport refers to transporting a tangent space at a point along a geodesic distorting vectors of this tangent space as less as possible. In Yair et al. (2019), the authors provide an excellent and rigorous mathematical description of parallel transport in the manifold of SPD matrices. For harmonization and comparison of SPD matrices, we are interested in transporting and projecting them to a common tangent space.

The main line of reasoning applied to our case is as follows:

1. Project matrices from a given site k to the tangent space of their site Fréchet mean ${\bar{Σ}}^{(k)}$ via logarithmic map $lo g_{{\bar{Σ}}^{(k)}} (Σ_{i}^{(k)})$ .

2. Parallel transport them to the tangent space of a common reference point Σ₀.

3. Reproject the parallel transported matrices back to the manifold via exponential map exp_Σ₀(·).

One of the main results of Yair et al. (2019), is that all this process can be performed by the transformation $Σ_{i}^{(k)} \to C Σ_{i}^{(k)} C^{T}$ with $C = {(Σ_{0} {({\bar{Σ}}^{(k)})}^{- 1})}^{1 / 2}$ . Therefore, properties 1 and 2 presented in the case of matrix whitening also hold here: intra-site distances are preserved and the Fréchet mean becomes $C {\bar{Σ}}^{(k)} C^{T}$ .

Then, one has to decide which is the appropriate reference point Σ₀. Ng et al. (2014) used Σ₀ = I to compare the classification performance between the present method and matrix whitening at identity, although they performed parallel transport numerically (Schild's ladder). In this case, $C = C^{T} = {({\bar{Σ}}^{(k)})}^{- 1 / 2}$ and the Fréchet mean reduces to I (notice that the transformation is exactly equivalent to matrix whitening), arising the same previously mentioned concerns about the original location in the manifold. On the other hand, Yair et al. (2019) suggest using as the reference point $Σ_{0} = \bar{Σ}$ , the Fréchet mean of site means ${{\bar{Σ}}^{(k)}}$ . This approach takes into account the original location of SPD matrices in the manifold through the global mean $\bar{Σ}$ . Right after this step, Yair et al. perform matrix whitening using $\bar{Σ}$ . We will refer to this whole process proposed by Yair et al. as simply parallel transport (even though proper parallel transport is the step performed after projecting to the site mean's tangent space and before reprojecting back to the manifold at the reference point). Notice that the use of a single reference point Σ₀ in our setting, both for patients with ADHD and HC, can be justified by taking into account the same considerations from the previous subsection.

A priori, matrix whitening and parallel transport are different transformations. Nevertheless, one can show that under the condition $[\bar{Σ}, {\bar{Σ}}^{(k)}] \equiv \bar{Σ} {\bar{Σ}}^{(k)} - {\bar{Σ}}^{(k)} \bar{Σ} = 0$ , ∀k, that is, when site means commute with the global mean, parallel transport and matrix whitening are equivalent frameworks. This property is proven in the Annex, and it will have an impact on the results we obtain.

2.4.3. Rigid Log-Euclidean Translation

We hereby propose a method for site harmonization, which we call Rigid Log-Euclidean Translation (RLET), that has the two a priori desirable properties stated above: preservation of intra-site geodesic distances, thus retaining biological variability, and consideration of the original location of correlation matrices in the manifold. Site bias could also impact the distribution of matrices around the site mean, but this would require modeling site effects on the distribution of matrices around their mean. The proposed method is therefore a geometrically-motivated first order approximation. We will work under the LERM framework, which allows a purely Euclidean treatment of the logarithm of the matrices. The steps of Rigid Log-Euclidean Translation are as follows:

1. Compute the logarithm of site Fréchet mean for all k

\begin{array}{l} Log ({\bar{Σ}}^{(k)}) = \frac{1}{N^{(k)}} \sum_{i \in (k)} Log (Σ_{i}^{(k)}), & (12) \end{array}

where the sum runs through all subjects in site k, and N^(k) is the total number of subjects in site k.

2. Compute the logarithm of the global mean $\bar{Σ}$ defined as

\begin{array}{l} Log (\bar{Σ}) = \frac{1}{K} \sum_{(k)} Log ({\bar{Σ}}^{(k)}), & (13) \end{array}

with K being the number of sites.

3. Apply the following transformation to the entire collection of matrices ${Σ_{i}^{(k)}}$ given their k

\begin{array}{l} Log (Σ_{i}^{(k)}) \to Log ({\tilde{Σ}}_{i}^{(k)}) = Log (\bar{Σ}) + (Log (Σ_{i}^{(k)}) - Log ({\bar{Σ}}^{(k)})) . & (14) \end{array}

4. Use the exponential map to obtain the modified (site-harmonized) SPD matrices $Exp (Log ({\tilde{Σ}}_{i}^{(k)})) = {\tilde{Σ}}_{i}^{(k)}$ .

When taking into account that after applying the logarithmic map to the original matrices, we obtain vectors of the Euclidean space ℝ^n×n, the above transformation is merely a translation of the original matrices such that transformed site means ${\tilde{\bar{Σ}}}^{(k)}$ are identically the global mean $\bar{Σ}$ . In fact, it is a rigid translation of matrices logarithms, representing an isometry of the Euclidean metric, and therefore preserving Euclidean distances in the Euclidean space ℝ^n×n. These properties are formally proven in the Annex.

Since geodesic distances under the LERM framework are these Euclidean distances, intra-site geodesic distances are invariant under the proposed transformation, and the Rigid Log-Euclidean Translation approach preserves intra-site variability. The transformation also allows retaining information about the original location in the manifold through the global mean. On the other hand, one can straightforwardly modify this approach to rigidly transport matrices to I, by removing the term $Log (\bar{Σ})$ from the transformation rule. In this case, $Log ({\tilde{\bar{Σ}}}^{(k)}) = 0 = Log (I), \forall k$ , (see Annex) and resulting relative positions between all matrices in Euclidean space are the same as when translated to $\bar{Σ}$ . We will refer to these two approaches as RLET( $\bar{Σ}$ ) and RLET(I). RLET(I) can be useful to compare different harmonization strategies at identity and to probe differences coming uniquely from manifold location.

2.5. Pair-Wise Distances and Low-Dimensional Embedding

Even though all the concepts presented above are geometrically motivated, the major difficulty for bias characterization stems from the fact that we are dealing with a very high-dimensional space (in our case, d = n(n + 1)/2 = 190(190 + 1)/2 = 18, 145). Therefore, the extraction of insights from this space is a challenging task. However, since our manifold is equipped with a metric and a distance measure (i.e., the manifold is Riemannian), we can construct a useful object that can summarize important geometric information: the pair-wise distance matrix D_ij = d(Σ_i, Σ_j) between functional connectivity matrices (subjects). In fact, this approach of using pair-wise distances or pair-wise similarities is common in unsupervised techniques of dimensionality reduction for visualization purposes, such as multidimensional scaling (MDS) or t-distributed stochastic neighbor embedding (tSNE).

We will focus on t-distributed stochastic neighbor embedding (tSNE), since it was the method of choice in the previous works we have adapted for site harmonization. tSNE (van der Maaten and Hinton, 2008) is a nonlinear dimensionality reduction technique that maps a data point in high-dimensional space to a point in 2-dimensional or 3-dimensional Euclidean space. Different from other nonlinear dimensionality reduction approaches, tSNE has a probabilistic nature. Originally, the similarity metric of the algorithm was taken to be the pair-wise Euclidean distance between high-dimensional points, although currently different metrics can be used depending on the task at hand. However, the usual Euclidean distance is a good choice in our case: by using as high-dimensional inputs the logarithms of the matrices, the similarity measure between two different matrices will turn out to be the LERM geodesic distance. As a consequence, the low-dimensional embedding and its visualization will display similarity given by the pair-wise distribution of matrices in the manifold (under the LERM framework) with very high probability. At this point, one is able to assess the distribution of matrices in the manifold and, therefore, to characterize original geometric imaging-site bias and evaluate the effect of harmonization methods.

2.6. Functional Analysis

One can perform comparative functional analyses between methods in terms of resting-state networks instead of specific ROI-to-ROI connections. To this aim, we will use the Yeo's 17-networks atlas (Yeo et al., 2011), where the brain is parcellated into 17 different resting-state networks. We have further reduced the number of networks by using the following proposed correspondence: 1 and 2 correspond to the visual network (VIS); 3, 4, and 14 to the sensorimotor network (MOT); 5 and 6 to dorsal attention (DA); 7 to ventral attention (VA); 8, 11, 12, and 13 to frontoparietal network (FP); 9 and 10 to limbic (LIM); and 15, 16, and 17 to default mode network (DMN). However, Yeo's parcellation does not include important subcortical regions that are included in our ROIs (CC200 parcellation, see Introduction). Therefore, and taking into account the results we have obtained, we have introduced the brain-stem (BS), the cerebellum (CRB), and basal ganglia (BG). In total, we are left with 10 different functional components.

Once we have the functional parcellation of the brain, we will assign each ROI to its corresponding functional component by using the ROI's coordinates. This will allow us to initially focus our attention on inter-network and intra-network interactions since the individual study of salient connections would be intractable when the number of detected significant differences in ADHD and HC connectomes is large. When thresholding frequency matrices F_ij at higher frequency levels, the number of samples will be reduced and individual treatment of these differences will be possible.

3. Results

3.1. Pair-Wise Distances and Distribution of Matrices

We have constructed the aforementioned pair-wise distance matrices D_ij = d(Σ_i, Σ_j) that retain information on the high-dimensional distribution of the matrices in the manifold. Specifically, we have computed pair-wise distances for original (non-modified) matrices ${D_{i j}}^{(0)}$ both using AIRM and LERM distances, harmonized matrices under Rigid Log-Euclidean Translation (RLET( $\bar{Σ}$ ) and RLET(I)), ${D_{i j}}^{(RLET (\bar{Σ}))}$ and ${D_{i j}}^{(RLET (I))}$ , using LERM distance, and matrix-whitening-harmonized (MW), and parallel-transport-harmonized matrices (PT), ${D_{i j}}^{(MW)}$ and ${D_{i j}}^{(PT)}$ , employing AIRM distance (Figure 1).

FIGURE 1

Figure 1. Pair-wise distances between functional connectivity matrices from different sites (KKI: Kennedy Krieger Institute, Johns Hopkins University; NYU: New York University Child Study Center; PKG: Peking University; NIM: NeuroIMAGE sample). (A) Pair-wise affine-invariant Riemannian metric (AIRM) distances for non-transformed matrices. (B) Pair-wise Log-Euclidean Riemannian metric (LERM) distances for non-transformed matrices. (C) Pair-wise AIRM distances after matrix whitening. (D) Pair-wise LERM distances after Rigid Log-Euclidean Translation (RLET) to the global mean $\bar{Σ}$ . (E) Pair-wise AIRM distances after parallel transport. (F) Pair-wise LERM distances after RLET to the identity matrix.

${D_{i j}}^{(MW)}$ and ${D_{i j}}^{(PT)}$ look identical (Figures 1C,F). This approximate equivalence can be explained by nearly vanishing commutators $[\bar{Σ}, {\bar{Σ}}^{(k)}]$ , as previously proved. We have checked the values of these commutators and, indeed, they almost vanish, as can be seen in Figure 2. The consequence is that, in practical terms, in the following, we can regard PT and MW as two equivalent frameworks.

FIGURE 2

Figure 2. Commutators of the global mean $\bar{Σ}$ and original site Fréchet mean ${{\bar{Σ}}^{(k)}}$ . (A) $[\bar{Σ}, {\bar{Σ}}^{(KKI)}]$ ; (B) $[\bar{Σ}, {\bar{Σ}}^{(NYU)}]$ ; (C) $[\bar{Σ}, {\bar{Σ}}^{(PKG)}]$ ; (D) $[\bar{Σ}, {\bar{Σ}}^{(NIM)}]$ . (KKI, Kennedy Krieger Institute, Johns Hopkins University; NYU, New York University Child Study Center; PKG, Peking University; NIM, NeuroIMAGE sample).

On the other hand, we have applied tSNE to the logarithms of original (non-harmonized) matrices, RLET( $\bar{Σ}$ )-harmonized matrices, and MW-harmonized matrices, with results shown in Figures 3, 4. Therefore, we have obtained a low-dimensional representation of ${D_{i j}}^{(0)}$ , ${D_{i j}}^{(RLET (\bar{Σ}))}$ , and ${D_{i j}}^{(MW)}$ , all of them computed by means of LERM distances. RLET(I)-harmonized matrices have not been used since, by construction, their pair-wise distances are identical to the ones for RLET( $\bar{Σ}$ ), and, regarding PT-harmonized matrices, they are almost exactly equivalent to the ones obtained using MW. The differences that could be displayed in the visualization of RLET(I) matrices compared to RLET( $\bar{Σ}$ )-harmonized ones and PT compared to MW would be due to the stochastic nature of the tSNE algorithm.

FIGURE 3

Figure 3. Two-dimensional embedding obtained after applying t-stochastic neighbor embedding to the logarithms of non-harmonized matrices. KKI, Kennedy Krieger Institute, Johns Hopkins University; NYU, New York University Child Study Center; PKG, Peking University; NIM, NeuroIMAGE sample.

FIGURE 4

Figure 4. (A) Two-dimensional embedding obtained after applying t-stochastic neighbor embedding to the logarithms of matrices transformed under matrix whitening. (B) Two-dimensional embedding obtained after applying t-stochastic neighbor embedding to the logarithms of matrices transformed under Rigid Log-Euclidean Translation to the global mean $\bar{Σ}$ . KKI, Kennedy Krieger Institute, Johns Hopkins University; NYU, New York University Child Study Center; PKG, Peking University; NIM, NeuroIMAGE sample.

3.2. Numerical Assessment of Harmonization Properties

Two harmonization properties were required from the beginning: preservation of intra-site geodesic distances, thus preserving biological variability (and some residual site bias), and the imposition of a common Fréchet mean. In the case of RLET( $\bar{Σ}$ ), this Fréchet mean is the global mean $\bar{Σ}$ , and in the case of RLET(I) and MW, the identity matrix I. To check that the first property is fulfilled, we have computed pair-wise distance difference matrices ${D_{i j}}^{(RLET (\bar{Σ}))} - {D_{i j}}^{(0)}$ (with RLET(I) giving the same results by construction) and ${D_{i j}}^{(MW)} - {D_{i j}}^{(0)}$ (in this latter case, using ${D_{i j}}^{(PT)}$ would give visually the same result, since ${D_{i j}}^{(PT)} \approx {D_{i j}}^{(MW)}$ , as proved before). We have found that diagonal blocks (differences in intra-site distances) vanish identically (Figures 5A,B). Regarding the second property, we have computed on one side, pair-wise LERM distances between site means and between the global mean and these site means, before RLET(I) and RLET( $\bar{Σ}$ ), and after RLET( $\bar{Σ}$ ) harmonization; and on the other side, pair-wise AIRM distances between site means and global mean before MW harmonization, pair-wise AIRM distances between site means and the identity matrix after MW harmonization, and pair-wise LERM distances between site means and the identity matrix after RLET(I) (Figures 5C–F). After harmonization, these pair-wise distances vanish identically.

FIGURE 5

Figure 5. (A) Difference in pair-wise AIRM distances between original matrices and matrices transformed under matrix whitening. (B) Difference in pair-wise LERM distances between original matrices and matrices transformed under Rigid Log-Euclidean Translation (RLET). (C) Pair-wise AIRM distances between site Fréchet means ${{\bar{Σ}}^{(k)}}$ and the global mean $\bar{Σ}$ . (D) Pair-wise LERM distances between site Fréchet means ${\bar{Σ}}^{(k)}$ and the global mean $\bar{Σ}$ . (E) Pair-wise AIRM distances between site Fréchet mean computed after matrix whitening and the identity matrix I. (F) Pair-wise LERM distances between site Fréchet means computed after RLET( $\bar{Σ}$ ) and the global mean $\bar{Σ}$ . KKI, Kennedy Krieger Institute, Johns Hopkins University; NYU, New York University Child Study Center; PKG, Peking University; NIM, NeuroIMAGE sample.

3.3. Entry-Wise ADHD/HC Differences

The process of performing 100 separate experiments, described in Section 2.3, to detect entry-wise differences, has been applied, with the same randomly pooled subjects, to the original, RLET( $\bar{Σ}$ )-harmonized, RLET(I)-harmonized, and MW-harmonized matrices (PT has been disregarded because of its approximate equivalence to MW). Therefore, we have obtained four frequency matrices ${F_{i j}}^{(0)}$ , ${F_{i j}}^{(RLET (\bar{Σ}))}$ , ${F_{i j}}^{(RLET (I))}$ , and ${F_{i j}}^{(MW)}$ , respectively, which are shown in Figure 6. Results on the sensitivity (defined also in subsection 2.3) we have obtained for ${F_{i j}}^{(0)}$ , ${F_{i j}}^{(RLET (\bar{Σ}))}$ , ${F_{i j}}^{(RLET (I))}$ , and ${F_{i j}}^{(MW)}$ are shown in Table 1 between parentheses, together with the number of detected significant differences depending on their frequency of appearance in the frequency matrices.

FIGURE 6

Figure 6. Frequency matrices F_ij obtained by performing 100 permutation experiments with the same randomly pooled subjects, binarizing each experiment's p-value matrix at p < 0.001 and summing these resulting matrices for (A) original functional connectivity (FC) matrices; (B) FC matrices transformed under matrix whitening; (C) FC matrices transformed under Rigid Log-Euclidean Translation (RLET) to the global mean $\bar{Σ}$ ; (D) FC matrices transformed under RLET to the identity matrix.

TABLE 1

Table 1. The number of detected anomalies using original matrices, harmonized matrices under Rigid Log-Euclidean Translation (RLET) to the global mean $\bar{Σ}$ , harmonized matrices under RLET to the identity matrix I, and harmonized matrices under matrix whitening, according to the chosen frequency threshold.

3.4. Intra-Network and Inter-network Anomalies Distribution

In the following, we will refer to significant differences in connection strength between ADHD and HC subjects as “anomalies.” To have a global view of the distribution of anomalies in terms of inter- and intra-network interaction, we have constructed a 2-dimensional plot that reflects the relative involvement of the different resting-state networks in these interactions.

This 2-dimensional plot is a 10 × 10 matrix (one row and one column per functional component) constructed as follows: first, we have filled the matrix by assigning to each entry ij, the number of anomalies connecting functional component i with functional component j (therefore, diagonal elements represent intra-network anomalies), with the matrix being symmetric since we do not have a sense of directed connection; afterward, we have normalized each row (say, for example, DMN), excluding diagonal elements, by using the total number of inter-network anomalies that involve the component represented by that row (DMN); finally, in the last step, we have normalized the diagonal by dividing each diagonal entry by the sum of all diagonal entries (that is, the total number of intra-network anomalies). At this point, we are left with a matrix where each row, excluding diagonal entries, adds up to 1, and in turn, diagonal entries add up to 1. One has to interpret this matrix in the following manner: when taking, for example, the row corresponding to DMN, the entry DMN-FP is the percentage of inter-network anomalies involving DMN that end up connected to FP; when considering DMN-DMN, it is the percentage of intra-network anomalies that are found within DMN. Therefore, we have summarized in a plot, the relative contribution of every functional component to the inter-network anomalies involving another specific functional component. One has to bear in mind that DMN-FP (row-column) does not have the same interpretation as FP-DMN and, indeed, their values are different in general (after normalizing per row, the matrix is no longer symmetric): the number of anomalous connections between DMN and FP is fixed, but their relative contribution to the total number of inter-network anomalies involving DMN and to the total number of inter-network anomalies involving FP need not be the same.

We have plotted these matrices for the anomalies obtained using the original, the RLET( $\bar{Σ}$ )-harmonized, the RLET(I)-harmonized, and the MW-harmonized functional connectomes (Figure 7). They have been computed with the anomalous connections identified by thresholding their corresponding frequency matrices at F_ij ≥ 2 to have a large enough number of samples (Table 1). Furthermore, we have computed differences between the plots corresponding to original and RLET( $\bar{Σ}$ ), between RLET(I) and MW, and between RLET( $\bar{Σ}$ ) and RLET(I); afterward, we have binarized the resulting difference plots at an absolute entry-wise difference ≥ 15%, thus allowing the detection of important changes in the distributions and assessing harmonization and manifold location effects. These results are shown in Figure 8.

FIGURE 7

Figure 7. Anomaly distribution plots according to the involved functional components obtained as described in subsection 3.4 using (A) original matrices; (B) matrices transformed under Rigid Log-Euclidean Translation (RLET) to the global mean $\bar{Σ}$ ; (C) matrices transformed under matrix whitening; (D) matrices transformed under RLET to the identity matrix. VIS, visual network; MOT, sensorimotor network; DA, dorsal attention network; VA, ventral attention network; FP, frontoparietal network; LIM, limbic network; DMN, default mode network; BS, brain stem; CRB, cerebellum; BG, basal ganglia.

FIGURE 8

Figure 8. Binarized entry-wise absolute differences greater than 15% between plots in Figure 6 for (A) original (non-harmonized) matrices and matrices harmonized under Rigid Log-Euclidean Translation (RLET) to the global mean $\bar{Σ}$ ; (B) RLET to $\bar{Σ}$ and RLET to the identity matrix I harmonization approaches; (C) matrix whitening and RLET to I harmonized matrices. VIS, visual network; MOT, sensorimotor network; DA, dorsal attention network; VA, ventral attention network; FP, frontoparietal network; LIM, limbic network; DMN, default mode network; BS, brain stem; CRB, cerebellum; BG, basal ganglia.

Even though a more involved interpretation of these results will be made in the Discussion section, we highlight that discrepancy between RLET( $\bar{Σ}$ ) and RLET(I) signal distortions induced by relocating matrices to the neighborhood of I, since relative positions between matrices in Log-Euclidean space are exactly the same by construction. Therefore, in the following, we will focus on the comparison between non-harmonized and RLET( $\bar{Σ}$ )-harmonized matrices.

3.5. RLET( $\bar{Σ}$ )-Harmonization Results

At this point, we focus on results at a frequency F_ij ≥ 3 to allow a more precise comparison between the results obtained from non-harmonized and RLET( $\bar{Σ}$ )-harmonized matrices. In this case, we have obtained 21 significant inter-network connections using non-harmonized matrices; and 32 inter-network and 1 intra-network anomalies after transforming original matrices under the RLET( $\bar{Σ}$ ) framework. We have represented the resulting anomalous connectomes in Figures 9, 10.

FIGURE 9

Figure 9. Anomalous connections obtained by thresholding the frequency matrix for non-harmonized matrices at F_ij ≥ 3. In total, the 21 anomalous connections are displayed. Notice the lack of anomalies involving the posterior parietal cortex and the cerebellum.

FIGURE 10

Figure 10. Anomalous connections obtained by thresholding the frequency matrix for matrices harmonized under Rigid Log-Euclidean Translation to the global mean $\bar{Σ}$ at F_ij ≥ 3. The 33 detected anomalies are displayed.

Moreover, we have plotted two histograms, one for non-harmonized and the other for RLET( $\bar{Σ}$ )-harmonized matrices, displaying how the distribution of abnormal connections involving particular functional components varies according to the frequency threshold (Figure 11). When looking at the histograms, one can see that the distribution of anomalies according to the involved functional component is better preserved when changing the considered frequency level in the case of RLET( $\bar{Σ}$ ) than in the non-harmonization approach.

FIGURE 11

Figure 11. The relative contribution of functional components to the total number of detected anomalies by thresholding frequency matrices at F_ij ≥ 2 and F_ij ≥ 3 corresponding to (A) the non-harmonization approach; (B) harmonization under Rigid Log-Euclidean Translation to the global mean $\bar{Σ}$ . VIS, visual network; MOT, sensorimotor network; DA, dorsal attention network; VA, ventral attention network; FP, frontoparietal network; LIM, limbic network; DMN, default mode network; BS, brain stem; CRB, cerebellum; BG, basal ganglia.

4. Discussion

4.1. Imaging-Site Bias

Results from pair-wise distances computation (Figure 1) and low-dimensional embedding (Figures 3, 4) of functional connectivity matrices show that there is an imaging-site-biased distribution of these matrices in the manifold. In particular, one can distinguish between an intra-site bias effect and an inter-site bias effect. Intra-site bias is clearly visible when looking at diagonal blocks of pair-wise distances plots: intra-site distances are significantly different between KKI and NIM matrices [interestingly, the NIM sample comes in its turn from 3 different imaging sites (ADHD-200 Consortium, 2017), what could explain its large intra-site variability]. Instead, one would expect the intrinsic biological variability to be approximately the same across sites when the number of subjects is large enough. Inter-site bias is reflected in the clustered distribution of matrices in the manifold that makes it possible to clearly identify visually the imaging site in the tSNE low-dimensional embedding (recall that this low-dimensional embedding is, with high probability, displaying LERM distance similarities). When harmonizing using the three different methods, inter-site distances were generally reduced (Figure 1), which is the expected effect of removing site-clustering behavior.

The clustered distribution in the manifold (Figure 3) points directly to the fact that entry-wise statistical comparisons could not be completely reliable unless the site effect is previously subtracted (Figure 4), since what is usually targeted in these statistical tests are entry-wise differences. Because of the nature of the permutation testing approach targeting these differences, large entry-wise differences coming from the original clustered and distant distribution of the matrices might be masking true and more subtle biological differences, or even giving rise to wrong salient entries. Geometry-grounded frameworks transform the matrices' entries in such a way that they can still be regarded as covariances (resulting matrices belong to the SPD manifold, as opposed to ComBat outputs, for example), that removes (partially) multi-site effects on entry-wise differences, and therefore that allows a more unbiased study of the significance of these individual differences or anomalies.

Regarding the detection of significant differences, in terms of sensitivity, RLET(I) is the harmonization method that performs the worst (Table 1) at all frequency thresholds. Nevertheless, only by using harmonization approaches have we reached the F_ij = 5 level. On the other hand, site means transform as expected, becoming the identity matrix I for RLET(I) and MW, and the global mean $\bar{Σ}$ in the case of RLET( $\bar{Σ}$ ), since pair-wise distances vanish. Intra-site distances are also preserved as intended but, as a consequence, intra-site bias is also retained. However, as a first approximation to imaging-site harmonization, our aim was to operate in the least distortive manner. One cannot know to which extent true biological variability would be artificially altered by modifying intra-site distances, even though it is clear that there is an imaging-site contribution to the observed dispersion. Notice however that when looking at Figure 4A, it seems that MW has been able to correct differences in dispersions, as opposed to RLET (Figure 4B). This impression is an artifact coming from the fact that the low-dimensional embedding has been obtained by using LERM distances, both for MW and for RLET. MW is based on the AIRM framework and, therefore, the low-dimensional embedding, using LERM distances as the metric, has not been able to capture the expected intra-site AIRM distance preservation behavior. In fact, if one considers Figure 1, one can see that different dispersions around the site mean have not been corrected for MW (Figure 1C), since diagonal blocks exhibit different average intra-site distances in the same way as RLET-transformed ones (Figure 1D).

Figure 8 provides evidence favoring harmonization, but also evidence for rejecting harmonization at the identity matrix. Testing for the difference between these distribution matrices and requiring entry-wise coincidence within ±15% is strict, bearing in mind that we are dealing with relative contributions of connections among and within 10 different components, and the number of samples is considerably small. Therefore, at this point, looking at the differences between RLET(I) and MW (only 3/100 entries display absolute differences larger than 15%), we can claim that harmonization is indeed working in its task of homogenizing results. However, taking into account that the only difference between RLET(I) and RLET( $\bar{Σ}$ ) is the location of the Fréchet mean (relative positions between functional connectivity matrices in the Log-Euclidean space are exactly preserved by construction), and considering the dissimilarities between distributions, we can affirm that location in the manifold is important (as we had presumed in the Methods section and motivating our construction of RLET( $\bar{Σ}$ )) and that transporting matrices to the identity matrix is too distortive in our setting. As a consequence, and given those discrepancies in anomaly distributions between original and RLET( $\bar{Σ}$ ) are 3-fold (9/100) the ones observed by harmonizing with two different approaches to identity (3/100), we have focused our attention in the following subsection on the effect of RLET( $\bar{Σ}$ )-harmonization on the original functional connectivity matrices, since these (possibly biologically relevant) discrepancies are, at least partially, due to harmonization.

Finally, given the results pointing to a considerably large intra-site bias, we can propose a modification to RLET( $\bar{Σ}$ )

\begin{array}{l} Log ({Σ_{i}}^{(k)}) \to Log ({\tilde{Σ}}_{i}^{(k)}) = Log (\bar{Σ}) + λ^{(k)} (Log (Σ_{i}^{(k)}) - Log ({\bar{Σ}}^{(k)})), & (15) \end{array}

consisting on the introduction of a site-dependent rescaling parameter λ^(k) > 0, which has the effect of rescaling intra-site distances by a factor of λ^(k) while still enforcing all site Fréchet mean to be the global mean $\bar{Σ}$ . Tuning λ^(k) would allow, for example, to empirically adjust mean intra-site distances (and, thus, intra-site variability) so that different sites display approximately the same dispersion around $\bar{Σ}$ . A first approximation choice for λ^(k) could be, therefore,

\begin{array}{l} λ^{(k)} = \frac{\frac{1}{K} \sum_{(k^{'})} \frac{1}{2 N^{(k^{'})}} \sum_{i \in (k^{'})} \sum_{j \in (k^{'})} d (Σ_{i}, Σ_{j})}{\frac{1}{2 N^{(k)}} \sum_{i \in (k)} \sum_{j \in (k)} d (Σ_{i}, Σ_{j})}, & (16) \end{array}

that is, the ratio between the mean of all intra-site pair-wise distances (regardless of k) and the mean of intra-site-k distances.

4.2. Abnormal Functional Connectivity Findings

4.2.1. Large-Scale Networks in ADHD

By thresholding frequency matrices at F_ij ≥ 2, one can obtain a large enough number of anomalies to construct the distributions shown in Figure 7. This approach enables to obtain a coarse-grained picture of anomalous interactions and, therefore, to take into account anomalous engagement between large-scale resting-state networks.

Although there is a large amount of pair-wise interactions, the most descriptive findings are obtained by considering relevant functional components that do not have anomalous inter- or intra-network interactions. Using non-harmonized matrices, we have found that at F_ij ≥ 2 there is an absence of intra-DA and intra-VA anomalous connections. In the case of RLET( $\bar{Σ}$ )-harmonized matrices, we see an absence of intra-VIS and intra-VA anomalies, and an absence of DA-VA and VIS-VA inter-network abnormal interaction. In both cases, BG engages abnormally selectively with DMN, FP, and VIS networks. In resting-state functional connectivity, in general, findings do not support an involvement of VA in ADHD, even though authors do not find this evidence to be conclusive (Castellanos and Proal, 2012). On the contrary, DA involvement in ADHD is well-established (Dickstein et al., 2006; Rubia, 2011). The absence of intra-DA alterations under the non-harmonization framework could be therefore interpreted as a deficit of the framework. Intra-VIS alterations, both hyper- and hypoactivations, have been found in task-based settings, depending on the task (Vance et al., 2007; Dillo et al., 2010; Schneider et al., 2010). One study found altered intra-VIS resting-state functional connectivity by using ADHD-200 resting-state data (Kessler et al., 2014) (correlating with results from the analysis of non-harmonized matrices, where intra-VIS contributions are found). Authors highlight the importance of examining carefully the impact of the visual component in ADHD and, in particular, its relation to attention (Castellanos and Proal, 2012). To our opinion, given the importance (in terms of contribution to anomalies) that RLET( $\bar{Σ}$ )-harmonization gives to VIS, the absence of intra-network findings is descriptive, in the sense that it is difficult to understand that the algorithm did not discover intra-network anomalies because of sensitivity regarding the visual component.

4.2.2. Pathophysiology of ADHD

Although there is a growing consensus that large-scale brain networks are involved in ADHD, the number of anomalous connections that remain as significant when thresholding their frequency at F_ij ≥ 3 does not allow an interpretation in terms of network interactions, but rather an interpretation in terms of pathways, (parts of) circuits, or pair-wise relevant connections. This interpretation makes it possible to probe the fine-grained pathophysiology of ADHD from a functional point of view and to correlate our findings with established results.

4.2.2.1. Prefrontal Cortex

It has been hypothesized for a long time that ADHD is a disorder of the prefrontal cortex (PFC). The motivation underlying the hypothesis of an involvement of PFC is that one of its most important functions is behavioral control, the principal impairment presented by patients with ADHD. The circuitry that has been found to be critical in the neurobiology of ADHD is the cortico-striatal-thalamo-cortical (CSTC) loops. These closed loops transmit cortical inputs to the thalamus via the striatum, to be reprojected back to the cortical region. Two of these loops play a key role in ADHD: the dorsolateral prefrontal circuit, with the dorsolateral prefrontal cortex (DLPFC) and the (dorsolateral) caudate nucleus forming the cortico-striatal pathway, and the orbitofrontal circuit, composed of the orbitofrontal cortex (OFC) and the (ventromedial) caudate nucleus pathway. The dorsolateral prefrontal circuit is a central element in cognitive control, and the orbitofrontal circuit is heavily implicated in reward processing, whose impairment is thought to provoke impulsive behavior in ADHD. On top of these ones, another component that has been added more recently to ADHD PFC anomalous connections is the fronto-cerebellar circuitry. The cerebellum is thought to be involved in diverse functional impairments of patients with ADHD, such as working memory, attention, or the construction of temporal expectations. For a detailed description of the involvement of these circuits in ADHD, we refer the reader to Durston et al. (2011) and references therein.

Remarkably, the analysis of RLET( $\bar{Σ}$ )-harmonized matrices points directly to these circuits: we have found anomalous connectivity between the right caudate nucleus and the left OFC; between right cerebellum (VIIb) and the right frontal pole; together with a connected chain comprising the left caudate nucleus, the right inferior frontal gyrus pars opercularis, left cerebellum (VIIb) and the right angular gyrus, with both right inferior frontal gyrus pars opercularis and right angular gyrus lying on DLPFC. Therefore, we have found evidence for 1) anomalies in frontostriatal paths, both in the OFC and DLPFC circuits; 2) an abnormal connectivity in a fronto-cerebellar path (right cerebellum—right frontal pole); 3) a simultaneous anomalous engagement of fronto-cerebellar and frontostriatal circuits with the same cortical region, DLPFC (left cerebellum—right angular gyrus, right inferior frontal gyrus pars opercularis—left caudate nucleus); and 4) more specifically, a simultaneous anomalous engagement of a fronto-cerebellar circuit and a frontostriatal circuit involving exactly the same cortical node (right inferior frontal gyrus pars opercularis, DLPFC), and, thus, a precise anomalous overlap of different circuitry components. Results of non-harmonized matrices also feature the anomalous connection between the left caudate nucleus and the right inferior frontal gyrus pars opercularis. In this case, we would only have evidence for the involvement of the DLPFC circuit. Furthermore, it is important to mention that non-harmonized matrices have not signaled any anomaly involving cerebellar nodes, which is undesirable given the pervasive evidence of their implication in ADHD. In contrast, RLET( $\bar{Σ}$ )-harmonization provides evidence of abnormal functional connectivity of the cerebellum with somatomotor, default mode network, and visual nodes, apart from the PFC ones, pointing to the multi-dimensional association to ADHD pathophysiology.

4.2.2.2. Visual Nodes

It has been established that patients with ADHD have worse performances when tested for visual processing speed and visual short-term memory in comparison to matched healthy controls (Low et al., 2018). These abnormal performances signal a perceptual deficit and, more concretely, impairments in early visual information processing (Alqahtani et al., 2019; Papp et al., 2020).

One of the most remarkable differences between the anomalies spotted using the original matrices and the RLET( $\bar{Σ}$ )-harmonized ones is the importance that the analysis of the latter matrices gives to visual network nodes, in terms of contribution to the total number of detected anomalies. Specifically, there is an interesting and insisting presence of anomalous connections between visual network nodes and nodes found in the posterior parietal cortex (PPC), belonging to the dorsal attention network (DA). Precisely, these connections fill the void found in Figure 9. It has been found that the interaction between the occipital cortex and DA is involved in maintaining attention and suppressing attention to irrelevant stimuli (Capotosto et al., 2009; Shulman et al., 2009). Our findings signal specifically the anomalous connection of nodes lying on the most posterior part of DA and nodes of the visual cortex. One important pathway for visual information processing connects these brain regions: the dorsal stream. The dorsal stream and the ventral stream were proposed as two routes with differentiated functions in the early processing of visual information (Goodale and Milner, 1992): the dorsal stream, running from the visual cortex to PPC, is typically thought to be involved in the construction of a detailed map of the visual field, spatial awareness, the detection and analysis of motion and the guidance of actions and coordination in space; on the other hand, the ventral stream, originating in the visual cortex and leading to the temporal lobe, is proposed to be related to visual identification and recognition.

The dorsal stream, which is tentatively being featured in our findings, has been demonstrated with clinical and experimental evidence to play an essential role in the ability of shifting spatial attention, that is, to disengage attention from a location and to engage attention to another location (Sciberras-Lim and Lambert, 2017). Although in patients with ADHD there is a disregulation in attentional resources allocation, it is still not clear which dimensions of attention are affected by the disorder. The affectation of visuospatial attention is a controversial topic (Roberts et al., 2017), with findings pointing to opposite ways. Many authors point to the fact that given the wide spectrum of impairments that patients exhibit, their phenotyping by a particular dimension of attention might not be possible (Roberts et al., 2017). Nevertheless, what is widely accepted is that patients with ADHD display a deficit in visuospatial working memory (van Ewijk et al., 2013), the capacity to maintain a representation of visuospatial information for a brief period of time (Vecera and Rizzo, 2003). In particular, visuospatial working memory is involved, for example, in the retrieval and manipulation of recent images for orientation in space and for keeping track of moving objects. As we have explained, the dorsal stream plays a critical role in the integration of visual information to produce the egocentric spatial map on which the subject relies onto detect motion or to guide his/her actions in space. Therefore, there is coherence between the impairment of visuospatial working memory in ADHD and the tentative appearance of the dorsal streams in our findings.

4.2.2.3. Other Relevant Anomalies

Given the coherence between the results obtained after RLET( $\bar{Σ}$ )-harmonization and established ADHD pathophysiology, we have reasons to think that default mode network (DMN), sensorimotor (MOT), visual (VIS), dorsal attention (DA), and frontoparietal (FP) nodes play very important roles in the disorder and that the right lateral occipital cortex is a region that needs to be examined more carefully in subsequent analyses. When thresholding the frequency matrix at F_ij ≥ 4, we are left with 8 anomalous connections. Within these 8 anomalies, we can find the aforementioned OFC-striatum and fronto-cerebellar pathways, and 5 of them involve DMN. If we take the maximum threshold F_ij = 5, 4 of these previous 8 anomalies remain: left superior frontal gyrus (MOT) with right superior frontal gyrus (DMN), right postcentral gyrus (MOT) with right lateral occipital cortex superior division (DMN), right occipital pole (VIS) with right lateral occipital cortex superior division (FP), and right lateral occipital cortex superior division (DMN) with right supramarginal gyrus anterior division (DA). Interestingly, 3 out of 4 feature the same brain regions, the right lateral occipital cortex superior division, being associated with DMN and FP functional networks, depending on the particular node of the region.

5. Conclusion

In conclusion, we have collected evidence that supports further research on multi-site dataset harmonization when using functional connectivity matrices. Specifically, we have shown that the distribution of matrices in the SPD matrices manifold is site-biased. Moreover, we have been able to discard two adapted domain adaptation approaches and prove that our proposal (RLET( $\bar{Σ}$ )) is the most reliable one in a clinical setting. Remarkably, functional analyses of ADHD-200 data after RLET( $\bar{Σ}$ )-harmonization present a high correlation with neurobiological findings in ADHD. However, intra-site bias was not removed and, to our point of view, this issue needs to be addressed, possibly by using our modification of RLET as a first approximation. It would also be valuable to use a more extended dataset for further research and to rely on more permutations in each permutation test to increase statistical power. Nevertheless, given our results, we think there are reasons for reanalyzing functional data from multi-site datasets corresponding to other disorders, given their pervasive use in functional neuroscience.

Data Availability Statement

The datasets presented in this study can be found in online repositories. The names of the repository and accession number can be found below: Figshare, https://doi.org/10.6084/m9.figshare.16437534.v1.

Author Contributions

GS performed the study conception and design, data analysis, code implementation, and material and figures preparation. The first draft of the manuscript was written by GS. DP, GP, and OC provided methodological guidance according to their role of master thesis' supervisors and senior authorship, and revised the manuscript. All authors contributed to the article and approved the submitted version.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Acknowledgments

We thank the reviewers for their useful and constructive comments and suggestions. We also thank the publisher Frontiers for the partial publication fee waiver.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fninf.2022.769274/full#supplementary-material

References

ADHD-200 Consortium (2017). The Adhd-200 Sample. Available online at: https://fcon_1000.projects.nitrc.org/indi/adhd200/.

Alqahtani, Y., McGuire, M., Chakraborty, J., and Feng, J. H. (2019). “Understanding how ADHD affects visual information processing,” in Universal Access in Human-Computer Interaction. Multimodality and Assistive Environments (Orlando, FL: Springer International Publishing), 23–31.

Google Scholar

Arsigny, V., Fillard, P., Pennec, X., and Ayache, N. (2005). “Fast and simple calculus on tensors in the log-euclidean framework,” in Lecture Notes in Computer Science (Berlin; Heidelberg: Springer Berlin Heidelberg), 115–122.

Google Scholar

Beer, J. C., Tustison, N. J., Cook, P. A., Davatzikos, C., Sheline, Y. I., Shinohara, R. T., et al. (2020). Longitudinal ComBat: a method for harmonizing longitudinal multi-scanner imaging data. Neuroimage 220, 117129. doi: 10.1016/j.neuroimage.2020.117129

PubMed Abstract | CrossRef Full Text | Google Scholar

Bellec, P., Chu, C., Chouinard-Decorte, F., Benhajali, Y., Margulies, D. S., and Craddock, R. C. (2017). The neuro bureau ADHD-200 preprocessed repository. Neuroimage 144, 275–286. doi: 10.1016/j.neuroimage.2016.06.034

PubMed Abstract | CrossRef Full Text | Google Scholar

Brain, D., and Webb, G. I. (2002). “The need for low bias algorithms in classification learning from large data sets,” in Principles of Data Mining and Knowledge Discovery, eds T. Elomaa, H. Mannila, and H. Toivonen (Helsinki).

Google Scholar

Capotosto, P., Babiloni, C., Romani, G. L., and Corbetta, M. (2009). Frontoparietal cortex controls spatial attention through modulation of anticipatory alpha rhythms. J. Neurosci. 29, 5863–5872. doi: 10.1523/JNEUROSCI.0539-09.2009

PubMed Abstract | CrossRef Full Text | Google Scholar

Castellanos, F. X., and Proal, E. (2012). Large-scale brain systems in ADHD: beyond the prefrontal-striatal model. Trends Cogn. Sci. 16, 17–26. doi: 10.1016/j.tics.2011.11.007

PubMed Abstract | CrossRef Full Text | Google Scholar

Craddock, R. C. (2011). Athena Pipeline. Available online at: https://www.nitrc.org/plugins/mwiki/index.php/neurobureau:AthenaPipeline.

David, S. P., Ware, J. J., Chu, I. M., Loftus, P. D., Fusar-Poli, P., Radua, J., et al. (2013). Potential reporting bias in fMRI studies of the brain. PLoS ONE 8, e70104. doi: 10.1371/journal.pone.0070104

PubMed Abstract | CrossRef Full Text | Google Scholar

Dickstein, S. G., Bannon, K., Castellanos, F. X., and Milham, M. P. (2006). The neural correlates of attention deficit hyperactivity disorder: an ALE meta-analysis. J. Child Psychol. Psychiatry 47, 1051–1062. doi: 10.1111/j.1469-7610.2006.01671.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Dillo, W., Göke, A., Prox-Vagedes, V., Szycik, G. R., Roy, M., Donnerstag, F., et al. (2010). Neuronal correlates of adhd in adults with evidence for compensation strategies – a functional mri study with a go/no-go paradigm. Ger. Med. Sci. 8, Doc09. doi: 10.3205/000098

PubMed Abstract | CrossRef Full Text | Google Scholar

Dodero, L., Minh, H. Q., Biagio, M. S., Murino, V., and Sona, D. (2015). “Kernel-based classification for brain connectivity graphs on the riemannian manifold of positive definite matrices,” in 2015 IEEE 12th International Symposium on Biomedical Imaging (ISBI) (Brooklyn, NY: IEEE).

Google Scholar

Du, Y., Fu, Z., and Calhoun, V. D. (2018). Classification and prediction of brain disorders using functional connectivity: promising but challenging. Front. Neurosci. 12, 525. doi: 10.3389/fnins.2018.00525

PubMed Abstract | CrossRef Full Text | Google Scholar

Durston, S., van Belle, J., and de Zeeuw, P. (2011). Differentiating frontostriatal and fronto-cerebellar circuits in attention-deficit/hyperactivity disorder. Biol. Psychiatry 69, 1178–1184. doi: 10.1016/j.biopsych.2010.07.037

PubMed Abstract | CrossRef Full Text | Google Scholar

Fortin, J.-P., Cullen, N., Sheline, Y. I., Taylor, W. D., Aselcioglu, I., Cook, P. A., et al. (2018). Harmonization of cortical thickness measurements across scanners and sites. Neuroimage 167, 104–120. doi: 10.1016/j.neuroimage.2017.11.024

PubMed Abstract | CrossRef Full Text | Google Scholar

Fortin, J.-P., Parker, D., Tunç, B., Watanabe, T., Elliott, M. A., Ruparel, K., et al. (2017). Harmonization of multi-site diffusion tensor imaging data. Neuroimage 161, 149–170. doi: 10.1016/j.neuroimage.2017.08.047

PubMed Abstract | CrossRef Full Text | Google Scholar

Goodale, M. A., and Milner, A. (1992). Separate visual pathways for perception and action. Trends Neurosci. 15, 20–25. doi: 10.1016/0166-2236(92)90344-8

PubMed Abstract | CrossRef Full Text | Google Scholar

Greicius, M. (2008). Resting-state functional connectivity in neuropsychiatric disorders. Curr. Opin. Neurol. 21, 424–430. doi: 10.1097/WCO.0b013e328306f2c5

PubMed Abstract | CrossRef Full Text | Google Scholar

Johnson, W. E., Li, C., and Rabinovic, A. (2006). Adjusting batch effects in microarray expression data using empirical bayes methods. Biostatistics 8, 118–127. doi: 10.1093/biostatistics/kxj037

PubMed Abstract | CrossRef Full Text | Google Scholar

Kessler, D., Angstadt, M., Welsh, R. C., and Sripada, C. (2014). Modality-spanning deficits in attention-deficit/hyperactivity disorder in functional networks, gray matter, and white matter. J. Neurosci. 34, 16555–16566. doi: 10.1523/JNEUROSCI.3156-14.2014

PubMed Abstract | CrossRef Full Text | Google Scholar

Kim, J., Wozniak, J. R., Mueller, B. A., and Pan, W. (2015). Testing group differences in brain functional connectivity: Using correlations or partial correlations? Brain Connect. 5, 214–231. doi: 10.1089/brain.2014.0319

PubMed Abstract | CrossRef Full Text | Google Scholar

Low, A.-M., Vangkilde, S., le Sommer, J., Fagerlund, B., Glenthøj, B., Jepsen, J. R. M., et al. (2018). Visual attention in adults with attention-deficit/hyperactivity disorder before and after stimulant treatment. Psychol. Med. 49, 2617–2625. doi: 10.1017/S0033291718003628

PubMed Abstract | CrossRef Full Text | Google Scholar

Ng, B., Dressler, M., Varoquaux, G., Poline, J. B., Greicius, M., and Thirion, B. (2014). “Transport on riemannian manifold for functional connectivity-based classification,” in Medical Image Computing and Computer-Assisted Intervention –MICCAI 2014 (Boston, MA: Springer International Publishing), 405–412.

Google Scholar

Nichols, T. E., and Holmes, A. P. (2001). Nonparametric permutation tests for functional neuroimaging: a primer with examples. Hum. Brain Mapp. 15, 1–25. doi: 10.1002/hbm.1058

PubMed Abstract | CrossRef Full Text | Google Scholar

Papp, S., Tombor, L., Kakuszi, B., Balogh, L., Réthelyi, J. M., Bitter, I., et al. (2020). Impaired early information processing in adult ADHD: a high-density ERP study. BMC Psychiatry 20, 292. doi: 10.1186/s12888-020-02706-w

PubMed Abstract | CrossRef Full Text | Google Scholar

Pennec, X. (2006). Intrinsic statistics on riemannian manifolds: basic tools for geometric measurements. J. Math. Imaging Vis. 25, 127–154. doi: 10.1007/s10851-006-6228-4

CrossRef Full Text | Google Scholar

Pennec, X., Fillard, P., and Ayache, N. (2006). A riemannian framework for tensor computing. Int. J. Comput. Vis. 66, 41–66. doi: 10.1007/s11263-005-3222-z

CrossRef Full Text | Google Scholar

Roberts, M., Ashinoff, B. K., Castellanos, F. X., and Carrasco, M. (2017). When attention is intact in adults with ADHD. Psychonomic Bull. Rev. 25, 1423–1434. doi: 10.3758/s13423-017-1407-4

PubMed Abstract | CrossRef Full Text | Google Scholar

Rubia, K. (2011). “cool” inferior frontostriatal dysfunction in attention-deficit/hyperactivity disorder versus “hot” ventromedial orbitofrontal-limbic dysfunction in conduct disorder: a review. Biol. Psychiatry 69, e69–e87. doi: 10.1016/j.biopsych.2010.09.023

PubMed Abstract | CrossRef Full Text | Google Scholar

Schneider, M. F., Krick, C. M., Retz, W., Hengesch, G., Retz-Junginger, P., Reith, W., et al. (2010). Impairment of fronto-striatal and parietal cerebral networks correlates with attention deficit hyperactivity disorder (ADHD) psychopathology in adults —a functional magnetic resonance imaging (fMRI) study. Psychiatry Res. 183, 75–84. doi: 10.1016/j.pscychresns.2010.04.005

PubMed Abstract | CrossRef Full Text | Google Scholar

Sciberras-Lim, E. T., and Lambert, A. J. (2017). Attentional orienting and dorsal visual stream decline: review of behavioral and EEG studies. Front. Aging Neurosci. 9, 246. doi: 10.3389/fnagi.2017.00246

PubMed Abstract | CrossRef Full Text | Google Scholar

Shulman, G. L., Astafiev, S. V., Franke, D., Pope, D. L. W., Snyder, A. Z., McAvoy, M. P., et al. (2009). Interaction of stimulus-driven reorienting and expectation in ventral and dorsal frontoparietal and basal ganglia-cortical networks. J. Neurosci. 29, 4392–4407. doi: 10.1523/JNEUROSCI.5609-08.2009

PubMed Abstract | CrossRef Full Text | Google Scholar

Slavakis, K., Salsabilian, S., Wack, D., Muldoon, S., Baidoo-Williams, H., Vettel, J., et al. (2017). Riemannian-geometry-based modeling and clustering of network-wide non-stationary time series: the brain-network case. ArXiv [preprint] abs/1701.07767. doi: 10.1117/12.2274405

CrossRef Full Text | Google Scholar

Smitha, K., Raja, K. A., Arun, K., Rajesh, P., Thomas, B., Kapilamoorthy, T., et al. (2017). Resting state fMRI: a review on methods in resting state connectivity analysis and resting state networks. Neuroradiol. J. 30, 305–317. doi: 10.1177/1971400917697342

PubMed Abstract | CrossRef Full Text | Google Scholar

van den Heuvel, M. P., and Pol, H. E. H. (2010). Exploring the brain network: a review on resting-state fMRI functional connectivity. Eur. Neuropsychopharmacol. 20, 519–534. doi: 10.1016/j.euroneuro.2010.03.008

PubMed Abstract | CrossRef Full Text | Google Scholar

van der Maaten, L., and Hinton, G. (2008). Visualizing data using t-sne. J. Mach. Learn. Res. 9, 2579–2605. Available online at: https://www.jmlr.org/papers/v9/vandermaaten08a.html

Google Scholar

van Ewijk, H., Heslenfeld, D. J., Luman, M., Rommelse, N. N., Hartman, C. A., Hoekstra, P., et al. (2013). Visuospatial working memory in ADHD patients, unaffected siblings, and healthy controls. J. Atten. Disord. 18, 369–378. doi: 10.1177/1087054713482582

PubMed Abstract | CrossRef Full Text | Google Scholar

Vance, A., Silk, T. J., Casey, M., Rinehart, N. J., Bradshaw, J. L., Bellgrove, M. A., et al. (2007). Right parietal dysfunction in children with attention deficit hyperactivity disorder, combined type: a functional MRI study. Mol. Psychiatry 12, 826–832. doi: 10.1038/sj.mp.4001999

PubMed Abstract | CrossRef Full Text | Google Scholar

Varoquaux, G., Baronnet, F., Kleinschmidt, A., Fillard, P., and Thirion, B. (2010). “Detection of brain functional-connectivity difference in post-stroke patients using group-level covariance modeling,” in Medical Image Computing and Computer-Assisted Intervention –MICCAI 2010 (Berlin; Heidelberg: Springer Berlin Heidelberg), 200–208.

Google Scholar

Vecera, S. P., and Rizzo, M. (2003). Spatial attention: normal processes and their breakdown. Neurol Clin. 21, 575–607. doi: 10.1016/S0733-8619(02)00103-2

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, J.-B., Zheng, L.-J., Cao, Q.-J., Wang, Y.-F., Sun, L., Zang, Y.-F., et al. (2017). Inconsistency in abnormal brain activity across cohorts of ADHD-200 in children with attention deficit hyperactivity disorder. Front. Neurosci. 11, 320. doi: 10.3389/fnins.2017.00320

PubMed Abstract | CrossRef Full Text | Google Scholar

Yair, O., Ben-Chen, M., and Talmon, R. (2019). Parallel transport on the cone manifold of SPD matrices for domain adaptation. IEEE Trans. Signal Process. 67, 1797–1811. doi: 10.1109/TSP.2019.2894801

CrossRef Full Text | Google Scholar

Yeo, B. T. T., Krienen, F. M., Sepulcre, J., Sabuncu, M. R., Lashkari, D., Hollinshead, M., et al. (2011). The organization of the human cerebral cortex estimated by intrinsic functional connectivity. J. Neurophysiol. 106, 1125–1165. doi: 10.1152/jn.00338.2011

PubMed Abstract | CrossRef Full Text | Google Scholar

You, K., and Park, H.-J. (2021). Re-visiting riemannian geometry of symmetric positive definite matrices for the analysis of functional connectivity. Neuroimage 225, 117464. doi: 10.1016/j.neuroimage.2020.117464

PubMed Abstract | CrossRef Full Text | Google Scholar

Yu, M., Linn, K. A., Cook, P. A., Phillips, M. L., McInnis, M., Fava, M., et al. (2018). Statistical harmonization corrects site effects in functional connectivity measurements from multi-site fMRI data. Hum. Brain Mapp. 39, 4213–4227. doi: 10.1002/hbm.24241

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: multi-site dataset, resting-state, functional connectivity, harmonization, Riemannian geometry, attention-deficit/hyperactivity disorder

Citation: Simeon G, Piella G, Camara O and Pareto D (2022) Riemannian Geometry of Functional Connectivity Matrices for Multi-Site Attention-Deficit/Hyperactivity Disorder Data Harmonization. Front. Neuroinform. 16:769274. doi: 10.3389/fninf.2022.769274

Received: 01 September 2021; Accepted: 25 April 2022;
Published: 23 May 2022.

Edited by:

Pedro Antonio Valdes-Sosa, University of Electronic Science and Technology of China, China

Reviewed by:

Lipeng Ning, Brigham and Women's Hospital and Harvard Medical School, United States
Ronaldo García Reyes, Cuban Neuroscience Center, Cuba

Copyright © 2022 Simeon, Piella, Camara and Pareto. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Guillem Simeon, Z3VpbGxlbS5zaW1lb24mI3gwMDA0MDt1cGYuZWR1

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

ORIGINAL RESEARCH article

Riemannian Geometry of Functional Connectivity Matrices for Multi-Site Attention-Deficit/Hyperactivity Disorder Data Harmonization

1. Introduction

2. Materials and Methods

2.1. Functional Connectivity Matrices From ADHD-200

2.2. Riemannian Geometry of SPD Matrices

2.3. Geometry-Aware Permutation Testing

2.4. Geometry-Aware Site Harmonization

2.4.1. Matrix Whitening at Identity

2.4.2. Parallel Transport

2.4.3. Rigid Log-Euclidean Translation

2.5. Pair-Wise Distances and Low-Dimensional Embedding

2.6. Functional Analysis

3. Results

3.1. Pair-Wise Distances and Distribution of Matrices

3.2. Numerical Assessment of Harmonization Properties

3.3. Entry-Wise ADHD/HC Differences

3.4. Intra-Network and Inter-network Anomalies Distribution

3.5. RLET(Σ¯)-Harmonization Results

4. Discussion

4.1. Imaging-Site Bias

4.2. Abnormal Functional Connectivity Findings

4.2.1. Large-Scale Networks in ADHD

4.2.2. Pathophysiology of ADHD

4.2.2.1. Prefrontal Cortex

4.2.2.2. Visual Nodes

4.2.2.3. Other Relevant Anomalies

5. Conclusion

Data Availability Statement

Author Contributions

Conflict of Interest

Publisher's Note

Acknowledgments

Supplementary Material

References

3.5. RLET( $\bar{Σ}$ )-Harmonization Results