Abstract
The manifold under consideration consists of the faithful normal states on a sigma-finite von Neumann algebra in standard form. Tangent planes and approximate tangent planes are discussed. A relative entropy/divergence function is assumed to be given. It is used to generalize the notion of an exponential arc connecting one state to another. The generator of the exponential arc is shown to be unique up to an additive constant. In the case of Araki’s relative entropy, every self-adjoint element of the von Neumann algebra generates an exponential arc. The generators of the composed exponential arcs are shown to add up. The metric derived from Araki’s relative entropy is shown to reproduce the Kubo–Mori metric. The latter is the metric used in linear response theory. The e- and m-connections describe a dual pair of geometries. Any finite number of linearly independent generators determines a submanifold of states connected to a given reference state by an exponential arc. Such a submanifold is a quantum generalization of a dually flat statistical manifold.
1 Introduction
The goal of the present paper is to show that the theory of quantum statistical manifolds can be formulated without reference to density matrices. It is tradition to describe the statistical state of a quantum model by a density matrix. In many cases this suffices, in particular when the Hilbert space of wave functions is finite-dimensional. However, even simple models such as the quantum harmonic oscillator or the hydrogen atom require an infinite-dimensional Hilbert space. This involves handling of unbounded operators which cause considerable technical complications. These complications are avoided in the present work.
A one-to-one correspondence between density matrices and quantum states is usually accepted. The quantum states form the sample space of the statistical description. An alternative description emerged in the past century, which introduced the notion of a mathematical state on an algebra of observables which can be realized as an algebra of bounded operators on Hilbert space. See for instance [1–5].
Equilibrium states of quantum statistical mechanics are described by the quantum analogue of the probability distribution of Gibbs, which is a density matrix ρ of the formwith H a Hermitian matrix, β a parameter the inverse temperature, and Z a function of β used to normalize density matrix ρ so that its trace equals 1. Models described in this way can belong to a quantum exponential family. They possess an intriguing property called the Kubo–Martin–Schwinger (KMS) condition [6]. The KMS condition describes a symmetry property of the time evolution of quantum states. This symmetry coincides with the symmetry between left and right multiplication of operators, which is studied in the Tomita–Takesaki theory [7]. [5] can be used as a reference text for this theory.
The notion of a statistical manifold is studied in information geometry ([8–12]). It is a manifold of probability distributions. The quantum analogue is described in Chapter 7 of [11] as a manifold of k by k density matrices. The book of Petz [13] reviews several aspects of quantum statistics, including the basics of quantum information and quantum information geometry.
The generalization of Amari’s dually flat geometry from statistical models with a finite number of parameters to Banach manifolds of mutually equivalent probability measures started with the work of [14]. Non-commutative versions were formulated by [15–19].
The convex set of faithful normal states on a σ-finite von Neumann algebra is in general not a Banach manifold. The point of view taken in the present work is that the set should, by definition, be a quantum statistical manifold. This raises the question of how to transfer common notions of differential geometry and of Banach manifolds to this quantum setting. The present work contributes to this effort.
The relative entropy of Umegaki [20] is the starting point to implement Amari’s dually flat geometry on the quantum manifold. It should be noted that relative entropy is called a divergence function in mathematical literature. Araki [21–23] generalizes Umegaki’s relative entropy to the context of mathematical states on an algebra of bounded operators on a Hilbert space. The use of Araki’s relative entropy replacing that of Umegaki’s is the core of the present work.
Exponential arcs were introduced in [24, 25] and used in [26]. These arcs can be considered one-parameter exponential families embedded in the manifold. The maximal exponential model centered at a given probability distribution p equals the set of all probability distributions connected to p by an open exponential arc. Exponential arcs were studied in the quantum setting by [27]. Here, the definition is generalized. The exponential arcs are used to define quantum statistical manifolds as submanifolds of the manifold of all quantum states.
The Radon–Nikodym Theorem plays an important role in probability theory. For each measure absolutely continuous with respect to the reference measure, there exists an essentially unique probability distribution function. The problem that arises in the non-commutative context is the non-uniqueness of the Radon–Nikodym derivative. This leads to different definitions of the relative entropy and of the exponential arcs. First attempts to reformulate the theory of the quantum statistical manifold in terms of states on a C*-algebra are found in [28,29] and in [27]. These two approaches differ in the choice of the Radon–Nikodym derivative. In the present work, the definition of an exponential arc is generalized so that it depends explicitly on the choice of relative entropy and in that way on the choice of the Radon–Nikodym derivative.
The alternative approach of [30] relies on the Lie Theory for the group of bounded operators with bounded inverse. The state space is partitioned into the disjoint union of the orbits of an action of the Lie group. Under mild conditions, it is shown that the orbits are Banach manifolds. The restriction to bounded operators implies that the orbits do not connect quasi-equivalent states when the Radon–Nikodym derivatives are unbounded operators.
Sections 2–4 give a short introduction on KMS states, on the theory of the modular operator, and on positive cones. Section 5 gives a definition of the manifold under study as the convex set of faithful normal states on a sigma-finite von Neumann algebra. The tangent space consists of linear functionals on the algebra. Its extent depends on the chosen topology, and it is not obvious how to find a good compromise. Therefore, the notion of approximate tangent vectors is considered in Section 6.
A dense subset of the manifold consists of states majorized by a multiple of the reference state. This subset of states is mentioned in Section 7 because it is easier to handle.
Section 8 gives a new definition of exponential arcs. It generalizes existing concepts and is broad enough to cover different approaches. The definition depends on the choice of a relative entropy/divergence function. Such an exponential arc can be seen as a one-dimensional sub-manifold and as a straightforward example of a quantum statistical manifold. Duality properties well-known for models of information geometry are elaborated in Section 9.
The important example of the algebra of n-by-n matrices is considered in Section 10.
Starting with Section 11 the paper specializes to the case of Araki’s relative entropy. It is shown in Section 13 that each self-adjoint element h of the von Neumann algebra defines an exponential arc defined relative to Araki’s relative entropy and starting at the reference state ω. The initial derivative of the arc exists as a Fréchet derivative and belongs to the tangent plane . The inner product between two such tangent vectors reproduces the metric which is used in the Kubo–Mori Theory of linear response. This is shown in Section 14. The exponential arcs are shown to be geodesics for the e-connection which is, by definition, the dual of the m-connection.
Section 16 applies the results obtained so far to show that manifolds generated by a finite number of exponential arcs have the properties one expects from a quantum statistical manifold.
A few points of concern are discussed in the final Section 17.
2 KMS states
Equilibrium states of quantum statistical mechanics satisfy the KMS condition. In the GNS representation, an equilibrium state becomes a faithful state on aσ-finite von Neumann algebra of operators on a complex Hilbert space. The state is defined by a normalized cyclic and separating vector in the Hilbert space.
The state of a model of statistical physics can be described by a mathematical state on a C*-algebra . It can be represented by a normalized vector Ω (a wave function) in a Hilbert space . This is known as the GNS (Gelfand–Naimark–Segal) representation theorem. Observable quantities are represented by self-adjoint operators on . The quantum expectation ⟨x⟩ of operator x is then given bywith in the right-hand side the scalar product of the two vectors xΩ and Ω. It should be noted that the mathematical convention is followed that the scalar product (inner product) is linear in its first argument and conjugate-linear in the second argument. In Dirac’s bra-ket notation, it reads
For convenience, one works with a von Neumann algebra of bounded operators on the Hilbert space . Observables of interest, when unbounded, are represented by operators affiliated with . The state ω on the C*-algebra extends to a vector state on again denoted ω. It is given by
The vector Ω is cyclic for , which means that the subspace is dense in the Hilbert space . It is also assumed in what follows that the state ω is faithful, i.e., ω(x*x) = 0 implies x = 0. This implies that Ω is a separating vector for , i.e., xΩ = 0 implies x = 0 for any x in , and hence it is a cyclic vector for the commutant of , the algebra of all operators commuting with all of .
Equilibrium states of statistical mechanics are characterized by the KMS (Kubo–Martin–Schwinger) condition [6]. Roughly speaking, this condition states that the quantum time evolution of the model has an analytic extension into the complex plane. This is made more precise in what follows.
The time evolution is described by a strongly continuous one-parameter group of unitary operators which leave the algebra unchanged, i.e., implies that belongs to for all t. The operators ut are determined by a self-adjoint operator Hwhich is the generator of the time evolution in the GNS representation. The time derivative of xt satisfies
This equation has the same form as Heisenberg’s equation of motion.
The KMS condition requires that for any pair x, y of operators in , there exists a complex function F(w), defined and continuous on the strip −β ≤ Im w ≤ 0 and analytical inside with boundary values
In the mathematics literature, the parameter β, which is the inverse temperature of the model, is usually taken equal to 1 or -1.
An immediate consequence of the KMS condition being satisfied is that the state ω is invariant. Indeed, take y equal to the identity operator. Then, one has F (t—iβ) = F(t) for all t in . If in addition, x is self-adjoint, then F(t) is a real function. From the Schwarz reflection principle, one then concludes that F(w) is a constant function. This implies ω(xt) = ω(x) for all self-adjoint x and hence for all x. The GNS theorem then guarantees that the vector Ω can be taken to be invariant, i.e., utΩ = Ω for all t.
3 The modular operator
The quantum-mechanical time evolution coincides with the modular automorphism group of Tomita–Takesaki theory.
The KMS condition, when satisfied, expresses a symmetry which is present in the context of non-commuting operators. The symmetry is the inversion of the order of multiplication of operators. In non-commutative groups, the modular function links left and right Haar measures. The analogue in functional analysis is studied in the theory of the modular operator, also called the Tomita–Takesaki theory [7].
The operator e−βH with H the generator of the quantum-time evolution is traditionally denoted as ΔΩ. It is the modular operator of the Tomita–Takesaki theory. It is in general an unbounded operator such that is in the domain of the definition of the square root of ΔΩ. Hence, the expressionis well-defined for 0 ≥ Im w ≥−β/2. The other half of the strip 0 ≥ Im w ≥−β is covered by the Schwarz reflection principle. Indeed, if x and y are self-adjoint, then one can show with the Tomita–Takesaki theory that the map t↦F (t − iβ/2) is a real function. Hence, the principle can be applied to obtain .
The unitary time evolution operator ut can be written asThe time evolution of an operator x in the Heisenberg picture is then given by
The action of the group is called the modular automorphism group.
The modular conjugation operator J of the Tomita–Takesaki Theory represents the symmetry which is at the basis of the theory. It is a conjugate-linear operator satisfying J = J* and . Operator x belongs to the von Neumann algebra if and only if JxJ belongs to the commutant algebra . The latter is the space of operators commuting with all operators in . The product is denoted as SΩ and has the property of
4 Dual cones
The natural positive coneis needed in subsequent sections. One reason for making use of it is that there exists a one-to-one correspondence between normal states onand normalized vectors in.
Section 4 of [22]introduces the cones , 0 ≤ α ≤ 1/2, of the vectors in . The self-dual cone is called the natural positive cone and is denoted as .
By definition, is the closure of the cone
The cone is used in [27] to introduce exponential arcs. It is equal to the closure of the setTo see this note thatwith y = Jx*J. The latter is an arbitrary element of the commutant .
The following characterization of the natural positive cone is found in Section 2.5 of [5].
Proposition 1: The cone equals the closure of the set of vectors
This result can be understood as follows. Take Φ in of the form (3), i.e., Φ = xJxΩ with x in . Let
This expression can be inverted toso that
Assume now that one could prove that the operator y defined by (4) belongs to ; then, the above calculation would show that Φ is of the form with a = yy* a positive element of . The actual proof of the proposition uses that belongs to .
The cone is independent [22] of the choice of the cyclic and separating vector Ω in , and the isometry J is the same for all these choices. For this reason, it is said to be universal.
From (3), it is easy to see that each vector in is an eigenvector with eigenvalue 1 of the modular conjugation operator J. Indeed, one hasHere, use is made of JΩ = Ω and the fact that the operators x and JxJ commute with each other.
5 A manifold of quantum states
A manifoldof vector states on the von Neumann algebrais defined. Tangent vector fields are Fréchet derivatives of paths in.
Introduce the notation ωΦ for the vector state defined by the normalized vector Φ in . It is given by
A manifold of states on the von Neumann algebra is defined by
The equilibrium state ω = ωΩ is taken as a reference point in . The subset of is the natural positive cone introduced in the previous section.
The topology on the manifold is that of the operator norm. One has
Several topologies can be defined on the algebra . Particularly relevant is the σ-weak topology. For what follows, it is important to know that in the present context, a state ω on is said to be normal if and only if it is σ-weakly continuous and if and only if it is a vector state. See for instance, Theorems 2.4.21 and 2.5.31 of [5].
Any tangent vector is a σ-weakly continuous linear functional on the von Neumann algebra . Let t↦γt be a Fréchet differentiable map defined on an open interval of with values in the manifold . The derivativeis required to exist as a Fréchet derivative, i.e., it satisfies
From the normalization, γt (1) = 1 for all t in the domain of the map, one obtains . From one obtains . Hence, the linear functional is Hermitian.
There are several ways to define the tangent space at the point ω in . Intuitively, a tangent vector is a derivative, defined in some sense, of a path t↦γt in passing through the point ω. The states of the manifold belong to the space of all σ-weakly continuous linear functionals on the algebra (see Proposition 2.4.18 of [5]). Hence, one expects that tangent vectors belong to as well.
In this section, the requirement is made that the path t↦γt is Fréchet-differentiable. This may be too restrictive. In what follows, we adopt the definition that the tangent space consists of all Hermitian χ in , satisfying χ(1) = 0. It should be noted that it is well-possible that for certain elements χ of this space, there is no smooth curve passing through ω with the property that the derivative at ω equals χ.
6 Approximate tangents
Approximate tangent vectors can be defined in an intrinsic manner.
An alternative definition of the tangent space starts from the following observation.
Proposition 2: The set defined byis a linear subspace of the tangent space .
Proof:
Let ϕ and ψ be two states in such that . Construct a Fréchet-differentiable path γ by
The state γt belongs to the manifold because the latter is a convex set. In particular, one has ω = γ1/2 and is a tangent vector. This shows that ϕ − ψ and hence also λ(ϕ − ψ) belongs to . One concludes that .
Assume now that λ(ϕ − ψ) and λ′(ϕ′ − ψ′) both belong to . We have to show thatbelongs to . If λ = 0 or λ′ = 0, then the claim is clearly satisfied. Without restriction, assume λ = 1.
If λ′ > 0, then chooseBoth ϕ″ and ψ″ belong to because the latter is a convex set. One verifies that ϕ″ + ψ″ = 2ω and
This shows that the latter sum belongs to .
In the case that λ′ < 0, one choosesto reach the same conclusion. This finishes the proof that is a linear subspace of .
We introduce the notationsand
The construction of is analogous to the construction of the approximate tangent space in Chapter 3 of [31]. Clearly, . Further properties are derived below.
Proposition 3: If γ is a Fréchet-differentiable path in , then belongs to with ω = γt.
Proof:
Let γ be a Fréchet-differentiable path in . Without restriction of generality, assume that γ0 = ω. For any ϵ > 0 and δ > 0, there exists t ≠ 0 such thatand
Then, ϕ defined bysatisfies ‖ϕ—ω‖ < ϵ, and (γt—γ−t)/2t belongs to . Hence, (5) shows that the tangent vector belongs to the closure of . Because ϵ > 0 is arbitrary, it also belongs to the intersection, which is .
Lemma 1: is a linear subspace of .
Proof:
Take χ and ξ in . There exist ϕ and ψ in such that and with ‖ϕ—ω‖ < ϵ and ‖ψ—ω‖ < ϵ. Therefore, there exist real λ, μ and states ϕ1, Φ2, Ψ1, Ψ2 in such thatand
If λ = 0 or μ = 0, then χ + ξ belongs to without further argument. Assume, therefore, that λ ≠ 0 and μ ≠ 0. If λμ > 0, then χ + ξ belongs to , with π = (1 − α)ϕ + αψ and α given by
Indeed, let
Then both π1 and π2 belong to and satisfyand
In addition,
One concludes that in this case, χ + ξ belongs to .
The case that λμ < 0 is similar. That implies is straightforward. One can conclude that is a linear space. It clearly is a subspace of .
Proposition 4: is a closed linear subspace of .
Proof:
The lemma shows that is a linear subspace of , which is a space closed in norm. Hence, also the norm closure of is a subset of this space and therefore also of .
7 Majorized states
The subset of states majorized by a multiple of the reference stateωis considered.
Definition 1: A state ϕ on is said to be majorized by a multiple of the state ω if there exists a positive constant λ such that
Take a′ ≠ 0 in the commutant algebra and let
Then, the state ωΦ is majorized by a multiple of the state ω. Indeed, one has for any positive x in
It is well-known that all states majorized by a multiple of the state ω are obtained in this way. This is the content of the following proposition.
Proposition 5: If the vector state ωΦ is majorized by a multiple of the state ω, then there exists a unique element a′ of the commutant such that Φ = a′Ω.
Proof:
An operator a′ is densely defined by
It satisfies a′Ω = Φ. It is well-defined because xΩ = 0 impliesso that xΦ = 0.
The operator a′ is bounded because
The operator a′ commutes with any x in becauseand Ω is cyclic for .
The operator a′ is unique. Indeed, assuming b′ in satisfies Φ = b′Ω. Then, one has for all x in
Hence, a′ − b′ vanishes on which is dense in the Hilbert space because Ω is cyclic for . Because a′ − b′ is a bounded and hence continuous operator, it vanishes everywhere so that a′ = b′.
Item (8) of Theorem 3 of [22] implies the following.
Proposition 6: If a vector state ωΦ, defined by a vector Φ in the natural positive cone , is dominated by a multiple of the state ω, then there exists a unique element a in the algebra such that Φ = aΩ and
Proof:
Proposition 5 shows that a′ in the commutant exists such that Φ = a′Ω. Because Φ and Ω both belong to , one has Φ = JΦ = Ja′JΩ.
Let a = Ja′J. From , it follows that a belongs to . This shows the existence.
The element a is unique because the correspondence between vector states on and vectors in is one-to-one and Ω is a separating vector for .
If is a commutative algebra, then a*a is the Radon–Nikodym derivative of the state ωΦ with respect to the reference state ω.
The subset of states of majorized by a multiple of the state ω is dense in in the sense that for any state ϕ in , there exists a sequence of elements of with the property that anΩ is a Cauchy sequence and
See Propositions 1.5 and 2.5 of [32].
Proposition 7: A tangent vectorχbelongs to the subspace of the tangent space if and only if it is proportional to the difference of two states ϕ and ψ in , both majorized by a multiple of the state ω.
Proof:
If χ belongs to , then by definition, there exist states ϕ and ψ in such that χ = λ(ϕ − ψ) and ϕ + ψ = 2ω. The latter implies that both ϕ and ψ are majorized by 2ω.
Conversely, assume that ϕ and ψ in are both majorized by a multiple of the state ω and let χ = λ(ϕ − ψ). This implies the existence of μ ≥ 1 and ν ≥ 1 such that ϕ ≤ μω and ψ ≤ νω.
Without restriction, assume that λ > 0.
Introducewith ρ still to be chosen. By construction, it holds that ϕ′ + ψ′ = 2ω and ϕ′ − ψ′ = 2ρχ. Hence, if ϕ′ and ψ′ are states in and ρ ≠ 0, then one can conclude that χ belongs to .
Fromandone obtains
Let ρ be equal to the inverse of the maximum of λμ and λν to prove the positivity of the functionals ϕ′ and ψ′. Normalization ϕ′(1) = ψ′(1) = 1 follows from χ(1) = 0. The functions are σ-weakly continuous as well. Hence, they are states in . This ends the proof that χ belongs to .
8 Exponential arcs
[27]introduces the notion of an exponential arc in the Hilbert space, inspired by the notion of exponential arcs in probability space as introduced by[24, 25]. Here, a definition is given which depends on the choice of a relative entropy.
In the present context, a divergence function D (ϕ‖ψ) is a real function of two states ϕ and ψ in the manifold . It cannot be negative, and it vanishes if and only if the two arguments are equal. A value of + ∞ is allowed. An energy function is an affine function defined on a convex subset of the set of normal states on the algebra .
The following definition of an exponential arc in the manifold assumes that a divergence function D (ϕ‖ψ) is given.
Definition 2: An exponential arc
γis a path in the manifold
for which there exists an energy function
such that
• γt is in the domain of ;
• The divergence D (γs‖γt) between any two points of the arc is finite;
• For any state ψ in the domain of , one has
The energy function is the generator of the exponential arc. The arc is said to connect the state γ1 to the state γ0.
A subclass of energy functions is formed by the functions for which there exists a self-adjoint operator h in the von Neumann algebra so that
In such a case, h is called the generator as well. The exponential arcs defined in [27] agree with the above definition with a generator defined by an unbounded operator affiliated with the commutant algebra .
Proposition 8: Expression (6) impliesand
It should be noted that with s = 0, expression (9) reduces to (6).
Proof:
Take ψ = γs in (6) to find
In particular, with s = t, this implies (8).
To prove (9), use (10) to write the right-hand side as
Next, eliminate D (γ0‖γt) and D (ψ‖γs) with the help of (6). This gives
To obtain the last line, use (8).
Corollary 1: Ift↦γt is an exponential arc with generator that connects γ1 to γ0, then for any s, t in [0, 1], the map ϵ↦γ(1−ϵ)s+ϵt is an exponential arc with generator that connects γt to γs.
Corollary 2: If t↦γt is an exponential arc with generator that connects γ1 to γ0, then t↦γ1−t is an exponential arc with generator , connecting the state γ0 to the state γ1.
The following two propositions deal with the uniqueness of an exponential arc and of its generator.
Proposition 9: Let ω and ϕ be two states in . Fix an energy function . There is at most one exponential arc t↦γt with generator that connects ϕ to ω.
Proof:
Assume both t↦γt and t↦δt are exponential arcs connecting the state ϕ to the state ω. Subtract (6) from the same expression with γt replaced by δt and take s = 0. This gives
Take ψ equal to δt. Then, one obtains
On the other hand, with ψ = γt, one obtains
The two expressions together yield
This implies D (γt‖δt) = 0. By the basic property of a divergence, one concludes that γt = δt.
Proposition 10: If the exponential arc t↦γt has two generators and , then these generators differ by a constant on their common domain of definition.
Proof:
It follows from (6) thatfor all states ψ in the intersection of the domains of and . This implies that a constant c exists so thatfor all ψ in the common domain.
The requirement (6) is a stability condition. The generator is a perturbation which shifts the state γ0 to the state ψ. This interpretation will become clear further on. The effect on the relative entropy of the shift along the arc t↦γt is linear. In the standard case, the relative entropy is based on the logarithmic function. This justifies calling the path t↦γt an exponential arc.
It should be noted that the Pythagorean relation [33, 34]is satisfied for all ψ with the same energy as the state γs, i.e., with
If the divergence function is interpreted as the square of a pseudo-distance, then the aforementioned relation states that for an arbitrary state ψ, the point γs of the arc which has the same energy is the point with minimal distance.
9 The scalar potential
The exponential arc has a dual structure similar to that found in information geometry[10, 11].
Given an exponential arc t↦γt with generator , introduce the potential Φγ defined by
Its Legendre transform is given by
Proposition 11: For any exponential arc
t↦
γtwith generator
, one has
(a) The function is strictly increasing;
(b) ;
(c) The line is tangent to the potential Φγ at the point t = s; this implies that the potential Φγ(s) is a strictly convex function, continuous on the open interval (0, 1);
(d) The following identity holds:
Proof:
(a) Take ψ = γt in (6). This gives
Because divergences cannot be negative, this implies that is non-decreasing. Assume now that . Then, it follows that
The latter implies that
s=
t. One concludes that
s<
timplies a strict inequality
.
(b) From the definition of the exponential arc, one obtains
Take
ψ=
γ0in this expression to find
(c) From (b), one obtains
because
D(
γs‖
γt) ≥ 0 with equality if and only if
s=
t. This implies that
is a line tangent to the potential Φ
γ(
s). By (a), the slope of this line is a strictly increasing function of
s. Hence, the potential Φ
γ(
s) is a strictly convex function, continuous on the open interval (0, 1).
(d) (13) implies that
On the other hand, one can use (b) to obtain
The optimal choice t = s yields the lower bound .
A dual parameter η of the exponential arc γ, dual to the parameter t, is the value of the generator . By item (a) of the proposition, it is a strictly increasing function of t. It is almost equal everywhere to the derivative of the value of the potential along the path.
10 The matrix case
Ifρandσare two density matrices, then the obvious definition of an exponential arc connectingσtoρiswith normalizationζ(t)given byIt is shown below that the corresponding states given byform an exponential arc for the relative entropy of Umegaki[20]in the GNS-representation of the stateσ0.
Fix a non-degenerate density matrix ρ of size n-by-n. It is a positive-definite matrix with trace Tr ρ equal to 1.
Umegaki’s relative entropy for the pair of density matrices σ, τ is given by
Assume now a mapwith normalization ζ(t) and with h given byThis is the obvious definition of an exponential arc in terms of density matrices. The corresponding potential iswith
The map (14) is also an exponential arc in the sense of Definition 2. To see this, consider any density matrix τ and calculate
This is of the form (6) except that the relative entropy is expressed in terms of density matrices in instead of vector states in the GNS representation of the state defined by the density matrix ρ.
An explicit construction of the GNS representation is possible. See for instance, the appendix of [28]. Let ω = σ0 denote the state determined by the density matrix ρfor any n-by-n matrix A with entries in . Such a matrix A is represented on the Hilbert space by the operator , where is the n-by-n identity matrix. The von Neumann algebra is the space of operators .
The matrix ρ can be diagonalized. This gives the spectral representationwhere (ei)i is an orthonormal basis in . Let
It is a normalized vector in . One readily verifies thatfor any n-by-n matrix A. In this way, any density matrix ρ defines a vector Ω in . The vector Ω is cyclic and separating for if ρ is non-degenerate. Hence, there is a one-to-one correspondence between non-degenerate density matrices and states in the manifold . It is then straightforward to replace the density matrices by states in the expressions obtained in the first part of this section.
11 The relative modular operator
Araki [35]introduces the relative modular operatorΔΦ,Ψfor any pair of vectorsΦ and Ψin the natural positive cone.
Assume that Φ and Ψ are vectors in which are separating for the algebra . Then, a conjugate–linear operator is defined by
It is well-defined because by assumption, xΨ = 0 implies that x = 0 so that also x*Φ = 0. It is a closable operator. Indeed, assume the sequence xnΨ converges to 0. Then, one has for any y in the commutant thatconverges to 0. By assumption, Ψ is separating for so that it is cyclic for the commutant . Hence, if the sequence converges, then it converges to 0. This shows the closability of the operator.
Let SΦ,Ψ denote the closure of this operator. It satisfies
Its inverse equals SΨ,Φ.
The relative modular operator ΔΦ,Ψ is defined by
Important properties of the relative modular operator arewhere J is the modular conjugation operator for the vector Φ.
12 Araki’s relative entropy
Araki [22, 23] uses the relative modular operator ΔΦ,Ψ to define the relative entropy/divergence D (ϕ‖ψ) of the corresponding states ϕ = ωΦ and ψ = ωΨ by
Proposition 12: The divergence D (ϕ‖ψ) satisfies D (ϕ‖ψ) ≥ 0 with equality if and only if ϕ = ψ.
Proof:
Letdenote the spectral decomposition of the operator ΔΦ,Ψ. From the concavity of the logarithmic function, it follows that
This shows that the divergence cannot be negative.
If ϕ = ψ, then one hasbecause ΔϕΦ = Φ.
Finally, D (ϕ‖ψ) = 0 implies that Φ is in the domain of log ΔΦ,Ψ and that log ΔΦ,ΨΦ = 0. The latter implies thatThis shows that D (ϕ‖ψ) = 0 vanishes only when Φ = Ψ.
Theorem 2.4 of [35] shows that
Because Φ belongs, by assumption, to the natural positive cone , it satisfies Φ = JΦ. Hence, one has also
13 A theorem
Each self-adjoint elementhof the von Neumann algebradefines an exponential arc with a generator equal to the energy function defined byh.
[21] constructs for each self-adjoint operator h in a vector Φh in the natural positive cone and calls h the relative Hamiltonian. Inspection of the explicit expression used in [21] shows thatwith operator X given by
The vector Φh defines a state ϕh by
Here, ξ(h) is the normalization
Theorem 3.10 of [35] implies that the state ϕh obtained in this way satisfies for all ψ in
Take ψ = ϕh and ψ = ω to find that the normalization ξ(h) is given by
Consider now the path γ defined by γt = ϕth. Then, (16) becomeswith
From this last expression, one obtains
From (15), we infer that γt converges to ω as t ↓ 0. Hence, D (γt‖ω) and D (ω‖γt) converge to 0 faster than t. This implies that the derivative exists and equals ω(h). This also implies that
Elimination of ζ(t) from (17) yieldsThis shows that γ is an exponential arc connecting γ1 to γ0 = ω.
Proposition 13: One haswith the operator TΩ given by
It should be noted that this operator TΩ was introduced in [36].
Proof:
From (15), one obtains
Writeand
The two contributions to (21) can now be taken together. One obtainsTake x = 1 to see thatso that it follows (19).
In summary, one can infer
Theorem 1: Let ω in be a vector state with cyclic and separating vector Ω. Choose the divergence function equal to the relative entropy of Araki as defined by (15). For each self-adjoint element h in , an energy function is defined by and there exists an exponential arc γ with generator connecting some state γ1 of to the state γ0 = ω. For any state ψ in , the derivative of D (ψ‖γt) at t = 0 exists and is given by ω(h) − ψ(h). The derivative of the exponential arc at t = 0 satisfies (19).
Further properties hold for the exponential arc of the above theorem.
Proposition 14: For any exponential arc γ constructed in Theorem 1, the derivative is a Fréchet derivative.
Proof:
Let Ξ(h) denote the remainder of order h2 in (15), i.e.,
Then one can use (19) for
This yields
Each of the terms in the right-hand side of this expression is of order less than t as t tends to 0. Hence, is a Fréchet derivative.
Proposition 15 (Additivity of generators): If the state ϕ is connected to the state ω by the exponential arc with generator h and ψ is connected to ϕ by the exponential arc with generator k, then ψ is connected to ω by the exponential arc with generator h + k and ω is connected to ψ by the exponential arc with generator −h.
For the proof, see Proposition 4.5 of [21].
14 The metric
Eguchi [37]introduced the technique of deriving the metric of the tangent space by taking two derivatives of the divergence. Application here yields the metric which is used in the Kubo–Mori theory of linear response[38, 39].
Consider two exponential arcs t↦γt and s↦ηs with respective generators h and k. They connect the states γ1 and η1 to the reference state ω. The tangent vectors at s = t = 0 are and . They belong to the tangent space . The scalar product between them is by definition given by
Assume now that these exponential arcs are those constructed in Theorem 1. Then, one haswith the operator TΩ defined by (20). It should be noted that in most applications, one assumes that the expectations ω(h) of the generator h and ω(k) of the generator k vanish. Then, the result obtained here coincides with that used in [36]. In what follows, a non-vanishing expectation of the generators is taken into account.
Let us now discuss some technical issues. The scalar product is well-defined by (22). This follows from
Lemma 2: If two exponential arcs with initial point ω with generators h, respectively k, both in , have the same initial tangent vector, then one has
Proof:
Let γ and η be two exponential arcs with generators h, respectively k in , such that γ0 = η0 = ω. Without restriction, assume that ω(h) = ω(k) = 0 and . Then, (19) implies that
Take x = h − k. Then, it follows that TΩ(h − k)Ω = 0.
This lemma shows that the mapis one-to-one and identifies the tangent vector with the vector TΩhΩ in the Hilbert space .
Expression (22) defines a bilinear form. This follows from.
Lemma 3: The map (23) is linear.
Proof:
Let γ be an exponential arc with generator h in . Then, t↦γϵt is an exponential arc with generator ϵh for any ϵ in [−1, 1] and the tangent vector is . Hence, (23) maps onto ϵTΩhΩ.
Next, consider a pair of exponential arcs γ and η with generators k, and h, respectively, in and with γ0 = η0 = ω. Let θ denote the exponential arc with generator h + k. It exists by Theorem 1. The state θt can then be written aswith Φth+tk being the unique element in the natural positive cone representing the state θt. Now, use (15) to write
This implies
Both observations together prove the linearity of map (23).
Proposition 16: Expression (22) defines a non-degenerate scalar product on the space of tangent vectors of the form with γ an exponential arc as constructed in Theorem 1.
Proof:
The two lemmas show that (22) is a well-defined bilinear form. Positivity of the form is clear. The symmetry follows from (22). It remains to be shown that it is non-degenerate.
Assume that . This implieswith h the generator of γ. The operator TΩ is invertible—see the proof of Lemma II.2 of [36]. Hence, it follows thatBecause O is separating for , it follows that h is a multiple of the identity. The latter implies that .
15 Dual geometries
The geodesics of the e-connection are the exponential arcs. In the m-connection, the geodesics are made up by convex combinations of a pair of states. The m- and e-connections are each others’ dual with respect to the metric ofSection 14.
Consider two states ω and ϕ in the manifold . The tangent vectoris independent of t. Hence, it is a geodesic for the connection in which all parallel transport operators are taken equal to the identity operator. It should be noted that the tangent space coincides with the space of σ-weakly continuous linear functionals χ, satisfying χ(1) = 0 and hence it is the same everywhere . This connection is by definition the m-connection.
For t in (0, 1), the tangent vector belongs to the subspace of the tangent space which is introduced in Section 6. Conversely, every vector χ in is the tangent vector of an m-geodesic passing through the point γt. However, this does not imply that through parallel transport Π(γt↦γs), the space maps onto the space .
The transport operators Π* of the dual geometry are defined by
In this expression, V and W are vector fields and (⋅,⋅)ω is the scalar product defined in the previous section and evaluated at the point ω of the manifold .
It can be shown that any exponential arc γ is a geodesic for this dual geometry. To do so, we have to show thatThe tangent vector at t = 0 is given by (19). Its value for arbitrary t is given by the following proposition.
Proposition 17: Let γ denote an exponential arc γ with generator h belonging to . Let Φt be the normalized vector in the natural positive cone representing the state γt. The derivative is given by
Proof:
The state γ1 is connected to ω by the exponential arc with generator h and γt is connected to ω by the exponential arc with generator th. Let
It follows from Proposition 8 that s↦ψs is an exponential arc with generator (1 − t) h connecting γt to γ1. Application of (19) to the latter arc giveswith Ψ = Φt. This implies (24) because .
Theorem 2: Any exponential arc γ with generator h in is a geodesic for the dual of the m-connection with respect to the metric introduced in Section 14.
Proof:
Let t↦ϕt be an exponential arc with generator k in such that ϕ0 = γt. Fix t in [0, 1] and let Φt denote the normalized element of the natural positive cone representing the state γt. Let η be an exponential arc with generator k starting at γt, i.e., η0 = γt. Because Π(γs↦γt) is the identity, the definition of the dual transport operator yieldswith l the generator of the arc s↦γ(1−s)t+s. It equals l = (1 − t)h. This last expression equals
By proposition 16, the scalar product is non-degenerate. Therefore, one can conclude that
This shows that the exponential arc γ is a geodesic for the dual of the m-connection.
16 Finite-dimensional submanifolds
A finite set of linearly independent generators is shown to define a finite-dimensional submanifold in which all states are connected to the reference state by an exponential arc. The submanifold defined in this way is a dually flat quantum statistical manifold.
Let ω be the reference state of . It is a vector state with a cyclic and separating vector Ω. Choose an independent set of self-adjoint operators h1, … , hn in . By Theorem 1, there exists an exponential arc γ with generator h = θihi connecting some state γ1 in to the state γ0 = ω. A parameterized family of states ωθ, is now defined by putting ωθ = γ1. These states form a submanifold of .
From the definition of an exponential arc, one obtains immediately that for any ψ in
Take ψ = ωθ in this expression to find
Hence, the quantity θiω(hi) is maximal if and only if ωθ equals the reference state ω.
Proposition 18: Dual coordinates ηi are defined by
They satisfywith (⋅,⋅)θ equal to the scalar product introduced in Section 14 and with basis vectors ∂i equal to ∂ωθ/∂θi.
Proof:
Introduce the path γ(i) defined by
It satisfies
By definition, is the end point of the exponential arc with generator . From Proposition 15, it then follows that γ(i) is an exponential arc with generator hi connecting to ωθ. These arcs γ(i) are used in the calculation that follows.
The definition of the scalar product at the beginning of Section 14 gives
Corollary 3: There exists a potential Φ(θ) such that
This follows because the scalar product is symmetric so that
This symmetry is a sufficient condition for the potential Φ(θ) to exist.
Consider the following generalization of the potential introduced in Section 9.
Apply (18) to the exponential arc γ(i) which connects to ωθ to find
This implies that Φ(θ) satisfies (28).
One can conclude that the selection of an independent set of self-adjoint operators h1, … , hn in defines a parameterized statistical model θ↦ωθ of states on the von Neumann algebra . An obvious basis in the tangent plane is formed by the derivative operators ∂i. The scalar product introduced in Section 14 starting from the relative entropy of Araki defines a Hessian metric on the tangent planes. Exponential arcs are geodesics for the e-connection which is the dual of the m-connection.
17 Discussion
• The manifold under consideration consists of vector states on a sigma-finite von Neumann algebra in its standard representation. Such a manifold has nice properties described by the Tomita–Takesaki Theory and hence is an obvious study object when exploring quantum statistical manifolds in an infinite-dimensional setting. Particular attention is given in the present work on the definition of the tangent planes. This is also a point of concern in the commutative context of manifolds of probability measures. See, for instance, the approach of [14]. A convenient choice for the tangent space at the state ω in the manifold is to take it equal to the space of all σ-weakly continuous Hermitian linear functionals χ on vanishing on the identity operator . However, it is well-possible that the equivalence class of smooth curves through ω with initial tangent equal to a given χ is empty. Approximate tangent planes are considered an alternative in Section 6. They form a subspace of as defined previously. Nevertheless, the initial tangent vectors of Fréchet-differentiable paths starting at ω belong to the approximate tangent space. It is not clear whether the initial tangents of exponential arcs are dense in the approximate tangent space with respect to the inner product of Section 14. Further research is needed at this point.
• A new definition of exponential arcs is given. It depends on the choice of a divergence function/relative entropy defined on pairs of points in the manifold and on the choice of a generator which is a linear functional defined on a domain in the manifold. It is general enough to cover different approaches that one can follow to solve the non-uniqueness problem of the Radon–Nikodym derivative in the context of non-commutative probability. Nevertheless, one can prove in full generality nice properties such as uniqueness of the generator, existence of scalar potential, and Pythagorean relations. The additivity of generators when composing exponential arcs is shown in the specific context of Araki’s relative entropy. See Proposition 15.
• The second half of the paper focuses on the relative entropy of Araki. Only exponential arcs with bounded generators belonging to the von Neumann algebra are considered. This suffices to reach the goal of replacing the existing approach based on density matrices and Umegaki’s relative entropy. However, the solution of the problem mentioned previously regarding the extent of the tangent spaces most likely requires the handling of unbounded generators.
• The scalar product of Bogoliubov presented in Section 14 is used extensively in Linear Response Theory, also known as Kubo–Mori theory. Its link with the KMS condition of Section 2 is not highlighted in the present text. It is tradition in the Kubo–Mori theory and more generally in statistical mechanics to focus on a small number of variables. It is shown in Section 16 that the selection of a finite number of variables defines a quantum statistical manifold supporting Amari’s dually flat geometry.
Statements
Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material: further inquiries can be directed to the corresponding author.
Author contributions
The author confirms being the sole contributor of this work and has approved it for publication.
Funding
The publication charges are covered by the Universiteit Antwerpen.
Conflict of interest
The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors, and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
References
1.
DixmierJ. Les C*-algèbres et leurs représentations. Paris: Gauthier-Villars (1964).
2.
DixmierJ. Les algèbres d’operateurs dans l’espace Hilbertien. Paris: Gauthier-Villars (1969).
3.
RuelleD. Statistical mechanics, Rigorous results. New York: W.A. Benjamin, Inc. (1969).
4.
EmchG. Algebraic methods in statistical mechanics and quantum field theory. New Jersey, United States: John Wiley & Sons (1972).
5.
BratteliORobinsonD. Operator algebras and quantum statistical mechanics. New York, Berlin: Springer (1979).
6.
HaagRHugenholtzNMWinninkM. On the equilibrium states in quantum statistical mechanics. Commun Math Phys (1967) 5:215–36. 10.1007/bf01646342
7.
TakesakiM. Tomita’s theory of modular Hilbert algebras and its applications. In: Lecture notes in mathematics. Berlin: Springer-Verlag (1970).
8.
ChentsovNN. Statistical decision rules and optimal inference. In: Transl. Math. Monographs. Rhode Island, United States: American Mathematical Society (1972).
9.
EfronB. Defining the curvature of a statistical problem. Ann Stat (1975) 3:1189–242.
10.
AmariS. Differential-geometrical methods in statistics. In: Lecture notes in statistics. New York, Berlin: Springer (1985).
11.
AmariSNagaokaH. Methods of information geometry. In: Translations of mathematical monographs. Oxford, United Kingdom: Oxford University Press (2000).
12.
AyNJostJVân LêHSchwachhöferL. Information geometry. Berlin, Germany: Springer (2017).
13.
PetzD. Quantum information theory and quantum statistics. (Berlin: Springer-Verlag) (2008).
14.
PistoneGSempiC. An infinite-dimensional structure on the space of all the probability measures equivalent to a given one. Ann Stat (1995) 23:1543–61.
15.
GrasselliMRStreaterRF. On the uniqueness of the chentsov metric in quantum information geometry. Infin Dim Anal Quan Prob. Rel. Top. (2001) 4:173–82. 10.1142/s0219025701000462
16.
StreaterRF. Duality in quantum information geometry. Open Syst Inf Dyn (2004) 11:71–7. 10.1023/b:opsy.0000024757.25401.db
17.
StreaterRF. Quantum orlicz spaces in information geometry. Open Syst Inf Dyn (2004) 11:359–75. 10.1007/s11080-004-6626-2
18.
JenčováA. A construction of a nonparametric quantum information manifold. J Funct Anal (2006) 239:1–20. 10.1016/j.jfa.2006.02.007
19.
GrasselliMR. Dual connections in nonparametric classical information geometry. Ann Inst Stat Math (2010) 62:873–96. 10.1007/s10463-008-0191-3
20.
UmegakiH. Conditional expectation in an operator algebra. IV. Entropy and information. Kodai Math Sem Rep (1962) 14:59–85. 10.2996/kmj/1138844604
21.
ArakiH. Relative Hamiltonian for faithful normal states of a von Neumann algebra. RIMS (1973) 9:165–209.
22.
ArakiH. Some properties of modular conjugation operator of von Neumann algebras and a non-commutative Radon–Nikodym theorem with a chain rule. Pac J Math (1974) 50:309–54. 10.2140/pjm.1974.50.309
23.
ArakiH. Relative entropy of states of von Neumann algebras. Publ RIMS Kyoto Univ (1976) 11:809–33.
24.
PistoneGCenaA. Exponential statistical manifold. AISM (2007) 59:27–56. 10.1007/s10463-006-0096-y
25.
PistoneG. Nonparametric information geometry. In: NielsenFBarbarescoF, editors. Geometric science of information. Berlin, Germany: Springer (2013). p. 5–36.
26.
SantacroceMSiriPTrivellatoB. On mixture and exponential connection by open arcs. In: NielsenFBarbarescoF, editors. Geometric science of information. Berlin, Germany: Springer (2017). p. 577–84.
27.
NaudtsJ. Exponential arcs in the manifold of vector states on a σ-finite von Neumann algebra. Inf Geom (2022) 5:1–30. 10.1007/s41884-021-00064-4
28.
NaudtsJ. Quantum statistical manifolds. Entropy (2018) 20:472. 10.3390/e20060472
29.
NaudtsJ. Correction: Naudts, J. Quantum statistical manifolds. Entropy 2018, 20, 472. Entropy (2018) 20:796. 10.3390/e20100796
30.
CiagliaFMIbortAJostJMarmoG. Manifolds of classical probability distributions and quantum density operators in infinite dimensions. Inf Geom (2019) 2:231–71. 10.1007/s41884-019-00022-1
31.
SimonL. Lectures on geometric measure theory. In: Proceedings of the centre for mathematical Analysis. Australia: Australian National University (1983).
32.
NiesteggeG. Absolute continuity for linear forms on b*-algebras and a Radon-Nikodym type theorem (quadratic version). Rend Circ Mat Palermo (1983) 32:358–76. 10.1007/bf02848539
33.
CsiszárI. I-divergence geometry of probability distributions and minimization problems. Ann Probab (1975) 3:146–58.
34.
CziszárIMatús̆F. Generalized minimizers of convex integral functionals, Bregman distance, Pythagorean identities. Kybernetika (2012) 48:637–89.
35.
ArakiH. Relative entropy for states of von Neumann algebras II. Publ Rims, Kyoto Univ (1977) 13:173–92.
36.
NaudtsJVerbeureAWederR. Linear response theory and the KMS condition. Comm Math Phys (1975) 44:87–99. 10.1007/bf01609060
37.
EguchiS. Information geometry and statistical pattern recognition. Sugaku Expositions (2006) 19:197–216.
38.
KuboR. Statistical-Mechanical theory of irreversible processes. I General theory and simple applications to magnetic and conduction problems. J Phys Soc Jpn (1957) 12:570–86. 10.1143/jpsj.12.570
39.
MoriH. Transport, collective motion, and Brownian motion. Progr Theor Phys (1965) 33:423–55. 10.1143/ptp.33.423
Summary
Keywords
exponential arcs, quantum statistical manifold, quantum divergence function, Araki’s relative entropy, dually flat geometry, Tomita–Takesaki theory, linear response theory, Kubo–Mori theory
Citation
Naudts J (2023) Exponential arcs in manifolds of quantum states. Front. Phys. 11:1042257. doi: 10.3389/fphy.2023.1042257
Received
12 September 2022
Accepted
06 January 2023
Published
07 February 2023
Volume
11 - 2023
Edited by
Florio M. Ciaglia, Universidad Carlos III de Madrid de Madrid, Spain
Reviewed by
Sorin Dragomir, University of Basilicata, Italy
Fabio Di Cosmo, Universidad Carlos III de Madrid de Madrid, Spain
Updates
Copyright
© 2023 Naudts.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Jan Naudts, jan.naudts@uantwerpen.be
This article was submitted to Statistical and Computational Physics, a section of the journal Frontiers in Physics
Disclaimer
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.