Application of Generalized (Hyper-) Dual Numbers in Equation of State Modeling

Rehner, Philipp; Bauer, Gernot

doi:10.3389/fceng.2021.758090

TECHNOLOGY AND CODE article

Front. Chem. Eng., 11 October 2021

Sec. Computational Methods in Chemical Engineering

Volume 3 - 2021 | https://doi.org/10.3389/fceng.2021.758090

Application of Generalized (Hyper-) Dual Numbers in Equation of State Modeling

Philipp Rehner*

Gernot Bauer

Institute of Thermodynamics and Thermal Process Engineering, University of Stuttgart, Stuttgart, Germany

The calculation of derivatives is ubiquitous in science and engineering. In thermodynamics, in particular, state properties can be expressed as derivatives of thermodynamic potentials. The manual differentiation of complex models can be tedious and error-prone. In this work, we revisit dual and hyper-dual numbers for the calculation of exact derivatives and show generalizations to higher order derivatives and derivatives with respect to vector quantities. The methods described in this paper are accompanied by an open source Rust implementation with Python bindings. Applications of the generalized (hyper-) dual numbers are given in the context of equation of state modeling and the calculation of critical points.

1 Introduction

The calculation of derivatives is required in many branches of physics and engineering. In particular, derivatives are required in solvers for nonlinear equations or optimization problems. Accurate calculations of derivatives are even more important in cases where the derivatives are part of the model itself, rather than just appearing in algorithms that often do not suffer much from appropriate approximations to the Jacobians. The ideal solution would be to not approximate at all and implement all required derivatives of the model by hand. This approach guarantees accurate results and efficiency. However, for complicated models, the probability of errors in the implementation becomes high. The process can be aided by symbolic math toolboxes and automatic code generation. An entirely different approach that requires no additional implementation within the model, is the use of numerical derivatives like forward or central differences, and higher order methods. The simplicity of the approach comes at the price, that the result is inherently an approximation and the achievable precision depends on the chosen step size. For small step sizes, the calculation can become unstable due to machine precision. For large step sizes, the calculation is stable but less accurate. To find the optimal step size for an arbitrary problem can be challenging.

Exact results without differentiation by hand can be obtained using automatic differentiation (Griewank and Walther, 2008). In this approach, the calculations in the model are represented by a calculation tree, that contains all intermediate results and their derivatives. The individual steps in this tree consist of algebraic operations or elementary functions for which the derivatives are known. Depending on the order with which the derivatives are calculated, the methods are referred to as forward or backward accumulation. The method of using (hyper-) dual numbers (Fike and Alonso, 2011) and the generalizations introduced in this work can be understood as an automatic differentiation using operator overloading and forward accumulation. Automatic differentiation with dual numbers is similar to the wider known complex step differentiation (Martins et al., 2003). However, aside from the availability of complex numbers in many programming languages, there is no advantage whatsoever to use complex step differentiation above dual numbers: complex step differentiation can be seen as an improved numerical derivative that alleviates the problems at small step sizes, whereas dual number differentiation always results in exact derivatives within machine precision. Automatic differentiation using (hyper-) dual numbers has been used in various fields of engineering and science such as solving optimal control problems (Agamawi and Rao, 2020) or formulating equations of motion for rigid- and multi-body systems (Cohen and Shoham, 2015; Cohen and Shoham, 2017). Yet, the concept of automatic differentiation with dual numbers is rarely used in chemical engineering and thermodynamics in particular, with the exception of the work of Diewald et al. (2018) and Rehner et al. (2019). With this review of the mathematical background and the openly available implementation in Rust and Python, we aim to change that.

This manuscript is structured as follows: in section 2 the properties of dual and hyper-dual numbers are briefly revisited and a generalization to higher derivatives and vector valued gradients and Hessians is demonstrated. Section 3 outlines the implementation in Rust and Python that is available on GitHub. Section 4 shows applications for generalized (hyper-) dual numbers within thermodynamics and equation of state modeling. This is a particular interesting application for automatic differentiation as state properties can be identified as partial derivatives of thermodynamic potentials.

2 Methods

This section aims to explain why dual and hyper-dual numbers are useful in the context of automatic differentiation by establishing an isomorphism between the number and derivative calculus. Isomorphism refers to the observation that all derivation rules have a counterpart in the arithmetic of the (hyper-) dual numbers. With this link established, it is shown how it can be reverted to obtain generalized (hyper-) dual numbers for the calculation of higher order and partial derivatives, gradients, Jacobians and Hessians.

2.1 Dual Numbers

Similar to complex numbers, dual numbers are formed by adjoining a new element ɛ with the property ɛ² = 0. A dual number x can thus be written in the form

x = x_{0} + x_{1} ε, x_{0}, x_{1} \in R (1)

The product of x with a dual number y = y₀ + y₁ɛ is

x y = x_{0} y_{0} + (x_{0} y_{1} + x_{1} y_{0}) ε . (2)

For the application in thermodynamic modeling and other parts of physics, the most interesting property, however, is the exact Taylor expansion

f (x) = f (x_{0}) + f^{'} (x_{0}) x_{1} ε . (3)

All higher order terms cancel since they contain the factor ɛ² = 0. Therefore, the derivative of a (real) function f (x) can be calculated exactly by evaluating the function using dual numbers and taking the ɛ-component f₁ of the result, as

f (x_{0} + ε) = \underset{f_{0}}{\underset{︸}{f (x_{0})}} + \underset{f_{1}}{\underset{︸}{f^{'} (x_{0})}} ε . (4)

As opposed to numerical differentiation methods (forward, backward, or central differences), calculating the derivative with dual numbers does not introduce an error by truncating the Taylor expansion (the truncation is exact for dual numbers). Therefore the necessity to find an optimal step size is eliminated and we can simply choose x₁ = 1.

The multiplication, Eq. 2, of two dual numbers follows the same rules as the product rule in calculus and the Taylor expansion, Eq. 3, is analogous to the chain rule. With the chain rule, the product rule and the elementwise addition, the derivatives of arbitrarily complex functions are known. Therefore, dual numbers form an isomorphism to the first derivative, and the derivatives of any model, that is written using analytical functions can be calculated exactly using dual numbers. As an example, the division can be written as a product of the numerator with the divisor raised to the power of minus one:

\frac{x}{y} = x y^{- 1} = (x_{0} + x_{1} ε) (y_{0}^{- 1} - y_{0}^{- 2} y_{1} ε) = \frac{x_{0}}{y_{0}} + \frac{x_{1} y_{0} - x_{0} y_{1}}{y_{0}^{2}} ε (5)

The result of this composition is the quotient rule in calculus. Additional functions like exponentials

e^{x} = e^{x_{0}} + x_{1} e^{x_{0}} ε (6)

or logarithms

\ln x = \ln x_{0} + \frac{x_{1}}{x_{0}} ε (7)

follow from Eq. 3.

2.2 Hyper-Dual Numbers

As an extension to dual numbers, hyper-dual numbers were introduced by Fike and Alonso (2011). Hyper-dual numbers contain two extra dimensions ɛ₁ and ɛ₂ with the property $ε_{1}^{2} = ε_{2}^{2} = {(ε_{1} ε_{2})}^{2} = 0$ . Thus a hyper dual number x can be written as

x = x_{0} + x_{1} ε_{1} + x_{2} ε_{2} + x_{12} ε_{1} ε_{2}, x_{0}, x_{1}, x_{2}, x_{12} \in R (8)

For a function f (x, y), the exact Taylor expansion using hyper dual numbers is

f (x_{0} + ε_{1}, y_{0} + ε_{2}) = f (x_{0}, y_{0}) + f_{x} (x_{0}, y_{0}) ε_{1} + f_{y} (x_{0}, y_{0}) ε_{2} + f_{x y} (x_{0}, y_{0}) ε_{1} ε_{2} . (9)

Therefore, hyper dual numbers can be used to calculate exact second partial derivatives f_xy (x, y) by setting x₁ = y₂ = 1 and all other derivative parts to zero. The first partial derivatives f_x (x, y) and f_y (x, y) are always calculated simultaneously. The product of two hyper dual numbers x and y

x y = x_{0} y_{0} + (x_{0} y_{1} + x_{1} y_{0}) ε_{1} + (x_{0} y_{2} + x_{2} y_{0}) ε_{2} + (x_{0} y_{12} + x_{1} y_{2} + x_{2} y_{1} + x_{12} y_{0}) ε_{1} ε_{2} (10)

again corresponds to the product rule for partial derivatives and the chain rule can be written as

f (x) = f (x_{0}) + f^{'} (x_{0}) x_{1} ε_{1} + f^{'} (x_{0}) x_{2} ε_{2} + (f^{″} (x_{0}) x_{1} x_{2} + f^{'} (x_{0}) x_{12}) ε_{1} ε_{2} . (11)

To calculate derivatives using dual or hyper-dual numbers, Eqs. 3, 11 are used to implement the required elementary functions (exponentials, powers, trigonometric functions, etc.) and Eqs. 2, 10 are used to overload the multiplication operator. Then, by the virtue of Eqs. 4, 9, the derivatives of arbitrary functions are available by setting the first derivative part(s) of the input variable(s) to 1.

2.3 Generalizations

A detailed mathematical investigation of the properties of dual and hyper dual numbers is important in a general context. However, from a science and engineering point of view, the isomorphism between the number and properties of derivatives is all that matters. Dual numbers are usually presented as numbers with certain properties. Then, by identifying the isomorphism to the chain and product rules, the exact calculation of derivatives follows as an application. There is nothing stopping us from reversing the process: start with a certain application, like the calculation of third or higher order derivatives and generate a number that is isomorphic to the chain and product rules. In other words, we think about the number as data structures, that contain a function value and its corresponding derivatives simultaneously.

2.3.1 Higher Order Ordinary Derivatives

For ordinary derivatives of an arbitrary order all derivatives up to the highest order are required for intermediate steps. Therefore, a generalized dual number of order K can be written as

x = x_{0} + \sum_{i = 1}^{K} x_{i} ν_{i}, \forall i : x_{i} \in R (12)

where the additional dimensions are referred to with the letter ν as opposed to ɛ to indicate, that they represent additional derivative orders instead of additional variables. The multiplication of two generalized dual numbers x and y can be written as

x y = x_{0} y_{0} + \sum_{i = 1}^{K} \sum_{j = 0}^{i} (\binom{i}{j}) x_{j} y_{i - j} ν_{i} (13)

The general expression of the chain rule requires the application of Faà di Bruno’s formula

f (x) = f (x_{0}) + \sum_{i = 1}^{K} \sum \frac{i!}{k_{1}! \cdot \dots \cdot k_{i}!} f^{(k_{1} + \dots + k_{i})} (x_{0}) \prod_{m = 1}^{i} {(\frac{x_{m}}{m!})}^{k_{m}} ν_{i} (14)

where the inner sum sums over all tuples of non-negative integers (k₁, …, k_i) with $\sum_{m = 1}^{i} m k_{m} = i$ . The general implementation is cumbersome and presumably slow. Therefore it is appropriate to implement these generalized dual numbers for a fixed value of K. For K = 3, the product rule

x y = x_{0} y_{0} + (x_{0} y_{1} + x_{1} y_{0}) ν_{1} + (x_{0} y_{2} + 2 x_{1} y_{1} + x_{2} y_{0}) ν_{2} + (x_{0} y_{3} + 3 x_{1} y_{2} + 3 x_{2} y_{1} + x_{3} y_{0}) ν_{3} (15)

and the chain rule

\begin{align} f (x) = f (x_{0}) + f^{'} (x_{0}) x_{1} ν_{1} + (f^{″} (x_{0}) x_{1}^{2} + f^{'} (x_{0}) x_{2})) ν_{2} \\ + (f^{‴} (x_{0}) x_{1}^{3} + 3 f^{″} (x_{0}) x_{1} x_{2} + f^{'} (x_{0}) x_{3})) ν_{3} \end{align} (16)

simplify accordingly. The i-th derivative (i ≤ K) of a real function f (x) can be identified as the ν_i component of f evaluated using generalized dual numbers.

f (x_{0} + ν_{1}) = f (x_{0}) + \sum_{i = 1}^{K} f^{(i)} (x_{0}) ν_{i} (17)

2.3.2 Gradients and Jacobians

With marginal changes to the formulation, a vector dual number can be defined that is able to calculate the gradient of a function with respect to an arbitrary amount of variables. To do so, the scalar dimension ɛ is replaced with a vector $ε = {(\begin{matrix} ε_{1} & \dots & ε_{N} \end{matrix})}^{⊺}$ with N the number of variables. The vector dual number x can then be written as

x = x_{0} + x_{1}^{⊺} ε, x_{0} \in R, x_{1} \in R^{N} . (18)

The product rule

x y = x_{0} y_{0} + {(x_{0} y_{1} + y_{0} x_{1})}^{⊺} ε (19)

and the chain rule

f (x) = f (x_{0}) + f^{'} (x_{0}) x_{1}^{⊺} ε (20)

are modified accordingly. The most relevant use case is that for a function f (x) of N variables $x = {(\begin{matrix} x_{1} & \dots & x_{N} \end{matrix})}^{⊺}$ , ɛ can be chosen as an array of N elements. Then the gradient ∇f can be determined from the dual part of f (x₀ + ɛ):

f (x_{0} + ε) = f (x_{0}) + \nabla f {(x_{0})}^{⊺} ε (21)

For array valued functions F, this generalizes to the Jacobian J, as

F (x_{0} + ε) = F (x_{0}) + J (x_{0}) ε (22)

2.3.3 Vectorized Second Partial Derivatives and Hessians

Vector hyper dual numbers can also be defined and allow the calculation of second partial derivatives with respect to an arbitrary number of variables. Both ɛ₁ and ɛ₂ are vectors and thus the vector hyper dual number x in general can be written as

x = x_{0} + x_{1}^{⊺} ε_{1} + x_{2}^{⊺} ε_{2} + ε_{1}^{⊺} x_{12} ε_{2}, x_{0} \in R, x_{1} \in R^{M}, x_{2} \in R^{N}, x_{12} \in R^{M \times N} (23)

with the product rule

x y = x_{0} y_{0} + {(x_{0} y_{1} + y_{0} x_{1})}^{⊺} ε_{1} + {(x_{0} y_{2} + y_{0} x_{2})}^{⊺} ε_{2} + ε_{1}^{⊺} (x_{0} y_{12} + x_{1} y_{2}^{⊺} + y_{1} x_{2}^{⊺} + y_{0} x_{12}) ε_{2}, (24)

and the chain rule

f (x) = f (x_{0}) + f^{'} (x_{0}) x_{1}^{⊺} ε_{1} + f^{'} (x_{0}) x_{2}^{⊺} ε_{2} + ε_{1}^{⊺} (f^{″} (x_{0}) x_{1} x_{2}^{⊺} + f^{'} (x_{0}) x_{12}) ε_{2} (25)

To save computation time for the case of simple Hessians, only one of the first derivatives needs to be accounted for as they are identical. The vector hyper dual number can then be written as

x = x_{0} + x_{1}^{⊺} ε + ε^{⊺} x_{12} ε, x_{0} \in R, x_{1} \in R^{N}, x_{12} \in R^{N \times N} (26)

instead. The Hessian H (and the gradient that comes for free) of a scalar function f(x) can then be calculated from

f (x_{0} + ε) = f (x_{0}) + \nabla f {(x_{0})}^{⊺} ε + ε^{⊺} H (x_{0}) ε . (27)

2.3.4 Higher Order Partial Derivatives

Higher order derivatives can be added by implementing new dual numbers with additional fields for the derivatives and extended sets of product and chain rules. However, by allowing the elements of any generalized dual number to be a generalized dual number themselves, higher order derivatives can be obtained without additional implementation work (Szirmay-Kalos, 2021). This becomes apparent, when a hyper dual number x is written as

x = x_{0} + x_{1} ε_{1} + x_{2} ε_{2} + x_{12} ε_{1} ε_{2} = (x_{0} + x_{1} ε_{1}) + (x_{2} + x_{12} ε_{1}) ε_{2} (28)

showing that it can simply be represented as a dual number for which the coefficients are themselves dual numbers. By avoiding redundancies in the calculation, a dedicated implementation for hyper dual numbers can be expected to be faster than the recursive version. However, the recursive implementation offers great flexibility when it comes to higher order derivatives for which a dedicated implementation becomes tedious. An example for which the usage of recursive dual numbers is useful is given in section 4.2.

3 Implementation

Our implementation is done in the Rust programming language, an open source language that was released in its first stable version in 2015 and has been increasingly adapted and used since then. It is an attractive choice for a numerical library because it is a strongly typed, compiled language that generates efficient machine code while offering high-level language constructs (e.g. pattern matching), elegant error handling, allows for generic programming and has automatic code creation features (macros). A very valuable feature of the Rust compiler is the so-called borrow checker that prevents operations that would lead to undefined behavior (null or dangling pointers) or data races at compile time.

Besides the compiler, Rust ships with Cargo, a package manager that resolves dependencies of external packages, compiles the code and its dependencies (by calling the compiler, rustc), runs tests and benchmarks and builds the documentation. The bindings to Python are written in pure Rust using PyO3, a Rust package that allows to build native Python extension modules.

Dual numbers as presented in this work are implemented in Rust as structured types (structs, similar to classes in other languages). Different from other object oriented languages, Rust provides inheritance and polymorphism in form of traits, i.e. a collection of methods that extend the data type’s functionality. Overloading of operators is realized by implementing the respective traits e.g. the Add trait for addition, Sub for subtraction, and so on.

Dual numbers are generic over their inner data types and the number of variables. The generic data type allows to specify the precision, but also to define recursive dual numbers. The number of variables is specified using Rusts min_const_generics feature. Therefore, the size of the arrays is known at compile time and scalar dual numbers can be obtained as a special case of the generic vector dual numbers without any overhead. Our library defines a trait, DualNum, which is implemented for all dual numbers as well as floating point numbers (f64 and f32), and contains a number of useful traits for numerical operations (binary arithmetic operations, trigonometric functions, exponential and logarithm as well as spherical Bessel functions, comparison operators and so on). This trait can be used to write generic functions whose arguments and results can be any dual number types, and—since the compiler generates a version for each dual number type at compile time (a concept called monomorphization)—no checks at runtime are necessary. Since the DualNum trait is build upon the more generic Num trait (which is part of the Rust num package), dual numbers work seamlessly with other Rust packages that offer functions or structures parameterized over Num. For example, it is possible to write a generic function that uses the fast Fourier transform (using the RustFFT package) for which (partial) derivatives can be computed by simply calling the function with a dual number as argument instead of a floating point number.

Data types with generic parameters cannot be directly exposed to Python, since these parameters (and consequently the size of the data type) have to be known at compile time. Therefore, we offer several specific parameter combinations as Python classes, e.g. scalar dual and hyper dual numbers with 64 bit floating point numbers as inner data types. The Python data types work seamlessly with most of numpy’s math functions so that existing code rarely has to be modified. The source code for the Rust package including the Python bindings is available on GitHub under the name num-dual. The code is also published on the Rust package repository crates. io and the python packages for Windows, Linux and macOS are available from the Python Package Index (PyPI).

4 Applications

The usage of the generalized (hyper-) dual numbers in the last section is by no means restricted to a specific field in science or engineering. However, in this work we want to focus on the application in equation of state modeling. Besides the documentation of the provided data types and functions, the github repository contains a comprehensive example which illustrates the methods and algorithms discussed below in form of a Jupyter notebook using the Peng-Robinson equation of state.

4.1 Calculation of Thermodynamic Properties

The calculation of state properties in thermodynamics is a particularly interesting application of generalized (hyper-) dual numbers because properties can be written as partial derivatives of a thermodynamic potential. In modern equations of state, this thermodynamic potential is usually the Helmholtz energy A with its characteristic variables temperature T, volume V, and number of particles of each species n. In this and the subsequent sections bold symbols indicate arrays over all components. With first order partial derivatives, the entropy S, the pressure p, and the chemical potentials μ can be obtained as

S = - {(\frac{\partial A}{\partial T})}_{V, n} p = - {(\frac{\partial A}{\partial V})}_{T, n} μ = {(\frac{\partial A}{\partial n})}_{T, V} (29)

Using dual numbers, the scalar properties can be calculated as

A (T + ε, V, n) = A - S ε and A (T, V + ε, n) = A - p ε (30)

With vector dual numbers, the chemical potential can be calculated in a single function evaluation, as

A (T, V, n + ε) = A + μ^{⊺} ε . (31)

With scalar dual numbers, the individual components have to be evaluated individually, as

A (T, V, n_{i} + ε, n_{j \neq i}) = A + μ_{i} ε (32)

For second partial derivatives hyper dual numbers are required. The derivatives with respect to scalar variables are obtained from

A (T + ε_{1}, V + ε_{2}, n) = A - S ε_{1} - p ε_{2} - {(\frac{\partial p}{\partial T})}_{V, n} ε_{1} ε_{2} (33)

which demonstrates how the first partial derivatives for the variables associated with ɛ₁ and ɛ₂ are intrinsically calculated together with the second partial derivatives. This characteristic can be used to avoid redundancies in the calculation for specific property evaluations where both first and second order partial derivatives are required or more generally by caching all results of the property evaluation for a given (T, V, n) state. Second partial derivatives with respect to the same variable can also be calculated using hyper dual numbers.

\begin{align} A (T + ε_{1} + ε_{2}, V, n) & = A - S ε_{1} - S ε_{2} - {(\frac{\partial S}{\partial T})}_{V, n} ε_{1} ε_{2} \end{align} (34)

\begin{align} A (T, V + ε_{1} + ε_{2}, n) & = A - p ε_{1} - p ε_{2} - {(\frac{\partial p}{\partial V})}_{T, n} ε_{1} ε_{2} . \end{align} (35)

However, without any loss of information, the performance can be slightly improved by switching to second order dual numbers.

\begin{align} A (T + ν_{1}, V, n) & = A - S ν_{1} - {(\frac{\partial S}{\partial T})}_{V, n} ν_{2} \end{align} (36)

\begin{align} A (T, V + ν_{1}, n) & = A - p ν_{1} - {(\frac{\partial p}{\partial V})}_{T, n} ν_{2} \end{align} (37)

by avoiding the duplicate calculation of the same first derivative.

The partial derivative of the chemical potential with respect to mole numbers can be computed efficiently using vector hyper dual numbers in the Hessian form, Eq. 26,

A (T, V, n + ε) = A + μ^{⊺} ε + ε^{⊺} {(\frac{\partial μ}{\partial n})}_{T, V} ε (38)

or in the general form, Eq. 23.

\begin{align} A (T + ε_{1}, V, n + ε_{2}) & = A - S ε_{1} + μ^{⊺} ε_{2} + ε_{1} {(\frac{\partial μ}{\partial T})}_{V, n} ε_{2} \end{align} (39)

\begin{align} A (T, V + ε_{1}, n + ε_{2}) & = A - p ε_{1} + μ^{⊺} ε_{2} + ε_{1} {(\frac{\partial μ}{\partial V})}_{T, n} ε_{2} \end{align} (40)

where M = 1 in Eq. 23 and thus ɛ₁ simplifies to a scalar.

4.2 Calculation of Critical Points

A commonly used algorithm for the calculation of critical points for a system with given mole fractions z was proposed by Heidemann and Khalil (1980) and refined by Michelsen and Mollerup (2004). The matrix M is defined as

M_{i j} = \sqrt{z_{i} z_{j}} {(\frac{\partial^{2} β A}{\partial n_{i} \partial n_{j}})}_{T, V} (41)

with z_i the mole fraction of component i, $β = \frac{1}{k_{B} T}$ the inverse temperature, and k_B the Boltzmann constant. Further, the variable s is introduced, that acts on the mole numbers n via

n_{i} = z_{i} + s u_{i} \sqrt{z_{i}} (42)

with u the eigenvector corresponding to the smallest eigenvalue λ₁ of M. Then, the two criticality conditions can be written as

{\frac{\partial^{2} β A}{\partial s^{2}}|}_{s = 0} = \sum_{i} \sum_{j} u_{i} u_{j} M_{i j} = λ_{1} = 0 (43)

and

{\frac{\partial^{3} β A}{\partial s^{3}}|}_{s = 0} = 0 . (44)

The second criticality condition is usually evaluated using a numerical derivation. This is particularly undesirable in this context, because the numerical error does appear in the residual function. Therefore it influences not only the convergence but the result itself.

The method can be enhanced using dual numbers. First the matrix M is evaluated using scalar or vector hyper-dual numbers as shown in section 4.1. The calculation of the smallest eigenvalue λ₁ and the corresponding eigenvector u is unaffected by the presence of dual numbers. The second criticality condition is calculated with one single evaluation of the Helmholtz energy using the third order dual numbers introduced in section 2.3.1. By setting

n_{i} = z_{i} + u_{i} \sqrt{z_{i}} ν_{1}, (45)

the ν₃ component of the Helmholtz energy A (T, V, n) is the second criticality condition.

The critical temperature and either density or volume are iterated using a Newton scheme. The Jacobian can be approximated numerically, however, if recursive dual numbers and the calculation of eigenvalues and eigenvectors for dual numbers are available, it can be evaluated exactly. The temperature and volume are set to

T^{dual} = T + (\begin{matrix} 1 & 0 \end{matrix}) ε, V^{dual} = V + (\begin{matrix} 0 & 1 \end{matrix}) ε . (46)

The entries of M (themselves dual numbers) are calculated by setting

n_{k}^{hd - dual} = z_{k} + δ_{i k} ε_{1} + δ_{j k} ε_{2} (47)

with the Kronecker delta δ_ij and evaluating the Helmholtz energy

M_{i j}^{dual} = \sqrt{z_{i} z_{j}} A {(T^{dual}, V^{dual}, n^{hd - dual})}_{12} (48)

The eigenvalue $λ_{1}^{dual}$ and the corresponding eigenvector u^dual are calculated as shown in Supplementary Material. Finally, the second criticality condition is determined by setting

n_{i}^{d 3 - dual} = z_{i} + u_{i}^{dual} \sqrt{z_{i}} ν_{1} (49)

and evaluating $A {(T^{dual}, V^{dual}, n^{d 3 - dual})}_{3}$ . The Jacobian is now available from the gradient parts of the two criticality conditions. Here it was assumed that operators are available for all combinations of dual and recursive hyper-dual numbers. In practice this poses difficulties with respect to the implementations and is therefore likely not the case. Then, T^dual and V^dual can simply be lifted to the higher order dual number by setting all derivative parts to zero.

4.3 Cross Association

In equations of state that are based on thermodynamic perturbation theory by Wertheim (1984a), Wertheim (1984b), like the statistical associating fluid theory (SAFT) family (Chapman et al., 1989; Gross and Sadowski, 2001; Lafitte et al., 2013) or the cubic plus association (CPA) equation of state (Kontogeorgis et al., 2006), association contributions are used to model highly directional short range interactions like hydrogen bonds. The Helmholtz energy contribution for the cross association is given by

β A^{assoc} = \sum_{i} n_{i} \sum_{A_{i}} N_{A_{i}} (\ln χ^{A_{i}} - \frac{χ^{A_{i}}}{2} + \frac{1}{2}) (50)

with $N_{A_{i}}$ the number of association sites of kind A on molecule i and $χ^{A_{i}}$ the corresponding fraction of non-bonded sites. What makes this contribution relevant in the context of generalized (hyper-) dual numbers is, that the fraction of non-bonded sites $χ^{A_{i}}$ is determined by the implicit set of equations

χ^{A_{i}} = {(1 + \sum_{j} ρ_{j} \sum_{B_{j}} N_{B_{j}} χ^{B_{j}} Δ^{A_{i} B_{j}})}^{- 1} (51)

The exact expression for the association strength $Δ^{A_{i} B_{j}}$ differs between different equations of state, but the equations shown here are valid for all variants. To be able to automatically determine partial derivatives of the Helmholtz energy, the partial derivatives of the monomer fractions $χ^{A_{i}}$ are also required. With an appropriate iterative method, these derivatives can be obtained with hardly any additional implementations using generalized (hyper-) dual numbers.

The state of the art iteration method for the monomer fraction was once again presented by Michelsen (2006). Applied to the notation used above, the problem is reformulated as a minimization of the property

Q = \sum_{i} ρ_{i} \sum_{A_{i}} N_{A_{i}} (\ln χ^{A_{i}} - χ^{A_{i}} + 1) - \frac{1}{2} \sum_{i} \sum_{j} ρ_{i} ρ_{j} \sum_{A_{i}} \sum_{B_{j}} N_{A_{i}} N_{B_{j}} χ^{A_{i}} χ^{B_{j}} Δ^{A_{i} B_{j}} (52)

with respect to the variables $χ^{A_{i}}$ . The minimum of Q is obtained when the gradients

g_{A_{i}} = \frac{\partial Q}{\partial χ^{A_{i}}} = ρ_{i} N_{A_{i}} (\frac{1}{χ^{A_{i}}} - 1 - \sum_{j} ρ_{j} \sum_{B_{j}} N_{B_{j}} χ^{B_{j}} Δ^{A_{i} B_{j}}) (53)

vanish. The stationary points of Eq. 53 coincide with the solutions of Eq. 51. The solution is found by using the Newton iteration scheme

χ^{(k + 1)} = χ^{(k)} + Δ χ, \hat{H} Δ χ + g = 0 (54)

with the modified Hessian $\hat{H}$ with entries

{\hat{H}}_{A_{i} B_{j}} = - \frac{ρ_{i} N_{A_{i}} δ_{A_{i} B_{j}}}{χ^{A_{i}}} (1 + \sum_{k} ρ_{k} \sum_{C_{k}} N_{C_{k}} χ^{C_{k}} Δ^{A_{i} C_{k}}) - ρ_{i} ρ_{j} N_{A_{i}} N_{B_{j}} Δ^{A_{i} B_{j}} (55)

The modification of the Hessian ensures that it is negative definite. As shown in Supplementary Material, when Eq. 54 is iterated from some initial value using generalized (hyper-) dual numbers, the derivatives are calculated automatically. To speed up the computation, it is advisable to first solve the nonlinear equation for the real part. Then, as the equations for the derivatives are linear, the resulting $χ^{A_{i}}$ can be lifted to the relevant dual number and as many steps of Eq. 54 need to be applied as there are derivatives, i.e. one for dual numbers, two for hyper-dual numbers and three for third order dual numbers. With the derivatives of the monomer fractions in place, the Helmholtz energy contribution and its derivatives are available from Eq. 50.

5 Discussion

Generalized (hyper-) dual numbers enable the calculation of exact derivatives of scalar and vector valued functions which is especially valuable when derivatives are part of the (physical) model. Derivatives no longer need to be implemented or approximated which leads to less error prone and faster development. Depending on the model, an implementation using dual numbers can be more costly to evaluate compared to hand-written derivatives. However, in the context of thermodynamic equations of state shown in this work, using the right type of dual number in combination with caching prior results, this disadvantage can be compensated.

Thermodynamic equations of state greatly benefit from generalized (hyper-) dual numbers because thermodynamic properties are computed from (partial) derivatives of a single function, the thermodynamic potential. We showed above that generalized (hyper-) dual numbers can be used to efficiently compute these properties even for complex models that contain implicit expressions like the association contribution in SAFT. In some cases, like the calculation of critical points, the algorithms themselves can be simplified using generalized (hyper-) dual numbers. Because only a single model equation has to be implemented (the Helmholtz energy), separation of model agnostic algorithms, like phase equilibria and stability analysis, and the model equation is simple and leads to more maintainable and extensible code. We believe that from a research point of view this enables faster and easier development of new models and algorithms, and consequently, it will enable faster adaption and transfer to industry.

Data Availability Statement

The source code developed for this study can be found on GitHub (https://github.com/itt-ustutt/num-dual) and crates.io (https://crates.io/crates/num-dual). Compiled Python packages for Windows, Linux and macOS are available from PyPI (https://pypi.org/project/num_dual).

Author Contributions

PR and GB have contributed equally to design and implementation of the num-dual Rust and Python libraries as well as the scientific research and the documentation process.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Acknowledgments

We thank Joachim Gross for his continued support and the opportunity to pursue this project, and Rolf Stierle and Elmar Sauer for their valuable input on the manuscript.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fceng.2021.758090/full#supplementary-material

References

Agamawi, Y. M., and Rao, A. V. (2020). Cgpops. ACM Trans. Math. Softw. 46, 1–38. doi:10.1145/3390463

CrossRef Full Text | Google Scholar

Bartholomew-Biggs, M. C. (1998). Using Forward Accumulation for Automatic Differentiation of Implicitly-Defined Functions. Comput. Optimization Appl. 9, 65–84. doi:10.1023/A:1018382103801

CrossRef Full Text | Google Scholar

Chapman, W. G., Gubbins, K. E., Jackson, G., and Radosz, M. (1989). Saft: Equation-Of-State Solution Model for Associating Fluids. Fluid Phase Equilibria 52, 31–38. doi:10.1016/0378-3812(89)80308-5

CrossRef Full Text | Google Scholar

Cohen, A., and Shoham, M. (2015). Application of Hyper-Dual Numbers to Multibody Kinematics. J. Mech. Robotics 8, 011015. doi:10.1115/1.4030588

CrossRef Full Text | Google Scholar

Cohen, A., and Shoham, M. (2017). Application of Hyper-Dual Numbers to Rigid Bodies Equations of Motion. Mechanism Machine Theor. 111, 76–84. doi:10.1016/j.mechmachtheory.2017.01.013

CrossRef Full Text | Google Scholar

Diewald, F., Heier, M., Horsch, M., Kuhn, C., Langenbach, K., Hasse, H., et al. (2018). Three-dimensional Phase Field Modeling of Inhomogeneous Gas-Liquid Systems Using the Pets Equation of State. J. Chem. Phys. 149, 064701. doi:10.1063/1.5035495

CrossRef Full Text | Google Scholar

Fike, J., and Alonso, J. (2011). “The Development of Hyper-Dual Numbers for Exact Second-Derivative Calculations,” in 49th AIAA Aerospace Sciences Meeting including the New Horizons Forum and Aerospace Exposition, Orlando, Fl, USA, January 7, 2011. (Reston, VA: AIAA), 886. doi:10.2514/6.2011-886

CrossRef Full Text | Google Scholar

Griewank, A., and Walther, A. (2008). Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation. Chennai, TN: SIAM.

Google Scholar

Gross, J., and Sadowski, G. (2001). Perturbed-chain Saft: An Equation of State Based on a Perturbation Theory for Chain Molecules. Ind. Eng. Chem. Res. 40, 1244–1260. doi:10.1021/ie0003887

CrossRef Full Text | Google Scholar

Heidemann, R. A., and Khalil, A. M. (1980). The Calculation of Critical Points. Aiche J. 26, 769–779. doi:10.1002/aic.690260510

CrossRef Full Text | Google Scholar

Kontogeorgis, G. M., Michelsen, M. L., Folas, G. K., Derawi, S., von Solms, N., and Stenby, E. H. (2006). Ten Years with the Cpa (Cubic-plus-association) Equation of State. Part 1. Pure Compounds and Self-Associating Systems. Ind. Eng. Chem. Res. 45, 4855–4868. doi:10.1021/ie051305v

CrossRef Full Text | Google Scholar

Lafitte, T., Apostolakou, A., Avendaño, C., Galindo, A., Adjiman, C. S., Müller, E. A., et al. (2013). Accurate Statistical Associating Fluid Theory for Chain Molecules Formed from Mie Segments. J. Chem. Phys. 139, 154504. doi:10.1063/1.4819786

CrossRef Full Text | Google Scholar

Martins, J. R. R. A., Sturdza, P., and Alonso, J. J. (2003). The Complex-step Derivative Approximation. ACM Trans. Math. Softw. 29, 245–262. doi:10.1145/838250.838251

CrossRef Full Text | Google Scholar

Michelsen, M. L. (2006). Robust and Efficient Solution Procedures for Association Models. Ind. Eng. Chem. Res. 45, 8449–8453. doi:10.1021/ie060029x

CrossRef Full Text | Google Scholar

Michelsen, M., and Mollerup, J. (2004). Thermodynamic Modelling: Fundamentals and Computational Aspects. Holte, Denmark: Tie-Line Publications.

Google Scholar

Rehner, P., Aasen, A., and Wilhelmsen, Ø. (2019). Tolman Lengths and Rigidity Constants from Free-Energy Functionals-General Expressions and Comparison of Theories. J. Chem. Phys. 151, 244710. doi:10.1063/1.5135288

PubMed Abstract | CrossRef Full Text | Google Scholar

Szirmay-Kalos, L. (2021). Higher Order Automatic Differentiation with Dual Numbers. Period. Polytech. Elec. Eng. Comp. Sci. 65, 1–10. doi:10.3311/PPee.16341

CrossRef Full Text | Google Scholar

Wertheim, M. S. (1984b). Fluids with Highly Directional Attractive Forces. Ii. Thermodynamic Perturbation Theory and Integral Equations. J. Stat. Phys. 35, 35–47. doi:10.1007/BF01017363

CrossRef Full Text | Google Scholar

Wertheim, M. S. (1984a). Fluids with Highly Directional Attractive Forces. I. Statistical Thermodynamics. J. Stat. Phys. 35, 19–34. doi:10.1007/BF01017362

CrossRef Full Text | Google Scholar

Keywords: automatic differentiation, equation of state, thermodynamic models, data structure, numerical methods

Citation: Rehner P and Bauer G (2021) Application of Generalized (Hyper-) Dual Numbers in Equation of State Modeling. Front. Chem. Eng. 3:758090. doi: 10.3389/fceng.2021.758090

Received: 13 August 2021; Accepted: 20 September 2021;
Published: 11 October 2021.

Edited by:

José María Ponce-Ortega, Michoacana University of San Nicolás de Hidalgo, Mexico

Reviewed by:

Mustafa Salti, Mersin University, Turkey
Javier Tovar Facio, National Institute of Ecology and Climate Change, Mexico
Fabricio Nápoles, Michoacana University of San Nicolás de Hidalgo, Mexico
Luis Fernando Lira-Barragán, Michoacana University of San Nicolás de Hidalgo, Mexico

Copyright © 2021 Rehner and Bauer. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Philipp Rehner, cmVobmVyQGl0dC51bmktc3R1dHRnYXJ0LmRl

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.