An online learning algorithm for adapting leg stiffness and stride angle for efficient quadruped robot trotting

Aboufazeli, Mahtab; Samare Filsoofi, Ali; Gurney, Jason; Meek, Sanford G.; Mathews, V John

doi:10.3389/frobt.2023.1127898

ORIGINAL RESEARCH article

Front. Robot. AI, 06 April 2023

Sec. Bio-Inspired Robotics

Volume 10 - 2023 | https://doi.org/10.3389/frobt.2023.1127898

This article is part of the Research TopicBio-Inspired Soft LocomotionView all 7 articles

An online learning algorithm for adapting leg stiffness and stride angle for efficient quadruped robot trotting

Jason Gurney²

¹School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR, United States
²Department of Mechanical Engineering and the Robotics Center, University of Utah, Salt LakeCity, UT, United States

Animals adjust their leg stiffness and stride angle in response to changing ground conditions and gait parameters, resulting in improved stability and reduced energy consumption. This paper presents an online learning algorithm that attempts to mimic such animal behavior by maximizing energy efficiency on the fly or equivalently, minimizing the cost of transport of legged robots by adaptively changing the leg stiffness and stride angle while the robot is traversing on grounds with unknown characteristics. The algorithm employs an approximate stochastic gradient method to change the parameters in real-time, and has the following advantages: (1) the algorithm is computationally efficient and suitable for real-time operation; (2) it does not require training; (3) it is model-free, implying that precise modeling of the robot is not required for good performance; and (4) the algorithm is generally applicable and can be easily incorporated into a variety of legged robots with adaptable parameters and gaits beyond those implemented in this paper. Results of exhaustive performance assessment through numerical simulations and experiments on an under-actuated quadruped robot with compliant legs are included in the paper. The robot platform used a pneumatic piston in each leg as a variable, passive compliant element. Performance evaluation using simulations and experiments indicated that the algorithm was capable of converging to near-optimal values of the cost of transport for given operating conditions, terrain properties, and gait characteristics with no prior knowledge of the terrain and gait conditions. The simplicity of the algorithm and its demonstrably improved performance make the approach of this paper an excellent candidate for adaptively controlling tunable parameters of compliant, legged robots.

1 Introduction

Mobile robots are employed for a wide range of applications and in various environments. Energy efficiency of mobile robots is a critical issue as they carry their power sources on them, and batteries must comply with space and weight constraints. Unlike wheeled robots, legged robots can mimic animal behavior and adapt to unstructured environments that are inaccessible to wheeled robots.

Animals adjust their leg compliance and gait parameters such as the stride length and stride frequency of their gaits to traverse more efficiently in response to environmental changes and/or their locomotion speeds. Inspired by biology, this paper presents strategies to adaptively reduce the energy consumption of legged robots during gait by continuously adjusting leg stiffness and stride angle. Although we consider only the adaptation of leg stiffness and stride angle, the algorithm can be easily generalized to other parameters of the robot design and of the gaits.

Passive compliance in distal legs of animals improves stability and reduces energy consumption (Alexander, 1984). Animals can store energy at the beginning of the stance phase from touchdown to midstance and release it for use in the rest of the stance phase (Dickinson et al., 2000). Geyer et al. (2006) have demonstrated that a compliant leg model is needed to characterize normal walking and running mechanisms in humans. Human runners increase their leg stiffness on soft grounds and decrease it on hard grounds (Ferris and Farley, 1997; Ferris et al., 1998). Humans also increase their equivalent leg stiffness with increasing running speeds (Kim and Park, 2011) and with increasing stride frequencies (Farley and Gonzalez, 1996). The stride length and stride rate of human runners increase with increasing forward velocity (Elliott and Blanksby, 1979). Animals also increase their stride length proportionately to their forward speeds (Jayes and Alexander, 1978; Barrey et al., 2002). Experimental analyses on humans and other bipedal and quadruped animals have shown that they vary their stride lengths while changing their forward velocities based on the anatomy of their bodies and their legs (Alexander, 2004). It has been argued, based on experimental results, that changing the stride length and stiffness of the legs in various gaits and speeds conserves energy (Elliott and Blanksby, 1979; Umberger and Martin, 2007; Shen and Seipel, 2015).

Inspired by the biological phenomena described above, this paper presents an online learning algorithm to continuously adapt the leg stiffness and the stride length of a quadruped robot in order to reduce its energy consumption during locomotion. These parameters were updated using a block gradient algorithm with the goal of reducing the cost of transport (CoT) (Kottege et al., 2015). CoT is a non-dimensional metric used to compare the energy efficiency of the transportation of different mobile robots and/or animals. CoT is defined as the consumed power per 1 kg weight of the robot/animal to traverse a distance of 1 m.

The leg stiffness of the quadruped robot was changed in real time by varying the pressure in a pneumatic piston located in each distal leg segment, and the stride length was adjusted by manipulating the stride angle. Our approach has many advantages over the state-of-the-art, including: (1) it is computationally efficient and can be easily implemented in real-time; (2) it does not need to train the robot in advance; (3) it is model-free, implying that precise modeling of the robot is not required for good performance; and (4) it can be applied to a variety of legged robots and gaits beyond those we implemented.

The performance of the online learning algorithm was evaluated using a large set of simulations and experiments performed on an under-actuated quadruped platform (Gurney et al., 2023). The results presented in the paper indicate that similar to biological phenomena, our approach adapts the leg stiffness and stride angle appropriately to achieve better energy efficiency in different operating conditions.

The rest of the paper is organized as follows. Section 2 provides a review of previous work related to design and control approaches for improving the performance of robots including those associated with energy efficiency, robustness to disturbances, and the ability to track desired trajectories. Section 3 describes the design of the robot and the online learning algorithm. Performance evaluation of the robot using simulations and experiments is presented in Section 4. Finally, our concluding remarks are provided in Section 5.

2 Related work

A number of approaches to improve the energy efficiency, accuracy of trajectory tracking, robustness to disturbances, and environmental changes of legged robots have been reported in the literature.

One class of approaches was inspired by biological phenomena and targets mechanical design principles. One example involves designing robot legs with compliant actuators such as hydraulic actuators as in HyQ (Boaventura et al., 2012) and SCalf (Yang et al., 2019), and pneumatic pistons used by Mirzana et al. (2018). Employing compliant actuators improves robustness to unstructured environments for a wide range of dynamic locomotion (Boaventura et al., 2012), but cannot reduce energy consumption because such systems do not have a mechanism to store and release energy as passive compliance systems do. Hua et al. (2019) designed a hydraulic servo actuator with passive compliance (HPCA) for SCalf and used a compliance control algorithm based on a virtual model of the robot to improve its energy efficiency. A number of systems employing passive compliance in robot legs have been reported in the literature. Examples include Cheetah-cub (Spröwitz et al., 2013), StarlETH (Hutter et al., 2012), iSprawl (Kim et al., 2006), MIT Cheetah (Seok et al., 2013) ATRIAS (Hubicki et al., 2018), ANYmal (Hutter et al., 2016) and COMAN Kormushev et al. (2019) robots. Passive compliance conserves energy and improves the robot’s robustness to disturbances caused by external forces. Analyses of passive compliance systems have shown that two configurations of springs and actuators, i.e., series elastic actuators (SEA) and parallel elastic actuators (PEA) are more energy efficient compared to rigid actuators (RA), and SEA is more efficient than PEA (Verstraten et al., 2016; Kashiri et al., 2018). The optimum stiffness of the springs in such systems was obtained empirically in many designs (Kim et al., 2006; Spröwitz et al., 2013; Kormushev et al., 2019). However, the optimum stiffness varies with operating conditions. Passive compliance with adjustable stiffness is a solution to conserve energy and tune the stiffness when the operating conditions vary. For example, Koco et al. (2016) designed a variable passive compliance system that enabled a quadruped robot to adjust to grounds with different stiffness assuming that the stiffness of the ground is known. Christie et al. (2019) verified the effect of adjusting the leg stiffness on reducing the CoT by designing a robot with magnetorheological-fluid-centric variable stiffness legs.

The second class of approaches for designing high-performance legged robots involves designing parameters of the robot or gait by employing optimization algorithms to minimize energy consumption, trajectory tracking error, etc., improve the smoothness of the motion and robustness to disturbances, or achieve another appropriate objective. Many of these optimization algorithms were implemented offline, and the resulting parameters of the algorithm were kept frozen during locomotion. Examples of such algorithms include variations of policy gradient methods such as the policy gradient reinforcement learning implemented on Sony Aibo (Kohl and Stone, 2004), the policy gradient learning combined with interpolation between the policies used in Cassie (Xie et al., 2018) and the guided constraint policy gradient method used in ANYmal (Gangapurwala et al., 2020). Other than gradient-based algorithms, methods such as Bayesian optimization approaches as in (Lizotte et al., 2007) and (Calandra et al., 2014) and an archive-based micro genetic algorithm implemented on the hexapod robot in (Zhai et al., 2020) were employed to optimize the desired objective functions. Jin et al. (2013) applied a non-linear quadratic optimization approach to minimize the energy cost of a compliant hexapod robot by finding the optimal velocity and duty factor. They also showed via simulations that the values of duty factor, velocity, and stride angle vary for various weights and payloads. Unlike these approaches, our algorithm was implemented online and the parameters can be adapted in real time.

Some of the recent optimization approaches for controlling robot locomotion were designed to respond adaptively to environmental changes, unknown initial conditions, and/or damage. Examples include stiffness control of a simulated model of a passive bipedal robot using a double-deep Q network presented in (Wu et al., 2019) and intelligent trial and error algorithm for robots to adapt to damage presented in (Cully et al., 2015). Both algorithms require offline training. Arena et al. (2021) implemented an adaptive controller based on the FitzHugh-Nagumo neuron to reduce the energy consumption of a model of Mini MIT cheetah while the robot walked on uneven grounds. They found the optimal values of the parameters of the trajectory for five different values of slopes and employed a state machine to update the parameters of the trajectory with the optimal parameters of the closest slope. Even though their method was implemented online, they only considered five distinct slopes for one specific forward velocity, and the optimal parameters were not computed for other values of forward velocities and slopes.

Various adaptive optimization methods have been implemented on robot arm manipulators to either follow desired trajectories or identify the plant models online. These methods include fuzzy neural networks (He and Dong, 2017), adaptive fuzzy full state feedback control (Yu et al., 2020), and adaptive impedance control (Li et al., 2016). Zhou et al. employed recurrent neural networks (RNN) to control the compliance of redundant robot manipulators (Zhou et al., 2020a) and an RNN combined with an adaptive online identifier to learn the kinematic parameters of the manipulator (Zhou et al., 2020b). However, the number of state variables in multi-legged robots is much greater than in a robot arm manipulator. This increases the computational complexity of the algorithms and makes them challenging to implement in real-time.

The third class of approaches to improve the performance of the robots involve creating efficient locomotion using appropriate trajectory planning algorithms. For example, Kormushev et al. (2019) employed a reinforcement learning approach that evolved policy parametrization dynamically. Yu et al. (2018) generated symmetric trajectories using a curriculum learning method. Ordonez-Apraez et al. (2022) presented a maximum entropy reinforcement learning architecture to derive the optimal policy that minimized the energy of locomotion of two robots, one bipedal and one quadruped, in a simulation environment. Liu et al. (2015) used a particle swarm optimization method and Koco et al. (2014) employed a genetic multi-objective optimization algorithm and a pattern search method used in SCalf (Yang et al., 2019) to determine the robot’s foot trajectory. All these algorithms were implemented offline. In SoftLegs (Gasparri et al., 2018), a numerical optimization algorithm was employed to generate a dataset of optimized trajectories offline, and new trajectories were generated online by synthesizing the trajectories using the dataset. Xin et al. (2020) optimized the torque trajectory of ANYmal in the presence of disturbances on an uphill ramp and on grounds with low friction. Their controller combined a cartesian impedance control algorithm with quadratic programming under certain stability constraints and torque limits. The resulting algorithm was computationally efficient and implemented online for static walking but there was no report of implementing the system for dynamic gaits such as trotting. A combination of model-based control and model-free reinforcement learning framework was introduced in (Da et al., 2021) to adaptively learn contact sequences. However, this framework requires training on a simulated model of the robot and the authors only presented the results of implementing their algorithm on a real robot at low speeds.

The algorithm of this paper updates the parameters of the robot design and angular trajectory of the hip/shoulder joints with the goal of reducing the CoT for dynamic gaits. Unlike a number of previous algorithms that employed offline optimization methods and/or required a large training data set and hours of training on either the real robot or a model of the robot, our algorithm can be easily implemented online, and in real-time. A drawback of training a robot and using the trained parameters without adjustments during normal operations is that the performance of the robot will, in general, degrade over time. Performance degradation can arise from changes in the operating conditions as well as changes in the robot during operation. Our approach mitigates all these problems.

3 Methodology

This Section describes the robot platform and the online learning algorithm designed to reduce energy consumption online and in real-time.

3.1 Robot platform

The online learning algorithm was implemented and tested on an under-actuated quadruped platform, UPed, designed and built at the University of Utah Gurney et al. (2023). The left panel of Figure 1 displays a picture of the robot. A schematic of one of the robot’s legs is shown separately in the right panel.

FIGURE 1

FIGURE 1. Left: A side view of UPed Gurney et al. (2023). Right: A schematic of one of the robot’s legs designed in solid works. Used with permission of ASME, from [UPed: A Quadruped Robot Platform to Study Directional Leg Compliance, Gurney and Filsoofi et al., vol 15, no 1, 2023]; permission conveyed through Copyright Clearance Center, Inc.

The leg design involved an elbow back and knee forward configuration, with passive compliance at the knee/elbow joints and the distal limb Lee and Meek (2005). Previous work Meek et al. (2008) has shown that this configuration reduces pitch rate compared to the elbow forward and/or knee backward designs. Lower pitch rate improves the inherent pitching stability of the robot’s gait. The robot was restricted to planar motion because each of its legs was actuated in the plane perpendicular to the main body by a DC motor located in the hip/shoulder joint. The proximal and the distal segments of the leg were connected together with a passive torsional spring. The proximal limb was a solid bar; however, the distal limb was composed of a linear translational spring in parallel with a pneumatic piston. Details of the robot design are provided in Gurney et al. (2023).

During the stance phase of each leg, a valve is closed preventing air flow in the cylinder, making it act like a passive spring. The air pressure, P_td, in the pneumatic piston adapts and controls the stiffness of the leg. Consequently, we can use the piston as a compliant element with variable stiffness, and adjust its stiffness in real time by varying the air pressure in the piston. Using the ideal gas law, the force exerted by the parallel combination of the pneumatic piston and the linear spring during stance phase is related to their displacement as

F = \frac{P_{t d} A L_{0}}{(L_{0} - x)} - P_{a t m} A + K_{l i n} x (1)

where L₀ and P_td represent the length and the pressure, respectively, of the pneumatic piston at the time of touchdown, the parameters A, P_atm and x denote the area of the piston, the ambient pressure and the displacement of the piston with respect to L₀, respectively, and K_lin represents the stiffness of the linear spring. The stiffness of the passive compliance is the derivative of the force F with respect to the displacement x, and is given by

K = \frac{d F}{d x} = \frac{P_{t d} A L_{0}}{{(L_{0} - x)}^{2}} + K_{l i n} (2)

The adaptive parameters of the robot are the stiffness of the legs and the stride angle of its gait. Based on the results in (1) and 2, we adapt the pressure of the pneumatic spring to control the stiffness of the passive compliance of the leg. The stride angle r is defined as the maximum deviation of the hip/shoulder angles from the static stance phase.

Our robot was restricted for planar motion in the longitudinal direction. As a result we can make inferences about the stability of the robot by analyzing its pitching motion. (Meek et al., 2008). demonstrated that the elbow-backward, knee-forward configuration results in lower pitch oscillations than three other configurations. In this work, we empirically swept through stride angle and touchdown pressure and selected a subset of all possible parameter ranges in which the robot’s pitching motion was bounded. During deployment, the adaptation of the parameters were restricted to this subset, allowing the robot to operate in a stable manner. Figure 2 displays the plot of pitch angle-pitch rate for a simulated model of the robot n MATLAB-Simulink using the Simscape Multibody library Simulink Documentation (2018) for stride angle r = 14° and touchdown pressure P_td = 3 × 10⁵ Pa which belonged to the parameter set that produced a stable gait. Since the trajectory of the pitch rate fits inside a circle with bounded radius, this periodic gait is asymptoticly stable. Furthermore, (Usherwood and Granatosky, 2020), demonstrated that the knee-forward and elbow-backward configuration reduces joint and limb work and pitch moment. This indicates a direct relationship between energy consumption and the pitch accelerations of biological systems. In addition, pitch rate and CoT had similar trends in the experiments performed on UPed (Gurney et al., 2023) for various conditions. Our algorithm adaptively reduces the energy consumption of the robot. This leads to reduced pitch acceleration and improved the stability of pitching motion of the robot.

FIGURE 2

FIGURE 2. Plot of pitch rate vs. pitch angle for r = 14°, P_td = 3 × 10⁵ Pa and angular velocity ω = 90 deg/s. The black triangle displays the beginning of the cycle.

3.2 Adaptation of leg stiffness and stride angle

Although easily generalized, we developed an online learning algorithm for adapting the stride angle of the robot’s gait and the stiffness of the passive compliance of its legs, with the objective of reducing the average value of the consumed energy at each time interval since our main target is to minimize the statistical expectation of a positive power of the CoT. For simplicity of implementation, the CoT is computed over an integer N_b number of strides. In the rest of the paper, we denote N_b strides as one block. Assuming that the robot has already traversed the (m − 1)^th block, our objective is to reduce a positive power of the CoT while the robot is traversing the mth block, i.e., reduce

E \{{CoT}^{d} (m; \underset{̲}{U} (m), \underset{̲}{θ})\} (3)

where E{.} is the statistical expectation of ⋅, $\underset{̲}{θ} = {[r, P_{t d}]}^{T}$ is the vector of optimization parameters consisting of the stride angle r and touchdown pressure P_td and $\underset{̲}{U} (m) = {[\underset{̲}{U} {[T (m)]}^{T}, \dots, \underset{̲}{U} {[T (m + 1) - 1]}^{T}]}^{T}$ represents the augmented vector of all control signals for block m. Let $\underset{̲}{U} [n] = {[I_{1} [n], I_{2} (n), I_{3} (n), I_{4} (n)]}^{T}$ be the vector of control signals at time n where I_j(n) is the current drawn by the jth motor at time n. Let block m start at time T(m). The CoT for block m is defined as Kottege et al. (2015).

CoT (m) = \frac{P_{a v g} (m)}{M g V (m)} (4)

where P_avg(m) is the sum of the average power consumed by all the motors of the robot while the robot is traversing the interval of block m, V(m) is the robot’s average linear forward velocity in the interval, g is the acceleration due to the gravity, and M is the total mass of the robot. The average power consumed by the motor that actuates the ith leg for block m may be estimated as

P_{i} (m) = \frac{1}{T (m + 1) - T (m)} \sum_{n = T (m)}^{T (m + 1) - 1} \{R I_{i} {(n)}^{2} + | ω_{i} (n) R_{G} k_{t} I_{i} (n) |\} Δ t (n) (5)

where R, k_t, and R_G represent the motor coil resistance, torque constant, and gearbox ratio, respectively, of the motors actuating the shoulder/hip angle movements, ω_i is the hip/shoulder angular velocity of the ith leg (after the gearbox), I_i is the current consumed by the ith motor, and Δt(n) is the sampling period for the measurements. In (5), it is assumed that all four motors have the same values of R, k_t, and R_G.

We employed a stochastic gradient adaptation method to update the optimization parameters online and in real-time. The adaptive algorithm does not assume knowledge of the functional relationship between the parameters of the robot and the CoT. Instead, approximations of the gradients of $CoT (m; \underset{̲}{θ})$ with respect to the adaptive parameters are calculated numerically. The basic online learning strategy, then, is to employ update equations of the form:

r (m) = r (m - m_{r}) - η_{r} CoT (m - m_{r}) f_{r} (\hat{\nabla_{r}} CoT (m - m_{r})) (6)

and

P_{t d} (m) = P_{t d} (m - m_{P}) - η_{P} CoT (m - m_{P}) f_{P} (\hat{\nabla_{P}} CoT (m - m_{P})) (7)

where f_r and f_P are non-decreasing functions that are bounded above and below by finite numbers to avoid large changes of the adaptive parameters that are updated. The parameters m_r and m_P represent the number of blocks between successive updates of r and P_td, respectively, and $\hat{\nabla_{r}} μ (m)$ and $\hat{\nabla_{P}} μ (m)$ denote the numerically calculated gradient of the mean of the CoT with respect to r and P_td, respectively at the mth block. For every successive block, the gradient is calculated only for the parameters that are updated. Let r be the parameter that updates at block m. The gradient at block m is estimated as the slope of the CoT with respect to r as

\hat{\nabla_{r}} CoT (m) = \frac{CoT (m) - CoT (m - 1)}{r (m) - r (m - 1)} (8)

Similarly, if P_td is updated at block m the gradient is estimated as

\hat{\nabla_{P}} CoT (m) = \frac{CoT (m) - CoT (m - 1)}{P_{t d} (m) - P_{t d} (m - 1)} (9)

Although other more accurate approximations for the gradients are possible, we chose the definitions in (8) and 9 for computational simplicity. In this paper, the adaptive parameters are updated sequentially i.e., only one parameter is updated during each block. Simultaneous update of the parameters causes a 100% correlation between the update directions of the parameters which constrains the changes of the parameters to be always in one direction. We avoid this problem by updating one of the two parameters during each update cycle and keeping the value of the other parameter the same as it was in the previous block. For the sequential block-wise stochastic gradient descent algorithm described above, we used hyperbolic tangent functions for f_r and f_P in (6) and 7. That is,

f_{r} (\hat{\nabla_{r}} CoT (m)) = \tanh (S_{r} \hat{\nabla_{r}} CoT (m)) (10)

and

f_{P} (\hat{\nabla_{P_{t d}}} CoT (m)) = \tanh (S_{P} \hat{\nabla_{P_{t d}}} CoT (m)) (11)

Here, S_r and S_P denote constant scaling factors of the numerically calculated gradients of r and P_td, respectively, and control how quickly the incremental update values saturate. The parameters η_r, η_P, S_r and S_P are constant values and are chosen empirically. The update equations are tabulated in Table 1 for the case when m_r = m_P = 2.

TABLE 1

TABLE 1. Block-wise stochastic gradient descent algorithm.

For the experiments and simulations in this paper, we chose d = 2 in (3). For this case, we can interpret the update strategy to be using a time-varying learning rate given by η_r(P)CoT (n − 2). Therefore, the equivalent learning rate compared to the vanilla stochastic gradient descent algorithm is η_r(P)CoT (n − 2).

4 Results

A large number of simulations and experiments were performed to assess the capabilities of the online learning algorithm. Representative results presented here will demonstrate that the algorithm is capable of adapting the parameters of the robot and tracking time-varying operating environments online and achieving near-optimal performance.

4.1 Simulations

4.1.1 Simulation setup

A model of the quadruped robot described in Section 3.1 was built in MATLAB-Simulink using the Simscape Multibody library Simulink Documentation (2018). The foot-ground contacts were modeled as point contacts. The ground at contact points was modeled as a parallel spring-damper system in each dimension Silva et al. (2013). Uneven grounds were built by perturbing the stiffness, damping coefficients, and the longitudinal slope of the ground by adding a zero-mean pseudo-Gaussian noise sequence to the nominal values of these parameters. The additive noise sequence was modeled as the output of a single-pole lowpass filter, computed recursively as

x_{η} (n + 1) = α x_{η} (n) + \sqrt{(1 - α^{2})} ζ (n) (12)

Here, ζ(n) and the initial value x_η(1) were chosen as independent and identically distributed (i.i.d) Gaussian variables with zero mean value and standard deviation equal to the standard deviation of the ground parameter. The parameter α in (12) belonged to the interval [0,1] and defined the spectrum of the added noise. Values of α closer to one resulted in smoother perturbations than otherwise.

For all the simulations described here, α was chosen to be 0.95 for all noise sequences. The standard deviation of the noise added to the ground stiffness and damping coefficients were 10% of their nominal values. The standard deviation of the longitudinal slope regardless of its mean value was equal to 0.5° and the transverse slope was equal to zero. A zero-mean i.i.d Gaussian noise sequence was added to the measured CoT with a standard deviation of 0.005 regardless of the value of the CoT which produced, for the simulation conditions, a signal-to-noise ratio in the range of 32 dB–38 dB.

The algorithm was implemented for various hip/shoulder angular velocities and nominal values of the ground parameters, tabulated in Table 2. In all simulation setups summarized in Table 2, the nominal value of the ground damping coefficient in all directions was chosen to be 100 N/m. The online learning algorithm updated optimization parameters r and P_td in the beginning of each block which is equal to N_b = 5 strides. Unless otherwise stated, all the simulations involved the robot trotting for 300 blocks. The actual duration of each simulation varied because the time of travel for each block depended on the hip/shoulder angular velocity ω and the stride angle r, and the stride angle was adapted in these simulations. The results presented are averages computed over 50 independent simulations, run for the same operating conditions. The initial choices of r and P_td were 12° and 4 × 10⁵ Pa, respectively, in all simulations involving online parameter learning. These choices resulted in the stable operation of the robot in all the simulation conditions explored here.

TABLE 2

TABLE 2. Simulation setup, optimum parameter values, and steady-state statistics of the online learning algorithm.

4.1.2 Performance benchmarks

To establish the performance benchmarks, we determined the optimal values of r and P_td that minimized the CoT for all the conditions simulated. For each operating condition considered, we simulated the robot for 3 minutes for each value of the stride angle between 10° and 25° in steps of 0.25^◦, and touchdown pressure between 2 × 10⁵ and 10 × 10⁵ Pa in steps of 0.25 × 10⁵ Pa. For each set of parameters, the average of the CoT values was calculated over the 3-min run. The optimal values of CoT and the two parameters are tabulated in Table 2. Similar ranges of stride angle and touchdown pressure were used in simulations involving online learning. Lower touchdown pressures increased the pitching motion of the robot excessively and resulted in unstable or close to unstable behavior. The reason is that at lower touchdown pressures, or equivalently, smaller values of leg stiffness, the force exerted by the legs may not be sufficient to provide the demanded momentum for the robot’s motion. Therefore, the minimum allowable touchdown pressure increases by increasing the forward velocity. For an angular velocity of 140 deg/s, pressure values below 2.5 × 10⁵ Pa were not simulated for the same reason. P_td greater than 10 × 10⁵ Pa was not feasible in the robot platform, and therefore not simulated.

A heat map of CoT values associated with the robot trotting on the ground with the slope of 0° ± 0.5° and stiffness of 100 ± 10 kN/m with ω = 105 deg/s is displayed as a function of touchdown pressure and stride angle in Figure 3. There were three local minima for this specific simulation setup corresponding to CoT values of 0.2513 when (r, P_td) = (15, 2.5 × 10⁵), 0.2593 when (r, P_td) = (20.75, 2.5 × 10⁵) and 0.3269 when (r, P_td) = (16.5, 6.25 × 10⁵), where the units of r and P_td are degree and Pascal, respectively. The online learning algorithm can converge to any of these local minima. Exhaustive analysis of simulation results has suggested that initialization as done in our simulations may result in convergence to parameter values close to the global optimum. However, additional work is still needed to guarantee convergence to the global minima of the performance surface.

FIGURE 3

FIGURE 3. Heat map of CoT values for the robot trotting with ω = 105 deg/s on the ground with the stiffness of 100 ± 10 kN/m and the slope of 0 ± 0.5°. CoT values greater than 0.6 are displayed in white color to provide sufficient contrast for values closer to the optimal values.

We can observe in Table 2 and Figure 4 that the optimal values of the CoT and the optimization parameters vary with the angular velocities of the hip/shoulder joint, ground stiffness, and longitudinal slope. Variations in the optimal values of the stride angle and leg stiffness have been reported for humans and animals for different ground conditions and forward speeds (Jayes and Alexander, 1978; Elliott and Blanksby, 1979; Farley and Gonzalez, 1996; Barrey et al., 2002; Umberger and Martin, 2007; Kim and Park, 2011; Spröwitz et al., 2013; Lussiana et al., 2015; Shen and Seipel, 2015). The trends in how changing the operating conditions such as forward velocity and ground stiffness vary the stride angle and the leg stiffness of biological models, generally comply with our observations for the simulated robot. For example, the optimum pressure is higher for faster trotting on flat ground because stronger force is needed to evoke higher velocities (Spröwitz et al., 2013). Similarly, a direct correlation between the forward speed of human runners and their leg stiffness was reported by Kim and Park (2011). The optimum stride angle of the simulated robot is higher for higher velocities similar to the results reported in (Jayes and Alexander, 1978; Elliott and Blanksby, 1979; Barrey et al., 2002) for humans and animals. The optimum touchdown pressure of the simulated robot’s leg increases on softer grounds in agreement with the biophysical data reported in (Ferris and Farley, 1997; Ferris et al., 1998). The authors of (Ferris et al., 1998) have argued that the equivalent stiffness of the leg and the ground should be constant in order to keep the same gait dynamics on various grounds. Similarly, the optimal values of touchdown pressure of the simulated robot were higher for softer grounds. There are conflicting reports of how the stride length and leg stiffness change in animals when the slope of the ground changes (Telhan et al., 2010; Lussiana et al., 2015; García-Pinillos et al., 2019). As a result, it is not possible to make a meaningful comparison of how the leg parameters of our robot and animals change on sloped terrains.

FIGURE 4

FIGURE 4. Top three plots display the optimum values of stride angle, touchdown pressure and CoT when the robot trots on uneven grounds with K_gnd = 100 kN/m and mean slope of zero with various hip/shoulder angular velocity. Bottom three plots display the optimum values of stride angle, touchdown pressure and CoT when the robot trots with ω = 70 deg/s on uneven grounds with various ground stiffness and mean slope of zero.

4.1.3 Online learning simulations with various operating conditions

The algorithm was simulated for different choices of ground conditions and trotting speeds as tabulated in the first three columns of Table 2. The hip/shoulder angular velocities of ω = 140 deg/s, ω = 105 deg/s and ω = 70 deg/s, respectively, corresponded to average forward velocities of 0.65 ± 0.01 m/s, 0.44 ± 0.06 m/s and 0.30 ± 0.04 m/s, respectively. The variation in forward velocities in a condition with constant angular velocity is caused by varying the stride angle and touchdown pressure. The positive and negative signs of the ground slopes represent uphill and downhill slopes, respectively. The nominal values of the ground stiffness were chosen to be 100 kN/m (tiled floor Bosworth et al. (2016)) or 20 kN/m (mulch layer Bosworth et al. (2016)), respectively.

Table 2 also summarizes the steady-state statistics (mean ± STD) of the CoT and the adaptive parameters in these simulations. The calculations assumed that the last 100 blocks represented the steady-state and the mean results reported are the averages over the last 100 blocks and 50 independent runs. The steady-state variance of each run was calculated separately over the last 100 blocks and the square-root of the mean value of the variances are reported as the standard deviation for each case. The values of hyper-parameters i.e., η_r, η_P, S_r and S_P were chosen empirically by searching over a wide range of parameters. The values of η_r and η_P were empirically selected to be $\frac{0.6}{C o T (1)}$ degree for the ω = 140 deg/s and $\frac{0.3}{C o T (1)} \times 1 0^{5}$ Pa, respectively and to be $\frac{0.3}{C o T (1)}$ deg and $\frac{0.15}{C o T (1)} \times 1 0^{5}$ Pa, respectively, for ω = 105 deg/s and ω = 70 deg/s regardless of ground parameters. S_r and S_P were chosen to be equal to 25 regardless of the values of the angular velocity and the ground parameters. The last column of Table 2 shows the ratio of the difference between the steady-state and the optimal values of the CoT to the optimal values of the CoT for each case. The maximum deviation from the optimal value of the CoT in any of these simulations was 8.7%.

The evolution of the average CoT computed by averaging over 50 independent simulations during each block for the simulations on uneven ground with a mean slope of 0, nominal ground stiffness of 100 kN/m, and all three angular velocities evaluated are displayed with solid lines in Figure 5. The dotted lines display the empirical optimum values of the CoT of the plots with the same colors. The shaded area surrounding each curve corresponds to the standard deviation of the CoT at each point.

FIGURE 5

FIGURE 5. CoT of the robot trotting on uneven grounds with the mean slope of zero and three different angular velocities. The dotted lines display the optimal values corresponding to the curve with the same color. The duration of simulations is different for the three cases because the angular velocities are different for the three different cases, and the stride angle of the robot evolves differently for each operating condition.

Similar to the results of the experiments on human subjects running on flat, uphill and downhill slopes reported in Kim and Park (2011); Farley and Gonzalez (1996); Elliott and Blanksby (1979); Barrey et al. (2002); Jayes and Alexander (1978); Shen and Seipel (2015); Umberger and Martin (2007); Lussiana et al. (2015); Spröwitz et al. (2013), our online learning algorithm converged to different values of stride angle and touchdown pressure on different longitudinal slopes and velocities. Similar to biological models that adapt their leg stiffness on grounds with different surface stiffness Ferris et al. (1998); Ferris and Farley (1997), our algorithm converged to different values of touchdown pressure for different ground types. Although some of the minima of the performance surface may not match the observations on human and animal behavior in prior research, when the robot parameters and the CoT values converged to the global minimum, the resulting evolution of the leg parameters was similar to how animals and humans change leg stiffness and stride angle during gait.

4.1.4 Robot behavior during abrupt changes in forward speed

This set of simulations were designed to investigate the effect of the abrupt changes in the angular velocity of the hip/shoulder joints while the robot was trotting and the ability of the algorithm to converge to near-optimal CoT values after the change. The ground stiffness for these simulations was chosen to be 100 ± 10 kN/m and the longitudinal slope of the ground was 0 ± 0.5°. The angular velocity changed from 105 deg/s to 70 deg/s halfway through each run for one set of simulations, and vice versa (70 deg/s to 105 deg/s) for another set. Each simulation contained 600 blocks. Figure 6 displays the average CoT values of 50 independent simulations with solid lines and their standard deviations with the shaded area for each case. The learning rates were set the same as in the simulation setups in section 4.1.3; Table 3 reports the statistics of the last 100 blocks before the abrupt change of the angular velocity (first row of each set) and the final 100 blocks for each case (second row of each set). According to Table 3, the algorithm could converge to the values within 6.67% and 3.5% of the optimal values after abruptly reducing the angular velocity (set 1) and increasing it (set 2), respectively.

FIGURE 6

FIGURE 6. CoT of the robot trotting on uneven ground with the slope of 0 ± 0.5° and ground stiffness of 100 ± 10 kN/m when ω changed from 105 deg/s to 70 deg/s and vice versa abruptly after 300 blocks. The dashed and dotted lines represent the optimal values for ω = 105 deg/s and ω = 70 deg/s, respectively.

TABLE 3

TABLE 3. Steady-state statistics of Figure 6.

4.2 Experiments on a robot platform

4.2.1 Setup

A number of experiments were conducted using the robot of Figure 1 trotting on a treadmill. These experiments were performed to show the effectiveness of the online learning algorithm on a real platform. The robot used for these experiments is the UPed Gurney et al. (2023), an under-actuated quadruped robot platform explained in Section 3.1. The robot used PID controllers to drive the motors in the hip/shoulder joints, and generate sufficient torque to track desired angular trajectories. The PID controller as well as the online learning algorithm were loaded onto a dSpace MicroLabbox (Paderborn, Germany), a control prototyping system, which provided all motor and pneumatic control to the robot, and set the gait and compliance parameter values.

4.2.2 Performance benchmarks

The CoT of the UPed was calculated for the robot trotting with the angular velocity of 105 deg/s and 70 deg/s on a treadmill with longitudinal slopes of 0 and −6° with the stride angle between 12° and 19.5° in steps of 1° and touchdown pressure between 2 × 10⁵ and 7 × 10⁵ Pa in steps of 0.5 × 10⁵ Pa. Figure 7 displays CoT values, as a function of r and P_td, of the robot trotting on a flat treadmill with ω = 105 deg/s. There are two local minima for this case corresponding to CoT values of 0.3090 when (r, P_td) = (14.5, 7) and 0.3274 when (r, P_td) = (16.5, 2). Depending on the initial conditions, the algorithm could converge to either of them. Similar measurements for the other 3 cases were performed and the corresponding global and local minima are reported in Table 4.

FIGURE 7

FIGURE 7. CoT values of the robot trotting on flat ground with ω = 105 deg/s as a function of P_td and r. The colored squares show the end point of each run. All runs started from the black triangle.

TABLE 4

TABLE 4. Experimental setups, and empirically measured values and parameters of the global and local minima.

As expected, and similar to the simulation results, Table 4 indicates that the optimal CoT values are different for different forward velocities and longitudinal slopes. Further, similar to biological models, our robot also adapts its stride angle and touchdown pressure to improve its energy efficiency in different ground conditions and operating environments. For example, similar to what has been observed in humans Kim and Park (2011); Elliott and Blanksby (1979), analysis of the experimental results indicate that the optimal values of the stride angle and the stiffness of the leg of the robot are also higher in higher velocities.

4.2.3 Online learning algorithm on the robot platform

Our online learning algorithm was implemented on the UPed in operating conditions tabulated in Table 4. For each experimental setup, 10 independent runs were performed. The initial conditions were chosen to be far from the optimal values. On flat ground, the initial conditions for ω = 70 deg/s and ω = 105 deg/s were equal to (r, P_td) = (17.5, 3) and (r, P_td) = (14, 4.5), respectively. On the downhill slope, the initial conditions for ω = 70 deg/s and ω = 105 deg/s were equal to (r, P_td) = (15, 4) and (r, P_td) = (18, 6.5), respectively. The learning rates for all cases were set to $μ_{r} = \frac{0.3}{C o T (1)}$ degrees and $μ_{P} = \frac{0.15}{C o T (1)} \times 1 0^{5}$ Pa. The length of each run was approximately 600 s, but the number of blocks varied from run to run because of the variations in the stride angle due to adaptation and various hip/shoulder angular velocities.

Because the performance surface is multi-modal, the algorithm is only guaranteed to converge to one of the local minima of the surface. Table 5 summarizes the average and standard deviation values of the measured CoT, stride angle and touchdown pressure, computed over the last 60 blocks of each run, and the number of runs N_r that was determined to have converged to each local minimum. For the case of flat terrain and ω = 105 deg/s, Figure 7 displays the end points of each of the ten runs (averaged over the last 60 blocks.) Six of the ten runs converged (or is close to convergence) to the global minimum, and three converged to the local minimum of the CoT surface. Analysis of the trends of the parameters of the remaining run suggested that, at the end of the run, the stride angle was reducing and the touchdown pressure was increasing, indicating that the system was traversing toward the global minimum, but may have needed a much longer run time to get there. The cost of transport associated with the global and local minima were relatively close to each other, and the online learning algorithm reduced the CoT from its initial value of 0.59 by more than 45% in all nine cases that converged to one of the two minimum locations. Even the single run that had not converged, resulted in a substantial reduction of the CoT value from 0.58 to 0.43, indicating that our approach is capable of achieving significant improvement in energy efficiency.

TABLE 5

TABLE 5. Experimental setups and steady-state statistics.

Figure 8 displays the evolution of the CoT for all individual runs for the experimental setup with ω = 105 deg/s and zero slope. Each run, depicted in a specific color, started from the black triangle in Figure 7 and ended at the square with the same color as the trajectory shown in Figure 8. As explained earlier, the run shown in yellow in the two figures had not converged at the end of the run, but still showed a substantial reduction in the cost of transport. The cost of transport of the run shown in red displayed somewhat erratic behavior in the interval between 200 and 400 s Of the run. During this time, the parameters wandered between two possible convergence locations. However, the system could escape this “unstable” region and converge to near-optimal value by the end of the run.

FIGURE 8

FIGURE 8. CoT of the robot trotting on a flat treadmill with the angular velocity of 105 deg/s for 10 independent runs.

According to Table 4, the CoT values at the local minimum of ω = 105 deg/s and zero slope was very close to the global minimum. Consequently, the CoT values are near-optimal values even when the online learning algorithm converges to the local minimum rather than the global minimum. In the angular velocity of 70 deg/s, the algorithm converged to the other local minima in some runs which have higher CoT values compared to the global minima. However, the algorithm could still reduce the energy consumption by as much as 50% compared to the initial conditions.

In most cases, the trends of the optimum values of adaptive parameters observed for both UPed and its simulated model were similar. For example, the optimum values of stride angle, touchdown pressure and CoT increase with increasing forward velocity. However, the CoT values of UPed were greater compared to its model in similar operating conditions and the optimal values of stride angle and the touchdown pressure were different for the actual robot and the simulated robot. One reason for the mismatch between the actual robot and the simulated model of it is that estimating the parameters of the robot precisely is not a trivial task. Even with precise estimation of system parameters, they vary over time. In spite of these differences, we can observe from Tables 2, 4, 5 that both simulation and the actual robot have the same order of magnitude for the CoT values. Therefore, we tuned the hyper-parameters of the simulation model by searching over a wide range of parameters and employed similar values of hyper-parameters for the experiments on UPed.

5 Concluding remarks

An online learning algorithm based on gradient descent was presented in this paper to adaptively reduce the energy consumption of a quadruped robot trotting on grounds with unknown characteristics. This algorithm updated the stride angle and leg stiffness in real-time while the robot was trotting. The online learning algorithm was implemented on a simulated model of an under-actuated quadruped robot and also experimentally evaluated using the UPed platform. The algorithm was computationally efficient for implementation in real-time. Different operating conditions associated with different hip/shoulder angular velocities, ground characteristics, and longitudinal slopes were explored. Performance evaluation using simulations and experiments suggested that the online learning algorithm presented in this paper was capable of converging to near-optimal values of the cost of transport for given operating conditions, terrain properties, and gait characteristics.

The approach presented in this paper is a general framework that works without training and can be implemented on any legged robots with adaptive parameters and for any gait, and can also be generalized to optimize more than two parameters. Examples of other such parameters include knee/elbow stiffness, knee/elbow rest angle and asymmetry in the stiffness of the forelegs and the rear legs. Analyses of a model of a horse galloping in a simulation environment (Herr and McMahon, 2001) and the data taken from actual horses (Heglund and Taylor, 1988) have showed that the optimum values of the stride angle and the stride frequency that minimized the CoT vary with forward velocity. Additionally, Wei et al. (2015) demonstrated with a simulated model of a galloping quadruped robot that the leg stiffness should be adjusted for different forward velocities to reduce the CoT. The online learning algorithm presented in this paper can be implemented on fast gaits such as galloping to reduce the CoT of legged robots by updating changeable parameters of their legs. The hyper-parameters of the update equations should be tuned appropriately when applied to other robots and gait conditions. The algorithm can also be modified by using different functions for f_r, f_P, and different values of d in (6) and 7. A simplified model using d = 1, and f_r and f_P as signum functions was presented in Aboufazeli (2018) on a model of a quadruped robot similar to the one presented in this paper.

The results of the online learning algorithm showed that adaptively updating the leg stiffness and the stride angle during locomotion reduces the CoT in various ground conditions and forward velocities. Several observations reported in Kim and Park (2011); Farley and Gonzalez (1996); Elliott and Blanksby (1979); Barrey et al. (2002); Jayes and Alexander (1978); Shen and Seipel (2015); Umberger and Martin (2007); Lussiana et al. (2015); García-Pinillos et al. (2019); Telhan et al. (2010) show that animals and humans adjust their leg stiffness and stride angle in various ground conditions and forward velocities. Our results in both simulations and experiments agreed with these observations and demonstrated the usefulness of real-time updates of the stride angle and leg stiffness in legged robots.

Data availability statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Author contributions

MA developed the algorithm, performed simulations, contributed to the writing of all sections of the paper. AS contributed to the modeling of the robot for simulations, assisted in developing the algorithm, and contributed to performing experiments, and writing parts of the paper. JG contributed in performing experiments and writing parts of the paper. SM oversaw the experiments and helped revise the paper. VM oversaw the algorithm and helped revise the paper.

Funding

This work was supported in part by the National Science Foundation under Grant Number 1427422.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Aboufazeli, M. (2018). Energy efficient quadruped robot trotting via adaptation of leg parameters over unknown ground. M.S. Thesis. Corvallis, Oregon, USA: School of Electrical Engineering and Computer Science, Oregon State University.

Google Scholar

Alexander, R. M. (1984). Elastic energy stores in running vertebrates. Am. Zoologist 24, 85–94. doi:10.1093/icb/24.1.85

CrossRef Full Text | Google Scholar

Alexander, R. M. (2004). Bipedal animals, and their differences from humans. J. Anat. 204, 321–330. doi:10.1111/j.0021-8782.2004.00289.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Arena, P., Patanè, L., and Taffara, S. (2021). Energy efficiency of a quadruped robot with neuro-inspired control in complex environments. Energies 14, 433. doi:10.3390/en14020433

CrossRef Full Text | Google Scholar

Barrey, E., Desliens, F., Poirel, D., Biau, S., Lemaire, S., Rivero, J.-L., et al. (2002). Early evaluation of dressage ability in different breeds. Equine Veterinary J. 34, 319–324. doi:10.1111/j.2042-3306.2002.tb05440.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Boaventura, T., Semini, C., Buchli, J., Frigerio, M., Focchi, M., and Caldwell, D. G. (2012). “Dynamic torque control of a hydraulic quadruped robot,” in 2012 IEEE International Conference on Robotics and Automation, Saint Paul, MN, USA, 14-18 May 2012, 1889–1894.

CrossRef Full Text | Google Scholar

Bosworth, W., Whitney, J., Kim, S., and Hogan, N. (2016). “Robot locomotion on hard and soft ground: Measuring stability and ground properties in-situ,” in 2016 IEEE international conference on Robotics and automation (ICRA), 3582–3589.

CrossRef Full Text | Google Scholar

Calandra, R., Gopalan, N., Seyfarth, A., Peters, J., and Deisenroth, M. P. (2014). “Bayesian gait optimization for bipedal locomotion,” in Learning and intelligent optimization: 8th international conference, Gainesville, FL, February 16–21, 2014 (Springer International Publishing), 274–290.

CrossRef Full Text | Google Scholar

Christie, M., Sun, S., Ning, D., Du, H., Zhang, S., and Li, W. (2019). A highly stiffness-adjustable robot leg for enhancing locomotive performance. Mech. Syst. Signal Process. 126, 458–468. doi:10.1016/j.ymssp.2019.02.043

CrossRef Full Text | Google Scholar

Cully, A., Clune, J., Tarapore, D., and Mouret, J.-B. (2015). Robots that can adapt like animals. Nature 521, 503–507. doi:10.1038/nature14422

PubMed Abstract | CrossRef Full Text | Google Scholar

Da, X., Xie, Z., Hoeller, D., Boots, B., Anandkumar, A., Zhu, Y., et al. (2021). “Learning a contact-adaptive controller for robust, efficient legged locomotion,” in Conference on robot learning (PMLR), 883–894.

Google Scholar

Dickinson, M. H., Farley, C. T., Full, R. J., Koehl, M., Kram, R., and Lehman, S. (2000). How animals move: An integrative view. Science 288, 100–106. doi:10.1126/science.288.5463.100

PubMed Abstract | CrossRef Full Text | Google Scholar

Elliott, B., and Blanksby, B. (1979). Optimal stride length considerations for male and female recreational runners. Br. J. Sports Med. 13, 15–18. doi:10.1136/bjsm.13.1.15

PubMed Abstract | CrossRef Full Text | Google Scholar

Farley, C. T., and Gonzalez, O. (1996). Leg stiffness and stride frequency in human running. J. Biomechanics 29, 181–186. doi:10.1016/0021-9290(95)00029-1

CrossRef Full Text | Google Scholar

Ferris, D. P., and Farley, C. T. (1997). Interaction of leg stiffness and surface stiffness during human hopping. J. Appl. Physiology 82, 15–22. doi:10.1152/jappl.1997.82.1.15

CrossRef Full Text | Google Scholar

Ferris, D. P., Louie, M., and Farley, C. T. (1998). Running in the real world: Adjusting leg stiffness for different surfaces. Proc. R. Soc. Lond. Ser. B Biol. Sci. 265, 989–994. doi:10.1098/rspb.1998.0388

CrossRef Full Text | Google Scholar

Gangapurwala, S., Mitchell, A., and Havoutis, I. (2020). Guided constrained policy optimization for dynamic quadrupedal robot locomotion. IEEE Robotics Automation Lett. 5, 3642–3649. doi:10.1109/lra.2020.2979656

CrossRef Full Text | Google Scholar

García-Pinillos, F., Latorre-Román, P. Á., Ramírez-Campillo, R., Párraga-Montilla, J. A., and Roche-Seruendo, L. E. (2019). How does the slope gradient affect spatiotemporal parameters during running? Influence of athletic level and vertical and leg stiffness. Gait posture 68, 72–77. doi:10.1016/j.gaitpost.2018.11.008

PubMed Abstract | CrossRef Full Text | Google Scholar

Gasparri, G. M., Manara, S., Caporale, D., Averta, G., Bonilla, M., Marino, H., et al. (2018). Efficient walking gait generation via principal component representation of optimal trajectories: Application to a planar biped robot with elastic joints. IEEE Robotics Automation Lett. 3, 2299–2306. doi:10.1109/lra.2018.2807578

CrossRef Full Text | Google Scholar

Geyer, H., Seyfarth, A., and Blickhan, R. (2006). Compliant leg behaviour explains basic dynamics of walking and running. Proc. R. Soc. B Biol. Sci. 273, 2861–2867. doi:10.1098/rspb.2006.3637

PubMed Abstract | CrossRef Full Text | Google Scholar

Gurney, J., Samare Filsoofi, A., McClain, E., Bolejack, C., Aboufazeli, M., Mathews, V. J., et al. (2023). Uped: A quadruped robot platform to study directional leg compliance. J. Mech. Robotics 15, 015001. doi:10.1115/1.4053995

CrossRef Full Text | Google Scholar

He, W., and Dong, Y. (2017). Adaptive fuzzy neural network control for a constrained robot using impedance learning. IEEE Trans. Neural Netw. Learn. Syst. 29, 1174–1186. doi:10.1109/tnnls.2017.2665581

PubMed Abstract | CrossRef Full Text | Google Scholar

Heglund, N. C., and Taylor, C. R. (1988). Speed, stride frequency and energy cost per stride: How do they change with body size and gait? J. Exp. Biol. 138, 301–318. doi:10.1242/jeb.138.1.301

PubMed Abstract | CrossRef Full Text | Google Scholar

Herr, H. M., and McMahon, T. A. (2001). A galloping horse model. Int. J. Robotics Res. 20, 26–37. doi:10.1177/02783640122067255

CrossRef Full Text | Google Scholar

Hua, Z., Rong, X., Li, Y., Chai, H., and Zhang, S. (2019). Active compliance control on the hydraulic quadruped robot with passive compliant servo actuator. IEEE Access 7, 163449–163460. doi:10.1109/access.2019.2951830

CrossRef Full Text | Google Scholar

Hubicki, C., Abate, A., Clary, P., Rezazadeh, S., Jones, M., Peekema, A., et al. (2018). Walking and running with passive compliance: Lessons from engineering: A live demonstration of the ATRIAS biped. IEEE Robotics Automation Mag. 25, 23–39. doi:10.1109/mra.2017.2783922

CrossRef Full Text | Google Scholar

Hutter, M., Gehring, C., Bloesch, M., Hoepflinger, M. A., Remy, C. D., and Siegwart, R. (2012). “StarlETH: A compliant quadrupedal robot for fast, efficient, and versatile locomotion,” in Adaptive mobile Robotics (Baltimore, USA: World Scientific), 483–490.

CrossRef Full Text | Google Scholar

Hutter, M., Gehring, C., Jud, D., Lauber, A., Bellicoso, C. D., Tsounis, V., et al. (2016). “ANYmal-a highly mobile and dynamic quadrupedal robot,” in 2016 IEEE/RSJ international conference on intelligent robots and systems (IROS), Daejeon, South Korea, 38–44.

CrossRef Full Text | Google Scholar

Jayes, A., and Alexander, R. M. (1978). Mechanics of locomotion of dogs (canis familiaris) and sheep (ovis aries). J. Zoology 185, 289–308. doi:10.1111/j.1469-7998.1978.tb03334.x

CrossRef Full Text | Google Scholar

Jin, B., Chen, C., and Li, W. (2013). Power consumption optimization for a hexapod walking robot. J. Intelligent Robotic Syst. 71, 195–209. doi:10.1007/s10846-012-9771-9

CrossRef Full Text | Google Scholar

Kashiri, N., Abate, A., Abram, S. J., Albu-Schaffer, A., Clary, P. J., Daley, M., et al. (2018). An overview on principles for energy efficient robot locomotion. Front. Robotics AI 5, 129. doi:10.3389/frobt.2018.00129

PubMed Abstract | CrossRef Full Text | Google Scholar

Kim, S., Clark, J. E., and Cutkosky, M. R. (2006). iSprawl: Design and tuning for high-speed autonomous open-loop running. Int. J. Robotics Res. 25, 903–912. doi:10.1177/0278364906069150

CrossRef Full Text | Google Scholar

Kim, S., and Park, S. (2011). Leg stiffness increases with speed to modulate gait frequency and propulsion energy. J. Biomechanics 44, 1253–1258. doi:10.1016/j.jbiomech.2011.02.072

PubMed Abstract | CrossRef Full Text | Google Scholar

Koco, E., Glumac, S., and Kovacic, Z. (2014). “Multiobjective optimization of a quadruped robot gait,” in Proc. IEEE 22nd mediterranean conference on control and automation (Palermo, Italy), 1520–1526.

CrossRef Full Text | Google Scholar

Koco, E., Mutka, A., and Kovacic, Z. (2016). New variable passive-compliant element design for quadruped adaptation to stiffness-varying terrain. Int. J. Adv. Robotic Syst. 13, 90. doi:10.5772/63893

CrossRef Full Text | Google Scholar

Kohl, N., and Stone, P. (2004). “Policy gradient reinforcement learning for fast quadrupedal locomotion,” in IEEE international conference on Robotics and automation, 2004, New Orleans, LA, 2619–2624.

CrossRef Full Text | Google Scholar

Kormushev, P., Ugurlu, B., Caldwell, D. G., and Tsagarakis, N. G., and (2019). Learning to exploit passive compliance for energy-efficient gait generation on a compliant humanoid. Aut. Robots 43, 79–95. doi:10.1007/s10514-018-9697-6

CrossRef Full Text | Google Scholar

Kottege, N., Parkinson, C., Moghadam, P., Elfes, A., and Singh, S. P. (2015). “Energetics-informed hexapod gait transitions across terrains,” in 2015 IEEE international conference on Robotics and automation (ICRA) (Seattle, WA: IEEE), 5140–5147.

CrossRef Full Text | Google Scholar

Lee, D. V., and Meek, S. G. (2005). Directionally compliant legs influence the intrinsic pitch behaviour of a trotting quadruped. Proc. R. Soc. B Biol. Sci. 272, 567–572. doi:10.1098/rspb.2004.3014

CrossRef Full Text | Google Scholar

Li, Z., Huang, Z., He, W., and Su, C.-Y. (2016). Adaptive impedance control for an upper limb robotic exoskeleton using biological signals. IEEE Trans. Industrial Electron. 64, 1664–1674. doi:10.1109/tie.2016.2538741

CrossRef Full Text | Google Scholar

Liu, Q., Zhao, J., Schütz, S., and Berns, K. (2015). “Adaptive motor patterns and reflexes for bipedal locomotion on rough terrain,” in Proc. IEEE/RSJ international conference on intelligent robots and systems (IROS), Hamburg, Germany, 3856–3861.

CrossRef Full Text | Google Scholar

Lizotte, D. J., Wang, T., Bowling, M. H., and Schuurmans, D. (2007). “Automatic gait optimization with Gaussian process regression,” in International joint conferences on artificial intelligence, Hyderabad, India, 7, 944–949.

Google Scholar

Lussiana, T., Hébert-Losier, K., and Mourot, L. (2015). Effect of minimal shoes and slope on vertical and leg stiffness during running. J. Sport Health Sci. 4, 195–202. doi:10.1016/j.jshs.2013.09.004

CrossRef Full Text | Google Scholar

Meek, S., Kim, J., and Anderson, M. (2008). Stability of a trotting quadruped robot with passive, underactuated legs, Pasadena, CA, 347–351.

Google Scholar

Mirzana, I. M., Muqeeth, M., Qureshi, S. A., and Jibran, M. A. M. (2018). Fabrication and analysis pneumatic quadruped robot. Int. J. Eng. Technol. Sci. Res. (IJETSR) 5.

Google Scholar

Ordonez-Apraez, D., Agudo, A., Moreno-Noguer, F., and Martin, M. (2022). “An adaptable approach to learn realistic legged locomotion without examples,” in 2022 international conference on Robotics and automation (ICRA), Philadelphia, PA, 4671–4678.

CrossRef Full Text | Google Scholar

Seok, S., Wang, A., Chuah, M. Y., Otten, D., Lang, J., and Kim, S. (2013). “Design principles for highly efficient quadrupeds and implementation on the MIT Cheetah robot,” in Proc. IEEE international conference on Robotics and automation, Karlsruhe, Germany, 3307–3312.

CrossRef Full Text | Google Scholar

Shen, Z., and Seipel, J. (2015). Animals prefer leg stiffness values that may reduce the energetic cost of locomotion. J. Theor. Biol. 364, 433–438. doi:10.1016/j.jtbi.2014.09.008

PubMed Abstract | CrossRef Full Text | Google Scholar

Silva, M., Barbosa, R., and Castro, T. (2013). “Multi-legged walking robot modelling in MATLAB/SimmechanicsTM and its simulation,” in 8th EUROSIM congress on modelling and simulation (Cardiff, UK): IEEE), 226–231.

CrossRef Full Text | Google Scholar

Simulink Documentation (2018). Simulation and model-based design. https://www.mathworks.com/products/simulink.html.

Google Scholar

Spröwitz, A., Tuleu, A., Vespignani, M., Ajallooeian, M., Badri, E., and Ijspeert, A. J. (2013). Towards dynamic trot gait locomotion: Design, control, and experiments with Cheetah-cub, a compliant quadruped robot. Int. J. Robotics Res. 32, 932–950. doi:10.1177/0278364913489205

CrossRef Full Text | Google Scholar

Telhan, G., Franz, J. R., Dicharry, J., Wilder, R. P., Riley, P. O., and Kerrigan, D. C. (2010). Lower limb joint kinetics during moderately sloped running. J. Athl. Train. 45, 16–21. doi:10.4085/1062-6050-45.1.16

PubMed Abstract | CrossRef Full Text | Google Scholar

Umberger, B. R., and Martin, P. E. (2007). Mechanical power and efficiency of level walking with different stride rates. J. Exp. Biol. 210, 3255–3265. doi:10.1242/jeb.000950

PubMed Abstract | CrossRef Full Text | Google Scholar

Usherwood, J. R., and Granatosky, M. C. (2020). Limb work and joint work minimization reveal an energetic benefit to the elbows-back, knees-forward limb design in parasagittal quadrupeds. Proc. R. Soc. B 287, 20201517. doi:10.1098/rspb.2020.1517

PubMed Abstract | CrossRef Full Text | Google Scholar

Verstraten, T., Beckerle, P., Furnémont, R., Mathijssen, G., Vanderborght, B., and Lefeber, D. (2016). Series and parallel elastic actuation: Impact of natural dynamics on power and energy consumption. Mech. Mach. Theory 102, 232–246. doi:10.1016/j.mechmachtheory.2016.04.004

CrossRef Full Text | Google Scholar

Wei, X., Long, Y., Wang, C., and Wang, S. (2015). A critical characteristic in the transverse galloping pattern. Appl. Bionics Biomechanics 2015, 1–16. doi:10.1155/2015/631354

CrossRef Full Text | Google Scholar

Wu, Y., Yao, D., Guo, Z., and Xiao, X. (2019). Adaptive stiffness control of passivity-based biped robot on compliant ground using double deep Q network. Proc. Institution Mech. Eng. Part C 233, 2177–2189. doi:10.1177/0954406218781402

CrossRef Full Text | Google Scholar

Xie, Z., Berseth, G., Clary, P., Hurst, J., and van de Panne, M. (2018). “Feedback control for Cassie with deep reinforcement learning,” in Proc. IEEE/RSJ international conference on intelligent robots and systems (IROS), 1241–1246.

CrossRef Full Text | Google Scholar

Xin, G., Wolfslag, W., Lin, H.-C., Tiseo, C., and Mistry, M. (2020). An optimization-based locomotion controller for quadruped robots leveraging cartesian impedance control. Front. Robotics AI 7, 48. doi:10.3389/frobt.2020.00048

PubMed Abstract | CrossRef Full Text | Google Scholar

Yang, K., Li, Y., Zhou, L., and Rong, X. (2019). Energy efficient foot trajectory of trot motion for hydraulic quadruped robot. Energies 12, 2514. doi:10.3390/en12132514

CrossRef Full Text | Google Scholar

Yu, W., Turk, G., and Liu, C. K. (2018). Learning symmetric and low-energy locomotion. ACM Trans. Graph. (TOG) 37, 1–12. doi:10.1145/3197517.3201397

CrossRef Full Text | Google Scholar

Yu, X., He, W., Li, H., and Sun, J. (2020). Adaptive fuzzy full-state and output-feedback control for uncertain robots with output constraint. IEEE Trans. Syst. Man, Cybern. Syst. 51, 6994–7007. doi:10.1109/tsmc.2019.2963072

CrossRef Full Text | Google Scholar

Zhai, S., Jin, B., and Cheng, Y. (2020). Mechanical design and gait optimization of hydraulic hexapod robot based on energy conservation. Appl. Sci. 10, 3884. doi:10.3390/app10113884

CrossRef Full Text | Google Scholar

Zhou, X., Xu, Z., Li, S., Wu, H., Cheng, T., and Lv, X. (2020a). “Optimization-based compliant control for manipulators under dynamic obstacle constraints,” in AI based robot safe learning and control (Berlin, Germany: Springer), 83–104.

CrossRef Full Text | Google Scholar

Zhou, X., Xu, Z., Li, S., Wu, H., Cheng, T., and Lv, X. (2020b). “RNN based trajectory control for manipulators with uncertain kinematic parameters,” in AI based robot safe learning and control (Berlin, Germany: Springer), 17–38.

CrossRef Full Text | Google Scholar

Keywords: adaptive control, bio-inspired robots, variable passive compliance, efficient legged robots, online learning algorithms

Citation: Aboufazeli M, Samare Filsoofi A, Gurney J, Meek SG and Mathews VJ (2023) An online learning algorithm for adapting leg stiffness and stride angle for efficient quadruped robot trotting. Front. Robot. AI 10:1127898. doi: 10.3389/frobt.2023.1127898

Received: 20 December 2022; Accepted: 13 March 2023;
Published: 06 April 2023.

Edited by:

Onur Ozcan, Bilkent University, Türkiye

Reviewed by:

Mohammad Askari, Swiss Federal Institute of Technology Lausanne, Switzerland
Alper Yeldan, Singapore University of Technology and Design, Singapore

Copyright © 2023 Aboufazeli, Samare Filsoofi, Gurney, Meek and Mathews. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Mahtab Aboufazeli, YWJvdWZhem1Ab3JlZ29uc3RhdGUuZWR1

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.