Negotiation-based cooperative planning of local trajectories

Schneider, Julian; Rothfuß, Simon; Hohmann, Sören

doi:10.3389/fcteg.2022.1058980

TECHNOLOGY REPORT article

Front. Control Eng., 29 November 2022

Sec. Networked Control

Volume 3 - 2022 | https://doi.org/10.3389/fcteg.2022.1058980

This article is part of the Research TopicModel-based Methods for Human-Machine Cooperative and Shared Control SystemsView all 6 articles

Negotiation-based cooperative planning of local trajectories

Julian Schneider*

Simon Rothfuß

Sören Hohmann

Institute of Control Systems (IRS), Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany

In this work, a cooperative local trajectory planner based on negotiation theory for human-robot interaction is developed. It is implemented on a robot, which accompanies patients to examination rooms as part of the HoLLiECares project. For this purpose, an existing human–machine cooperation model for decision-making in one-time conflict cases is applied to a time-repeated negotiation of motion primitives. In negotiation theory, time pressure in the form of deadlines is classically used to achieve agreements. Since deadlines do not naturally exist in all technical applications and their artificial insertion would create an unintuitive system behavior for an involved human, a reciprocal tit-for-tat strategy for the automation is applied in the present work to achieve agreements. This leads to a system behavior that is able to dynamically change between human-in-the-lead behavior or automation-in-the-lead behavior and everything in between depending on the concession of the human and thus on human’s desire. The cooperative negotiation-based local trajectory planner is tested simulatively.

1 Introduction

As part of the HoLLiECares project, the existing humanoid robot HoLLiE is to be modified and further developed as a nursing robot. One core task of the robot will be to accompany patients to examination rooms. As a lot of patients need walking support, especially after surgeries in the hip and leg area, patients can lean on a special forearm rest, which is mounted on the robot arm, while they are accompanied to treating rooms. Figure 1 shows the robot and a human in the accompanying setting.

FIGURE 1

FIGURE 1. HoLLiE robot in hospital accompanying a patient.

To the knowledge of the authors, there are no existing systems that haptically guide people while providing walking support at the same time. Existing systems can be divided into human guiding applications without haptic interaction like a robot tour guide in Fiore et al. (2015), e.g., for museums in Bueno et al. (2011). On the other side, there are systems that provide walking support, e.g., rehabilitation applications like the WalkTrainer (Stauffer et al., 2009) or an omnidirectional walking platform proposed by Aguirre-Ollinger and Yu (2021). Other systems that provide walking support are exoskeletons, e.g., Quintero et al. (2012), Lu et al. (2014), Bai et al. (2017), and Li et al. (2020, 2021). All these systems have in common that they implement a static leader–follower approach. Either the robot leads the way and the human must follow like in the studies by Bueno et al. (2011), Stauffer et al. (2009), Quintero et al. (2012), Lu et al. (2014), and Bai et al. (2017) or the human is the leader in providing the path and velocity as a reference and the robot follows as in Aguirre-Ollinger and Yu (2021).

Instead of a static leader–follower implementation, we want to apply a cooperative approach to dynamically negotiate about the leadership role. This approach is motivated by the fact that even though the robot knows the way to the examination room, at the same time, the human locally wants to go a different way than the robot specifies. This may happen in cases where the human wants to cut a path around the bend, while the robot, for example, always wants to walk on the right side of the corridor in order to comply with the right-hand driving rule comparable to road traffic. However, after the curve, it could be that the human again matches the robot’s motion request. We expect higher acceptance of the accompanying service if we respect the patient’s local wishes for a special path. Furthermore, we consider the ethical aspect of the right of self-determination of the patient according to the MEESTAR model (Weber, 2015) already in the design of the accompanying service. Even in the situation of accompanying a patient to an examination room, it is a respectful manner of good care to respect and react to the patients’ path wishes.

1.1 Related work

1.1.1 Emancipated change of leadership roles

An attempt at required dynamical change of leader roles in a human–robot accompanying scenario is presented in Fiore et al. (2015). They have investigated whether human speed can be influenced by having the robot suggest a speed. However, they have not taken into account feedback from the human, i.e., the robot’s reaction to the human’s reaction to the presented offer.

In order to implement a method to dynamically change between the leadership roles, some method is needed to find an agreement. In this process, neither agent should lead preferentially; there should instead be a form of negotiation with equal rights for both agents. A human–machine cooperation model that provides equal rights and authority for both the human and the machine is presented in Rothfuß (2022), with one model instantiation using negotiation theory and another instantiation using game theory. To the knowledge of the author, the presented human–machine cooperation model is the only existing formal method so far, which describes decision-making processes between automation and humans in an emancipated manner while also considering the cognitive limitations ofthe human. For a deeper insight into the state of the art on this topic, please refer Rothfuß (2022).

Although the method presented in Rothfuß (2022) is suitable for decision-making in one-time conflict cases and discrete agreements, such as deciding which turn to take at a crossroad or which direction to take in an evasive maneuver, the presented application is not intended to negotiate the leadership role once and discretely but to decide dynamically which trajectory to follow even with the possible result on a trajectory compromise at some point between the human and the robot. The method must, therefore, be further developed.

Inspired by the scenario of making a joint decision at a crossroad, one could model the process of finding a common trajectory in a very simple way as in Figure 2. Instead of negotiating about different turns at a crossroad, one could negotiate on different trajectory candidates like the three candidates in Figure 2. The trajectory $T_{1}$ would cut the corner, whereas $T_{3}$ would follow the rule to go on the right side. Two partners could find a common trajectory by negotiating on the three candidates and finding an agreement. However, this approach requires the human to formulate their whole trajectory wish in advance. As the depicted corridor in Figure 2 is a continuous space, the human cannot easily formulate their desired trajectory around a corner. One would need, for example, different trajectories suggested by the robot on a display or via a projection on the floor. As this suggestion would be a rather undesirable restriction for a human, the approach of finding a common trajectory by negotiation could be preserved by the negotiation on motion primitives. These motion primitives specify the trajectory only for a limited distance in advance, and the desired motion primitives of the human could be estimated via a haptic connection, i.e., the human does not need to formulate it like an entire trajectory to the target. The combination of the negotiated motion primitives then leads to a common trajectory toward the target. This approach converts the problem of finding a common trajectory into the repeated negotiation on motion primitives.

FIGURE 2

FIGURE 2. Basic idea of negotiating trajectories.

1.1.2 Negotiation theory

The idea of the present work is negotiation over motion primitives. The classical flow of a negotiation process is shown in Figure 3. For example, agent 1 gets an offer from agent 2 and evaluates it by determining the utility of this offer. If the offer’s utility fulfills the acceptance strategy of agent 1, an agreement is found. Otherwise, a counteroffer is presented to agent 2. There exist different negotiation protocols that specify in detail the exchange of offers, e.g., synchronous or asynchronous modes. In order to determine the utility of an offer and a potential counteroffer, a utility function is required. The acceptance strategy determines the condition under which an agent accepts the other agent’s offer. The bidding strategy describes the behavior of an agent to find a suitable offer with respect to the current situation. This strategy can be based on the behavior of the cooperation partner (behavior-based strategy) like a tit-for-tat strategy or depending on time (time-based strategy). As behavior-based strategies cannot model the urgency of decision-making in dynamic systems, time-based strategies are preferred to find an agreement in time critical decision problems under pressure (Rothfuß et al., 2019).

FIGURE 3

FIGURE 3. Flowchart of a negotiation, source: Rothfuß et al. (2019)

In order to motivate the agents involved in a negotiation to reach an agreement, time pressure is classically used in negotiation theory (Stuhlmacher et al., 1998), which Rothfuß (2022) implemented in the form of a deadline. After the deadline has passed, an agreement must be reached. In the study by Rothfuß (2022), agreements are achieved by a time-based bidding strategy. The closer the deadline is, the lower the own value of the presented offer becomes and the more likely the partner is to accept this offer. In the application of cooperative trajectory planning presented, no natural deadlines arise from the application, as they do, for example, in the case of cooperative agreement on an evasive maneuver when approaching an obstacle. Here, the natural deadline would be the crash with the obstacle. In contrast, the motion primitives in the present case do not have a defined end but can be executed as long as they lead both partners along the global path or in the direction of the partner’s motion request. For the application of time-based bidding strategies to guarantee agreements, a deadline would have to be artificially inserted for the conflict case of different movement desires. Different human users would probably want different deadlines for this, which would require a parameter to be identified for the automation design and would therefore provide more effort. To avoid the insertion of artificial deadlines, automation could instead choose its bidding strategy depending on human behavior. In contrast, the tit-for-tat behavioral strategies commonly found in the literature (Baarslag, 2016, p. 28) are of little help here in ensuring that agreements are reached. Instead, automation could observe the human’s offers over a period of time and choose a reciprocal tit-for-tat strategy: If it is a very concessive human, the automation insists on its desire to move. If it is a very unyielding person, the automation gives in more frequently.

1.1.3 Motion planning

In this work, it is assumed that the resulting joint motion results from a compromise of the individual motion desires of the human and the robot. The single motion desires result from individual motion planning. This can be divided into global and local navigation for the robot (Zhou et al., (2022); Gul et al., (2019)). In global navigation, prior knowledge of the environment is used, and offline planning of a path or trajectory is possible. In order to respond to dynamic changes and uncertainties in the environment, local navigation is additionally required, which is performed online.

In the field of motion planning, a distinction is also made between path planning and trajectory planning (Gasparetto et al. (2015)). A path is a purely geometric description of a path in space. A trajectory assigns a time dependency to this geometric path. This is achieved by linking the path to a speed profile.

1.2 Contribution

In this work, an existing method for emancipated cooperative decision-making between a human and a machine in one-time conflict cases is extended to time-repeated decision-making processes. Compromises are found without inserting an artificial deadline. This is achieved by the application of a new behavior-based strategy within negotiation theory, namely, a reciprocal tit-for-tat strategy. The developed method is applied to cooperative trajectory planning.

Before introducing the exact procedure of negotiating motion primitives and the resulting necessary adjustments, there is a presentation of the problem first and the system model in Section 2. Section 3 gives a brief introduction into negotiation theory at its beginning and presents the cooperative local trajectory planner. The corresponding results are shown in Section 4. Section 5 concludes the article.

2 Problem formulation, system structure, and system model

In the present application, the accompaniment of a patient by a humanoid robot shown in Figure 1 is to be implemented. Patients can lean on a forearm rest while walking together with the robot. In addition, the robot projects the direction toward the examination room on the ground. The problem is that both agents have to plan and execute a trajectory under the constraint that the distance between both agents has to remain within a certain range. In our model, the movement desire results from a globally planned path existing for both agents that contains the individual movement preferences (e.g., shortest way, adherence to the ‘right-hand driving commandment,’ etc.). Since the two agents do not know each other’s global paths, the compromise must be found at the level below: the local planning level. Here, depending on one’s own motion desire and the partner’s motion desire, motion primitives are determined based on negotiation (see Section 3.2 for a more detailed explanation). The motion desire of the partner is estimated from the interaction force F_Int between the human and robot (see Section 3.1 for a more detailed explanation). This interaction force is measured in the shoulder of the robot by using a force–torque sensor. The motion primitives are subsequently adjusted by underlying control loops of the robot and by the human’s locomotor system. The interaction force F_Int comes from the execution of the motion primitives by the human and the robot which—as just described—serves as an input variable for local planning. Figure 4 shows the described system structure as a block diagram. The global planning for the human and robot is assumed to be given and collision-free in the context of this work. In addition, for real-world applications, dynamic obstacles and cooperative measures for collision avoidance would need to be considered in the local planning layer. Since the focus of this work is on the methodology for emancipated compromise finding in the presence of different global paths, dynamic obstacles are not considered in this work for simplicity. Collision avoidance control strategies in multi-agent systems can be found, for example, in the study proposed by Huang et al. (2019).

FIGURE 4

FIGURE 4. System structure of the accompanying problem.

The calculation of the local trajectory depends on the course of the global path. Therefore, the two agents must not move too far away from their global paths. With corridors in mind, we assume that the human and robot are always in sufficiently close distance to their global paths. Nevertheless, if it happens that an agent moves farther than a certain threshold value from its global path, a replanning of the global path can be started from the current position. Before the functionality of the negotiation-based local path planning is explained, the system model of the coupled system is presented in the following.

2.1 System model

Since no models exist in the literature for the presented application, this work proposes a new modeling of the scenario. The walking motion of humans can be described for goal-directed movements using the non-holonomic unicycle model (Arechavaleta et al., 2006; Mombaur et al., 2008). Here, the human always stays tangentially along their trajectory with a velocity greater than or equal to zero. In the presented case, no dynamic obstacles are considered, so no sudden orthogonal sideway movements of the human are necessary. Eq. 1 describes the non-linear unicycle model with state variables x₁, x₂, and x₃ and control variables v and ω. The states x₁ and x₂ describe the coordinates in the plane, and $x_{3} = \arctan (\frac{x_{2}}{x_{1}})$ describes the current angle around the vertical axis.

\dot{x} = [\begin{matrix} v \cdot \cos (x_{3}) \\ v \cdot \sin (x_{3}) \\ ω \end{matrix}] . (1)

To produce the human-expectable behavior, such as that of an attendant caregiver by means of automation, non-holonomic unicycle kinematics is also chosen for the robot platform motion kinematics. Eq. 2 shows the overall state equations with the state variables for the human (index H) and the state variables of the robot (index R).

\dot{x} = [\begin{matrix} v_{H} \cdot \cos (x_{3}) \\ v_{H} \cdot \sin (x_{3}) \\ ω_{H} \\ v_{R} \cdot \cos (x_{6}) \\ v_{R} \cdot \sin (x_{6}) \\ ω_{R} \end{matrix}] . (2)

Eq. 3 shows the outputs:

y = {[\begin{matrix} x_{4} & x_{5} & x_{6} & F_{Int,x} & F_{Int,y} \end{matrix}]}^{T} . (3)

The measurable positions x₄ and x₅ of the robot and the robot angle x₆ are the output variables. The position, velocity, and angular velocity of the human are not measurable. In addition, the magnitude and direction of the interaction force are presented, which is described by the planar, two-dimensional vector F_Int. This has the entries F_Int,x and F_Int,y in the plane parallel to the ground. The measured interaction force results on the one hand from the application of force by the human hand on the handle. Second, it results from different directions of movement between the human and the robot. It should be noted that the arm of the robot remains stationary for safety reasons.

2.2 Motion primitives

As described previously, the negotiation of motion primitives p takes place in the local motion planning. A motion primitive can be described by the parameter velocity v and angular velocity ω. For the velocity v = 1 ms⁻¹ and a set of discrete angular velocities, the set of motion primitives shown in Figure 5 is defined for a simulation step time of t = 1 s. Hence, the motion primitives, i.e., output variable of the local motion planning, are conveniently described by the same parameters (velocity and angular velocity) as the control variables of the coupled system model. Thus, a conversion between motion primitives and input variables of the system model is not necessary.

FIGURE 5

FIGURE 5. Set of sampled motion primitives.

3 Negotiation-based local trajectory planner

The starting point of the negotiation-based cooperative trajectory planner is the individual movement desire of each agent. This corresponds to each agent’s own local trajectory planning, which is independent of the partner’s behavior. The local trajectory planning for the robot was implemented using the base local planner algorithm (Marder-Eppstein and Perko, 2022). This algorithm samples a discrete set of motion primitives $p_{i} = {[v_{i}, ω_{i}]}^{T}$ from a discrete set of velocity values and angular velocity values. All non-colliding motion primitives are then evaluated using a cost function J_MP (index MP: motion primitives), and the best evaluated motion primitive is selected. The cost function J_MP evaluates the distance d_i of each motion primitive p_i to the global path and the control effort of the respective motion primitive $v_{i}^{2}$ and $w_{i}^{2}$ by means of the following expression:

J_{MP} = w_{1} \cdot d + w_{2} \cdot v_{2} + w_{3} \cdot w_{2}, (4)

where the parameters $w_{1}$ , $w_{2}$ , and $w_{3}$ are weighting factors. The best-valued motion primitive p* is then determined by means of

p^{*} = \min_{v, ω} \{J_{MP}\} . (5)

Figure 6 shows an example set of sampled motion primitives with the best rated trajectory. This trajectory describes in the following the ‘personal’ motion desire of the robot in each step. This personal motion desire must then be compared to the human motion desire. Figure 7 shows the subsystem of the robot’s local trajectory planner. In order to compare the motion desire p* with the human’s motion desire, a motion primitive estimation of the human is first required (highlighted in yellow in Figure 7). This estimates from the measured interaction force F_Int a motion primitive p that the human requires from the robot in order to comply with the human’s motion request. This is denoted by the index R (motion primitive for the robot) and the exponent H (demand of the human):

p_{R}^{H}

FIGURE 6

FIGURE 6. Set of sampled motion primitives with the best rated trajectory p*.

FIGURE 7

FIGURE 7. System architecture of the accompanying problem with a detailed view into the local planning of the robot.

3.1 Motion primitive estimation

The basic idea behind the motion primitive estimation is that one agent exerts an interaction force on its partner (agent 2) to make the overall system follow the motion request of agent 1. With the interaction force F_Int, agent 1 communicates to agent 2 what agent 2 should do to follow agent 1’s trajectory desire. Thus, a mapping $f : F_{Int} \to p_{A 2}^{A 1}, R^{2} \to R^{2}$ is sought, which determines a motion primitive $p_{A 2}^{A 1}$ (control input of agent 2 demanded by agent 1) from the measured interaction force F_Int (system output of the coupled system) in order to follow the motion request of agent 1. This is a classical control engineering problem, namely, the search for a control law u(y) with an output feedback. Due to the nonlinearities of the coupled system, this control law is determined heuristically in the following.

Figure 8 shows an example interaction force F_Int at the contact point C of the human with the robot on the forearm rest in the local robot coordinate system {r_x, r_y}. Here, r_x always points in the direction of motion of the robot. The interaction force F_Int drawn in this local coordinate system has the components ${[F_{Int, x}, F_{Int, y}]}^{T}$ . The interaction force provided by the human is interpreted by the robot like the input of a joystick. The example force F_Int drawn in Figure 8 with F_Int,x > 0 and F_Int,y > 0 (quadrant I) means that the robot should move to the left and accelerate. An interaction force in quadrant II would correspondingly require a rightward motion and acceleration. Quadrant III also means a rightward motion but deceleration, and quadrant IV correspondingly means a decelerating leftward motion. The force component F_Int,x can thus be considered in a heuristic view as proportional to an acceleration a, and the component F_Int,y becomes correspondingly proportional to an angular acceleration $\ddot{α} = \dot{ω}$ . Since in the present application discrete velocity and angular velocity values are considered to form a set of motion primitives, in the following, the velocity changes are written as Δv and Δω. Thus, from the interaction force, the change in the motion primitive Δp = [Δv,Δω]^T is interpreted with respect to the current executed motion primitive.

FIGURE 8

FIGURE 8. Interaction force F_Int in local robot coordinates.

The changes Δv and Δω are calculated as follows: a maximum allowable interaction force F_{Int, max} is introduced. Larger forces imposed by humans are not considered. In the range 0 < F_Int,x < F_{Int, max} or 0 < F_Int,y < F_{Int, max}, the proportional conversion into a velocity change Δv or angular velocity change Δω takes place by means of the following formulas:

Δ v = sgn (F_{Int, x}) \cdot \frac{\min (| F_{Int, x} |, F_{Int, max})}{F_{Int, max}} \cdot (v_{ub} - v_{lb}) \cdot f_{v} (6)

Δ ω = sgn (F_{Int, y}) \cdot \frac{\min (| F_{Int, y} |, F_{Int, max})}{F_{Int, max}} \cdot ω_{ub} \cdot f_{ω} . (7)

The fraction in Eq. 6, 7 calculates a percentage of force that describes how much the acceleration or angular acceleration change should be. If F_Int,x = F_Int,max is applied, this fraction becomes one and represents a maximum change request. This is multiplied by the maximum possible range of the control input (v_ub − v_lb), where the indices ub and lb mean upper bound and lower bound, respectively. If v_R = v_lb holds and F_Int,x = + F_{Int, max} is applied, v_R (at f_v = 1) changes to the largest possible velocity v_ub. The sign function determines the sign of the acceleration function. The factor f_v ≥ 1 represents a parameter which increases the sensitivity of F_Int. If f_v > 1 is chosen, Δv increases faster and thus reacts more sensitively to values F_Int,x < F_Int,max. The calculation rule for Δω results analogously. Since the discrete set of angular velocities is symmetric about the origin (ω_ub = −ω_lb), the fraction is multiplied only by the maximum positive angular velocity ω_ub. The sign of the angular velocity change also results from the sign of the applied force. In case |Δv| > v_ub − v_lb resp. |Δω| > ω_ub, this is internally limited to v_ub − v_lb resp. ω_ub.

Equations 6, 7 together form the function $f : F_{Int} \to p_{A 2}^{A 1}, R^{2} \to R^{2}$ , which estimates the motion primitive $p_{A 2}^{A 1}$ from the applied interaction force F_Int. From this, together with the actual motion desire $p_{A 2}^{*}$ of agent A2, a motion primitive p_A2 must be determined based on negotiation (see Figure 7). For this purpose, the components of the negotiation model are explained as follows.

3.2 Negotiation of motion primitives

The human machine cooperation model by Rothfuß (2022) that allows for emancipated decision-making processes consists of several components. The model’s agent set P is defined by P = {H, R} that is composed of the two agents, the human (H) and robot (R). The other components are defined in this work as follows:

1. Negotiation protocol: The negotiation starts in an event-based manner when the interaction force exceeds a threshold value F_thresh. If this case occurs, it is obvious that the robot has a different motion request than the human or that the human wants to communicate a motion request to the robot. Since the interaction force is a vector in the plane, the Euclidean norm or two-norm of the interaction force F_Int is compared with the threshold value:

‖ F_{Int} ‖_{2} > F_{thresh} . (8)

1. The use of the two-norm also ensures that a conflict is detected in both directions: first, when both agents are moving away from each other, and second, when the two agents are moving toward each other. After the negotiation has started, offers are exchanged simultaneously since the interaction force acts on both agents simultaneously too.

2. Negotiation deadline: As discussed in the previous section, it does not make sense to consider a negotiation deadline in the presented application. Therefore, this model component is not considered here. Agreements are reached through a behavior-based offer strategy with a reciprocal tit-for-tat strategy (see as follows).

3. Offers: In the presented application, the exchanged offers o between the agents are the exchange of motion primitives p. However, these are not explicitly communicated but are passed on to the other agent in the form of the interaction force F_Int, which can be seen as a function g that transforms the offers o via the coupled system: F_Int = g(o). The motion primitives demanded from the interaction force must then be decoded by the other agent (o = g⁻¹(F_Int)), (see Section 3.1 on motion primitive estimation). The decision set D is the discrete set of which global path is followed: the path of the human or the path of the robot.

4. Utility function: The utility function U for evaluating the opponent’s offer is a function $f : o \to U, O \to [0,1]$ that maps a utility U ∈ [0, 1] to every offer o, where offers in this case are motion primitives p. This utility describes how much the offer is useful for one agent to follow its desire to move. Therefore, the cost function J_MP (Eq. 4) builds the starting point together with the actual motion desire p* of the robot and the required motion primitive of the human $p_{R}^{H}$ . These two motion primitives form upper and lower bounds for the robot with respect to the possible velocity and angular velocity values. The motion primitives between these upper and lower bounds form a window of possible motion primitives, which is considered in the following. The utility function U within these bounds is normalized to the range between zero and one, where the motion primitive p_min with the lowest utility in the windowed utility range is assigned the value zero. The motion primitive p* is assigned with the utility value one. To do this, p_min must first be determined:

p_{min} = \min_{v, ω} \{J_{MP}\}, (9)

s . t . v_{lwb} \leq v \leq v_{uwb}, (10)

ω_{lwb} \leq ω \leq ω_{uwb}, (11)

where the indices lwb and uwb stand for lower window bound and upper window bound, respectively. These bounds result from the two motion primitives p* and $p_{R}^{H}$ that form the boundary of the window.

Then, the utility function in the windowed region is normalized, and U (d, v, ω) ∈ [0, 1] denotes the normalized utility function with J_MP(p_min) ≥ J_MP(d, v, ω) ≥ J_MP(p*) > 0.

U (d, v, ω) = \frac{J_{MP} (p_{min}) - J_{MP} (d, v, ω)}{J_{MP} (p_{min}) - J_{MP} (p^{*})}, v \in [v_{lwb}, v_{uwb}], ω \in [ω_{lwb}, ω_{uwb}] . (12)

5. Bidding Strategy: The bidding strategy specifies the dependency, in which the target utility value U_target ≤ 1 of the robot decreases per negotiation round in order to accommodate the human. A behavioral reciprocal tit-for-tat strategy is implemented here as the bidding strategy. The classical tit-for-tat strategy observes the behavior of the partner over the last k steps and shows the same behavior in percentage or absolute terms. In the presented application, a reciprocal tit-for-tat strategy is to be implemented for the robot, which yields more the more unyielding the human behaves. In the reciprocal tit-for-tat strategy applied here, the target utility U_target value of the robot should change depending on the concession of the human. For this purpose, the utility loss U_L is introduced with U_target = U (p*) − U_L = 1 − U_L. The utility loss U_L describes the extent to which an agent is willing to deviate from its desired movement primitive per round. For the reciprocal tit-for-tat strategy to be implemented, a function is now sought that maps the partner’s behavior, i.e., its concession to a utility loss U_L.

Since the concession of the human cannot be measured directly, it must be estimated indirectly via the robot’s own behavior. If both agents move away from each other because of their individual desire to move, the interaction force increases and both agents are unyielding. If they move in parallel because the global paths are parallel, the interaction force remains approximately within a threshold range, and there is also no concession. If the interaction force becomes smaller even though the robot did not move toward the human, there is concession on the part of the human. If the interaction force is smaller because the robot moved toward the human, the robot is yielding. Yielding of the robot toward the human is accompanied by a loss of utility $U_{L} = U (p^{*}) - U (p), U_{L} \in [0,1]$ because it has to deviate from the best-valued motion primitive p*. If the robot sticks to its motion path with p*, there is no utility loss. To estimate the concession of the human, value pairs (U_L|ΔI) (ΔI: effect on interaction force) are formed over the last l steps using a heuristic considering

\frac{d}{d t} F_{Int} = {\dot{F}}_{Int} \approx F_{Int} (t = k \cdot T) - F_{Int} (t = (k - 1) \cdot T) . (13)

The smaller the value ΔI, the higher the concession of the human is:

• U_L = 0 and $‖ {\dot{F}}_{Int} ‖ < 0$ : the interaction force became smaller, although the robot did not approach, ΔI = −2 and (U_L| − 2).

• U_L > 0 and $‖ {\dot{F}}_{Int} ‖ < 0$ : the interaction force became smaller, although the robot approached, ΔI = −1 and (U_L| − 1).

• U_L = 0 and $‖ {\dot{F}}_{Int} ‖ \approx 0$ : the interaction force remained approximately the same with the robot not approaching, ΔI = 0 and (U_L|0).

• U_L = 1 and $‖ {\dot{F}}_{Int} ‖ > 0$ : the interaction force increased, with the robot not approaching, ΔI = +1 and (U_L| + 1).

• U_L < 1 and $‖ {\dot{F}}_{Int} ‖ > 0$ : the interaction force became larger despite the robot approaching, ΔI = +2 and (U_L| + 2).

Summing up ΔI over the last l steps gives the sum

M ≔ \sum_{j = 1}^{l} I_{j} . (14)

If M = −2l, it is a very yielding human, and the robot can remain compliant (U_L = 0). If M = 2l, it is a very unyielding human and the robot shall become all the more yielding U_L. Based on M, this reciprocal tit-for-tat behavior for U_L is now calculated by means of the function

U_{L} = \{\begin{cases} 0, M < 0 \\ f_{M} \cdot \frac{1}{{(2 l)}^{2}} \cdot M^{2}, M \geq 0 \end{cases} . (15)

For M = 2l, it follows U_L = 1 and U_L = 0 results for M ≤ 0. The parameter 0 < f_M ≤ 1 is introduced to increase the sensitivity of U_L. The function U_L(M) is shown in Figure 9 for f_M = 1.The next offer of the agent o (k + 1) = p is then calculated by

p = \min_{v, ω} \{| U (d, v, ω) - U_{L} |\} . (16)

6. Acceptance strategy: A motion primitive o^k proposed by the partner at time step k is accepted if the utility U^k(o) of the motion primitive is higher than the utility U^k+1(o) of the motion primitive that the agent would propose in the next step.

FIGURE 9

FIGURE 9. Quadratic function for U_L to behave with reciprocal tit-for-tat.

3.3 Existence of a solution

For the developed cooperative trajectory planner, it can be said that an agreement for a joint movement between the human and robot is always found. The problem formulation in Section 2 shows that conflicting direction requests between the human and robot (e.g., humans want to go left and robots want to go right) do not have to be considered because both agents are aiming for the same goal and thus for the same general direction. Under this condition, it must now be shown that the human and robot always come to an agreement. This can be shown by considering the conflict case. In this case, there would be an interaction force ‖F_Int‖₂ > F_thresh. According to (8), this triggers negotiation. Now, it must be ensured that an agreement is reached in finite time, which is generally achieved by a mutual loss of utility. The extreme case corresponds to a completely unyielding human not allowing any loss of utility. In this case, the robot has to adapt to the movement request $p_{R}^{H}$ of the human in finite time. Since the boundaries of the set $O$ of the utility function U(o) are formed by one’s own preferred movement request p* and the partner’s movement request $p_{R}^{H}$ , the movement primitive $p_{R}^{H}$ is definitely in the set $O$ of considered motion primitives with $J_{MP} (p_{R}^{H}) \geq 0$ . The bidding strategy presented in Section 3.2, with the reciprocal tit-for-tat strategy, then ensures that $p_{R}^{H}$ is executed after l steps at the latest. This happens until the robot has approached the human sufficiently close and ‖F_Int‖₂ ≤ F_thresh holds.

4 Simulative evaluation

In the following, the scenario shown in Figure 10 is used for the analysis of three differently concessive patient types. In all three cases, human behavior is simulated in the same way as robot behavior: local path planning using the global path shown in Figure 10 and the base local planner algorithm and negotiation-based selection between own motion primitive and the motion primitive requested by the robot. The human bidding strategy is assumed to be constant in contrast to the reciprocal tit-for-tat behavior of the robot. A constant U_L,H is specified in each case for the human. The robot applies the reciprocal tit-for-tat strategy.

FIGURE 10

FIGURE 10. Simulation scenario with a starting point in the bottom left and target at the top.

Figure 11A shows the cooperative scenario for a totally unyielding patient who always follows his personal motion request. This can be seen in the only small deviation between the human’s global path and the executed trajectory. In the case of the robot’s trajectory, it can be seen that it initially follows its global path but then moves toward the human’s path and subsequently maintains an approximately constant distance from the human. On average over all simulation steps, the mean value of U_L,R is U_{L,R, mean} = 0.543. This parameter can be understood as a measure of how much the robot deviates from its own desired movement and accommodates the human on average over the entire simulation time. Its range of values is 0 ≤ U_L,R,mean ≤ 1, where U_L,R,mean = 0 means that the robot always executes the motion primitive that is demanded by the human, and U_L,R,mean = 1 means the constant selection of its own desired motion primitive. Table 1 shows the mean value of the distance between the human and robot d_H,R,mean, the mean value of the distance between the human and his global reference d_H,ref,mean, and the mean value of the distance between the robot and its global reference d_R,ref,mean for each simulation scenario. For the first simulation scenario, the first line shows that on average a distance of 0.573 m is maintained between the robot and the human. The human deviates on average 0.098 m from its global reference, whereas the robot deviates 0.345 m from its global reference.

FIGURE 11

FIGURE 11. Simulation results for three simulation scenarios. (A) Totally unyielding patient. (B) Yielding patient. (C) Totally yielding patient.

TABLE 1

TABLE 1. Mean value of distance between the human and robot d_H,R,mean, the mean value of distance between the human and his global reference d_H,ref,mean, and the mean value of distance between the robot and its global reference d_R,ref,mean for each simulation scenario.

The second simulation is performed with an approximately balanced concession between the human and robot and is shown in Figure 11B. The constant utility loss of the human in this scenario is set to U_L,H = 0.12. Both agents go in the middle of their global wishes. It can be seen that there is a closer following of the human’s global path at the beginning. After that, both actors move toward the robot’s path before following the human’s path more closely again toward the end. The reciprocal tit-for-tat-strategy yields an average of utility loss of the robot of U_{L,R, mean} = 0.428. The second row in Table 1 shows the second simulation scenario that on average, a distance of 0.506 m is maintained between the robot and the human. The human deviates on average 0.235 m from its global reference, whereas the robot deviates 0.277 m from its global reference. These two deviations show an approximately equal deviation of the human and robot from their global references.

In the third simulation scenario in Figures 11A,C, significant yielding behavior of the human is set with U_L,H = 0.6. It can be seen that both agents follow the path and motion request of the robot (Figure 11C). The average utility loss of the robot here is U_{L,R, mean} = 0.284. The third row in Table 1 shows the third simulation scenario that on average, a distance of 0.464 m is maintained between the robot and the human. The human deviates on average 0.578 m from its global reference, whereas the robot deviates 0.083 m from its global reference.

The simulation results in Figure 11 show an adaptive adjustment of the robot to the human’s motion request for different degrees of human’s concession. If the human is very unyielding, the robot adapts to the human’s trajectory. If the human is very compliant, the overall system follows the robot’s lead. If, on the other hand, the human is only compliant to a certain degree, both find a compromise between their movement wishes. These results confirm that a dynamic leadership role allocation is possible in the context of accompanying a human.

5 Conclusion

In the present work, a negotiation-based local trajectory planner that adapts to human behavior was implemented. For this purpose, a conventional local trajectory planner was used together with a negotiation-based human–machine interaction model from the literature for repeated negotiation over motion primitives. For this purpose, a different bidding strategy had to be introduced in the human–machine interaction model to achieve agreement even in applications without natural deadlines. In the presented application, a novel reciprocal tit-for-tat strategy was implemented that adapts to human behavior: If the human is very unyielding, the robot acts even more yielding. If the human insists very little on his own desire to move, the robot dictates the movement. This behavior was successfully demonstrated in the simulative evaluation. As a result, there is an automation behavior that adapts to human-in-the-lead behavior if the human wishes to do so or it adapts to automation-in-the-lead behavior if the human acts very yielding.

The work has only been tested simulatively so far and will be evaluated practically in the next step. In the course of this, it is particularly important to investigate how well the dynamic adaptation of the leadership role performs with real people. For this purpose, it must be investigated in particular to what extent the human input signal can be considered a reliable expression of the human’s desire to move or whether additional uncertainties in the interpretation of the input signal must be taken into account. In addition, dynamic obstacles must be considered for real-world applications, and a cooperative collision avoidance strategy must be implemented.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement

Written informed consent was obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article.

Author contributions

All authors listed have made a substantial, direct, and intellectual contribution to the work and approved it for publication.

Funding

This work was supported by the German Federal Ministry of Education and Research (BMBF) under the grant no. 16SV8401.

Acknowledgments

The authors acknowledge support from the KIT-Publication Fund of the Karlsruhe Institute of Technology.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors, and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Aguirre-Ollinger, G., and Yu, H. (2021). Omnidirectional platforms for gait training: Admittance-shaping control for enhanced mobility. J. Intelligent Robotic Syst. 101. doi:10.1007/s10846-021-01335-z

CrossRef Full Text | Google Scholar

Arechavaleta, G., Laumond, J.-P., Hicheur, H., and Berthoz, A. (2006). The nonholonomic nature of human locomotion: a modeling study. IEEE, 158–163. doi:10.1109/BIOROB.2006.1639077

CrossRef Full Text | Google Scholar

Baarslag, T. (2016). Exploring the strategy space of negotiating agents: A framework for bidding, learning and accepting in automated negotiation. Cham: Springer International Publishing.

Google Scholar

Bai, J., Song, A., Xu, B., Nie, J., and Li, H. (2017). A novel human-robot cooperative method for upper extremity rehabilitation. Int. J. Soc. Robot. 9, 265–275. doi:10.1007/s12369-016-0393-4

CrossRef Full Text | Google Scholar

Bueno, D. R., Viruete, E., and Montano, L. (2011). An autonomous tour guide robot in a next generation smart museum”. in The 5th international symposium on ubiquitous computing and ambient intelligence

Google Scholar

Fiore, M., Khambhaita, H., Milliez, G., and Alami, R. (2015). “An adaptive and proactive human-aware robot guide,” in Social robotics (Cham: Springer International Publishing), 194–203. doi:10.1007/978-3-319-25554-5_20

CrossRef Full Text | Google Scholar

Gasparetto, A., Boscariol, P., Lanzutti, A., and Vidoni, R. (2015). “Path planning and trajectory planning algorithms: A general overview,” in Motion and operation planning of robotic systems: Background and practical approaches. Editors G. Carbone, and F. Gomez-Bravo (Cham: Springer International Publishing), 3–27.

CrossRef Full Text | Google Scholar

Gul, F., Rahiman, W., and Nazli Alhady, S. S. (2019). A comprehensive study for robot navigation techniques. Cogent Eng. 6, 1632046. doi:10.1080/23311916.2019.1632046

CrossRef Full Text | Google Scholar

Huang, S., Teo, R. S. H., and Tan, K. K. (2019). Collision avoidance of multi unmanned aerial vehicles: A review. Annu. Rev. Control 48, 147–164. doi:10.1016/j.arcontrol.2019.10.001

CrossRef Full Text | Google Scholar

Li, W.-Z., Cao, G.-Z., and Zhu, A.-B. (2021). Review on control strategies for lower limb rehabilitation exoskeletons. IEEE Access 9, 123040–123060. doi:10.1109/access.2021.3110595

CrossRef Full Text | Google Scholar

Li, Z., Ren, Z., Zhao, K., Deng, C., and Feng, Y. (2020). Human-cooperative control design of a walking exoskeleton for body weight support. IEEE Trans. Ind. Inf. 16, 2985–2996. doi:10.1109/tii.2019.2900121

CrossRef Full Text | Google Scholar

Lu, R., Li, Z., Su, C.-Y., and Xue, A. (2014). Development and learning control of a human limb with a rehabilitation exoskeleton. IEEE Trans. Ind. Electron. 61, 3776–3785. doi:10.1109/TIE.2013.2275903

CrossRef Full Text | Google Scholar

Marder-Eppstein, E., and Perko, E. (2022). base_local_planner - ROS Wiki (2022). Available at: http://wiki.ros.org/base_local_planner (Accessed September 30, 2022).

Google Scholar

Mombaur, K., Laumond, J.-P., and Yoshida, E. (2008). An optimal control model unifying holonomic and nonholonomic walking, Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots. (IEEE), 646–653. doi:10.1109/ICHR.2008.4756020

CrossRef Full Text | Google Scholar

Quintero, H. A., Farris, R. J., Ha, K., and Goldfarb, M. (2012). “Preliminary assessment of the efficacy of supplementing knee extension capability in a lower limb exoskeleton with FES,” in 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, San Diego, CA, USA, 28 August 2012 - 01 September 2012, 3360–3363. doi:10.1109/EMBC.2012.6346685

PubMed Abstract | CrossRef Full Text | Google Scholar

Rothfuß, S. (2022). Human-machine cooperative decision making. PhD Thesis. Karlsruhe, (Germany): Karlsruher Institut für Technologie. doi:10.5445/IR/1000143873

CrossRef Full Text | Google Scholar

Rothfuß, S., Schmidt, R., Flad, M., and Hohmann, S. (2019). “A concept for human-machine negotiation in advanced driving assistance systems,” in 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC), Bari, Italy, 06-09 October 2019, 3116–3123. doi:10.1109/SMC.2019.8914282

CrossRef Full Text | Google Scholar

Stauffer, Y., Allemand, Y., Bouri, M., Fournier, J., Clavel, R., Metrailler, P., et al. (2009). The WalkTrainer—a new generation of walking reeducation device combining orthoses and muscle stimulation. IEEE Trans. Neural Syst. Rehabil. Eng. 17, 38–45. doi:10.1109/TNSRE.2008.2008288

PubMed Abstract | CrossRef Full Text | Google Scholar

Stuhlmacher, A. F., Gillespie, T. L., and Champagne, M. V. (1998). The impact of time pressure in negotiation: A meta-analysis. Int. J. Confl. Manag. 9, 97–116. doi:10.1108/eb022805

CrossRef Full Text | Google Scholar

Weber, K. (2015). Meestar: Ein Modell zur ethischen Evaluierung sozio-technischer Arrangements in der Pflege- und Gesundheitsversorgung. Tech. Alltags - Beitrag für gutes Leben?, 247–262.

Google Scholar

Zhou, C., Huang, B., and Fränti, P. (2022). A review of motion planning algorithms for intelligent robots. J. Intell. Manuf. 33, 387–424. doi:10.1007/s10845-021-01867-z

CrossRef Full Text | Google Scholar

Keywords: negotiation theory, human–robot interaction, human–machine cooperation, local trajectory planning, reciprocal tit-for-tat

Citation: Schneider J, Rothfuß S and Hohmann S (2022) Negotiation-based cooperative planning of local trajectories. Front. Control. Eng. 3:1058980. doi: 10.3389/fcteg.2022.1058980

Received: 30 September 2022; Accepted: 09 November 2022;
Published: 29 November 2022.

Edited by:

Yong Xu, Guangdong University of Technology, China

Reviewed by:

Ahmad Fakharian, Qazvin Islamic Azad University, Iran
Sunan Huang, National University of Singapore, Singapore

Copyright © 2022 Schneider, Rothfuß and Hohmann. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Julian Schneider, anVsaWFuLnNjaG5laWRlckBraXQuZWR1

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Negotiation-based cooperative planning of local trajectories

1 Introduction

1.1 Related work

1.1.1 Emancipated change of leadership roles

1.1.2 Negotiation theory

1.1.3 Motion planning

1.2 Contribution

2 Problem formulation, system structure, and system model

2.1 System model

2.2 Motion primitives

3 Negotiation-based local trajectory planner

3.1 Motion primitive estimation

3.2 Negotiation of motion primitives

3.3 Existence of a solution

4 Simulative evaluation

5 Conclusion

Data availability statement

Ethics statement

Author contributions

Funding

Acknowledgments

Conflict of interest

Publisher’s note

References

94% of researchers rate our articles as excellent or good

94% of researchers rate our articles as excellent or good