Robotics and Autonomous Systems 91 (2017) 226–233 Contents lists available at ScienceDirect Robotics and Autonomous Systems journal homepage: www.elsevier.com/locate/robot Planning of goal-oriented motion from stochastic motion primitives and optimal controlling of joint torques in whole-body Wataru Takano *, Yoshihiko Nakamura Mechano-Informatics, University of Tokyo, 7-3-1, Hongo, Bunkyoku, Tokyo, 113-8656, Japan highlights • This paper proposes the synthesis of whole body motions from stochastic motion primitives • The joint torques are computed during preserving the profile of the synthesized motion and controlling the reaction forces • Simulation demonstrates the validity of the motion synthesis and force control article info Article history: Received 15 September 2016 Received in revised form January 2017 Accepted 28 January 2017 Available online February 2017 Keywords: Motion primitive Motion synthesis Stochastic model a b s t r a c t Humanoid robots are expected to be integrated into daily life This requires the robots to perform humanlike actions that are easily understandable by humans Learning by imitation is an effective framework that enables the robots to generate the same motions that humans However, it is generally not useful for the robots to generate motions that are precisely the same as learned motions because the environment is likely to be different from the environment where the motions were learned The humanoid robot should synthesize motions that are adaptive to the current environment by modifying learned motions Previous research encoded captured human whole-body motions into hidden Markov models, which are hereafter referred to as motion primitives, and generated human-like motions based on the acquired motion primitives The contact between the body and the environment also needs to be controlled, so that the humanoid robot’s whole-body motion can be realized in its current environment This paper proposes a novel approach to synthesizing kinematic data using the motion primitive and controlling the torques of all the joints in the humanoid robot to achieve the desired whole-body motions and contact forces The experiments demonstrate the validity of the proposed approach to synthesizing and controlling wholebody motions by humanoid robots © 2017 The Authors Published by Elsevier B.V This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/) Introduction Humanoid robots need to understand human actions and communicate with humans using human-like gestures to be integrated into daily life Approaches to representing human motions have been developed as a typical framework for imitation learning [1–3] Representations of motion can be grouped into two categories One approach encodes motions into a set of parameters of a dynamical system, such as systems of differential equations or neural networks [4–7] The other approach uses stochastic models, typified by hidden Markov models (HMMs) [8–11] The dynamical systems or stochastic models represent motions in abstract forms as their parameters These abstract representations are called motion primitives Robots learn motions through observation or * Corresponding author Fax: +81 3818 0835 E-mail addresses: takano@ynl.t.u-tokyo.ac.jp (W Takano), nakamura@ynl.t.u-tokyo.ac.jp (Y Nakamura) demonstration as motion primitives, and subsequently use the motion primitives for motion recognition and motion generation One popular approach is Dynamic Movement Primitive (DMP) that encodes the demonstration into the parameters in the differential equation Petric et al created a virtual force field that attracts a trajectory to the target one [12] This approach generate the trajectory that satisfies a desired motion while preserving a natural posture in the training motions Wang et al also extended DMP to modifying the training trajectories while preserving their properties to create a new trajectory based on the Kernel functions [13] These approaches improve reusability of the DMP, but not handle the contact force Gams et al developed a method to control the force using the DMP [14] The DMP generates the kinematic motion in the feedforward manner and the contact was controlled to maintain a predefined force in the feedback manner The Concept of Synthesizing a trajectory in the feedforward and controlling a reaction force from the environment in the feedback is very similar to this paper http://dx.doi.org/10.1016/j.robot.2017.01.013 0921-8890/© 2017 The Authors Published by Elsevier B.V This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/) W Takano, Y Nakamura / Robotics and Autonomous Systems 91 (2017) 226–233 Another approach uses the stochastic model Research based on the stochastic model has focused on developing efficient algorithms for motion recognition, and has been developed in intelligent humanoid robots that can understand human actions [15–17] In addition to motion recognition, several methods have been proposed to generate motions by using motion primitives [8,18–21] Inamura et al have developed a sampling-based method to generate a sequence of joint angles of the entire robot body from HMMs In the first step, a node is sampled according to probabilities of node transitions in the HMM, and a sequence of nodes is created In the second step, vectors of joint angles are sampled according to the distribution of joint angles at each node in the node sequence derived in the first step These two steps result in a motion trajectory, represented by a sequence of the joint angles [8] Calinon et al proposed an approach to encoding joint angles and joint angular velocities into HMMs These HMMs can calculate the distribution of the joint angular velocities given the joint angles at the previous frame, sample the joint angular velocities based on the conditional distribution, and update the joint angles at the current frame [21] This framework includes the feedback loop and is adaptive to the current state Because the synthesized motions are likely to become the average of the training motions, the synthesized motions are not applied to environments considerably different from where the motions were originally learned Moreover, these methods not handle the targeted motions, and as a consequence, the synthesized motion does not necessarily achieve a specific task For example, if a robot learns a motion for grasping an object on a desk, then the robot can reach an object located at the same position by the learned motion primitive However, the robot is not able to reach to an object located at a position on a shelf different from the learned position on the desk Robots should be able to modify the learned motions and perform flexible motions for adapting to their environment Herzog et al propose an approach to recognize and synthesize motions of manipulating objects at different locations using a parametric hidden Markov model (PHMM)[18] PHMMs represent typical motions and compensate for different motions of a particular type Multiple motions are encoded into an HMM as a representative motion, and each motion variant is encoded into its own HMM The parameters of the HMM of each motion variant are represented as a linear function of the parameters of the HMM and the position of the object Consequently, individual motions are parameterized by the position of the object Calinon et al have developed a method for synthesizing motions by using descriptors of both the motion and the manipulated object [22] These approaches not control the contact forces that are important for motions interacting with environments or manipulation motions As described above, several researches have been tackling the motion generation from the motion primitives and extending the generation to adapting to the environment for achieving the reaching task But these approaches not take into account the dynamics, more specifically, contact force between the body and environment or the equation of motion for the robot body The motion of the robot is created by the joint torques and the interaction force from the environment The kinematic based motion generation needs to be applied to the force control such that the robot can perform kinetically and dynamically suitable motions This paper proposes a novel approach to synthesizing whole-body motion that satisfies configurations for a goal or subgoal postures from HMMs of motion primitives, and to subsequently controlling the torques of all the joints in a humanoid robot’s body such that the robot can perform human-like motions and maintain the desired contact with the environment Motion synthesizer and torque controller We propose a novel approach to synthesizing motions by using the motion primitives and to subsequently controlling the joint torques in the humanoid robot’s whole body, as shown in Fig 227 Fig A whole body motion is encoded into an HMM that is referred to as a motion primitive Motion synthesizer generates a sequence of joint angles that not only maintain the profile of motions learned as the motion primitive but also reach the goal postures Force controller computes the joint torques such that the humanoid robot can performed the synthesized motion and the reaction forces can be controlled as desired 2.1 Representation of motion primitives A configuration is defined as a complete specification of the pose of a human performer or a humanoid robot We choose the angles of all the joints in the whole body as the configuration in this study A whole-body motion is represented by a vector of the joint angles xt at time t A motion segment is expressed by a motion trajectory {x1 , x2 , , xl } The motion segment is encoded into a set of parameters of an HMM, the HMM is hereinafter referred to as a ‘‘motion primitive’’ An HMM is a stochastic model that is defined by a compact notation λ = {Π, A, B}, where{Π } = {π1 , π2 , , πm } is the set of initial node probability, A = aij is the matrix whose element aij is the probability of transition from the ith node to the jth node, and B is the set of output probabilities at the nodes Encoding of the motion segment optimizes these parameters of an HMM such that the likelihood of the motion segment being generated by the HMM is maximized The HMM can be utilized for motion recognition and motion generation The motion recognition finds the HMM from among a set of HMMs that is the most likely to generate the observed motion The motion generation outputs a sequence of joint angles according to the distribution of the motion trajectories learned by the HMM Thus, a humanoid robot learns motion segments in the form of motion primitives, recognizes observations, and generates its own motions by using the motion primitives 2.2 Synthesis of joint angle sequence This subsection describes a method to synthesize a sequence of joint angles from a motion primitive in a way that enables a humanoid robot not only to perform human-like motions but also to satisfy the (sub)goal postures This method introduces an objective function that consists of three functions The first function calculates the likelihood that the HMM generates a motion trajectory The second function calculates the distance between the motion trajectory and goal postures The third function calculates the smoothness of the motion 228 W Takano, Y Nakamura / Robotics and Autonomous Systems 91 (2017) 226–233 trajectory The motion synthesis finds the motion trajectory that maximizes the objective function The first function of the likelihood that motion trajectory {x1 , x2 , , xl } is generated by the motion primitive λ is denoted by ψp = ln P(x1 , x2 , , xl |λ) (1) This likelihood can be efficiently computed by a forward algorithm [23] The variable δt is assumed to take a binary value This variable is set to δt = if the goal posture is given to the motion at time t and δt = otherwise The second function in the objective function is derived from the difference between the configuration xt and the goal posture xt ,g at time t This function is written as a weighted quadratic form of these differences: l )T ( ) wg ∑ ( ψg = − δt xt − xt ,g xt − xt ,g , (11) 2 The likelihood P(z |λ) that the motion trajectory z is generated by HMM λ, can be rewritten as the following: ln P(z |λ) = ln ∑ P(z , q|λ) ∀q (2) ∑ P(q|z) t =1 xt xt t =4 x t = xt − 3xt −1 + 3xt −2 − xt −3 , (3) where the positive weight parameter wd is manually specified These three terms lead to the resultant objective function Ψ = ψp + ψg + ψd (4) The motion trajectory {x1 , x2 , , xl } is rewritten as following vector: z = xT1 , , xTl [ ]T , (5) where column vectors of the joint angles xt at time t, are concatenated to a vector z xT denotes the transpose of x The objective function in Eq (4) can be subsequently rewritten in the simple form: Ψ (z) = ln P(z |λ) − wg ( z − zg )T ) ( ∆g z − z g − where ∆g and zc are given as follows: ⎡ δ1 Id ⎢ ∆g = ⎣ O δn I d ] T zg = xT1,g , , xTl,g wd z T Lz , (6) ⎤ (7) ⎥ ⎦ O [ (8) Here, Id and Od are the d × d identity matrix and the zero matrix respectively, where d is the number of dimensions of xt Additionally, the matrix L is written as the dl × dl Laplacian matrix ⎡ ⎢Ad,4d ⎢ O ⎢ d L = ⎢ ⎢ Od ⎢ ⎣ Ad,4d = −Id ⎤ O3d,dn Od Ad,4d Od Od Ad,4d ··· ··· Od Od 3Id ··· −3Id Od Id ] ⎥ ⎥ ⎥ ··· ⎥ ⎥ ⎥ ⎦ (9) Ad,4d (10) An optimal motion trajectory z that maximizes the objective function Ψ (z) has to be computed In this study, the optimal motion trajectory is derived by using an iterative method The motion P(z , q|λ) ∀q [ l wd ∑ T [ Ψ (z (k+1) ) − Ψ (z (k) ) = ln P(z (k+1) |λ) − ln P(z (k) |λ) )T ( ) wg ( (k+1) − z − zg ∆g z (k+1) − zg )T ( ) wg ( (k) z − zg ∆g z (k) − zg + wd wd (k) T (k) T − z (k+1) Lz (k+1) + z Lz = ln where the positive weight parameter wg is manually specified The joint-angular jerks are chosen as the smoothness of the motion trajectory [24] The third function is written as the following weighted quadratic form ψd = − trajectory z (k) is derived after k iterations The difference between the objective functions of z (k+1) and z (k) is expressed by = ln EP(q|z) P(q|z) P(z , q|λ) ] P(q|z) (12) q = {q1 , q2 , , ql } is a sequence of nodes in the HMM λ, where qt is the node at time t P(z , q|λ) represents the likelihood that motion trajectory z is generated along the node sequence q by the HMM λ Additionally, EP(q|z) [R] defines the expected value of R given P(q|z), which is the distribution of the sequence of nodes q According to Jensen’s inequality, Eq (12) satisfies the following inequality [ ln EP(q|z) P(z , q|λ) ] P(q|z) [ ] P(z , q|λ) ≥ EP(q|z) ln P(q|z) (13) By using Eq (13), following equations can be derived [ ln P(z |λ) − EP(q|z) ln P(z , q|λ) ] P(q|z) P(q|z , λ) = ln P(z |λ) − EP(q|z) ln P(z |λ) − ln P(q|z) [ ] P(q|z) = EP(q|z) ln P(q|z , λ) = KL (P(q|z) ∥ P(q|z , λ)) [ ] (14) Eq (14) is the Kullback–Leibler information that measures the dissimilarity between the distribution P(q|z) and the distribution P(q|z , λ) The Kullback–Leibler information becomes zero only when these two distributions are the same and is positive otherwise Eq (11) can be rewritten as the following equation by using the notations of the Kullback Leibler information: Ψ (z (k+1) ) − Ψ (z (k) ) [ ] P(z (k+1) , q|λ) = EP(q|z (k+1) ) ln P(q|z (k+1) ) ( ) + KL P(q|z (k+1) ) ∥ P(q|z (k+1) , λ) [ ] P(z (k) , q|λ) − EP(q|z (k) ) ln P(q|z (k) ) ( ) (k) − KL P(q|z ) ∥ P(q|z (k) , λ) )T ( ) wg ( (k+1) − z − zg ∆g z (k+1) − zg )T ( ) wg ( (k) + z − zg ∆g z (k) − zg wd (k+1) T (k+1) wd (k) T (k) − z Lz + z Lz (15) 2 We approximate the distributions of the sequence of node q under the motion trajectory More specifically, we set distributions P(q|z (k+1) ) and P(q|z (k) ) to the distribution P(q|z (k) , λ) P(q|z (k+1) ) = P(q|z (k) , λ), P(q|z (k) ) = P(q|z (k) , λ) (16) W Takano, Y Nakamura / Robotics and Autonomous Systems 91 (2017) 226–233 These approximations result in the relation, in which the Kullback– Leibler information of the second term in Eq (15) becomes greater than zero, and the relation in which the Kullback–Leibler information of the fourth term is equal to zero Therefore, the k + 1th iteration only has to find the new motion trajectory z (k+1) that give the positive value of Eq (15) removed the second and fourth terms since the Ψ (z (k+1) ) becomes greater than Ψ (z (k) ) and the motion trajectory z (k+1) increases the value on the objective function Ψ (z (k+1) ) This iteration leads to a gradual increase in the value on the objective function Synthesis of motion trajectory z given goal postures according to motion primitive λ can be formulated by { arg max P(z (k+1) , q|λ) [ EP(q|z (k) ,λ) ln z (k+1) − EP(q|z (k) ,λ) ln − + wg ( − wd 2 )T )T ) ) } T z (k) Lz (k) (17) By eliminating terms that not include motion trajectory z (k+1) , Eq (17) can be rewritten as arg max EP(q|z (k) ,λ) ln P(z (k+1) |q, λ)P(q|λ) [ − z (k+1) { )T − z g ∆g z [ = arg max EP(q|z (k) ,λ) z (k+1) − wg ( − wd z (k+1) − zg 2 ( )T (k+1) − zg − z ] ln P(z (k+1) |q, λ) ∆g z (k+1) − zg (k+1) T Lz (k+1) } arg max z (k+1) − ⎩ P(q|z arg max z (k+1) − zg )T |q, λ) ∆g z (k+1) − zg − ( ) z wd T z (k+1) Lz (k+1) ⎫ ⎬ − z (k+1) ⎩ l−1 l ∑ ∑ (k+1) ln bqt (xt × ln πq1 + t =1 ln aqt ,qt +1 + z (k+1) T Lz (k+1) ⎫ ⎬ ⎭ (21) )T −1 Σi ( x − µi )] P(qt = i|z (k) , λ) (23) ⎤ O −1 O Σi P(ql = i|z (k) , λ) ⎥ ⎦ [ ]T ˜ i = µTi , , µTi µ (24) (25) Therefore, Eq (23) can be rewritten as { arg max ) m l ∑ ∑ − z (k+1) (20) wg ( m ] ]T [ ∑ [ (k+1) ˜ i Vi z (k+1) − µ ˜i z −µ i=1 )T ( ) − zg ∆g z (k+1) − zg ⎫ wd (k+1) T (k+1) ⎬ − z Lz ⎭ − z (k+1) (26) The optimal motion trajectory z (k+1) at the k + 1th iteration can be derived as Eq (27) ∀q l−1 ∑ wd )P(qt = i|z (k) , λ) ( ⎢ t =1 P(q|z (k) , λ) ( (k+1) d ln(2π ) + ln(|Σi |) + x − µi Vi = ⎣ where bi (x) is the probability of vector of joint angles x being generated from the ith node By substituting Eq (20) into Eq (19), Eq (19) can be rewritten as the following: arg max ) ln bi (xt P(q1 = i|z (k) , λ) −1 Σi P(z |q, λ) is the likelihood that motion trajectory z is generated along the sequence of nodes q from HMM λ, and it can be expressed in a closed form using parameters of the HMM ⎧ ⎨∑ ∆g z (k+1) − zg − ( i=1 t =1 i=1 t =1 T (19) t =1 )T ( ( ) ) +1) +1) ì x(k ài Σi−1 x(k − µi t t )T ( ) wg ( (k+1) − z − zg ∆g z (k+1) − zg ⎫ wd (k+1) T (k+1) ⎬ − z Lz ⎭ ⎡ (k+1) ln aqt ,qt +1 + ) ˜ i as We define the dl × dl matrix Vi and dl-dimensional vector µ follows: ⎭ (k+1) P(z (k+1) |q, λ) = ln πq1 + (k+1) ln bqt (xt t =1 { m l ∑∑ 1[ z (k+1) ∀q wg ( , λ) ln P(z ⎭ where d is the number of dimensions in the vector x By substituting Eq (22) into Eq (21) and removing terms that are independent of the motion trajectory z (k+1) from Eq (21), Eq (21) can be deformed into Eq (26) (18) (k+1) ln bi (x) = − The first term in Eq (18) is expressed by the summation of probability P(q|z (k) , λ) over all the node sequences q, and Eq (18) is rewritten as (k) T z (k+1) Lz (k+1) where P(qt = i|z (k) , λ) is the probability of staying at the ith node at time t given the motion trajectory z (k) The output probability bi (x) follows the Gaussian distribution, and the logarithm of this probability can be written as ) z (k+1) Lz (k+1) ⎧ ⎨∑ (k+1) { } T l ∑ wd (22) wd ) ( ) ∀q z (k+1) − zg wg ( ( P(q|z (k) , λ) wg ( ∆g z (k+1) − zg − ] z (k+1) wg ( ⎩ )T )T ( ) − zg ∆g z (k+1) − zg ⎫ wd (k+1) T (k+1) ⎬ , Lz − z ⎭ − ] ( wd ⎧ ⎨∑ z (k+1) ∆g z (k) − zg T = arg max ∆g z (k+1) − zg z (k+1) Lz (k+1) + { − ] ( z (k+1) − zg z (k+1) P(q|z (k) , λ) z (k+1) − zg z (k) − zg P(z (k) , q|λ) wg ( = arg max P(q|z (k) , λ) [ wg ( − 229 ⎫ ⎬ l ∑ t =1 ) (k+1) ln bqt (xt ) ( z (k+1) = m ∑ i=1 )( ˜ i + wg ∆g zg Vi µ m ∑ i=1 )− Vi + wd L + wg ∆g (27) 230 W Takano, Y Nakamura / Robotics and Autonomous Systems 91 (2017) 226–233 Note that the initial motion trajectory z (0) is randomly set and that the iterative optimization is terminated at a predefined iteration number 2.3 Control of reaction forces (s) (s) q˙ t −˙qt −1 (s) respectively ∆t is the sampling interval The vectors qt ∆t (s) (d) (d) and qt are set to the desired configuration qt and its velocity qt ˙ (d) This subsection describes a method to calculate the torques of all the joints in a humanoid robot whole body in order to control not only the synthesized motion trajectory but also reaction forces from contacting environments In order to discuss the controlling, the equation of motion of the humanoid robot is formulated by the following [25,26] ][ ] [ ] [ ] [ ] ∑ Nc [ T ] Cb gb Mbj qă b Jbi + + = + Fi qă j Cj gj τ Mjj Jji T [ Mbb Mjb (28) where Mbb ∈ R6×6 is the inertia matrix of the base link, Mbj ∈ R6×n is the coupling inertia matrix, Mjb is the transpose of Mbj , Mjj ∈ Rn×n is the inertia matrix of the links, n denotes the number of joints, qb ∈ R6 and qj ∈ Rn are the generalized coordinates of the base link, and the joint angle vector, Cb ∈ R6 and Cj ∈ Rn are centrifugal and Coriolis terms, gb ∈ R6 and gj ∈ Rn are gravitational terms, τ ∈ Rn is the vector of joint torques, Nc is the number of contact points, Jbi ∈ R6×6 is the Jacobian matrix that relates the base link velocity to the velocity of the ith contact point, Jji ∈ R6×n is the Jacobian matrix that relates the joint angular velocity to the velocity of the ith contact points, and Fi ∈ R6 is the vector of reaction forces and moments from the ith contact point We choose the desired contact points, each of which is an element of set F (d) , and set the desired contact forces and moments (d) for each chosen contact point Fi is the vector of desired contact forces and moments at the ith contact point of the set F (d) Additionally, we define the set of all the contact points except for the desired contact points as F The objective function Γ of the vector of joint torques τ and the vector of reaction forces and moments Fi : i = 1, 2, , Nc is defined as follows: Γ = 2 ∑ wF ∥Fi ∥2 + wτ ∥τ∥2 wF (d) Fi − Fi(d) + (29) [ u = τ T F1T · · · , F|TF (d) | F|TF (d) |+1 · · · F|TF (d) +F | ]T (30) Where Fi , (i = 1, 2, , |F (d) |) is the vector of reaction forces and moments at the desired contact point, and Fi , (i = |F (d) | + 1, , |F (d) + F |) is the vector of reaction forces and moments at other undesired contact point Subsequently, Eq (29) can be rewritten as ˜ + 2b˜ T u Γ (u) = uT Au (31) ˜ and b˜ are given as follows: where A wF (d) I|F (d) | O (d) T [ b˜ = 0n T − wF (d) F1 O [ (s) T qb (s) qj ] T T (s) (32) wF I|F | · · · − wF (d) F|(d) F (d) | (s) T 06|F | T ] (33) (s) q1 , q2 , , ql , where configuration q(s) = consists of the generalized coordinates of the base (s) (s) ∑ Jbi T Fi + ∑ (35) (36) iF Mj qă (d) + Cj + gj = τ + Jbi T Fi Jji T Fi + Jji T Fi i∈F i∈F (d) [ ] where the Mb and Mj are given as Mb = Mbb Mbj and [ matrices ] Mj = Mjb Mjj respectively By denoting the left-hand sides of Eqs (35) and (36) by the vectors d1 and d2 respectively, and using Eqs (30), (35) and (36) can be reduced as follows: [ O6,n JbF In JjF (d) T (d) T JbF T JjF T ]⎡ τ ⎤ [ ] ⎣F F d ⎦ = d1 FF d2 (37) where JbF (d) T T JbF (d) T JjF JjF T ] [ = Jb1 T · · · Jb|F d | T ] [ = Jb|F (d) +1| T · · · Jb|F d +F | T [ ] = Jj1 T · · · Jj|F d | T ] [ = Jj|F (d) +1| T · · · Jj|F d +F | T (38) (39) (40) (41) Eq (36) consequently results in the linear constraints as follows: C˜ u = d˜ (42) where C˜ = O6,n JbF In JjF d˜ = d1 T d2 T [ ]T (d) T (d) T JbF T JjF T ] (43) (44) The optimal u that minimizes the objective function represented by Eq (31) subject to the constraint represented by Eq (42) can be calculated using quadratic programming The joint torques that are elements of the optimal u are utilized to control the humanoid robot’s whole body motion while they not only maintain the profile of the motion trajectory but also control the reaction forces from the contacting environments as desired Experimental verification link and the joint angle vectors The velocity q˙ t and the acceler(s) Mb qă (d) + Cb + gb = ] where 0n denotes the n-dimensional zero vector The synthesized motion trajectory gives a sequence of the { } configurations (34) where qt is the measured configuration These desired configu(d) (d) (d) ration qt , velocity q˙ t and acceleration qă t are substituted to Eq (28), and the following constraints can be yielded [ where wF (d) , wF and wτ are positive weight parameters The vector u is defined as ˜ = A ( ) ( ) (d) (d) = qă (s) + K q − q + K q − q D t P t t t t i∈F i∈F (d) [ wτ In qă t iF (d) i at time t for the synthesized motion trajectory The acceleration (s) qă t is modified into the following desired acceleration: (s) ation qă t of qt at time t are given as q˙ t (s) = (s) qt −qt −1 ∆t (s) and qă t = We tested the synthesis of whole-body motion of a humanoid robot and the control of reaction forces from the contact environment on the simulation To create motion primitives with HMMs, we measured human whole-body motions with an optical motion capture system The positions of 34 markers attached to a human performer were recorded The sequence of position data was converted to angles of all the joints in the humanoid robot by inverse kinematic computation, and a sequence of joint angles was consequently obtained In this study, we recorded three motions of ‘‘touching an object with the right hand’’ and created its corresponding motion primitive using a 30-node left-to-right HMM The first test was conducted on synthesizing the whole-body motion of the robot from the motion primitive ‘‘touching an object W Takano, Y Nakamura / Robotics and Autonomous Systems 91 (2017) 226–233 231 Fig The motions generated from the motion primitive ‘‘touching an object with the right hand’’ The left panel shows the motion generated given no goal posture The middle and right panels show the motions generated given goal postures of touching an object at low and high positions respectively Fig Profiles of right and left shoulder joint angles The red solid line shows the joint angle trajectory generated from the motion primitive given no goal posture The red-shaded graph shows the distribution of joint trajectories for the motion primitive The green and blue solid lines show the synthesized motions when different goal postures were given (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) with the right hand’’ We set goal postures at the final frame such that the humanoid robot could touch an object at different positions with the right hand More specifically, two different goal postures were set for reaching to touch objects in high and low positions Fig shows the motions generated from the motion primitive On the left panel, no goal posture is given On the middle and right panels, the goal postures of touching an object at low and high positions are given respectively Fig shows the trajectories of the right and left shoulder joint angles for the given goal postures The distribution can be estimated by using the Gaussian distribution at the node that is the most likely to generate the joint angle at each frame Here, we manually set both of the positive weight parameters wg and wd to 100 For the right shoulder, the synthesized motions for reaching with the right hand to high and low positions were similar to the motion primitive given no goal posture The synthesized motions for reaching to low and high positions respectively converged to −0.9 rad and −1.5 rad at the final frame The three corresponding motions for the left shoulder are exactly same as one another The synthesized motions are very smooth without abrupt acceleration and come close to the goal postures while maintaining the profile of joint angles learned as the motion primitive The second test was conducted on control of both synthesized motion and reaction forces from the contact environment We used the synthesized motion of touching an object at a low position with the right hand A wall was created in front of the humanoid robot in the simulation, and the robot touched the wall with the right hand The desired reaction force on the right hand was set to (10, 0, 0) kN, where the first component is a force perpendicular to the wall surface, and the second and third components are forces horizontal to the wall The desired vertical reaction forces acting on both feet and both hips of the robot were set to one-fourth of its total weight Additionally, we set the positive weight parameters wF (d) , wF and wτ to 10, and respectively The bottom panel in Fig shows the humanoid robot’s whole-body motion controlled by the proposed method The robot raised its right hand, touched the wall, and bent the right wrist outward to control the reaction force from the contacting wall The bottom panel in Fig shows the profile of reaction forces on the right hand during application of our proposed controller The reaction force perpendicular to the wall was maintained around 10 kN, and the reaction forces horizontal to the wall were maintained at a constant level of zero Large reaction forces were measured when the right hand first came into contact with the wall, and so a limitation of this controller is its performance at the transition to contact However, as shown in Table 1, the average and standard deviation of the reaction force perpendicular to the wall were −10.7 kN and 14.1 kN, and the averages of the reaction forces horizontal to the wall were −1.12 kN and 0.02 kN, with standard deviations of 2.34 kN and 1.62 kN, respectively We additionally tested the force controller on the motion of touching an object at a high position with the right hand The bottom panel in Table shows the statistics of reaction force The reaction force perpendicular to the wall surface can be controlled to be maintained about 10 kN We compared our proposed method with a position controller (proportional–derivative control) for whole-body motion This controller computed the joint toques of all the joints in humanoid robot as follows: ˙ τ = Kp (qj (d) − qj ) + Kd (q˙ (d) j − qj ) (45) where Kp and Kd are the control parameters The top panel in Fig shows the whole-body motion where the joint angles were controlled according to Eq (45) and the reaction forces were not controlled The robot did not bend the right wrist even after the right hand contacted the wall because only the learned kinematic profile was maintained The top panel in Fig shows the profile of the reaction forces on the right hand when the position controller was used The reaction force perpendicular to the wall did not converge As shown by Table 1, the errors between the measured reaction forces and the desired reaction forces were larger during 232 W Takano, Y Nakamura / Robotics and Autonomous Systems 91 (2017) 226–233 Fig The motion ‘‘touching an object with the right hand’’ is given the desired reaction force perpendicular to the wall The top panel shows the whole-body motion controlled by the position controller, and the bottom panel shows the whole-body motion controlled by the force controller Table Averages and standard deviations of the reaction forces F¯x is the average of the reaction force perpendicular to the wall, and σ¯ x is its standard deviation F¯y and F¯z are the averages of the reaction forces horizontal to the wall, and σ¯ y and σ¯ z are their standard deviations The top and bottom panels shows the statistics during reaching for an object at low and high positions respectively F¯x (kN) F¯x (kN) F¯y (kN) σ¯ x (kN) σ¯ y (kN) σ¯ x (kN) F¯x (kN) F¯x (kN) F¯y (kN) σ¯ x (kN) σ¯ y (kN) σ¯ x (kN) Position control Force control −29.9 3.27 0.39 107.5 20.4 20.0 −10.7 −1.12 0.02 14.1 2.34 1.62 Position control Force control −32.2 −11.2 −1.15 3.50 0.44 119.1 20.8 20.1 0.07 14.1 2.32 1.69 Conclusion Fig Profiles of reaction forces on the right hand during the motion of touching the wall with the right hand The red solid line shows the reaction force perpendicular to the wall The green and blue solid lines show the reaction forces horizontal to the wall (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) application of the position controller than during application of our proposed force controller The reaction force could be controlled during maintaining kinematic profile similar to the training motion in the simulation experiments But there are several problems that lead to the critical issues especially when a real humanoid robot is controlled The first is how to control the reaction force at the first contact with the environment The large contact force was observed and was not suitably controlled when the robot’s hand touched the environment The additional feedback controller needs to be developed Another problem is the scalability of our proposed method The motions were limited to reaching a target object After reaching phase, the robot is required to grasping or manipulate the object Our proposed method needs to be extended to controlling the robot hands In this paper we have described a method for synthesizing whole-body motion for a humanoid robot from a motion primitive and controlling reaction forces while maintaining the synthesized motion The motions are represented by sequences of angles of all joints in the robot, and these sequences are encoded into an HMM called a motion primitive The objective function takes into consideration three evaluations: the likelihood that the synthesized motion is generated by the HMM, the distance between the synthesized motion and given goal postures, and the total jerk along the synthesized motion We proposed an algorithm to find an optimal sequence of joint angles that maximizes the objective function, and this optimal sequence results in the synthesized whole-body motion We also proposed a method for computing the input torques of all the joints in the robot, so that it could perform the synthesized motion, and for controlling the reaction forces from the contact environment as desired The objective function consists of the error between the actual reaction force and the desired one, total joint torque, and total reaction force The constraint for the objective function is given as the equation of motion for the synthesized W Takano, Y Nakamura / Robotics and Autonomous Systems 91 (2017) 226–233 motion The optimal joint torques can be found by maximizing the objective function subject to this constraint by using quadratic programming Our proposed methods for motion synthesis and force control were tested by using the motion of touching an object with the right hand The experiments on motion synthesis showed that the proposed method could generate adaptive whole-body motions of touching an object at a high position and touching an object at a low position Additionally, experimental results on force control clearly showed the method’s validity The error of the reaction force perpendicular to the wall was 0.7 kN, and the reaction force was less than the value of 19.9 kN obtained by position control Acknowledgments This research was partially supported by a Grant-in-Aid for Young Scientists (A) (No 26700021) from the Japan Society for the Promotion of Science and by the Strategic Information and Communications R&D Promotion Program (No 142103011) of the Ministry of Internal Affairs and Communications References [1] C Breazeal, B Scassellati, Robots that imitate humans, Trends Cogn Sci (11) (2002) 481–487 [2] S Schaal, Is imitation learning the route to humanoid robot? Trends Cogn Sci (6) (1999) 233–2422 [3] M.J Mataric, Getting humaniods to move and imitate, IEEE Intell Syst 15 (4) (2000) 18–24 [4] M Okada, K Tatani, Y Nakamura, Polynomial design of the nonlinear dynamics for the brain-like information processing of whole body motion, in: Proceedings of the IEEE International Conference on Robotics and Automation, 2002, pp 1410–1415 [5] A.J Ijspeert, J Nakanishi, S Shaal, Learning control policies for movement imitation and movement recognition, Neural Inf Process Syst 15 (2003) 1547–1554 [6] H Kadone, Y Nakamura, Symbolic memory for humanoid robots using hierarchical bifurcations of attractors in nonmonotonic neural networks, in: Proceedings of the 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2005, pp 2900–2905 [7] M Ito, K Noda, Y Hoshino, J Tani, Dynamic and interactive generation of object handing behaviors by a small humanoid robot using a dynamic neural network model, Neural Netw 19 (3) (2006) 323–337 [8] T Inamura, I Toshima, H Tanie, Y Nakamura, Embodied symbol emergence based on mimesis theory, Int J Robot Res 23 (4) (2004) 363–377 [9] T Asfour, F Gyarfas, P Azad, R Dillmann, Imitation learning of dual-arm manipulation task in humanoid robots, in: Proceedings of the IEEE-RAS International Conference on Humanoid Robots, 2006, pp 40–47 [10] A Billard, S Calinon, F Guenter, Discriminative and adaptive imitation in unimanual and bi-manual tasks, Robot Auton Syst 54 (2006) 370–384 [11] D Kulic, W Takano, Y Nakamura, Incremental learning, clustering and hierarchy formation of whole body motion patterns using adaptive hidden markov chains, Int J Robot Res 27 (7) (2008) 761–784 [12] T Petric, A Gams, L Zlajpah, A Ude, J Morimoto, in: Proceedings of the IEEE International Conference on Robotics and Automation [13] R Wang, Y Wu, W.L Chan, K.P Tee, in: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems [14] A Gams, T Petric, M Do, B Nemec, J Morimoto, T Asfour, A Ude, Adaptation and coaching of periodic motion primitives through physical and visual interaction, Robot Auton Syst 75 (B) (2016) 340–351 [15] W Takano, Y Nakamura, Symbolically structured database for human whole body motions based on association between motion symbols and motion words, Robot Auton Syst 66 (2015) 75–85 [16] W Takano, Y Nakamura, Integrating whole body motion primitives and natural language for humanoid robots, in: Proceedings of the IEEE-RAS International Conference on Humanoid Robots, 2008, pp 708–713 233 [17] W Takano, Y Nakamura, Statistical mutual conversion between whole body motion primitives and linguistic sentences for human motions, Int J Robot Res (2015) http://dx.doi.org/10.1177/0278364915587923 [18] D Herzog, V Krueger, D Grest, Parametric hidden markov models for recognition and synthesis of movements, in: Proceedings of the British Machine Vision Conference, 2008, pp 163–172 [19] K Sugiura, N Iwahashi, Motion recognition and generation by combining reference-point-dependent probabilistic models, in: Proceedings of the 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2008, pp 852–857 [20] H Kunori, D Lee, Y Nakamura, Associating and reshaping of whole body motions for object manipulation, in: Proceedings of the 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2009, pp 5240–5247 [21] S Calinon, F D’halluin, E.L Sauser, D.G Caldwell, A Billard, Learning and reproduction of gestures by imitation, IEEE Robot Autom Mag 17 (2) (2010) 44–54 [22] S Calinon, T Alisadeh, D.G Caldwell, On improving the extrapolation capability of task-parameterized movement models, in: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013, pp 610– 617 [23] L Rabiner, A tutorial on hidden markov models and selected applications in speech recognition, in: Proceedings of the IEEE, vol 77, 1989, pp 257–286 [24] T Flash, N Hogan, The coordination of arm movement: an experimentally confirmed mathematical model, Neurosci (1985) 1688–1703 [25] H Mayeda, K Yoshida, K Osuka, Base parameters of manipulator dynamic models, IEEE Trans Robot Autom (3) (1990) 312–321 [26] K Yoshida, D.N Nenchev, M Uchiyama, Moving base robotocs and reaction management control, in: Proceedings of the 7th International Symposium on Robotics Research, 1995, pp 101–108 Wataru Takano is an Associate Professor at Department of Mechano-Informatics, School of Information Science and Technology, University of Tokyo He was born in Kyoto, Japan, in 1976 He received the B.S and M.S degrees from Kyoto University, Japan, in precision engineering in 1999 and 2001, Ph.D degree from Mechano-Informatics, the University of Tokyo, Japan, in 2006 He was an Project Assistant Professor at the University of Tokyo from 2006 to 2007, and a Researcher on Project of Information Environment and Humans, Presto, Japan Science and Technology Agency from 2010 His field of research includes kinematics, dynamics, artificial intelligence of humanoid robots, and intelligent vehicles He is a member of IEEE, Robotics Society of Japan, and Information Processing Society of Japan He has been the chair of Technical Committee of Robot Learning, IEEE RAS Yoshihiko Nakamura is a Professor at Department of Mechano-Informatics, School of Information Science and Technology, University of Tokyo He was born in Osaka, Japan, in 1954 He received the B.S., M.S., and Ph.D degrees from Kyoto University, Japan, in precision engineering in 1977, 1978, and 1985, respectively He was an Assistant Professor at the Automation Research Laboratory, Kyoto University, from 1982 to 1987 He joined the Department of Mechanical and Environmental Engineering, University of California, Santa Barbara, in 1987 as an Assistant Professor, and became an Associate Professor in 1990 He was also a co-director of the Center for Robotic Systems and Manufacturing at UCSB He moved to University of Tokyo as an Associate Professor of Department of Mechano-Informatics, University of Tokyo, Japan, in 1991 His fields of research include the kinematics, dynamics, control and intelligence of robots ? particularly, robots with non-holonomic constraints, computational brain information processing, humanoid robots, human-figure kinetics, and surgical robots He is a member of IEEE, ASME, SICE, Robotics Society of Japan, the Institute of Systems, Control, and Information Engineers, and the Japan Society of Computer Aided Surgery He was honored with a fellowship from the Japan Society of Mechanical Engineers Since 2005, he has been the president of Japan IFToMM Congress He is a foreign member of the Academy of Engineering in Serbia and Montenegro ... angles of all the joints in the whole body as the configuration in this study A whole- body motion is represented by a vector of the joint angles xt at time t A motion segment is expressed by a motion. .. environment Motion synthesizer and torque controller We propose a novel approach to synthesizing motions by using the motion primitives and to subsequently controlling the joint torques in the humanoid... HMMs of motion primitives, and to subsequently controlling the torques of all the joints in a humanoid robot’s body such that the robot can perform human-like motions and maintain the desired