Optimization of the Robust Limit Cycle - A DIRECT COLLOCATION METHOD WITH STEP-TIME SAMPLING FO

3. A DIRECT COLLOCATION METHOD WITH STEP-TIME SAMPLING FOR

3.2 Optimization of the Robust Limit Cycle

Bipedal walking/running robots in general consist of both continuous (e.g. movement in the single support phase) and discrete (e.g. impact due to foot strike) behaviors, therefore can be modeled as hybrid systems. To find a limit cycle (the nominal periodic gait) of a given hybrid system, one common approach is to cast this problem as an open-loop optimal control problem and solve it using nonlinear optimization (i.e. trajectory optimization). Although this approach can generate walking gait for underactuated robots with complex dynamic behaviors, its open-loop nature make it challenging to impose cost or constraint to improve the robustness of the optimal solution, as the robustness is usually associated with the closed-loop controller design. In this section, the state of the art to find the robust limit cycle using trajectory optimization [32, 33] will be briefly introduced. 3.2.1 Problem Formulation for the Nominal Limit Cycle

A simple bipedal robot can be described as a hybrid systemH (assuming it has only one mode (e.g. walking phase) and one mode transition):

H =        ˙x = f (x, u), if φ(x, u) > hF x+ _{= ∆(x}−_), if φ(x, u) = hF

where x = [q, ˙q]T _{is the state of the system, f (·) is the state space equation of the robot.}

φ(·) is a guard function to trigger the mode transition (where the foot-strike happens), e.g., the function that determines the foot height where hF is the nominal step height. ∆(·) is a

function defined at the mode transition to map the pre-impact states x−to the post-impact states x+based on momentum conservation.

The method for finding a limit cycle is widely used and extended from the studies of passive dynamic walking [52, 53, 54], just to name a few. By running the simulation with

f (·) with the sequence of the states x(·) and control u(·), the optimization to find limit cycle can be formulated as:

min

x(·),u(·) J0(x, u)

s.t. x+ = ∆(x−) φ(x−) = hF

ceq(x, u) = 0, ciq(x, u) ≥ 0

where x(·) can be usually reduced to x0or a coefficient set α(·) of the trajectory to express

x(·). J0(x, u) is the objective function (e.g. uTu or cost of transport). An usual optimiza-

tion incorporated with simulation is also referred to as direct single shooting method. The major downsides of this method is that when the optimization complexity increases, it is often hard to converge, or can be sensitive to the initial condition. To make the nonlinear program more well-posed, direct collocation method discretizes the states and control into a set of free variables. Instead of running a simulation, the equations of the state continuity and the system dynamics are added into the equality constraints ceq(x). With

the discretization, a large but sparse nonlinear program can be formulated and solved effi- ciently.

3.2.2 Robustness of the Limit Cycle under Step Height Variation

Even though the robustness of the bipedal locomotion on uneven terrains could be challenging to quantify without considering the controller, there is an important insight we can get by checking the nominal limit cycle and its behavior under step height uncertainties [32]. The schematic of a nominal limit cycle and the mode transition of foot-strike is shown in Fig. 3.1(a). When a robot is walking on an uneven terrain, the pre-impact states will deviated from the x− (the circle markers in Fig. 3.1(b) and (c)). Intuitively, if the

(a) Nominal limit cycle

(b) Bad limit cycle _{(c) Good limit cycle}

Figure 3.1: The schematic of a nominal limit cycle, and the examples of ‘Bad’ and ‘Good’ limit cycles in terms of how far the post-impact states (due to step height uncertainties) deviated from the nominal limit cycle. The shaded region indicates the region of attraction of the trajectory.

post-impact states for all possible terrain heights are farther away from the nominal limit cycle (Fig. 3.1(b)), it is more likely that the robot will fall down because more post-impact states are out of the region of attraction (provided by the controller and trajectory). On the other hand, if all the post-impact states are closer to the limit cycle (Fig. 3.1(c)), then the system can be stabilized by the controller more easily. Therefore, this intuition is used to design the robust cost function for the robust trajectory optimization.

3.2.3 Robust Cost Function via Step Height Sampling

Assuming the profile of an uneven terrain is known and H = {hj|1 ≤ j ≤ K} is

a set of sampled heights with the fixed sampling increment dh, then the cost function which seeks the balance between the original objective J0(x, u) and the robust cost can be

generally expressed as: J0(x, u) + ω K X j=1 Jj(x−j, uj) s.t. φ(x−_j ) = hj, hj ∈ H

where Ji(x−j , uj) is the robust cost to measure the trajectory difference between the nom-

inal limit cycle and the perturbed trajectory with step height hj. By combining the opti-

mization with time-varying LQR, Dai [32] proposed the following robust cost:

Jj = (x+j − x +

)TS(tj)(x+j − x +

), S = ST, S > 0 (3.1)

where Ji indicates the cost-to-go for the given initial condition (x+j − x0) to the infinite

horizon with the time-varying LQR control (S(·) can be derived by integrating the periodic Riccati equation backward for the step-time tj), x+j is the post-impact state for the step

height hj. Although this method provides a nice robust cost associated with the LQR

controller design and it is using direct collocation for the nominal trajectory, the imposed constraints for the periodic Riccati equation, and the different sampled step heights with different time steps largely complicate the optimization. On the other hand, Griffin [33] proposed another robust cost function as:

Jj = 1 tj Z tj t=0 (||δxj(τ )||2+ ||δuj(τ )||2) τ tj dτ

where the square of the Euclidean distance between the normalized trajectory with the step height hj and the nominal limit cycle is measured. This proposed method was validated

with the experiment results using the planar bipedal robot MARLO. However, since it is based on the direct shooting method, it could be much more difficult for it to handle

more complicated systems. These potential issues motivate us to propose the robust cost function using step-time sampling which takes advantage of the structure provided by the direct collation framework.

In document Bipedal Walking Analysis, Control, and Applications Towards Human-Like Behavior (Page 56-60)