0% found this document useful (0 votes)

61 views

Path Planning Oriented Objects

Uploaded by

kevin

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

61 views

Path Planning Oriented Objects

Uploaded by

kevin

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 254

Robot Path Planning:

An Object-Oriented Approach

Morten Strandberg

TRITA–S3–REG–0401
ISSN 1404–2150
ISBN 91-7283-868-X

Automatic Control
Department of Signals, Sensors and Systems
Royal Institute of Technology (KTH)
Stockholm, Sweden, 2004

Submitted to the School of Electrical Engineering, Royal Institute of

Technology, in partial fulfillment of the requirements for the degree of
Doctor of Philosophy.
c 2004 by Morten Strandberg
Copyright

Robot Path Planning: An Object-Oriented Approach

Automatic Control
Department of Signals, Sensors and Systems
Royal Institute of Technology (KTH)
SE-100 44 Stockholm, Sweden

Tel. +46 8 790 6000

Fax. +46 8 790 7324
http://www.s3.kth.se
iii

Abstract
Path planning has important applications in many areas, for exam-
ple industrial robotics, autonomous systems, virtual prototyping, and
computer-aided drug design. This thesis presents a new framework for
developing and evaluating path planning algorithms. The framework is
named CoPP (Components for Path Planning). It consists of loosely cou-
pled and reusable components that are useful for building path planning
applications. The framework is especially designed to make it easy to do
fair comparisons between different path planning algorithms.
CoPP is also designed to allow almost any user-defined moving sys-
tem. The default type of moving system is a robot class, which is capable
of describing tree-like kinematic chains. Additional features of this robot
class are: joint couplings, numerical or closed-form inverse kinematics,
and hierarchical robot representations. The last feature is useful when
planning for complex systems like a mobile platform equipped with an
arm and a hand.
During the last six years, Rapidly-exploring Random Trees (RRTs)
have become a popular framework for developing randomized path plan-
ning algorithms. This thesis presents a method for augmenting bidirec-
tional RRT-planners with local trees. For problems where the solution
trajectory has to pass through several narrow passages, local trees help
to reduce the required planning time.
To reduce the work needed for programming of industrial robots, it
is desirable to allow task specifications at a very high level, leaving it up
to the robot system to figure out what to do. Here we present a fast and
flexible pick-and-place planner. Given an object that has to be moved to
another position, the planner chooses a suitable grasp of the object and
finds motions that bring the object to the desired position. The planner
can also handle constraints on, e.g., the orientation of the manipulated
object.
For planning of pick-and-place tasks it is necessary to choose a grasp
suitable to the task. Unless the grasp is given, some sort of grasp planning
has to be performed. This thesis presents a fast grasp planner for a three-
fingered robot hand. The grasp planner could be used in an industrial
setting, where a robot is to pick up irregularly shaped objects from a
conveyor belt. In conjunction with grasp planning, a new method for
evaluating grasp stability is presented.
v

Acknowledgments
First I would like to thank my advisors, Professor Bo Wahlberg and Pro-
fessor Henrik Christensen, for giving me the opportunity to be a graduate
student at the Centre for Autonomous Systems. As my toughest critics,
they have also been of invaluable help during the writing of this thesis.
I am especially indebted to Frank Lingelbach for many fruitful dis-
cussions on path planning, for proof reading, and for being a brave CoPP
test pilot.
I am thankful to the following persons for suggestions and correc-
tions to the manuscript: Fredrik Almqvist, Fredrik Niemelä, and Paul
Sundvall.
Finally, I thank my wife Firozeh and our children Nadja, Jonatan and
Peter for their love and support during these years; I promise that you
will see more of me from now on!
This research has been sponsored by the Swedish Foundation for
Strategic Research through the Centre for Autonomous Systems at KTH.
The support is gratefully acknowledged.
Contents

1 Introduction 1
1.1 Path Planning Applications . . . . . . . . . . . . . . . . . 2
1.2 Notation and Terminology . . . . . . . . . . . . . . . . . . 4
1.3 Probabilistic Roadmap Methods . . . . . . . . . . . . . . 6
1.4 Outline and Contributions of the Thesis . . . . . . . . . . 8
1.5 Publications . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2 A Framework for Path Planning 13

2.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 Framework Overview . . . . . . . . . . . . . . . . . . . . . 16
2.3 Dealing with Geometry . . . . . . . . . . . . . . . . . . . 19
2.3.1 Requirements on Geometric Types . . . . . . . . . 20
2.3.2 Moving Objects Around . . . . . . . . . . . . . . . 22
2.3.3 Concrete Geometric Types . . . . . . . . . . . . . 23
2.4 Collision Detection . . . . . . . . . . . . . . . . . . . . . . 24
2.4.1 Convex Polyhedra and Collision Detection . . . . . 25
2.4.2 Proximity Queries on Pairs of Objects . . . . . . . 28
2.4.3 Classes for Dealing with Sets of Objects . . . . . . 30
2.5 Configuration Space Interpolation . . . . . . . . . . . . . 33
2.5.1 Interpolation of Revolute Joints . . . . . . . . . . . 33
2.5.2 Interpolation of Rigid Body Orientations . . . . . 35
2.5.3 Car-Like Robots . . . . . . . . . . . . . . . . . . . 36
2.5.4 Interpolation Objects . . . . . . . . . . . . . . . . 38
2.6 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.6.1 Useful Metrics for Path Planning . . . . . . . . . . 41
2.6.2 Implemented Classes for Metrics . . . . . . . . . . 42
2.7 Configuration Space Sampling . . . . . . . . . . . . . . . . 44
2.7.1 Narrow Passage Sampling . . . . . . . . . . . . . . 45
viii Contents

2.7.2 Constraint Based Sampling . . . . . . . . . . . . . 46

2.7.3 Uniformly Distributed Rotations . . . . . . . . . . 47
2.7.4 Deterministic Sampling . . . . . . . . . . . . . . . 47
2.7.5 Sampling Strategy Classes . . . . . . . . . . . . . . 48
2.8 Local Planners . . . . . . . . . . . . . . . . . . . . . . . . 50
2.8.1 Checking Path Segments . . . . . . . . . . . . . . . 50
2.8.2 Local Planner Interface . . . . . . . . . . . . . . . 51
2.8.3 Example of a Flexible Local Planner . . . . . . . . 52
2.9 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . 53

3 A General Robot Model for Path Planning 55

3.1 Notation for Articulated Structures . . . . . . . . . . . . . 56
3.1.1 Denavit-Hartenberg Notation . . . . . . . . . . . . 57
3.1.2 Kleinfinger-Khalil Notation . . . . . . . . . . . . . 60
3.2 Modeling Joint Couplings . . . . . . . . . . . . . . . . . . 61
3.3 Inverse Kinematics . . . . . . . . . . . . . . . . . . . . . . 64
3.3.1 The Jacobian . . . . . . . . . . . . . . . . . . . . . 65
3.3.2 A Numerical Solver Based on the Pseudo-Inverse . 67
3.4 Robot Composition . . . . . . . . . . . . . . . . . . . . . . 70
3.5 Design and Implementation . . . . . . . . . . . . . . . . . 72
3.5.1 Self-Collision Table . . . . . . . . . . . . . . . . . . 73
3.5.2 Multiple Inverse Kinematics Solvers . . . . . . . . 73
3.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . 78

4 Augmenting RRT-Planners with Local Trees 81

4.1 The RRT-ConCon Algorithm . . . . . . . . . . . . . . . . 82
4.2 Local Trees . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.2.1 The RRT-LocTrees Algorithm . . . . . . . . . . . . 86
4.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.4 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . 92

5 Planning of Pick-and-Place Tasks 97

5.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 98
5.2 Overview of the Planner . . . . . . . . . . . . . . . . . . . 99
5.3 Grasp Generators . . . . . . . . . . . . . . . . . . . . . . . 104
5.3.1 Algorithms Suitable as Grasp Generators . . . . . 104
5.3.2 The Grasp Generator Interface . . . . . . . . . . . 106
5.3.3 Feedback to Grasp Generator Clients . . . . . . . . 107
5.4 Retract Planners . . . . . . . . . . . . . . . . . . . . . . . 109
5.5 Planning of Arm and Hand Motions . . . . . . . . . . . . 110
Contents ix

5.5.1 Multiple Goals . . . . . . . . . . . . . . . . . . . . 112

5.5.2 RRT-Planners and Retract Trees . . . . . . . . . . 113
5.5.3 Backtracking . . . . . . . . . . . . . . . . . . . . . 114
5.5.4 Efficient use of Robot Composition . . . . . . . . . 115
5.6 Task Constraints . . . . . . . . . . . . . . . . . . . . . . . 116
5.7 Path Smoothing . . . . . . . . . . . . . . . . . . . . . . . 117
5.7.1 Path Optimization . . . . . . . . . . . . . . . . . . 118
5.7.2 Spline Paths . . . . . . . . . . . . . . . . . . . . . 120
5.8 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
5.9 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . 130

6 Grasp Planning for a Three-Fingered Hand 135

6.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 136
6.2 Problem Description . . . . . . . . . . . . . . . . . . . . . 137
6.3 The Grasp Planner . . . . . . . . . . . . . . . . . . . . . . 139
6.3.1 An Overview . . . . . . . . . . . . . . . . . . . . . 139
6.3.2 Global Contour Characteristics . . . . . . . . . . . 143
6.3.3 Choosing Good Thumb Positions . . . . . . . . . . 145
6.3.4 Use of Precomputed Trajectories . . . . . . . . . . 146
6.3.5 Grasp Quality Evaluation . . . . . . . . . . . . . . 149
6.4 Examples of Planned Grasps . . . . . . . . . . . . . . . . 151
6.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . 154

7 Grasp Stability Evaluation 157

7.1 Grasp Analysis Introduction . . . . . . . . . . . . . . . . . 158
7.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 160
7.3 Grasp Evaluation Procedure . . . . . . . . . . . . . . . . . 164
7.3.1 An Algorithm for Solving the Min-Max Problem . 166
7.3.2 Improvement from Sorting the Hyperplanes . . . . 168
7.3.3 Disturbance Forces and Object Vertices . . . . . . 170
7.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
7.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . 173

8 Summary and Suggestions for Future Work 179

A Class Diagram Notation 183

B Rigid Body Transformations 187

B.1 The Homogeneous Transformation Matrix . . . . . . . . . 187
B.1.1 A Class for Homogeneous Transformations . . . . 190
B.2 Representing Rotations . . . . . . . . . . . . . . . . . . . 191
x Contents

B.2.1 Rotation Matrices . . . . . . . . . . . . . . . . . . 191

B.2.2 Euler Angles . . . . . . . . . . . . . . . . . . . . . 192
B.2.3 Quaternions . . . . . . . . . . . . . . . . . . . . . . 193
B.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 196

C Geometric Representations 197

C.1 Indexed Face Sets . . . . . . . . . . . . . . . . . . . . . . . 198
C.2 Triangle Sets . . . . . . . . . . . . . . . . . . . . . . . . . 199
C.3 Convex Polyhedra . . . . . . . . . . . . . . . . . . . . . . 199
C.4 Non-Convex Objects and Groups of Convex Objects . . . 202
C.4.1 Three Possible Designs . . . . . . . . . . . . . . . . 202
C.4.2 An Example of a Hierarchical Geometric Object . 204

D Pairwise Collision Detection with PQP and GJK 207

D.1 Encapsulating PQP . . . . . . . . . . . . . . . . . . . . . 207
D.2 Encapsulating Enhanced GJK . . . . . . . . . . . . . . . . 210
D.3 Mixing Different Algorithms . . . . . . . . . . . . . . . . . 211

E Framework Details 213

E.1 Configuration Space Path . . . . . . . . . . . . . . . . . . 213
E.2 Binary Constraints . . . . . . . . . . . . . . . . . . . . . . 214
E.3 Problem Class . . . . . . . . . . . . . . . . . . . . . . . . . 215
E.4 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . 217
E.5 Robot Description Files . . . . . . . . . . . . . . . . . . . 219

F The Boost Libraries 223

F.1 Boost Graph Library . . . . . . . . . . . . . . . . . . . . . 223
F.2 Spirit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224

References 229
Chapter 1

Introduction

In its most basic form, robot path planning is about finding a collision-
free motion from one position to another. Efficient algorithms for solving
problems of this type have important applications in areas such as: indus-
trial robotics, computer animation, drug design, and automated surveil-
lance. It is therefore not surprising that the research activity in this field
has been steadily increasing over the last two decades.
In the first part of this thesis, we study how object-oriented design
methods can be of use in the context of path planning. The result is an
object-oriented framework that makes it easy to develop and compare
path planning algorithms. Using this framework, two new algorithms
have been developed; one for the basic form of the path planning problem,
and one aimed for pick-and-place tasks.
The last chapters of the thesis are about grasp planning, which is a
related but slightly different type of planning; whereas path planning is
about finding motions, grasp planning is about finding a single configu-
ration, i.e., the grasp, that satisfy certain specifications.
The next section gives examples of application areas for path planning
algorithms. Section 1.2 introduces the terminology and notation that
is used in the path planning community. Section 1.3 presents a type
of randomized path planning algorithms that are most common today,
namely probabilistic roadmap methods (PRMs). The framework to be
presented in the thesis is not constrained to a particular planning method,
but will focus on PRM-like methods. Hence, Section 1.3 is useful to
understand the motivation behind the framework.
The final section gives an overview of the thesis and its contributions.
2 1 Introduction

1.1 Path Planning Applications

Apart from the obvious application areas in industrial robotics and au-
tonomous systems in general, there are also other useful application areas
for path planning. In this section we list some interesting examples to-
gether with relevant references.

Industrial and Service Robotics In industrial applications, the

robot motion has to be carefully programmed for each new task. As
this programming can be both laborious and time consuming, there is
much to win if this process could be made semi-autonomous. That is, a
path planning algorithm could be used to give suggestions on collision-
free motions. The robot programmer could then choose to accept or to
modify the generated motions, before making them part of the robot
program.
Unlike industrial robots, service robots have to operate in unpre-
dictable and unstructured environments. Such robots are constantly
faced with new situations for which there are no preprogrammed mo-
tions. Thus, these robots have to plan their own motions. Path planning
for service robots are much more difficult due to several reasons. First,
the planning has to be sensor-based, implying incomplete and inaccurate
world models. Second, real-time constraints means limited resources for
planning. Third, due to incomplete models of the environment, planning
could involve secondary objectives, with the goal to reduce the uncer-
tainty about the environment.
Navigation for mobile robots is closely related to sensor-based path
planning in 2D, and can be considered as a mature area of research [21,
30]. Sensor-based planning for manipulators, on the other hand, is still a
very open research problem. One of the first systems capable of sensor-
based planning for manipulation was the Handey system [93, 62]. Based
on laser range data, this system could plan pick-and-place tasks for a ma-
nipulator equipped with a parallel-jaw gripper. For two recent examples,
see Ahuactzin and Portilla [2], and Suppa et al. [146].

Computer Animation There is a growing integration of computer

animation with artificial intelligence to create virtual actors [40, 71]. Au-
tonomous or semi-autonomous virtual actors lead to higher productivity
as animators can now use high-level commands instead of specifying long
sequences of key-frames. Furthermore, with physics-based simulation, a
higher degree of realism can be achieved [40]. In his thesis, Kuffner [71]
1.1 Path Planning Applications 3

used path planning techniques to create autonomous characters for real-

time animation. Pettré et al. [118] were able to plan walking motions
for a virtual character with 57 degrees of freedom. The type of motion
used could be determined from motion capture data, allowing for realistic
animations. The techniques used in [71] and [118] can also be used in
computer games to create more convincing and more skilled computer-
controlled characters.

Virtual Prototyping Virtual prototyping involves modeling and sim-

ulation of all important aspects of a prototype, e.g., mechanical design,
kinematics, and dynamics, accompanied by realistic visualization. For
a mechanical assembly like an engine, one important aspect of the final
product is maintainability: How easy is it to reach and replace a cer-
tain part in the assembly? Without a physical prototype, such questions
can be difficult to answer, and path planning algorithms could thus be
a useful tool. For efficient manufacturing, we are interested in assem-
bly strategies that are as efficient as possible. Guibas et al. [51] and
Sundaram et al. [145] studied algorithms for the automatic generation of
efficient assembly/disassembly strategies. Another related application is
the problem of automated carton folding, see Lu and Akella [94].

Surveillance Planning For automated inspection or surveillance of

indoor environments it is important to find a short paths that covers the
entire environment. This problem, also known as the watchman route
problem, was considered by Danner and Kavraki [37]. Their approach
is general in that different visibility constraints can be imposed on the
sensor model and it is also applicable for three-dimensional regions. In a
variation of the watchman route problem, there is one or more intruders
present that has to be detected. This problem is harder because intrud-
ers are assumed to move arbitrarily fast and can play “hide-and-seek”
by sneaking back into areas already covered by the surveillance agent.
A solution, if it exists, will cover the environment and detect all intrud-
ers, independent of their number and maximum speed. LaValle et al. [83]
dealt with this problem in the case of one or more robots with omnidirec-
tional vision and a polygonal environment. LaValle and Hinrichsen [80]
later extended this work to include curved environments. In the case of
a beam-like sensor and a polygonal environment, Simov et al. [138] pre-
sented an O(n3 ) complete algorithm, where n is the number of vertices
in the environment.
4 1 Introduction

Computational Biology A somewhat unexpected application area for

path planning algorithms is within the fields of computational biology
and chemistry. In these fields path planning algorithms have been used
to study flexible ligand docking [14], and protein folding pathways [7].
Ligand docking is important in the area of computer-aided drug de-
sign, where the goal is to find a small molecule (the ligand) that is able to
dock with a target protein (the receptor). A potential docking configura-
tion must not only correspond to a configuration of low potential energy;
it must also be accessible to the ligand from an outside location. Bayazit
et al. [14] used path planning algorithms to find docking configurations
for ligand-protein pairs. The found docking configurations were close to
the real ones.
Protein molecules are long chains of amino acids. These molecule
chains will, under normal circumstances, fold themselves into a close-
packed, low-energy configuration. This folding process is important to
understand for several reasons: The protein’s function is determined by
its three-dimensional structure, and disturbed protein folding is related
to diseases, such as cystic fibrosis and Alzheimer’s disease [75]. The
folding process is hard to capture experimentally because it happens so
quickly [7], and therefore simulation is a necessary tool. Amato and
Song [7] studied protein folding as a path planning problem toward a
low-energy configuration. Because protein molecules can have several
thousands of degrees of freedom, these problems are very hard to solve
even for small proteins.

1.2 Notation and Terminology

Here we give a brief introduction to common path-planning terminology.
A path planning problem involves one or more robots1 moving in
W = R3 . (For a two-dimensional problem we have W = R2 .) A robot is
denoted with A, and a robot’s configuration is usually denoted with q.
If the robot is a kinematic chain, then the configuration q is typically a
vector containing the joint angles. If the robot is a free-flying rigid body,
then the configuration consists of a translational part and a rotational
part. Depending on how we represent rotations, the rotational part could
be three Euler angles or it could be a quaternion (see Appendix B). The
1 Note the we use the word robot in a very broad meaning here: It can be used to

mean a kinematic chain, a rigid body, or even an aircraft.

1.2 Notation and Terminology 5

closed subset of world, W, that is occupied by the robot in configuration

q is denoted by A(q). Thus, A(q) ⊂ W.

Obstacle Region The obstacle region, O, is defined as the set of all

points in W that belong to one or more obstacles. If the obstacles are
represented by closed subsets of W, then the obstacle region is also a
closed subset, O ⊂ W.

Configuration Space The space of all possible robot configurations is

denoted by C, and is called the configuration space of the robot. The con-
figuration space concept is very important in path planning. For a robot
arm with n joints, where each joint has a limited range, the configura-
tion space is simply Rn . Other problems can have a configuration space
with a completely different topology: Consider a two-dimensional world
with a rigid body that can translate and rotate. Due to the unbounded
rotational degree of freedom, the configuration space for this problem is
R2 × S1 , where S1 denotes the unit circle.

Configuration Space Obstacles The set of all robot configurations

for which the robot intersects an obstacle is denoted by Cobs . This set is
defined as:

Cobs = {q ∈ C | A(q) ∩ O =
6 ∅} (1.1)
Equation (1.1) can be seen as a mapping of the obstacles in W to obstacles
in the configuration space C. The resulting regions are often denoted
configuration space obstacles, or C-space obstacles for short. Note that
for robots consisting of several moving objects, the definition of Cobs in
Equation (1.1) have to be extended to include self-collisions.
The remaining portion of the configuration space is called the free
space. The free space is defined and denoted as Cf ree = C \ Cobs .
Since the seminal work of Lozano-Pérez [92], most planning algo-
rithms work in the configuration space. Transforming the problem from
W to the C-space using Equation (1.1) is a useful abstraction: Path plan-
ning problems that look very different in W still look similar in a C-space
formulation, allowing the same type of algorithm to solve widely differ-
ent problems. Note, however, that the transformation in Equation (1.1)
is carried out explicitly only for very simple problems of low-dimension.
Most algorithms instead work by probing C to find out wether a config-
uration belongs to Cf ree .
6 1 Introduction

Piano Mover’s Problem The goal of a path planning problem is thus

to find a path in Cf ree that connects an initial configuration, qi , with a
goal configuration, qg . This particular type of problem is often referred
to as the Piano Mover’s Problem in the literature. In the formulation of
the Piano Mover’s Problem there is no concept of time; it is enough to
find a sequence of translations and rotations that takes the robot from
the initial configuration to the goal configuration, and the velocities and
accelerations are of no concern. If we want to solve the problem, and at
the same time respect the dynamic constraints of the particular system,
then we have an instance of a kinodynamic motion planning problem. In
this thesis we only briefly touch upon kinodynamic planning. For work
on kinodynamic motion planning, see, e.g., LaValle and Kuffner [81] and
Lamiraux et al. [74].
For a more formal definition of the concepts in this section, see the
classic textbook by Latombe [77]. A more recent alternative is the book
by LaValle [79].

1.3 Probabilistic Roadmap Methods

A path planning algorithm that is guaranteed to find a solution if one
exists, and report failure if there is no solution, is said to be complete.
Reif [125] showed that the Piano Mover’s Problem is NP-hard. This
result implies that complete algorithms are only practical for problems
with a low-dimensional configuration space. The complexity of complete
path planning methods lead researchers to seek heuristic methods with
weaker notions of completeness, such as probabilistic completeness [64].
An algorithm that is probabilistically complete is guaranteed to find the
solution to a solvable problem in finite time [147].
In a series of ground breaking papers, Kavraki and Latombe [63],
Švestka and Overmars [147], and Kavraki et al. [64] laid the ground
for probabilistic roadmap methods (PRM). PRM-methods work in two
phases: a learning phase, and a query phase. In the learning phase (also
called construction phase or preprocessing phase in the literature), the
configuration space is sampled for collision-free configurations. These
configurations form the vertices in a graph called a roadmap. A simple
local planner is used to look for connections between nearby vertices. If
a connection is found, an edge is added to graph, connecting the corre-
sponding vertices. The idea is that the roadmap will eventually give a
sufficient representation of Cf ree .
1.3 Probabilistic Roadmap Methods 7

(a) Roadmap (b) Query

Figure 1.1: (a) An example of a roadmap for a two-dimensional config-

uration space. (b) An example of a roadmap query. The resulting path
is shown by the thicker lines.

In the query phase, the initial configuration, qi , and the goal con-
figuration, qg , are connected to the graph using the same local planner
that was used for the learning phase. If this connection succeeds, then
the path planning problem has been reduced to a graph search problem,
which can be answered in fractions of a second. Figure 1.1 (a) shows
an example of a roadmap for a two-dimensional configuration space. An
example of a query to the same roadmap is shown in Figure 1.1 (b).
PRM methods are particularly useful if repeated queries are expected
for the same environment; then the cost for constructing the roadmap
is amortized over each query. Thus, given a static environment, the
overall efficiency of a PRM-planner usually increases with the number of
queries. Due this behavior, there is often a distinction between single
shot planners, and multiple query planners. PRM-planners have for a
long time been thought of as multiple query planners due to the costly
learning phase, but recent contributions by Bohlin and Kavraki [17, 18]
have changed on that; they showed that the costly operation of verifying
wether path segments are collision free could be postponed until the query
phase. Thus, the constructed roadmap contains many infeasible edges,
but these are detected and deleted during the query phase. With this
Lazy PRM approach, the number of required collision detections were
significantly reduced, making the approach competitive as a single shot
8 1 Introduction

planner as well.

Variations Since the original work in [63, 147, 64], numerous varia-
tions, improvements, and extensions have been suggested. Already in [64]
it was pointed out that using uniform sampling of the C-space would not
be efficient in case the problem involves a narrow passage. To get through
the passage, the roadmap must have at least one vertex in that area.
However, as the critical region usually covers an extremely small portion
of the configuration space, the chance of getting a configuration there at
random is very small. Therefore many researchers have suggested sam-
pling schemes that bias the distribution towards the narrow passages of
the configuration space.
Other variations explore different heuristics for determining between
which vertices connection attempts should be made. These heuristics
often involve the choice of a metric, according to which the distance
between two configurations is determined. Throughout the thesis we will
use the words vertices and nodes interchangeably.
In Chaper 2 we will describe many of these PRM-methods in more
detail. In that chapter we will also present an object-oriented framework
that offer a systematic way to deal with all these variations.

1.4 Outline and Contributions of the Thesis

Chapter 2
Chapter 2 describes an object-oriented framework for developing and
comparing path planning algorithms. The framework consists of reusable
components that accelerate the development of path planning algorithms.
There exist many other path planning frameworks with the purpose to
ease and speedup the development of new algorithms, but the framework
in this thesis is specifically designed to make it easy to compare different
algorithms. This feature is an important contribution as it addresses
the lack of fair comparisons between the many similar path planning
algorithms that exist. The following list summarizes the key features of
the framework:
• The framework makes it easy to do fair comparisons between vari-
ations of a concept, such as different collision detection algorithms.
• There are few restrictions on the type of planning algorithms that
can be implemented with the framework.
1.4 Outline and Contributions of the Thesis 9

• The framework is platform independent.

• Adaptive planners that change between different algorithms de-

pending on the problem, are easily implemented.

Chapter 2 can also be read as an introduction to several important topics

such as metrics, sampling, and collision detection.

Chapter 3
Chapter 3 describes a general class for representing robots. The class
can describe robots with tree-like kinematic chains and joint couplings.
The possibility to add joint couplings makes it easy to model robots
with coupled degrees of freedom, such as, e.g., the Barrett hand.2 Joint
couplings also allow us to model certain types of robots with closed kine-
matic chains; an important feature as many industrial robots have closed
kinematic chains to increase the stiffness of the structure.
The robot class also provides features like general and specific in-
verse kinematics and robot composition. These two features have shown
to be useful in high-level path planning applications like pick-and-place
planning, see Chapter 5.

Chapter 4
Rapidly-exploring Random Trees (RRTs) [78] have become a popular
framework for solving path planning problems. In Chapter 4 we address
some of the weaknesses of bidirectional RRT-planners and present an im-
proved algorithm. The improved algorithm uses local trees, in addition to
the ones rooted at the start and goal configuration. Several experiments
show that the new algorithm performs much better than the original one
in case the solution has to pass several difficult regions.

Chapter 5
In Chapter 5 we present a pick-and-place planner that builds upon the
work by Kuffner [71]. The proposed planner is fast; it can solve every-
day pick-and-place tasks in a few seconds. Some improvements over the
planner in [71] allow the new planner to handle difficult problems more
2 The Barrett hand is a three-fingered robot hand from Barrett Technology, Inc.

efficiently. Other contributions in this chapter are: the use of multi-

ple goals, and a useful model for the separation of the grasp generation
process from the trajectory planning.

The second part of the thesis focuses on grasp planning and grasp sta-
bility. To be of real use, a robot must be able to interact with its envi-
ronment by picking up and manipulating objects. Thus, for robots that
autonomously interact with the environment, grasp planning and path
planning are needed in combination.

Chapter 6
In Chapter 6 we present a fast grasp planner for a three-fingered hand.
The planner uses a two-dimensional contour of the object as input and
the result is an ordered set of stable grasps. This contribution could be
useful for example in industrial applications where a robot is to pick up
objects of unknown shape from a conveyor belt.

Chapter 7
Grasp planning is often formulated as an optimization problem, where
the goal is to find the best grasp according to some criterion. In Chapter 7
we present a novel approach to grasp stability evaluation. The approach
is based on the grasp’s ability to withstand disturbance forces that act
on the surface of the grasped object. Previous approaches have often
treated the grasp stability analysis as independent of the object geometry,
whereas we claim that the stability depends on the geometry also. Our
approach takes the geometry into account, and it can also incorporate
task information to derive a task directed quality measure for a grasp.

Appendices
Much of the framework in Chapter 2 is illustrated using a variant of the
class diagram notation introduced by Rumbaugh et al. [126]. Appendix A
gives a brief introduction to class diagrams. Here we also make clear the
additional conventions that are used in this thesis.
Appendix B gives a short introduction introduction to rigid body
transformations and various ways of representing rotations.
The framework presented in Chapter 2 is designed to support different
geometric representations. Users of the framework can define their own
1.5 Publications 11

geometric types, or they can use one of the three types that the frame-
work provides: triangle sets, convex polyhedra, and hierarchies of convex
polyhedra. These three geometric types are covered in Appendix C.
In a similar manner, the framework is not bound to a specific collision
detection algorithm. The collision detection algorithms that have been
tested so far are PQP [76] and Enhanced GJK [27]. These algorithms are
covered in Appendix D.
Appendix E covers additional details of the framework, such as visu-
alization, task constraints, and path representation.
Appendix F presents parts of the Boost libraries that have been found
useful for implementing path planning algorithms. Examples are: generic
graph functions, random number generation, matrix classes, and parsing.

1.5 Publications
Some of the work presented in this thesis has been published elsewhere.
The material in Chapters 4, 6, and 7 is presented in part in the following
papers:

• M. Strandberg and B. Wahlberg. A method for grasp stability

evaluation based on disturbance force rejection. submitted in Aug.
2004 to IEEE Transactions on Robotics.

• M. Strandberg. Augmenting RRT-planners with local trees. In

IEEE International Conference on Robotics and Automation, vol-
ume 4, pages 3258–3262, New Orleans, LA, USA, Apr. 2004.

• M. Strandberg. A fast grasp planner for a three-fingered hand based

on 2D contours. In 33rd International Symposium on Robotics,
Stockholm, Sweden, Oct. 2002. IFR, Swira.

• M. Strandberg. A grasp evaluation procedure based on distur-

bance forces. In IEEE/RSJ International Conference on Intelligent
Robots and Systems, volume 2, pages 1699–1704, EPFL, Lausanne,
Switzerland, Oct. 2002.

The following publication contains results that are relevant, but not
covered in the thesis.

• L. Petersson, P. Jensfelt, D. Tell, M. Strandberg, D. Kragic and

H. I. Christensen. Systems integration for real-world manipulation
12 1 Introduction

tasks. In IEEE International Conference on Robotics and Automa-

tion, volume 3, pages 2500–2505, Washington, DC, USA, May 2002.
Chapter 2

A Framework for Path

Planning

Since the original work in [63, 147, 64], many variations of the basic
PRM algorithm have been published. Examples are: different sampling
schemes, different metrics, and various local planners. However, it is
difficult to compare these contributions objectively as “they were tested
on different types of scenes, using different underlying libraries, imple-
mented by different people on different machines” [46]. This problem
underlines the importance of a homogeneous software platform, where
changing, e.g., the collision detection algorithm is as simple as changing
one line of code or a directive in a configuration file. Only then can we
truly compare two versions of the same concept.
To address this problem, we will in this chapter present an object-
oriented framework for building path planners. The design rationale be-
hind the framework is plug-and-play capability between variations of all
the major concepts in path planning. Apart from the framework itself,
the contributions in this chapter are the identification of a set of or-
thogonal concepts, and the definition of coherent and efficient interfaces
to these concepts. The resulting framework is a set of loosely coupled
components that can be combined and extended to build path planners.
To emphasize the component-based approach, the framework is named
CoPP, which stands for Components for Path Planning.
As this is not the first attempt to create a framework for path plan-
ning, the next section will describe other approaches and we will see how
14 2 A Framework for Path Planning

they compare to CoPP. In Section 2.2 we give an overview of the frame-

work and we also group the different path planning concepts into four
high-level categories. These categories make it easier to get an overview
of the different concepts that are modeled in the framework, but they can
also serve as a model for how path planning applications are built using
the framework.1 For each of the four categories in Section 2.2, we identify
a number of concepts, such as metrics, sampling, and collision detection.
The most important of these concepts are covered in Sections 2.6–2.8.
The object-oriented design of each concept is visualized using a variant
of the class diagram notation introduced by Rumbaugh et al. [126]. A
brief introduction to the class diagram notation is given in Appendix A.
Throughout this chapter we try to focus on the high-level aspects of the
design, but readers who are interested in more details can find additional
material in the appendices.

2.1 Related Work

There are some recent papers where the need for fair comparisons be-
tween different path planning algorithms have been addressed. Amato et
al. [6] studied the effect of using different metrics and local planners for
rigid body problems. Reggiani et al. [124] used pluggable distance com-
putation functions to perform an extensive study of of existing collision
detection packages. Geraerts and Overmars [46] studied existing sam-
pling techniques and node adding techniques using a framework called
SAMPLE (System for Advanced Motion PLanning Experiments). Stud-
ies like these are important in the correct assessment of the merits and
drawbacks of different algorithms and is something that should be done
each time a new algorithm has been developed. It is therefore highly
desirable to have an extendable framework in which we can do compar-
isons between variations on different concepts. The CoPP framework is
designed such that doing experiments like those in [6, 124, 46] becomes
very easy.
There are a number path planning frameworks that have been pre-
sented in the literature, some of which are also publicly available. Among
the more popular is the Motion Strategy Library (MSL) developed at
the University of Illinois.2 MSL provides planners based on Rapidly-
1 Note that the we do not specify a specific architecture for path planners. We focus

instead on building blocks that can be used to build a path planner.

2 MSL is available at msl.cs.uiuc.edu/msl
2.1 Related Work 15

exploring Random Trees (RRTs), PRMs and forward dynamic program-

ming. A nice feature about MSL is that its design allows holonomic and
dynamic systems to be treated in a uniform manner. Thus, from a plan-
ner’s point of view it does not matter wether the system is dynamic or
not. However, there are also some drawbacks with the design of MSL: In
some areas the design does not have a good separation of concerns and
the granularity of the modeled concepts is too coarse, making it hard to
vary concepts individually. Another problem is that due to the chosen
method for visualization, MSL does not work under Windows. In CoPP,
we have avoided most of the portability issues by using the platform
independent VRML [8] language for visualization.
Another framework is OxSim, developed at Oxford University, see
Qin et al. [122], Cameron [28], and Cameron and Pitt-Francis [29]. The
design of OxSim focuses on the separation of distance computations and
geometric issues from the planner; distance computations is provided by
a module based on the Enhanced GJK algorithm [27], also developed
at Oxford. Due to the GJK algorithm only working on convex objects,
OxSim is currently limited to objects that can be described as the union of
convex polyhedra. The framework also require problems to be formulated
in terms of a kinematic chain, i.e., a serial manipulator. Both PRM
planners and potential field planners have been tested in OxSim, see [122,
29].
The, in our opinion, most mature and successful framework for
path planning so far is the Move3D framework developed by Siméon
et al. [135, 136]. This opinion is partly based on the number of new
PRM-techniques that has been developed and tested with Move3D,
see [35, 50, 134, 118]. Move3D has also resulted in a commercial spin-off
product named KineoWorks3 , which might be the reason why Move3D is
not publicly available. Move3D is implemented in pure C, and the man-
ual [110] reveals that the set of modeled concepts is roughly the same
as in CoPP. Another feature common to both frameworks is the support
for several collision detection algorithms. However, because of the strong
encapsulation offered by object-oriented C++, we believe that new con-
cepts can be added more easily to CoPP than to Move3D.4

3 www.kineocam.com
4 Object-oriented programming is of course possible in C as well, but it usually

requires much more work and discipline of the programmer.

16 2 A Framework for Path Planning

Geometric Representations
Models of Motion move objects
- triangle sets
- rigid body - convex polyhedra
- kinematic chain - octrees

change configuration position and

orientation

Planner
Collision Detection/
collisions/distances Distance Computation
- metrics
- interpolation
- sampling

Figure 2.1: Illustration of the main components that are involved when
solving a path planning problem. The arrows show the flow of infor-
mation for a typical implementation. The dashed line indicates a more
passive relationship; the geometric objects that need to be checked for
collisions are usually registered once at startup to the object responsible
for collision detection.

2.2 Framework Overview

As mentioned in the introduction to this chapter, contributions in the
path planning field tend to be variations of concepts. Thus, the concepts
are stable, whereas their implementation is not. The CoPP framework
can be seen as a collection of a number of, largely, orthogonal concepts
that have proven useful in the field of path planning. To get an overview
of the framework, we have grouped the concepts into four categories,
as shown in Figure 2.1. The categories are: geometric representation,
collision detection, motion generation, and planner concepts. These cat-
egories can also be seen as modules in a path planning application, where
the arrows in Figure 2.1 show the typical flow of information between the
modules.

Geometric Representation As path planning involves moving geo-

metric objects, one of the most basic issues is how we should represent
the geometry of the world and the objects that move in it. Typically,
2.2 Framework Overview 17

the chosen representation is tightly coupled to the method used for col-
lision detection, but here we want to see the geometric representation as
a concept that can vary as freely as possible. At the highest level, as
described by Figure 2.1, we only require a geometric object to have have
a few basic properties such as a position, an orientation, and a bounding
box, determining the object’s extent in space. We make no distinction
between stationary and moving objects; all objects are seen as moveable.
More details on geometric representations can be found in Section 2.3.

Models of Motion In addition to a description of the environment, a

path planning problem also involves one or more moving agents, denoted
by A in the case of a single agent. An agent can be any moving system,
ranging from a single rigid body or a robot manipulator, to a full-blown
dynamic model of an aircraft. The point here is that with the proposed
model for geometric representation, an agent is totally decoupled from
the method we use for representing geometric objects; the agent simply
forwards its current position and orientation to the geometric objects that
model its physical parts, see Figure 2.1. From this perspective, agents
are simply rules for generating motion. In this thesis, we focus on two
types of agents, rigid bodies and robots that can be described by tree-like
kinematic chains. The robot model is covered in Chapter 3.

Collision Detection Path planning algorithms depend heavily on

methods for collision detection and the efficiency of the collision detection
algorithm that is used can greatly affect the total planning time. To de-
couple the planner from the collision detection algorithm we use concepts
like distance engines, which are objects that provide a uniform interface
for various proximity queries. Geometric objects that we want to check
for collisions are typically registered to a distance engine at initialization.
The distance engine can, depending on its type, use the geometric objects
right away, or it may create its own internal representation of them. This
weak relationship between the distance engine and the geometric objects
is indicated with the dashed line in Figure 2.1. We point out that, due to
being totally decoupled from the path planner, the distance engine con-
cept is very useful even in applications unrelated to path planning, e.g.,
in applications for physics simulation, or in computer games. Distance
engines and related concepts are covered in Section 2.4.
18 2 A Framework for Path Planning

Planner Concepts The last category in Figure 2.1, planner concepts,

is more directly related to path planning than the other categories. Ex-
amples of concepts in this category are:

• Interpolation methods, Section 2.5

• Metrics, Section 2.6

• Sampling strategies, Section 2.7

• Local planning methods, Section 2.8

• Path smoothing, Section 5.7

Design Rationale Throughout the design of CoPP, we have aimed to

follow sound object-oriented techniques to achieve a framework that is:

• Easy to use

• Easy to extend

• Loosely coupled

• Portable

• Efficient

Among the techniques and principles we have used are the design pat-
terns in [45]. We have for example used the Template Method pattern to
automate (from the point of view of the derived classes) the gathering of
useful statistics, such as the number of collision checks.
For each concept mentioned in this section there is a corresponding
class hierarchy. The result is a set of small, loosely coupled class hier-
archies, where each hierarchy ideally has one well-defined responsibility,
e.g., interpolation between two configurations. Each class hierarchy orig-
inates from an abstract class, whose only purpose is to define a uniform
interface to the modeled concept. The idea is that planners that only refer
to the abstract interface of a concept can be reconfigured with any vari-
ation of that concept. Thus, making it easy to perform fair comparisons
between variations of a concept, which was one of the main motivations
behind the framework. An additional advantage is that we have in effect
got a multitude of planners at the price of one; just a reconfigure the
planner with another combination of concepts.
2.3 Dealing with Geometry 19

CoPP is implemented in C++, and in addition to general object-

oriented techniques, CoPP also uses some C++ specific features such
as templates and overloading. These two features in combination have
helped to provide generic solutions in areas where new concepts are likely
to be added. These generic solutions make the framework easy to extend.
Experience has shown that the portability problems for existing
frameworks, such as MSL, are mostly due to the choice of a platform
specific graphics API. To minimize such platform dependencies, we have
chosen to use the platform independent VRML [8] language for visual-
ization (see Appendix E.4).
To visualize the design, we use a variant of the class diagram nota-
tion introduced by Rumbaugh et al. [126]. A brief introduction to the
class diagram notation, together with some added conventions, is given
in Appendix A.

External Dependencies Any software project of moderate size will,

unless we are willing to reinvent the wheel over and over again, depend on
a number of external libraries. It is desirable, however, to keep the num-
ber of external dependencies as small as possible. The Boost libraries5
are free peer-reviewed C++ libraries of high quality. We have found that
Boost contains a lot of functionalities that are useful in the context of
path planning. Examples are: a wide variety of random number gen-
erators, a parser generator framework, and a library of generic graph
functions [133]. Thus, using Boost we can localize a lot of functionality
to a single external dependency. The usage of Boost within CoPP is
described in Appendix F.
Other external dependencies are the collision detection algorithms
PQP [76] and Enhanced GJK [27], and the Qhull program [12] for com-
puting convex hulls.

2.3 Dealing with Geometry

A study of proposed planners in the literature and path planning soft-
ware shows that the way geometry is represented is often tightly coupled
to the collision-detection algorithm that is used. Indeed, much work has
focused on finding suitable data structures for fast collision detection al-
gorithms. To be able to visualize the planner results, the data structure
5 www.boost.org
20 2 A Framework for Path Planning

must also be suitable for rendering. It turns out that these two require-
ments often are incompatible, so many path planners use two set of data
structures for each geometric object; one for collision detection and one
for visualization. For some planners, the planning method itself requires
adding “embellishments” to an otherwise standard geometry represen-
tation. Examples of this are workspace potential field methods, which
require each geometric object to have a set control points attached to it,
see, e.g., Tsai et al. [149] and Barraquand et al. [13]. At each configura-
tion, the forces resulting from the potential field are summed over all the
control points to give a resultant force and torque acting on the object.
Other examples are the method of Holleman and Kavraki [56] to sample
the medial axis of the workspace and the visibility tetrahedron concept of
Hernando and Gambao [54, 55], which also need special points attached
to each geometry. In the planning method suggested by Baginski [11],
each link of the robot is covered with several protective layers of increas-
ing thickness. These layers are used both to speed up collision checks
along path segments and to provide an estimate of the penetration depth
in case of a collision.
It is clear from the above that testing different collision detection algo-
rithms and developing new planning methods is difficult if the path plan-
ner framework restricted to one single geometric representation. There-
fore it was decided that CoPP should support several geometric represen-
tations through an extensible class hierarchy, where all geometric objects
inherit from a single base class.

2.3.1 Requirements on Geometric Types

Deciding which properties should be common to all geometric types is an
important design decision, as these properties form the base class inter-
face. However, as each added property in general impose a constraint,
too many of them will lead to a constrained hierarchy that only allow
certain types of geometric representations. We have found that once a
path planner is initialized, i.e., geometries are loaded and collision detec-
tion is setup, geometries are mostly moved around and queried for their
current position and orientation, which is in agreement with the flow of
information in Figure 2.1. Based on this observation, we require that
all geometric objects must be moveable and that they know about their
position and orientation relative a world frame. Furthermore, we have
also found that it is often useful to know about the spatial extent of a
given geometry. Therefore the base class interface will support queries
2.3 Dealing with Geometry 21

Geom
GetPose( )
Move(Transform t) material
Move(Vector args) Material
SetMoveFormat(format)
GetBoundingBox(…)
pose
AttachToFrame(…) Transform
Accept(GeomVisitor v)

Figure 2.2: Class diagram for the geometry base class.

about the (oriented) bounding box of the geometry. The bounding box
can be used to speedup collision detection, since we can quickly discard
pairs of geometries whose bounding boxes do not overlap. It is also used
to estimate the displacement of a robot link as the robot moves from one
configuration to another (see Section 2.6 on metrics).
We have found that these requirements form a base class interface that
is rich enough to be useful, yet imposes hardly no constraints on which
geometric representations we can use. The base class of all geometric
types is called Geom and it is shown in a class diagram in Figure 2.2.
From Figure 2.2 we can see that the base class declares a virtual
function Accept that takes a single argument of type GeomVisitor. This
is a prerequisite for the Visitor pattern [45], whose intent can be described
as adding new virtual methods to a class hierarchy without modifying
the hierarchy. The methods we want to add are put in a class that
derives from GeomVisitor. An example of the Visitor pattern is given
in Appendix E, where it is used to determine how to draw a geometric
object.
For visualization purposes, geometric objects also have material prop-
erties. The class Material in Figure 2.2 has the same type of data as the
material descriptions used in, e.g., VRML [8] and Open Inventor [154].
It can be used to model a wide range of effects such as shininess, specu-
lar color and transparency. Even though only a subset of these settings
are used most of the time, we have found that the transparency setting
is particularly useful for visualizing trajectories or hierarchically com-
posed geometric objects. See Figure 5.14 in Chapter 5 or Figure C.5 in
Appendix C for examples.
22 2 A Framework for Path Planning

2.3.2 Moving Objects Around

The most important property of the geometric types used in CoPP is
that they can be moved around. Moving an object means that we specify
a position and an orientation for it. This operation is straightforward,
but it is not so clear how the interface for the operation should look like.
This difficulty is mainly because there are so many ways to represent
rotations. We could, for example, use rotation matrices, quaternions, or
one of the many Euler angle set conventions, see Appendix B. Which
representation is most effective depend both on the representation used
internally by the path planner and on the type of path planning problem:
As pointed out by Kuffner [70], the choice of rotation representation can
have a big impact on the required planning time for rigid body problems.
For these reasons we do not want to restrict ourselves to one specific
format for moving objects around.
In Figure 2.2 we can see that the base class interface has two functions
named Move. In the first function, the argument is a Transform object,
see Appendix B.1, thus the orientation is specified by a rotation matrix.
The format used by the second function is less clear as its only argument
is a vector. How the contents of this vector is interpreted depends on an
internal parameter of the geometric object. Depending on the value of
this parameter, which can be changed through the SetMoveFormat, the
vector can be interpreted as Euler angles, as a rotation axis and an angle,
or as a quaternion. Any additional elements in the vector are interpreted
as translations.
With this approach, we get a uniform interface for all but one of the
formats, resulting in several benefits. First, without any constraints ge-
ometric objects behave as free-flying rigid bodies. Thus, for rigid body
problems, it is possible to use geometric objects right away, without inter-
vention of an object containing the equations of motion, as in Figure 2.1.
Second, it is easy to experiment with different formats to see which one is
most efficient. Finally, the second Move function has the same signature
as that of the robot class in Chapter 3. This similarity in the interfaces
opens the possibility to write C++ class templates that accept any type
of moving agent, as long as it has a function named Move with a match-
ing signature. Examples of this approach are given in Figure 2.11 and
Figure E.2.
So far we have only seen direct methods for moving geometric objects;
direct in the sense that the client directly communicates with the object
to be moved. We have found that moving objects this way can be both
2.3 Dealing with Geometry 23

inefficient and limiting for some type of planners. Therefore there is also
an indirect method for moving objects. This method uses the concept
of moving coordinate frames, to which we can attach any number of
geometric objects. Once an object is attached to a frame, it will move
along with the frame. As an example, we can think of a robot manipulator
with a moving coordinate frame at each joint. To make the robot’s links
move we just attach them to the corresponding frame. When we ask
a link about its current pose, it will return the pose of the frame it is
connected to.
The member function AttachToFrame takes a single Transform and
attaches the object to it. An overloaded version of this function takes two
Transform arguments, where the second argument specifies a constant
offset between the moving frame and the object. This second version is
especially convenient in pick-and-place planners, where a grasped object
suddenly moves along with the robot. When an object is grasped, the
following code makes sure it moves correctly together with the robot:

Transform offset = robot.GetEndEffectorPose().Inverse() *

grasped_object.GetPose();

grasped_object.AttachToFrame(robot.GetEndEffectorPose(),
offset);

The second argument specifies the pose of the grasped object relative
the end effector, and is obtained from the inverse of Equation (B.6) in
Appendix B.

2.3.3 Concrete Geometric Types

With the base class in place, we can start defining concrete geometric
types with representations like polygon meshes, Bézier patches, construc-
tive solid geometry, octrees, or any other representation that serves our
purposes. Currently there are three concrete geometric types in CoPP:
triangle sets, convex polyhedra, and objects that are hierarchies of con-
vex polyhedra. The reason for choosing these geometric types is that
most available collision detection algorithms can be divided in two major
classes: those that work on triangle sets, and those that work on convex
polyhedra. Thus, to support common collision detection algorithms, tri-
angle sets and convex polyhedra was a natural first choice to experiment
with. As the class of convex polyhedra is rather restricted, we also added
a type that can handle non-convex objects that can be decomposed into
24 2 A Framework for Path Planning

a finite set of convex polyhedra. Details about these classes can be found
in Appendix C.
To make it easy to construct rather complex environments and ob-
jects, we have developed a VRML-like file format in which geometric
objects can be defined. The parser for this format was developed with
Spirit parser generator framework. Spirit is part of the Boost libraries
and is briefly described in Appendix F.

2.4 Collision Detection

Collision detection is a vital part of any path planner. Furthermore,
because path planners spend most of their time on collision or distance
queries, the efficiency of the collision detection algorithm will greatly af-
fect the overall efficiency of the planner. In this section, we will review
some proposed collision-detection and distance-computation algorithms.
We will also introduce classes whose intent is to provide a uniform inter-
face for several types of proximity queries. As these classes are totally
decoupled from planner concepts, they could be used in other applications
as well. For example in physics simulations or in computer games.
The basic collision detection problem is about determining wether two
geometric objects share a common region.6 This formulation lead to algo-
rithms that work on pairs of objects. However, in simulations containing
hundreds or even thousands of objects, a pairwise approach is often im-
practical due to the O(n2 ) cost. Therefore collision detection algorithms
for large scale environments are often divided into two parts: The broad
phase, which identify pairs objects that have to be considered for possible
collisions, and the narrow phase, which perform exact collision checking
on the pairs that were not removed in the broad phase.
The efficiency of collision detection algorithms is very coupled to the
data structures used to model the geometric objects. Over the years,
many different representations for geometric objects have been proposed.
To reduce the spatial complexity, many algorithms uses hierarchical rep-
resentations, see, e.g., Larsen et al. [76]. Other algorithms are efficient
due to special assumptions about the objects, such as convexity. In the
next section we will discuss algorithms that work on convex polyhedra.
6 As pointed out by Cameron in [28], we should distinguish between interference

detection and collision detection; the former checks wether two static objects overlap
in space, whereas the latter checks wether two objects interfere over a whole range of
motions. In this thesis we follow the mainstream and use the term collision detection
for both problems.
2.4 Collision Detection 25

2.4.1 Convex Polyhedra and Collision Detection

Convex polyhedra have been studied extensively in the context of the
minimum distance problem. The reason is that the minimum distance
problem in this case can be cast as a linear programming problem, al-
lowing us to use well-known results from convex optimization. In fact,
it can be shown that the collision detection problem for convex polyhe-
dra can be done in linear time (in terms of the total number of vertices)
in the worst case, see Lin and Canny [89]. However, as we soon will
see, there are several algorithms that, under the assumption of tempo-
ral coherence, exhibit almost-constant time complexity. The algorithms
for convex polyhedra fall into two main categories: simplex-based and
feature-based. The simplex-based algorithms treat the polyhedron as the
convex hull of a point set and perform operations on simplices defined
by subsets of these points. Feature-based algorithms, on the other hand,
treat the polyhedron as a set of features, where the different features are
the vertices, edges and faces of the polyhedron.

Simplex-Based Algorithms
One of the most well-known simplex-based algorithms is the Gilbert-
Johnson-Keerthi (GJK) algorithm [48]. Conceptually, the GJK algorithm
work with the Minkowski difference of the two convex polyhedra. The
Minkowski difference is also a convex polyhedron and the minimum dis-
tance problem is reduced to finding the point in that polyhedron that is
closest to the origin; if the polyhedron includes the origin, then the two
polyhedra intersect. However, forming the Minkowski difference explic-
itly would be a costly approach. Instead GJK work iteratively with small
subsets, simplices, of the Minkowski difference that quickly converge to a
subset that contains the point closest to the origin. For problems involv-
ing continuous motion, the temporal coherence between two time instants
can be exploited by initializing the algorithm with the final simplex from
the previous distance computation. Experiments in [48] showed that the
algorithm has a linear complexity in terms of the total number of vertices,
except for a few degenerate cases.
Since the original paper [48], several improvements on GJK have been
published. Cameron [26, 27] suggested the use of adjacency information
for each vertex to speedup the algorithm. With this adjacency infor-
mation, the algorithm uses hill-climbing to achieve faster convergence.
This variant of GJK is often referred to as Enhanced GJK. Experi-
ments in [26, 27] involving tracking pairs of convex polyhedra showed
26 2 A Framework for Path Planning

that the computational time was close to constant when the total num-
ber of vertices was varied between 10 and 500. This almost-constant
time complexity has been confirmed by Ong and Gilbert [111, 112]. Both
Cameron [26, 27] and Ong and Gilbert [111, 112] point out that this be-
havior only holds for situations with high temporal coherence. Nagle7
pointed out that there are cases where the GJK algorithm experiences
problems with numerical errors; in situations where two faces, very differ-
ent in size, are almost parallel, the algorithm might end up in an infinite
loop. This was in fact experienced during tests of the pick-and-place
planner in Chapter 5; at certain stages of a task, the robot hand was
moving close to parallel to the table top, causing Enhanced GJK8 to cy-
cle when computing the distance between the finger links and the table.
In a later version of his implementation of Enhanced GJK9 , Cameron
addressed this problem by changing the termination condition and also
returning an error parameter. This version has not yet been tested in the
CoPP framework.
The robustness of GJK was also addressed by van den Bergen [150].
Furthermore, efficient caching of repeated dot products tend to make this
the fastest GJK implementation currently available.
To summarize:

• GJK is a fast algorithm for solving the minimum distance problem

for convex polyhedra.

• In applications with strong temporal coherence, repeated distance

computations have an almost-constant time complexity, indepen-
dent of the total number of vertices involved.

• There are only two numerical tolerances and the algorithm is easy
to implement.

• If care is taken to detect cycling, the GJK algorithm is very robust.

It is also interesting to note that R3 is just a special case for the original
formulation of GJK; it remains valid in higher dimensions as well.
7 J. Nagle. GJK collision detection algorithm wanted. Posted on the

comp.graphics.algorithms newsgroup, Apr. 1998

8 The Oxford version of GJK, version 2.1.

Available at web.comlab.ox.ac.uk/oucl/work/stephen.cameron/distances/
9 version 2.4
2.4 Collision Detection 27

(a) (b) (c)

Figure 2.3: Illustration of the Voronoi regions for a simple geometry.

The figures show, from left to right, the Voronoi regions of a vertex, an
edge, and a face.

Feature-Based Algorithms
A polyhedron can be seen as a set of features, where the different fea-
tures are vertices, edges and faces. The Lin-Canny closest features al-
gorithm [89] traverses the surfaces of two convex polyhedra to find the
closest pair of features. This search is made very efficient because of the
use of precomputed Voronoi regions for each feature. The Voronoi re-
gions basically divides the exterior of each polyhedron into semi-infinite
regions, where each region belongs to a specific feature. Figure 2.3 shows
examples of the three types of Voronoi regions for a simple geometry.
Once the closest features are found and cached, subsequent queries will
run in expected constant time. Even though Lin-Canny is considered to
be among the fastest algorithms for this problem, it has several draw-
backs as pointed by Mirtich [103]. First, it cannot handle the case of
intersecting polyhedra, in which case the algorithm cycle forever. Sec-
ond, other geometrically degenerate situations can cause the algorithm
to cycle as well. Third, the implementation depend on over six numerical
tolerances that have to be tweaked. It should be mentioned though that
the implementation of Cohen et al. [32] is able to handle the intersecting
case also, but at the cost of extra pre-processing.
Addressing the shortcomings of Lin-Canny, Mirtich [103] developed
28 2 A Framework for Path Planning

the V-Clip algorithm. Experiments in [103] show that V-Clip is both

more robust and faster than both Lin-Canny and Enhanced GJK. How-
ever, as pointed out by Levey et al. [86], one often overlooked drawback
of both Lin-Canny and V-Clip is the huge memory footprint of its data
structures. They report that for each edge added to a polyhedron, the
V-Clip data structure requires 168 bytes of storage.

2.4.2 Proximity Queries on Pairs of Objects

At the lowest level, proximity queries deal with pairs of objects. Thus,
although the algorithm for solving a query varies, the concept of a pair
of geometric objects remains invariant. This invariant can be expressed
with an abstract class that specify an interface for proximity queries on
a pair. Concrete classes implement specific algorithms and each object
will have references to two geometric objects. Additional information
that can be stored in such objects is the result of the previous query;
several algorithms utilize such information to obtain almost-constant time
complexity in situations with strong temporal coherence.
The most important issue when designing this abstract class is its
interface; after all, its whole purpose is to define an interface to which
all derived classes must conform. As proximity queries come in several
flavors, we could easily think of many functions that should go into the
interface, e.g., Collides for a binary answer, Distance for the mini-
mum distance, and ClosestPoints for the two points that realize the
minimum distance between the two objects. However, a too fat interface
would severely restrict the types of algorithms we could use and also make
the class hierarchy hard to extend. Some collision detection algorithms,
for example the separating vector algorithm in [31], do not compute dis-
tances, they just give a binary answer depending on wether the objects
collide. With a base class that requires the implementation of distance
queries, such algorithms would be ruled out. Thus, the problem is how
to define a uniform interface for different proximity queries, without im-
posing any constraints on the type of algorithms the class hierarchy can
support. If we rank algorithms according to the amount of information
they provide in the following way:

1. Binary answer queries

2. Minimum distance queries

3. Closest points queries

2.4 Collision Detection 29

CollisionPair
if (DoCollides()) {
+ Collides( ) ++num_collisions;
+ GetNumCollisions( ) return true;
- DoCollides( ) }
return false;
num_collisions

DistancePair
+ Collides(tolerance)
+ Distance( )
+ DistanceSqrd( )
- DoCollides(tolerance)

WitnessPair
+ GetClosestPoints(p1, p2)

Figure 2.4: Classes that encapsulate proximity queries for pairs of geo-
metric objects.

then it is clear that each rank extends the capabilities of the previous
one. Thus, we have in effect found an inheritance hierarchy. We have
chosen the names of the corresponding classes to be CollisionPair,
DistancePair, and WitnessPair. The name WitnessPair was chosen
because the two points that realize the minimum distance are often called
witness points in the literature [27]. A class diagram for the resulting
hierarchy is shown in Figure 2.4.
Note that none of the abstract classes have any members that refer-
ence a pair of geometric objects; it is up to the derived classes to decide
which specific type of geometric objects they accept. Still, clients can
access the pair through the base class interface using the pure virtual
functions GetFirst and GetSecond, which all derived classes have to
implement.
The class CollisionPair supports only the binary collision query
through the member function Collides. This function is implemented
using the Template Method pattern [45] to automate the computation of
statistics like the number of collisions detected for a particular pair. The
30 2 A Framework for Path Planning

derived class DistancePair provides an overloaded version of Collides

that takes a tolerance parameter. If the distance between the two objects
is larger than the tolerance, then the function returns true. This is useful
in planning applications that use safety margins. Also, tolerance queries
are often faster and easier to answer than distance queries, see, e.g.,
Larsen et al. [76].
In its current version, the CoPP framework has been tested with
Enhanced GJK [27] for convex polyhedra and PQP [76] for triangle sets.
The classes that encapsulate these two algorithms10 are shown in the
class diagram in Figure 2.4. Details about the implementation of these
classes can be found in Appendix D.
A useful extension of the class hierarchy in Figure 2.4 is to add classes
that can handle callback functions, i.e., a user-defined action that should
be activated on some event. A callback function could for example com-
pute new velocities for two dynamic objects that collide. Another exam-
ple relates to planning based on potential fields; each time the distance
between a robot link and an obstacle is computed, a callback function
could be called to compute the repulsive force and accumulate it to the
robot link. After all distance computations have been made, each robot
link contains the accumulated forces due to the obstacles.

2.4.3 Classes for Dealing with Sets of Objects

So far we have only discussed how to deal with pairs of geometric objects.
However, at a higher level we must also be concerned about how to deal
with many moving and stationary objects. If the number of objects is
small, then we can use the straightforward approach to do a collision
check on every pair. As the number of objects grow larger, this approach
would quickly become infeasible due to its O(n2 ) complexity in terms
of the number objects. Other methods rely on dividing the workspace
into regions, where each object is assigned to one or more regions. Only
objects that reside in the same region need to be checked against each
other. The collision detection package I-Collide [32] uses a sweep-and-
prune method to quickly rule out pairs that clearly do not intersect. This
method uses the fact that two axis aligned bounding boxes intersect if
and only if their projections on each coordinate axis also intersect. The
complexity is thereby reduced from O(n2 ) to O(n + m), where m is the
number of axis-aligned bounding boxes that do intersect.
10 In case of PQP, the word algorithm is misleading, as it actually is a package

providing different types of proximity queries.

2.4 Collision Detection 31

It turns out that most broad-phase collision detection algorithms

make use of temporal coherence and/or information about the behavior
of the moving objects. If the goal is to simulate moving bodies according
to the law of physics, then valuable information is available in terms of
the velocity and acceleration of each moving object. The problem is that
in path planning applications that only use kinematics, such information
is not available. Furthermore, most path planners tend to have short
periods of strong temporal coherence, i.e. when checking a path between
two samples, followed by bursts of discontinuous jumps, i.e., checking
random samples. This behavior makes it hard to use many of the pro-
posed broad-phase algorithms. That, together with the relatively small
size of most considered problems, is probably the reason why most path
planners still use the simple O(n2 ) approach.

Object Sets and Distance Engines

For simple binary collision checking on a set of objects, we introduce the
abstract class ObjectSet, see Figure 2.5. Its implementation is very sim-
ilar to that of CollisionPair; the main method is IsCollisionFree,
which is also implemented using the Template Method pattern to auto-
mate the computation of statistics.
A more advanced type of object set is one that supports various dis-
tance queries as well. Such an object set is represented by the abstract
class DistEngine, Figure 2.5. The method MinDist returns the mini-
mum distance over all active pairs in the set. The concept of an active
pair is left to the subclasses to define; typically the active pairs would
involve checking all moving objects against each other and all moving
objects against all stationary objects.
At this stage we have not put any constraints on how to register a pair
or an object to these sets. We have not even mentioned the name of any
geometric type yet. So these two classes serve the purpose to introduce
two useful concepts and to establish an interface for these concepts.
As in the case with the pair-based classes, there are also concrete
distance engine classes for specific algorithms. The concrete classes are
DistEnginePQP and DistEngineGJK, which are both very similar to their
pair-based versions, see Appendix D. Both classes use straightforward
pairwise collision checking, so no broad-phase techniques are implemented
yet. The interface for registering geometric objects for collision detection
is similar; both classes have a method AddPair, but the types of the
arguments differ. In the case of DistEnginePQP, the method is a tem-
32 2 A Framework for Path Planning

++num_calls;
ObjectSet
free = DoIsCollisionFree();
+ IsCollisionFree( ) time_used += ElapsedTime();
+ GetCollidingPair(...) return free;
# DoIsCollisionFree( )
num_calls
time_used

DistEngine
+ IsCollisionFree(tolerance)
+ MinDist( )
# DoIsCollisionFree(tolerance)
# DoMinDist( )

DistEnginePQP DistEngineGJK
AddPair(GeomT1 a, AddPair(GeomConvex a,
GeomT2 b) GeomConvex b)
AddPair(GeomConvexGrp a,
GeomConvexGrp b)
AddPair(GeomConvex a,
GeomConvexGrp b)
SetTrackingMode(…)

PairBasedDistEngine
pairs
Add(DistancePair p) DistancePair

Figure 2.5: Classes that encapsulate collision detection for a set of

geometric objects.

plate, using the same technique described in Appendix D.1 for the class
DistPairPQP; here reference counting techniques are used to avoid du-
plicate PQP Model objects. In the case of DistEngineGJK, we would like
to use it transparently with any combination of the convex types de-
2.5 Configuration Space Interpolation 33

scribed in Appendix C. Therefore, AddPair is overloaded on the possible

combinations, see Figure 2.5.
If we want to mix different distance computation algorithms, then we
can make use of the DistPair class hierarchy from the previous section.
The class PairBasedDistEngine also have a method AddPair, but the
argument of this method is instead a reference to a DistPair object. In
the light of this class, the two previous classes might seem unnecessary,
because this general class can handle several algorithms. It is true that
this class reduces couplings and is a good as an abstraction. However, if
users intend to use only one distance algorithm, then they have to pay for
the overhead of a virtual function call for each pair. Although no timing
experiments have been made yet, this cost is probably not negligible,
considering the thousands of distance queries usually made for solving a
path planning problem.

2.5 Configuration Space Interpolation

The vertices of a roadmap graph represent collision-free configurations,
and the graph edges represent trajectories between neighboring vertices.
The trajectory represented by an edge is, for holonomic problems, im-
plicitly given by interpolation of the two configurations that the edge
connects. Thus, an edge is associated with an interpolation method.11
It turns out that the interpolation method that should be used is prob-
lem dependent, so simple linear interpolation is not always appropriate.
In this section we will discuss some of the more common interpolation
methods, and see how they can be incorporated into the framework.

2.5.1 Interpolation of Revolute Joints

Linear interpolation works fine for many cases, but is obviously unsuitable
for revolute joints that do not have any joint limits. The joint angle of
a such a joint is naturally associated with the unit circle, denoted by
S1 , and an interval of length 2π, say [−π, +π]. Thus, for joints whose
topology is given by S1 , linear interpolation is clearly inappropriate; it
cannot handle the wrap-around that occurs when the joint angle passes
one of the limits, therefore problems that require more than one full
11 For problems involving dynamic systems, an edge has to be associated with a

system input instead.

34 2 A Framework for Path Planning

(a) (b)

Figure 2.6: (a) A maze on the surface of a torus. (b) The same maze,
stretched out on a planar surface. The start and goal positions are also
drawn.

rotation cannot be solved. Another problem is that linear interpolation

will not always take the shortest path around the circle.
Revolute joints are not the only joints with a topology homeomorphic
to S1 : Consider the maze in Figure 2.6 (a), which is drawn on the surface
of a torus. If the maze is cut loose and stretched out on a planar surface,
it will look like in Figure 2.6 (b). In the figure is also drawn the start
and goal position of a robot moving in the maze. The topology of this
planar maze is such that if the robot crosses one of the borders, it will
suddenly appear on the opposite side of the maze. Joints with this type
of behavior will hereafter be denoted ring joints. This example might
seem to be of only theoretical interest, but there are applications where
it could be useful: Computer games often use this topology to let, e.g.,
space ships cross one side of the screen and appear on the opposite side.
Planning algorithms that are capable of handling this topology could be
used to guide the computer controlled opponents in the game.
A general ring joint can take values in the interval [θmin , θmax ], where
the ends of the interval are declared to be equivalent, forming a closed
loop. When interpolating between two values θ1 , and θ2 we want to make
sure that we take the shortest path around the circle. Pseudo-code for
achieving this is shown in Figures 2.7 and 2.8.
2.5 Configuration Space Interpolation 35

RingDiff(θ1 , θ2 , θmin , θmax )

1 δθ = θ2 − θ1 ;
2 if (δθ < −(θmax − θmin )/2) then
3 δθ = δθ + (θmax − θmin );
4 else if (δθ > (θmax − θmin )/2) then
5 δθ = δθ − (θmax − θmin );
6 return δθ;

Figure 2.7: A function for computing the “shortest path” difference

between two ring joint configurations θ1 and θ2 .

RingInterp(θstart , θend , t, θmin , θmax )

1 θt = θstart + t ∗ RingDiff(θstart , θend , θmin , θmax );
2 if (θt < θmin ) then
3 θt = θt + (θmax − θmin );
4 else if (θt > θmax ) then
5 θt = θt − (θmax − θmin );
6 return θt ;

Figure 2.8: Interpolation between two ring joint values by the fraction
t ∈ [0, 1].

2.5.2 Interpolation of Rigid Body Orientations

Rigid body orientations are represented by the special orthogonal group,
SO(3), but it is not straightforward how we should interpolate between
two rotations R0 , R1 ∈ SO(3). A common approach is to parameterize
SO(3) with one of the many Euler angle set conventions and see interpo-
lation between two orientations as an interpolation on S1 ×S1 ×S1 . Thus,
each Euler angle is seen as an individual revolute joint and is interpolated
according to the algorithm in Figure 2.8.
It turns out, however, that interpolating between two orientations
using Euler angles has several drawbacks: The computer graphics com-
munity has for a long time recognized that interpolation of rotations
using Euler angles results in unnatural and jerky motions of rigid bodies.
Kuffner [70] pointed out that Euler angle interpolation causes the volume
swept out by the rigid body to be unnecessarily large. This is disadvanta-
36 2 A Framework for Path Planning

geous to path planning as a large swept-volume increases the probability

of a collision. Craig [36] showed that there are 24 possible Euler angle
set conventions that we can choose from. Even though a couple of these
angle sets have become more or less standard, the interpolated motion
will depend on which particular angle set is used.
Another way to represent rotations is using unit quaternions. Quater-
nions are a generalization of the complex numbers, and they can be used
to represent three dimensional rotations just as unit complex numbers
can represent rotations in the plane. As the set of all unit quaternions
form a 4D unit sphere, the problem of smooth interpolation can be seen as
the problem of finding the great-circle arc between two points on the 4D
sphere. The resulting interpolation scheme is called spherical linear inter-
polation (SLERP) and is given by Equation (B.10). See Appendix B.2.3
for more material on quaternions.
It can be shown that SLERP corresponds to rotation around a fixed
axis with constant angular velocity. This behavior correspond with our
intuition about how interpolation between two orientations should be-
have.
Figure 2.9 shows some examples illustrating the differences between
Euler angle interpolation and SLERP. The H-shape is constrained to
move on the surface of a sphere. This can be seen as if the shape is rigidly
attached to a spherical joint at the center of the sphere. The left column
shows interpolation between two orientations using Euler angles. The
right column shows interpolation between the same orientations using
SLERP. From these examples we see that the motions resulting from
Euler angle interpolation are less intuitive and in general result in larger
swept volumes. It is also seen that SLERP moves the shape the same
amount in each step, which is due to its property of constant rotational
velocity.

2.5.3 Car-Like Robots

Car-like robots involve nonholonomic constraints that at each configura-
tion limit the available velocities. If we neglect the dynamics, the robot
configuration can be described by three variables, (x, y, θ). For car-like
robots, Reeds and Shepp [123] showed that the shortest path between
two configurations consists of at most five segments. The segments are
either linear or circular arcs with a radius equal to the minimum turning
radius of the robot. Furthermore, the optimal path consists of at most
two cusps, corresponding to points where the robot changes from forward
2.5 Configuration Space Interpolation 37

Figure 2.9: The H-shaped object is constrained to move on surface of

the sphere. The left column shows interpolation between two orientations
using Euler angles. The second column shows interpolation between the
same orientations using SLERP.
38 2 A Framework for Path Planning

motion to backward motion, or vice versa. The possible combinations of

segments constitute a family of 48 curves, known as Reeds and Shepp
curves.
This optimal path is of course only possible in the absence of obstacles,
but the Reeds and Shepp curves can still be used for solving path planning
problems involving car-like robots. In a PRM approach, we could let
the graph edges be defined by the optimal Reeds and Shepp curve for
each pair of neighboring vertices. This approach shows the need for
an interpolation method that interpolates along the optimal Reeds and
Shepp curve between two configurations. For path planning with car-like
robots, see, e.g., Bicchi et al. [15], Švestka and Overmars [147], Vendittelli
et al. [152], Song and Amato [141].

2.5.4 Interpolation Objects

To separate path planners from the particularities of interpolation, the
framework must provide an abstraction for interpolation. Such an ab-
straction can be achieved either with function pointers to various inter-
polation methods, or with interpolation objects. Here we have chosen the
latter approach because of the added flexibility: Interpolation objects can
contain state variables that affect the interpolation. This is not possible
with function pointers.
Figure 2.10 shows a class diagram for different interpolation objects.
In addition to the Interpolate method, the base class also provides a
Clone method so that interpolation objects can always be copied, even
if the exact type of the object should be unknown. The ability to clone
objects this way has shown to simplify the usage of the framework in
terms of ownership issues and memory handling. This pattern is therefore
used for all lightweight objects that are meant to be used as configuration
arguments for a planner.
The QuatInterp class uses spherical linear interpolation [131] between
two quaternions, as described by Equation (B.10). Linear interpolation
is used for any additional degrees of freedom. The RingInterp class is
used for problems where some degrees of freedom are ring joints, i.e.,
homeomorphic to S1 . It uses the algorithms shown in Figure 2.7 and 2.8
to interpolate between the ring joints, and linear interpolation for the
remaining joints.
To be able to compare the tradeoffs involved in using SLERP instead
of Euler angle interpolation, CoPP also provides a class EulerInterp.
This class is actually a special case of the RingInterp class and could
2.6 Metrics 39

Interpolator
Interpolate(q1, q2, t, qt)
Clone( )

LinearInterp EulerInterp QuatInterp

RingInterp
min_vals
max_vals
ring_joints

Figure 2.10: Interpolation classes.

have been implemented using inheritance. As the ranges of the rotational

joints are known in advance, we instead chose a more efficient implemen-
tation than using that of RingInterp.
The interpolation objects are mostly used inside local planners that
try to connect two configurations with a simple path, see Figure 2.13.
The actual path is determined by the interpolation object. Interpolation
objects are also used together with the Path class, where they are used
to compute intermediate configurations for a sequence of via-points, see
Appendix E.1.

2.6 Metrics
The definition of a metric is of fundamental importance to any sampling-
based path planner, but it is far from clear which metric one should
choose. Using the straightforward Euclidean distance measure is in most
cases not so good, because the topology of the configuration space is often
very different from the Euclidean space. In this section we will present
some of the most commonly used metrics for path planning.
A metric space is a topological space X together with a real valued
function ρ : X × X → R (called a metric) such that, for every x, y, z ∈ X,

ρ(x, y) ≥ 0, (2.1)
40 2 A Framework for Path Planning

ρ(x, y) = 0 ⇔ x = y, (2.2)
ρ(x, y) = ρ(y, x), (2.3)
ρ(x, y) + ρ(y, z) ≥ ρ(x, z). (2.4)

Thus, a metric ρ should be positive, be symmetric, and be zero if and only

if the two points are equal. The last equation states that a metric must
also fulfill the triangle inequality. As an example, if x = (x1 , x2 , . . . , xn )
and y = (y1 , y2 , . . . , yn ) are two points of Rn , then the metric

h i1/2
2 2 2
ρ(x, y) = (x1 − y1 ) + (x2 − y2 ) + · · · + (xn − yn ) (2.5)

is the well-known Euclidean distance in Rn .

For nonholonomic path planning it is common to use pseudo-metrics,
i.e., a metric that does not fulfill all of the metric properties [79]. As
these metrics try to estimate the cost-to-go, they often fail to satisfy the
symmetry property. As an example, consider planning for a sailboat:
Due to the wind and other factors, the cost of travelling a distance in
one direction will not be equal to the cost of going back in the opposite
direction. In this section we will not discuss pseudo-metrics, but we note
that they are useful for nonholonomic planning.
As will be shown below, several metrics has been suggested and tested
in the context of path planning, but still it is far from clear which metric
to choose. Most important is of course that the metric reflect the topology
of the configuration space. Still, many metrics involve the choice of one or
more weights, whose value can affect the planner performance. Another
important guideline is to consider the purpose of the operation involving
the metric: When moving from one configuration to another, one must
make sure that no collisions occur on the path between them. In this case
one would like to use a metric that relates to the maximum displacement
of any point on the robot. If this displacement is smaller than the distance
to the nearest obstacle, then the path is clear. In PRMs, distance metrics
are used to determine which nodes one should try to connect to, using
a local planner. Nodes that are close should be more likely candidates
than those that are far away. In this case, the ideal metric should be the
optimal cost-to-go, the cost for moving along the optimal path between
the two given configurations. However, finding this cost is at least as
difficult as solving the original problem. Therefore simpler metrics must
be used that hopefully reflect the true cost-to-go. Thus, the metric is
used as a heuristic for guiding the path planner.
2.6 Metrics 41

2.6.1 Useful Metrics for Path Planning

Manhattan Metrics One of the simplest configuration space metrics
is the so-called Manhattan distance or L1 -distance, which is defined as
n
X
ρ(x, y) = |xi − yi |, (2.6)
i=1

where n is the dimension of the configuration space. This metric has been
widely used, especially in path planners using a discretized representation
of the configuration space, see, e.g., Autere and Lehtinen [10] and Van
Geem [151]. The reason for its popularity is part due to its simplicity, but
also because it is better than the Euclidean distance in the case of grid-
based planners. This is due to the fact that grid-based planners rarely
allow diagonal motions, causing the Manhattan distance to be a better
estimate of the distance. Another advantage of the Manhattan metric
is its low cost, compared to, e.g., the Euclidean distance that requires a
square root. As the metric is computed many thousands of times, the
cost of computing the metric can be an important factor.
Studying a serial manipulator it is seen that moving one joint will
move all links subsequent to that joint. Thus, moving a joint closer to
the base of the robot will in most cases cause a larger gross motion of the
robot than moving a joint closer to the end-effector. The more links that
are in motion, the higher the probability for a collision. This observation
suggests that the cost for moving a joint should be higher the closer to the
base it is. The simplest way to achieve this is to extend Equation (2.6)
to a weighted Manhattan distance:
n
X
ρ(x, y) = ci |xi − yi |, ci > 0 (2.7)
i=1

where ci is the cost associated with joint i. An example of a path planner

using the weighted Manhattan distance is that proposed by Isto [60]. The
planner used several heuristics to guide its search, where each heuristic
had its own set of weights for the Manhattan distance.

Rigid Body Metrics Amato et al.[6] performed an extensive study

on the effect the choice of metric had on the performance of PRMs in the
context of rigid body problems. They found that the weighted Euclidean
distance had the best efficiency on the tested problems. They also found
that the relative importance of the translational distance between two
42 2 A Framework for Path Planning

configurations increased as the environment became more cluttered. A

weighted Euclidean metric is, however, not the most appropriate for rigid
body problems, as it does not capture the topology of the configuration
space. It would be more natural to use a metric that is based on SE(3),
the group of rigid body transformations. The problem is that the defini-
tion of a metric on SE(3) must involve the choice of a length scale, see,
e.g., Murray et al. [107], which affects how translational motion and ro-
tational motion are weighted in the metric. As discussed in Section 2.5.2,
interpolation between two orientations using SLERP is equivalent to mov-
ing along a great-circle arc on a 4D sphere. The end-points of the arc
are equivalent to the unit quaternions that correspond to the two ori-
entations. As the travelled distance, the arc length, is proportional to
the angle between the two quaternions, a metric for rotations could be
based on the inner product of the quaternions, see Equation (B.12). As
metric for SE(3) we could then use the weighted sum of the translational
distance and the quaternion distance.

Displacement Metrics Because a robot consists of one or more mov-

ing rigid bodies, other approaches to defining a metric could be based on
the distance between point sets in R3 . A common metric of this type is
the maximum distance any point on the robot has displaced between two
configurations [77]:

ρ(x, y) = max ||a(x) − a(y)||, (2.8)

a∈A

where A denotes the robot. Because this metric is expensive to compute,

one often approximates the robot links with their bounding boxes instead.
Then the maximum displacement of the vertices the bounding boxes gives
an upper bound for the distance given by Equation (2.8). This metric
is particulary useful when verifying if a straight line path between two
configurations is collision free, see, e.g., Baginski [11]. Its usefulness stems
from the fact that it in most cases gives a conservative estimate of the
movement between two configurations.

2.6.2 Implemented Classes for Metrics

CoPP provides several predefined metric classes. They all inherit from
the abstract class Metric, which defines the pure virtual functions
Distance and Clone. Figure 2.11 shows the base class Metric together
with the most commonly used metric classes in CoPP.
2.6 Metrics 43

The class BoundingBoxDispl is a bounding-box displacement metric:

For a set of geometric objects, either the maximal displacement, or the
summed displacement between two configurations is computed. The abil-
ity to add and remove geometric objects is useful for problems where the
number of moving objects is not constant. A typical example is a pick-
and-place task, where the grasped object suddenly moves along with the
robot. Note that BoundingBoxDispl is a template, with the type of the
moving system as template parameter. To conform with the template,
the moving system must have a function Move, as shown in Figure 2.11.
If the template is used together with the robot class described in Chap-
ther 3, the generated metric objects will have a reference to a robot; each
call to Distance will move the robot to the two configurations and find
the maximum (or summed) bounding box displacement. This metric has
shown to be really useful when checking wether a path segment is colli-
sion free or not; when the displacement between two successive samples
on the segment is below a user defined threshold, then it is assumed that
no collisions occur between these two samples.
The class RingMetric is used for problems where one or more de-
grees of freedom are ring joints, described in Section 2.5.1. Such degrees
of freedom arise, for example, in two-dimensional rigid body problems,
where the rotation of the body is not limited. A ring joint can take values
in [θmin , θmax ], and the “shortest path” difference between the values θ1
and θ2 , defined by the function in Figure 2.7, is here used to define a
metric for the ring joints. For a single ring joint, we define the distance
as:

ρ(θ1 , θ2 ) = |RingDiff(θ1 , θ2 , θmin , θmax )|. (2.9)

The metric RingMetric is a weighted Manhattan metric, where the con-

tribution from the ring joints is computed according to Equation (2.9).
The class QuatDist is a metric for rigid body problems. For the
translational part, the Manhattan distance is used. For the rotational
part, the distance is related to the inner product of two quaternions,
see Equation (B.12). The total distance is a weighted sum of these two
contributions. It could be argued that it would be more appropriate to
use the Euclidean distance for the translational part, but for the prob-
lems tested so far, involving many thousands of distance computations,
the speed of the Manhattan metric has had a larger effect on the total
planning time than the exactness of the Euclidean metric.
44 2 A Framework for Path Planning

Metric
Distance(q1, q2)
Clone( )

Euclidean WeightedManhattan
weights

BoundingBoxDispl QuatMetric
AddGeom(geom) rot_weight
RemoveGeom(geom)
use_summed_displ RingMetric
min_vals
geoms agent max_vals
ring_joints
AgentT weights
Geom
Move(config)

Figure 2.11: Class diagram for the metric classes.

2.7 Configuration Space Sampling

For sampling-based planning methods, an important issue is how the sam-

ples are generated. The most common approach is to draw samples with
a uniform probability over the entire configuration space. For problems
involving narrow passages, a better strategy is to bias the distribution
to the narrow passages. The next section presents several techniques for
doing this. For planning problems involving constraints, uniform ran-
dom sampling is not effective because most samples fail to satisfy the
constraint. Section 2.7.2 presents some approaches where random sam-
ples are generated on the constraint surface. For rigid body problems,
care must be taken not to introduce unwanted bias in the generated ro-
tations. This issue is discussed in Section 2.7.3. Section 2.7.4 mentions
deterministic sampling and Section 2.7.5 presents classes for various sam-
pling strategies.
2.7 Configuration Space Sampling 45

2.7.1 Narrow Passage Sampling

It was early noted that planners using uniform sampling will get into trou-
ble as soon as the solution requires passing through a narrow passage in
the configuration space [64]; because such passages occupy a very small
subset of the configuration space, the probability of randomly guessing
a configuration in the passage is prohibitively small. Sampling with a
uniform distribution will essentially require covering of Cf ree before find-
ing the necessary configurations in the narrow passage. To overcome this
problem, several sampling strategies have been proposed with the goal
to increase the likelihood of sampling ’difficult’ areas of the configuration
space.
Boor et al. [20] proposed the Gaussian sampling strategy to concen-
trate samples near the boundaries of the configuration space obstacles.
Two samples are drawn each time such that the distance between them is
a stochastic variable with a normal distribution. The standard deviation
of the distribution is a parameter of the algorithm; a smaller standard de-
viation will generate configurations that are closer to the obstacle bound-
aries. The important step in the algorithm is to return a collision free
sample only if the other sample is not collision free. Thus, if both samples
are collision free, they are rejected. In [20] the method was successfully
applied to two-dimensional problems involving a moving polygon. How-
ever, it is not clear how to extend this method to robots with kinematic
chains because the distance metric and the standard deviation for the
distribution become difficult to choose. A too small standard deviation
will cause the strategy to throw away most of the generated configuration
pairs due to both configurations being collision free or both causing a col-
lision. A too large standard deviation, on the other hand, will just turn
the strategy into a more expensive version of the uniform distribution
sampler.
Amato et al. [5] propose that, for samples that are not collision free,
binary search search along random directions can be used for quickly
generating configurations near the boundary of Cobs . Thus, colliding con-
figurations act as seeds for configurations near the C-space obstacles.
The medial axis of the free configuration space, also referred to as
the generalized Voronoi diagram, is a useful tool in path planning. This
stems from the fact that it has a lower dimension than the free space
but is still a complete representation for path planning purpose. Fur-
thermore, the medial axis represents paths with maximal clearance to
the obstacles. However, generating the medial axis for the free space
46 2 A Framework for Path Planning

is an expensive operation, making it intractable for all but the simplest

problems. But Wilmarth et al. [155] show a relatively simple method for
projecting random configurations onto the medial axis without having an
explicit representation of it. The projection method is also applied to con-
figurations that are inside configuration space obstacles. It was shown
that this extension greatly increases the number of samples in narrow
passages if the surrounding obstacles are ’thick’. An example was shown
for a single rigid body moving through a maze in 3D, but extending the
method to kinematic chains seems to be difficult.
Holleman and Kavraki [56] also proposed a projection method, but
they use the medial axis of the workspace instead. Their method can
be extended to kinematic chains by assigning handle points to each link,
which is not so satisfying as long as these have to be chosen manually.
Yang and LaValle [157] proposed a randomized perturbation scheme for
enhancing samples which can be seen as an approximation of the sampling
onto the medial axis. The method works well for kinematic chains.
Kazemi and Mehrandezh [65] used the theory of potential flow and
harmonic functions to bias the sampling towards the narrow passages.
They noted that regions of high fluid velocity often corresponded to nar-
row passages in the robot’s configuration space. Thus, by biasing the
sampling distribution toward high velocity regions, they achieved bet-
ter coverage of the narrow regions. In a similar approach, Aarno et
al. [1] used a harmonic potential in W to bias the sampling; the potential
reaches high levels near obstacle boundaries. As the dimension of W is at
most 3, the cost of computing the potential in W is usually much lower
than computing a potential in C. The drawback is that there need not
be a direct correspondence between the narrow passages in W and those
in C.

2.7.2 Constraint Based Sampling

Path planning problems may involve other constraints in addition to the
collision-free constraint. For such problems, drawing samples with a uni-
form distribution could be inefficient if few samples satisfy the constraint.
A better approach would be to use a constraint-based sampling method
that only generates samples that satisfy the constraint.
For robots with closed-loop kinematic chains, there is a closure-
constraint that all configurations must satisfy. Thus, any loops must
remain closed. The chance of generating a configuration at random that
satisfy the closure constraints is very small, so other methods are needed.
2.7 Configuration Space Sampling 47

In the method of Cortés et al. [35], loops in a kinematic chains are cut.
Each cut defines an active chain and a passive chain. Random sampling
and forward kinematics are used for the active chains. The passive chains
are then forced to close the open loops using inverse kinematics. To be
really effective, this method assumes that closed-form inverse kinematics
exist for the passive chains.
Oriolo et al. [113] considered the problem of a redundant robot that
has to move among obstacles along a given end-effector path. As the
robot is constrained to follow the end-effector path, it has to use its re-
dundancy to avoid the obstacles. The sampling method in [113] generated
samples that satisfied the end-effector constraint.

2.7.3 Uniformly Distributed Rotations

Euler angles are often used to parameterize the rotation of rigid bodies.
Kuffner [70] pointed out that drawing each Euler angle from a uniform
distribution will not generate uniformly distributed orientations; if the
orientations are visualized on a 3D sphere, as in Figure 2.9, the distri-
bution will have a clear bias towards the poles of the sphere. This bias
can have a negative effect on path planners, as excessive sampling might
be needed to achieve a crucial orientation. Thus, special care must be
taken to generate orientations that are truly uniformly distributed. The
same care must be taken if the rotations are represented by quaternions.
Shoemake [132] presented a simple algorithm for generating uniformly
distributed quaternions.

2.7.4 Deterministic Sampling

Recently, the need for random sampling has been questioned, see e.g.,
Branicky et al. [23]. First of all, unless some physical process is used, the
random numbers generated by a computer are actually pseudo-random.
That is, they are generated by a deterministic algorithm. As these num-
bers actually are deterministic, then it should be possible to use other
deterministic sequences that are better suited. Pseudo-random number
generators are designed to meet performance criteria that are based on
uniform probability densities. These criteria might not be the best for
sampling-based path planning.
In [23] good results were obtained with so called Hammersley points
and Halton points. These are deterministic point sets that are specifically
designed to have properties such as low dispersion and low discrepancy.
48 2 A Framework for Path Planning

2.7.5 Sampling Strategy Classes

Clearly, there are many suggestions for how to sample the configuration
space, each with its own merits and drawbacks. Despite the many vari-
ations, they all do the same thing, namely produce configuration space
samples. This observation, combined with the fact that configuration
space sampling is an essential part of many planners, suggests that the
sampling strategy should be encapsulated in an object. This is an ex-
ample of the Strategy pattern described in [45], where the interface to a
family of related algorithms is defined in an abstract base class. Letting
all sampling strategies inherit from an abstract class has several advan-
tages: First, changing sampling strategy is as simple as changing one line
of code, instantiating a different strategy object, which makes compari-
son of strategies and testing new ones extremely easy. Second, advanced
planners might adapt to the problem at hand by switching strategies at
runtime.
The class ConfigSpaceSampler is the base class for sampling strate-
gies. Clients ask for new samples through the function GetSample and
they should ideally not know or depend on how this sample was pro-
duced; it could have been drawn from a uniform distribution or it could
have been produced using a more elaborate scheme. In fact, samples
need not be random at all. They might as well be deterministic! Be-
cause some sampling approaches might fail to produce a new sample, a
boolean return value from GetSample is used to indicate if a new sample
was produced or not.
Even the simplest sampling strategy will need some information about
the configuration space it is supposed to sample. In most cases, it is
enough to know the lower and upper limits of the configuration space.
More elaborate strategies, like obstacle based sampling [5], will need more
information. Because the base class interface does not allow such infor-
mation to be passed to a sampling strategy, all the needed information
has to be provided at the creation of a sampling strategy. So the price to
pay for a simple and uniform query interface is that the creation becomes
non-uniform and that sampling strategies sometimes share objects with
the planner.
As seen in Figure 2.12, there are a number of sampling strategies in
CoPP, of which UniformSampler hardly needs any detailed exposition.
The class ObstBasedSampler implements the obstacle based sampling
strategy presented in [5]. This is an example of a strategy that has to
share a lot of information, i.e., the robot and the distance engine, with
2.7 Configuration Space Sampling 49

ConfigSpaceSampler
GetSample(sample )
Clone( )

UniformSampler HaltonSampler HammersleySampler

min_vals min_vals min_vals
range_vals range_vals range_vals
counter counter
max_samples

ObstBasedSampler UniformQuats
AllowCollFreeSamples(...) min_vals
NumDrawnSamples( ) range_vals
NumCollidingSamples( )
SetNumDirections(n)

robot dist_engine

Robot DistEngine
Move( ) IsCollisionFree( )
MinDist( )

Figure 2.12: Classes that encapsulate different sampling strategies.

the planner. For rigid body problems, the class UniformQuats should
be used. The translational part is drawn with a uniform distribution
from a box in Rn , while the rotational part is generated from uniformly
distributed quaternions, as described in [132].

The HaltonSampler and the HammersleySampler both use determin-

istic point sets. To keep track of which point to generate next, both
strategies use an internal counter that keep track of how many times
GetSample has been called. Note that the Hammersley point set is finite;
its size is a user-specified parameter. This class is an example where the
return value of GetSample has to be used to indicate that the sampling
strategy has run out of samples.
50 2 A Framework for Path Planning

2.8 Local Planners

The concept of local planners is important to planning methods like PRM
and its relatives. In PRM-methods, local planners are used to find con-
nections between pairs of nearby configurations, and in RRTs they are
used to extend one state towards another. As the number of attempted
connections can exceed several thousands, the overall efficiency is closely
related to what local planner is used.
An elaborate local planner that is able to “turn around corners” could
find more connections than a simple one that only moves along a straight
line. However, due to using more time per query, it is not clear wether
the more elaborate planner would be more efficient than the simple one.
Thus, there is an important tradeoff for local planners between success
rate and average query time. There is a clear trend towards very simple
local planners that only check wether a given path segment (often de-
termined by an interpolation method) is collision free or not. The low
success rate of such simple planners is compensated by their simplicity
and efficiency; in case the given path segment is not collision free they re-
turn very quickly, allowing a large number of path segments to be tested
in short time.
Other local planner methods could be based on potential fields or on
some heuristic rule, such as the rotate-at-s planner proposed by Amato
et al. [6]. The rotate-at-s planner is a family of planners, parameterized
by the parameter 0 ≤ s ≤ 1.

2.8.1 Checking Path Segments

The simplest local planner just tries to verify wether a path segment be-
tween two given configurations is collision free or not. The path segment
is often implicitly given in that it is determined by the two configurations
and a provided interpolation method. Even though this planner is con-
ceptually simple, the problem it tries to solve it is by no means a simple
one: To efficiently determine wether a given path is collision free is an
extremely difficult problem. An exact method that never fails to detect
a collision along a path will take too long time to be practical. Therefore
approximate methods are used, at the risk of missing a collision.
Difficulties arise because collision detection algorithms only check
wether particular configuration is collision free or not, when the prob-
lem is to check a whole range of motion. In practice, the most common
method to verify collision free path segments is to do static collision de-
2.8 Local Planners 51

tection at a finite number of points along each segment. The question is,
how dense do we have to sample a given segment? To keep the proba-
bility of missing a collision low, it is important that the relative motion
between each collision check is small. Thus, sampling a segment until,
e.g., the bounding box displacement between all samples is less than some
threshold seems as a good idea. The sampling of the segment could be
done in an incremental fashion, from one end towards the other. It is in
general, however, more efficient to recursively divide the path segment in
smaller and smaller segments and do a collision check on the mid-point of
each segment. For a collision free path segment, there is no difference be-
tween the two approaches. If there is a collision, however, then the second
approach will find this quicker, see e.g., Geraerts and Overmars [46].
For rigid body problems, it is possible to find exact upper bounds
on the displacement between two configurations. These bounds depend
on the object geometry, the center of rotation, and assumes that the
motion taken between the two configurations is given by interpolation
on SE(3) (or SE(2) for 2D problems). Thus, given the distance to the
nearest obstacle at two configurations, and the upper bounds on the
displacement, then a path segment is either classified as collision free
with certainty, or more samples are needed.
There are also other methods, which are not based on sampling of the
path segment. These methods have the advantage of being more exact
than the sampling based techniques, but are, on the other hand, also
more costly. Gilbert and Hong [47] formulated the collision detection
problem as a root finding problem, where the root (if it exists) is the first
collision on the given path of motion. Another technique is to check for
collision between the volumes that are swept out by the geometric objects
as they move, see Xavier [156]. A drawback with methods that use swept-
volumes is that they are conservative; two objects might occupy the same
space, but at different times. Cameron [25] suggested that the sweeping
operation instead should be done in both space and time, generating
four dimensional objects. Unless the motions are very simple, these four-
dimensional objects are difficult to construct.

2.8.2 Local Planner Interface

The purpose of a local planner is to find a connection between two con-
figurations. As mentioned in Section 2.8, most local planners are deter-
ministic and only check a segment that is determined by an interpolation
method. Thus, if a connection exists, clients would know how the con-
52 2 A Framework for Path Planning

necting path segment looks like, and a boolean result is all the information
needed. More elaborate local planners could, however, find a non-trivial
path connecting the two configurations. In such a case, the client must
be given the path also. A uniform interface to all kinds of local plan-
ners must hence return a path also. The resulting interface is shown in
Figure 2.13.
The general interface of the abstract class ConfigConnector makes
it easy to encapsulate almost any planning algorithm and use as a local
planner. One could, for example, use a PRM planner with RRT as local
planner, making it easy to compare the tradeoff between the speed and
connection rate of the local planner.

2.8.3 Example of a Flexible Local Planner

So far we have mostly discussed concepts in isolation from each other.
Here we will see an example of how several concepts can be combined to
form a flexible local planner.
We want to make a local planner that uses the recursive sampling
technique, described in Section 2.8.1, to verify wether path segments are
collision free or not. The recursion proceeds until the distance according
to a given metric is less than a specified threshold. The actual path that
is checked is determined by a given interpolation method.
Most path planning problems only involve finding a collision free path.
Thus, the local planner should have an object for collision detection as
well. However, path planning problems could involve other constraints
as well, as discussed in Section 5.6. A more general approach is therefore
to use an object that represents a binary constraint that answers wether
a given configuration is satisfied. Note how this abstraction also removes
the need to explicitly reference the moving system; it is done inside the
constraint object.
Thus, the local planner should have access to: a metric, an interpo-
lation method, and a binary constraint. The resulting class is named
BinaryConnector, and is shown in Figure 2.13. As the internal compo-
nents can be changed, the planner is very flexible, and can be used for a
wide range of path planning problems.
If BinaryConnector is used in a PRM planner, the Connect method
should return false and an empty path as soon as a collision is found. If
it is used in an RRT planner, then we would like it to find the path that
moves as far as possible along the segment towards the end configuration,
see Figure 4.1 (b). The local planner can switch to this behavior with
2.9 Chapter Summary 53

ConfigConnector
bool Connect(from, to, path)

metric
BinaryConnector Metric
SetMaxStepSize(…)
interp
SetToPersistent(…) Interpolator
SetMetric(…)
SetInterpolator(…) constr
SetConstraint(…) BinaryConstraint

Figure 2.13: A class diagram for local planners. Note how

BinaryConnector uses composition to gain flexibility.

the SetToPersistent method. To summarize, we have a flexible local

planner that can be used in both PRM and RRT planners.

2.9 Chapter Summary

In this chapter we have presented the design and implementation of a new
framework for path planning named CoPP. Compared to other frame-
works, CoPP models a richer, yet more decoupled, set of concepts. The
strong decoupling allow implementations of concepts to vary more inde-
pendent of each other. Furthermore, it allows path planning applications
to include only the exact subset of needed concepts. In this respect, the
framework can be seen as a set of LEGO blocks for building path planners,
where blocks that represent the same concept can be switched seamlessly.
The following list summarizes the key features of the framework:
• The framework is easy to extend, either with variations on existing
concepts, or with completely new concepts.
• The framework make it easy to do fair comparisons between varia-
tions of a concept, such as different collision detection algorithms.
• There are few restrictions on the type of planning algorithms that
can be implemented with the framework.
54 2 A Framework for Path Planning

• The framework is system independent.

• Adaptive planners that changes between different algorithms de-

pending on the problem, are easily implemented.

In this thesis we have only considered holonomic systems. The frame-

work is, however, easily extended to handle dynamic systems as well.
The major change is a completely new agent interface, with which the
path planner communicates. Following the design of the MSL framework,
this interface would have methods for applying an input to the dynamic
system over a certain time interval. Integration methods internal to the
agent would take care of computing the resulting state. Admittedly, ex-
isting path planners that want to handle dynamic systems would now
have to change to this new interface. However, as CoPP provides path
planning components and not path planner classes, there is nothing in
CoPP that has to change due to the added agent interface.
The CoPP framework has been tested to compile both with Microsoft
Visual C++ 6.0 and with gcc 3.2.2. The framework has been used to
implement a wide range of RRT planners, a PCD [91] planner, and the
pick-and-place planner in Chapter 5. The framework is also being tested
on a real robot platform in a project about sensor-based path planning
and fault detection.12

12 This project is carried out at the Centre for Autonomous Systems by: Daniel

Aarno, Frank Lingelbach, and Paul Sundvall.

Chapter 3

A General Robot Model

for Path Planning

Path planning deals with the problem of finding motion strategies for
movable objects or articulated structures. An articulated structure can
be used to model things like, e.g., the motion of a computer animated
character, a robotic manipulator or a complex protein molecule. From
hereafter we will call an articulated structure a robot. The goal in this
chapter is to develop a general robot class that is able to represent a
wide range of different robot types. The main purpose of this class is to
maintain the kinematic structure of the robot and to provide functions
for both the forward and the inverse kinematics.
It is important to have a systematic method for describing the kine-
matic structure of a robot. The next section will go through some of the
proposed notations. It will be pointed out that the standard Denavit-
Hartenberg notation [39] is incapable of describing robots with tree-like
kinematic structures. We use instead the notation of Kleinfinger and
Khalil [68], which in the case of a serial kinematic chain reduces to that
of Denavit and Hartenberg.
To reduce the number of actuators in, e.g., robot hands and hyper-
redundant robots, it is a common design strategy to mechanically couple
joints together. To allow modeling of such robots, the proposed robot
model incorporates the concept of joint couplings, which is discussed in
Section 3.2. Another important idea of the proposed robot model is
that simple robots can be combined to form a more complex robot. For
56 3 A General Robot Model for Path Planning

example, a model of humanoid robot can be seen as if it is composed

several other robots: two arm robots, two leg robots, a trunk robot and
so on. Composition of robots is discussed in Section 3.4.
Section 3.3 deals with the important, but often difficult, problem of
inverse kinematics. The main idea is the separation of inverse kinematics
solvers from the robot itself. This way, users can switch between general
solvers and specific closed form solutions at runtime.
In Section 3.5, the design and implementation of the robot class is
described. In particular, we describe how new inverse kinematic solvers
can added to the framework with a minimum amount of work. A robot
can be associated with a particular solver based on a directive in a robot
description file. An example of a robot description file is given in Ap-
pendix E.5.
The resulting robot class presented in this chapter, is useful for many
types of path planning problems, but we point out that the CoPP frame-
work does not hinge upon it. Other models of motion, e.g., dynamic
systems, are easily added to the framework.

3.1 Notation for Articulated Structures

To be able to instantiate a specific robot, we must have a precise nota-

tion for describing it. A robot is said to be a collection of rigid links and
joints connecting the links with each other. The purpose of a joint is
to constrain the relative motion between two connected links such that
the number of degrees of freedom is somewhere between one and five.
(Allowing the relative motion to have six degrees of freedom, or reducing
it to zero, is hardly interesting). The term lower pair is used to describe
the connection between a pair of bodies when the relative motion is char-
acterized by two surfaces sliding over one another. There are in fact six
possible lower pair joints [36]. When it comes to actuated joints, the
revolute and prismatic joints are by far the most usual. Passive joints,
common in, e.g., parallel robots, are often of the spherical type (ball in
socket). Another common passive joint is universal joint, or Cardan joint.
In the following, we will restrict ourself to revolute and prismatic joints.
This restriction is not severe as any other joints can be formed from a
suitable combination of revolute and prismatic joints.
3.1 Notation for Articulated Structures 57

3.1.1 Denavit-Hartenberg Notation

In order to deal with general articulated structures we must have a sys-
tematic way of describing them. Several methods have been proposed
and among them the Denavit and Hartenberg (D-H) notation [39], and
its variants [36, 115], is the most popular. Below we will give a brief
description of the D-H notation and an augmentation of it to handle
tree-like structures as well.
The D-H notation has become the de facto standard for describing
serial manipulators. The main assumption is that the robot can be de-
scribed as a single kinematic chain and that the joints connecting its links
are either prismatic (translational DOF) or revolute (rotational DOF).
The latter is no real restriction since the other basic joint types can be
modelled as a concatenation of degenerate (i.e., zero link length) revolute
and prismatic joints. A serial robot consists of n + 1 links, where link
0 is the fixed base and link n is the terminal link. The numbering of
joints and links is such that joint i connects links i − 1 and i. To describe
the pose of link i, a frame Fi is attached to it. The frame is defined
such that the axis Zi is coincident with the axis of movement for joint
i, hereafter simply called the joint axis. The axis Xi is defined as the
mutual perpendicular between joint axes i and i + 1. If the joint axes are
not parallel, this will also uniquely define the origin of frame Fi since the
closest points on the joint axes are also unique. The axis Yi formed by
the right-hand rule to completely specify the frame. Defining the frames
of each link in this manner, it turns out that only four parameters are
needed to describe frame i relative to frame i − 1. These parameters are:

• αi : the angle between Zi−1 and Zi about Xi−1

• ai : the distance from Zi−1 to Zi along Xi−1

• θi : the angle between Xi−1 and Xi about Zi

• di : the distance from Xi−1 to Xi along Zi

See Figure 3.1 for a picture of two adjacent frames with the corresponding
parameters. In the literature, these parameters are known as the link
twist, the link length, the joint angle and the joint offset, respectively.
From Figure 3.1 we see that we can derive the transform i−1 i T , de-
scribing the pose of link i relative to link i − 1, as a concatenation of two
rotations and two translations:
58 3 A General Robot Model for Path Planning

i−1
i T = RX (αi )DX (ai )RZ (θi )DZ (di ), (3.1)
where RX stands for rotation about the X-axis and DX stands for dis-
placement along the X-axis. See, e.g., the book by Craig [36] for details.
Evaluating Equation (3.1) gives:

 
cos θi − sin θi 0 ai
i−1
cos αi sin θi cos αi cos θi − sin αi −di sin αi 
T =
 sin αi sin θi
 (3.2)
i
sin αi cos θi cos αi di cos αi 
0 0 0 1

For a revolute joint, θi will be the joint variable, while all the other pa-
rameters are constant. For a prismatic joint, di will be the joint variable.
The pose of the last link relative the base of the robot can be found
by applying all the transforms from the base leading up to the last link:

0 n−1
nT = 01T 12T . . . n T (3.3)
Note that this equation also gives the pose of all the intermediate links
relative the base.
One of the main reasons for the popularity of the D-H notation is its
simplicity and compactness; for any link we only need to specify three
parameters, while the fourth one is the joint variable. This compactness
comes from the clever choice of the frames fixed to each link, reducing
the number of parameters to a minimum. The main drawback of the D-H
notation is that it runs into trouble when trying to represent kinematic
chains with more than one branch, i.e., a link has more than two joints.
If a link i connected to joint i, also is connected to m other joints, the
definition of frame i fixed to this link becomes ambiguous. This is because
Xi is defined from the relationship between joint axis i and the next joint
axis. In this case we have m choices for the next joint axis. If we choose
one of the m joints, say joint j, as the next joint axis, then the frame for
link i and j can be defined as usual, but the frames for the other m − 1
one links cannot be defined using the standard D-H parameters because
Xi does not point towards the right joint axis.
To overcome this problem, Sheth and Uicker [130] (S-U) proposed a
new notation for robots with tree-like or closed loop kinematic chains.
In their method, two frames are assigned to each link, and the result-
ing transformation matrix can be seen as composed of two parts: One
3.1 Notation for Articulated Structures 59

Axis i − 1 Axis i

Zi−1 Li−1 Li
Zi

Xi
Xi−1 θi
di
ai

αi

Figure 3.1: Illustration of the Denavit-Hartenberg notation. As joint i

is a revolute joint, θi is the joint variable.

constant transform related to the shape of the link, called the shape ma-
trix, and one variable transform representing the joint motion, called the
joint matrix. In the S-U notation, six parameters are needed to define
the shape matrix. The joint type and the joint variable defines the joint
matrix. The large number of frames, two for every link, and parameters
makes the S-U notation much more complicated than its D-H counter-
part. Because of this, the D-H notation has always been the choice unless
studying robots with loop kinematics. Thus, for a unified notation to be
used for both serial robots and robots with loop kinematics, it should re-
duce to the simple D-H notation in the case of a serial manipulator. This
is exactly the case of the notation proposed by Kleinfinger and Khalil [68]
(K-K). In the case of a serial manipulator, their notation coincide with
that the D-H notation described above, but for a branching link, two ex-
tra parameters, together with the usual D-H parameters, are needed for
each extra frame. Next we will describe the K-K notation in detail for a
tree-structure robot. In the case of a serial manipulator the notation is
60 3 A General Robot Model for Path Planning

identical to D-H notation described above.

3.1.2 Kleinfinger-Khalil Notation

It is assumed that the robot is composed of n + 1 links and n joints.
Due to the tree-like structure of the robot, it can have m end-effectors,
where an end-effector is a leaf-node of the tree representing the robot’s
kinematic structure. Each link and joint will have a number identifying
it. The numbering is done in the following way:
• The base link will always be link 0.
• The joint and link numbering are both increasing when traversing
the tree from the base to an end-effector.
• Joint i connects link a(i) and link i, where a(i) is the number of
the link preceding link i when coming from the base. Link i moves
with joint i.
• Frame i is considered fixed with respect to link i.
This defines how to number links, joints and frames. The next steps
tell how to define the frame associated with each link. As long as a link
has only two joints, the frames are defined just as in the D-H case above.
In the case that the link i has more than two joints, find the mutual
perpendicular between Zi and each of the succeeding axes Zj on the same
link, where i = a(j) and j = k, l, . . . . Let one of these perpendiculars
define the Xi axis. It was suggested in [68] that one should choose the
perpendicular corresponding to the joint on which the longest branch was
articulated. Together with Zi , the other perpendiculars, lets call them
Xi′ , Xi′′ , . . . , form a set of intermediate frames Fi′ , Fi′′ , . . . that are fixed
with respect to link i. The key idea of the K-K notation is that only
two parameters, ǫ and γ, are needed to describe a transform that takes
the frame Fi to one of the frames Fi′ , Fi′′ , . . . . Once there, the usual
D-H parameters can be used. Thus the K-K notation can be seen as
augmenting the D-H notation. The two extra parameters are defined as

• ǫi : the distance from Xi to Xi′ along Zi

• γi : the angle between Xi and Xi′ about Zi

See Figure 3.2 for an example. If ǫi 6= 0 and γi 6= 0, then Equation (3.2)

changes to
3.2 Modeling Joint Couplings 61


Cγi Cθi − Sγi Cαi Sθi −Cγi Sθi − Sγi Cαi Cθi
i−1
 Sγi Cθi + Cγi Cαi Sθi −Sγi Sθi + Cγi Cαi Cθi
T =
i  Sαi Sθi Sαi Cθi
0 0

Sγi Sαi ai Cγi + di Sγi Sαi
−Cγi Sαi ai Sγi − di Cγi Sαi 
 (3.4)
Cαi di Cαi + ǫi 
0 1
As pointed out by Craig [36], the D-H notation is ambiguous. That
is also the case of the K-K notation. Below some guidelines from [36]
are given for how to choose the frames in case a parameter is not well-
defined. To begin with, the choice of Zi is ambiguous, since there are
two directions in which we can point Zi when making it coincident with
joint axis i. In the case of joint axis a(j) and j being parallel, the choice
of origin location for frame a(j) is arbitrary. Here Craig [36] suggests to
choose the origin such that da(j) becomes zero, which is possible only if
the joint is revolute. If the two joint axes intersect, e.g., in the case of a
spherical wrist, then the mutual perpendicular is not defined. In such a
case, the Xa(j) axis is chosen to be perpendicular to the plane containing
both joint axes, leaving two choices for the direction of Xa(j) . The origin
of frame a(j) is located at the intersection of the two joint axes. The
choice of the first frame of the robot is arbitrary, but by convention [36]
it is chosen such that F0 = F1 when the first joint variable is zero. This
will make sure that a1 = 0 and α1 = 0. Additionally, d1 = 0 for a
revolute joint and θ1 = 0 for a prismatic joint.
It was shown in [68] that the K-K notation can also be applied in
the case of a robot with loop kinematics. By cutting each loop at an
arbitrary joint, such a robot can be transformed to an equivalent tree-
structure robot if a loop-closure constraint is added for each loop that is
cut.

3.2 Modeling Joint Couplings

To reduce the number of actuators it is common to introduce couplings
between joints, such that a single actuator can drive two or more joints.
An example of this is the BarrettHand, where four actuators are used to
actuate eight joints. This design made it possible to mount all actuators
62 3 A General Robot Model for Path Planning

Xi′
Xk
Zi dk
Li Xj
Lk

γk Zk

ǫk Lj
αj Li
θj

aj Xi

Figure 3.2: Illustration of the Kleinfinger-Khalil notation. The figure

is from [44], with kind permission from Lorenzo Flückiger.

and control electronics inside the wrist, avoiding troublesome solutions

like tendon actuation with the actuators far away from the hand. Mod-
eling a simple parallel-jaw gripper also becomes easier if we allow joint
couplings; with a joint coupling we can enforce the two jaws to always
move an equal amount, but in opposite directions.
Here joint couplings are modelled using the concepts of active and
passive joints, where an active joint can control any number of passive
joints. We consider only linear couplings, meaning that the input to a
passive joint is determined by the input to the active joint and a gear
factor. It follows that the number of degrees of freedom for a robot is
equal to the number of active joints.
There are examples of tentacle-like robots in the literature, see, e.g.,
Hannan and Walker [52] and Immega and Antonelli [59]. Such robots of-
ten have very low stiffness, making them suitable for interactions with hu-
mans. Moreover, their extreme flexibility allow them to wrap around ob-
jects, i.e., whole arm manipulation. The robot of Hannan and Walker [52]
consists of a flexible spine, which has 16 two DOF joints. The spine is
divided into four segments, and each segment is controlled by two pairs of
tendons. When the robot is actuated, the curvature along each segment
is approximately constant. Using the proposed joint coupling model, we
can easily build a robot model that has the same properties; each segment
3.2 Modeling Joint Couplings 63

Figure 3.3: An approximation of a continuum robot using coupled

joints. The robot is divided into four segments, with eight revolute joints
in each segment. The two first joints in each segment control the oth-
ers such that the curvature is constant along each segment. In total the
robot has eight degrees of freedom.

consists of eight revolute joints, where the first two control the others.
With all gear factors equal to one, we achieve constant curvature along
each segment. The resulting ’elephant trunk’, see Figure 3.3, has eight
degrees of freedom.
Modern industrial manipulators often have loops in the kinematic
structure to increase the stiffness and the payload capacity of the manip-
ulator. Solving the kinematics of manipulators with closed loop kinematic
chains is not so straightforward in that it requires satisfying the loop con-
straints. Often iterative solution techniques are required. However, there
is an important sub-class of such manipulators that we can handle with-
out, again using joint couplings. Robots like the Acma SR400 robot,
see [67], have a loop in the shape of a parallelogram. When such a par-
allelogram is deformed, all the joint angles change by the same amount.
Thus, we can model a parallelogram structure using one active joint that
control three passive joints, see Figure 3.4.
In summary, joint couplings are useful to model:

• Robots with physical joint couplings

• Snake-like continuum robots

• Robots with a parallelogram structure

64 3 A General Robot Model for Path Planning

Figure 3.4: A closed loop kinematic structure where the loop constraint
is satisfied using joint couplings. Due to the parallelogram structure, the
relationships between the joint angles are linear.

3.3 Inverse Kinematics

The problem of finding the configuration that takes the robot end effec-
tor to a given position and orientation is called the inverse kinematics
problem. This is generally a much harder problem to solve than the
forward kinematics problem, which was just a concatenation of all the
transformations leading to the end effector. The inverse kinematic prob-
lem may have zero, multiple or infinite number of solutions, depending
on the wanted pose and the structure of the robot.
Most path planning examples presented in the literature are of the
type “find a path from qa to qb ”, where qa and qb are two points in the
robot’s configuration space. However, when we specify robot tasks we
hardly want to deal with details such as joint angles. Instead we want to
give specifications like “robot, go to the kitchen” or “robot, pick up the
bottle”. This leaves it up to the robot (planner) to figure out the joint
angles needed to reach the locations necessary for the task.
Clearly, for high-level task specifications we need methods for solving
the inverse kinematics problem. It would therefore be useful if the robot
class could provide a general solver. However, numerical solvers tend to
3.3 Inverse Kinematics 65

be slow and they often only find one solution even if there are more.
Allowing only one solver would therefore be very inefficient for those
robots where we have a closed form solution. The method chosen here
is to have a repository of various solvers, from which users can choose at
runtime.
In the following, we will discuss a straightforward method to numeri-
cally solve the inverse kinematics problem. We will test it on two robots
and also compare its efficiency to a closed form solution. Finally we will
see how different solvers coexist in CoPP.

3.3.1 The Jacobian

Any numerical method for solving the inverse kinematics for a general
robot will most likely need the manipulator Jacobian. For robots living
in R3 , the Jacobian will be a 6 × m matrix, relating the m joint speeds
to the instantaneous velocity of the end-effector frame:

Ẋ = J(q)q̇, (3.5)

where the end-effector velocity is composed of both the linear and the
T
rotational velocities, Ẋ = VT ΩT . If not stated otherwise, the end-

effector velocities are measured relative the base-frame of the robot. The
column i of the Jacobian can be interpreted as the velocity of the end
effector when joint i has unit velocity and all other joints are locked. This
suggests that J can be computed column by column, looking at one joint
at a time. Below we will look at the velocity contributions to the end-
effector frame from a prismatic joint and a revolute joint, respectively.
The velocities will be expressed relative the end-effector frame itself, but
a simple transformation can express them relative another frame, e.g.,
the base frame.
Joint k produces the linear velocity nVk and the angular velocity nΩk
with respect to the frame n. If joint k is prismatic, then we have [67]:

n
Vk = nZk q̇k (3.6)
n
Ωk = 0 (3.7)

where nZk is the z-axis of frame k expressed in frame n.

If joint k instead is a revolute joint we get [67]:
66 3 A General Robot Model for Path Planning

n
Vk = ( nZk × nPk ) q̇k (3.8)
n
Ωk = nZk q̇k (3.9)
where nPk is the vector connecting frame k to frame n, expressed relative
frame n. Introducing the auxiliary variable σk , which is defined as
(
1 if joint k is prismatic,
σk = (3.10)
0 if joint k is revolute,
the velocity contribution from joint k can be written in the general form

n
Vk = [σk nZk + σ̄k ( nZk × nPk )] q̇k , (3.11)
n
Ωk = σ̄k nZk q̇k , (3.12)
where σ̄k = 1 − σk . Summing the contributions from all joints from the
base to the end-effector, we get the Jacobian as

σ1 Z1 + σ̄1 (Z1 × P1 ) . . . σn Zn + σ̄n (Zn × Pn )
J= , (3.13)
σ̄1 Z1 ... σ̄n Zn
where the leading superscript n has been omitted.
The Jacobian for a system with coupled joints will of course be dif-
ferent from that of the non-coupled system, since one or several joint
velocities can expressed in terms of the velocity of other joints. If we
have a coupling between joint i and j such that θ̇i = kij θ̇j , where kij is
the ’gear factor’, then it is seen that the jth column is given by

cj = cj + kij ci . (3.14)
The ith column is discarded, so each coupling will reduce the number of
columns in the Jacobian by one.
Note that the Jacobian in Equation (3.13) gives the end-effector ve-
locities relative to the end-effector frame itself. To express the velocities
relative another frame Fi , we can use the following transformation:
i 
nR 03×3
i  nJ,
J= (3.15)
i
03×3 nR
where inR is the rotation matrix that specify the orientation of the end-
effector frame relative frame Fi and nJ is from Equation (3.13).
3.3 Inverse Kinematics 67

3.3.2 A Numerical Solver Based on the Pseudo-

Inverse
Equation (3.5) relates joint velocities to end-effector velocities. It can
also provide us with a differential model of the manipulator:

∆X = J(q)∆q. (3.16)
This differential model will remain valid as long as ∆X and ∆q are
sufficiently small. The main idea when solving the inverse kinematics
numerically is to let ∆X be the pose error and use Equation (3.16) to
solve for ∆q. However, just inverting J will in most cases not work
because the Jacobian may not be a square matrix. Instead we use the
pseudo-inverse J+ of J. The pseudo-inverse of a matrix does always exist
and can easily be computed from the singular-value decomposition of the
matrix, see Golub and Van Loan [49]. Furthermore, it can be shown that
the solution given by

∆q = J+ ∆X (3.17)
minimizes the residual kJ∆q − ∆Xk2 .
Let 0nTd denote the desired end-effector pose and 0nTc the current end-
effector pose. The following algorithm can be used to iteratively solve for
the inverse kinematics problem [67]:

• Initialize qc by the current configuration.

• Compute 0nTc using the forward kinematics.

• Compute the position error dXp and the rotation error dXr , rep-
resenting the difference between 0nTd and 0nTd .

• If dXp and dXr are sufficiently small, then qd = qc and the itera-
tion terminates.

• To remain in the validity domain of the differential model we must

introduce the thresholds Sp and Sr on dXp and dXr respectively
such that:
dXp
– If kdXp k > Sp , then dXp = kdXp k
dXr
– If kdXr k > Sr , then dXr = kdXr k

• Compute the Jacobian 0J(qc ), denoted as J.

68 3 A General Robot Model for Path Planning

• Compute the joint variation dq = J+ dX, where dX =

T
dXT
p , dX T
r .

• Update the current joint configuration: qc = qc + dq.

• Return to the second step.

If it were not for the two thresholds Sp and Sr , this algorithm would
run efficiently for robots of any size. According to [67], the values 0.2
meter and 0.2 radians are acceptable for industrial robots of typical size.
For robots considerably larger, these small thresholds would make the
algorithm take small and inefficient steps. Tests by the author have also
shown that too large values of Sp and Sr can make the iterations vary
erratically a long time before converging.
Another, more severe drawback of this simple algorithm is that takes
no concern to joint limits; the solution provided by the algorithm can
have joint angles that are outside the valid range for the robot. As long
as the robot only contains revolute joints with 2π range this is not a
problem, but as the joint ranges decrease, more and more solutions fall
outside the joint limits.

A Comparison Between two Solvers

In the following we will describe an experiment setup to compare the
general solver against a closed form solution for the Puma 560 robot.
The closed form solution is based on [116]. Each solver was called 10,000
times, each time with a different desired end-effector pose. The determin-
istic pose-sequence was determined using the Halton class described in
Section 2.7; applying the forward kinematics on each configuration from
the Halton sequence gave the corresponding end-effector pose. Before
the call to the solver, the robot was moved back to the previous config-
uration. This had of course no effect on the closed form solution, but
it makes the test repeatable as the behavior of the general solver is af-
fected by the initial configuration. After each call to the solver, the total
number of solutions was incremented by the number of solutions found
at the current call. If no solution at all was found, this was reported as
a failure; as each end-effector pose was generated from the forward kine-
matics, there should be at least one solution to each query. The results,
together with the time needed for all queries, is reported in Table 3.3.2.
Not only is the closed form solution much faster, (by a factor 34), but
3.3 Inverse Kinematics 69

closed form solution general solver

time [s] 0.641 21.7
success rate 100% 52.7%
solutions / query 4.8 0.52

Table 3.1: A comparison between the closed form solution and the gen-
eral solver for the Puma inverse kinematics. The table shows the outcome
of 10,000 calls to each solver. Only once did the general solver fail be-
cause it exceeded the maximum number of iterations. The other times it
failed because the generated configuration violated the joint limits.

also more reliable; it never failed to find a solution, and it reported on av-
erage 4.8 solutions to each query. The general solver, on the other hand,
did not look for multiple solutions and found a solution to only about
half the queries. The reason for the high failure rate is the algorithms
ignorance of joint limits; only once did it not converge to the desired
end-effector pose within the maximum number of iterations, which was
set to 1,000, but the found solution often violated the joint limits of one
or more joints. It was found that repeating a failed query with a different
initial configuration made the solver converge to a valid configuration.

Generating Constrained Motions

As seen from the previous example, the iterative solver has great prob-
lems with joint limits. Considering that the joints of the Puma have
large ranges, we might expect that the problem becomes more severe for
robots with more joints and smaller joint ranges. Therefore we might
ask ourselves if our simple solver can be of any good use at all if we
have robots with joint limits. Experiments have shown that if the initial
configuration is close to a solution configuration, then the iterative solver
converges very quickly. This suggests that the solver should be suitable
for iterative generation of constrained end-effector motions; if we know
the initial configuration and move along the end-effector trajectory with
small steps, then the solver should have a better chance of converging to
each successive configuration.
To test this idea, the solver was used to generate different constrained
trajectories for the hyper-redundant trunk robot shown in Figure 3.3.
With the base fixed, this robot has eight degrees of freedom and average
range of each joint is only 29 degrees. The constrained motions corre-
sponded to: linear end-effector paths, fixed orientation and combinations
70 3 A General Robot Model for Path Planning

Figure 3.5: The end-effector is constrained to move along a straight

line with constant orientation. The motion was incrementally generated
using the general solver for the inverse kinematics.

of them. In all cases, the solver had no problem with the joint limits
until the robot reached really awkward configurations. Also, the motions
were generated very quickly as the number of iterations for solving each
step was usually very small. An example is shown in Figure 3.5, where
the end-effector was forced to follow a linear path with constant orienta-
tion. This shows that the general solver can be of use in path planning
problems where we have constraints on, e.g., the orientation.

3.4 Robot Composition

Consider the mobile manipulator in Figure 3.6. If we consider the mo-
bile base, the arm and the hand together, the system has a total of 13
degrees of freedom. Sending a model of this system to a grasp planner
3.4 Robot Composition 71

would seem very inappropriate, as there are many degrees of freedom

available in the model that are not necessary for the task. Likewise, for
a navigation task we would only want to consider the degrees of freedom
corresponding to the base. Thus, we would like to have a mechanism that
easily separates out the parts of the robot that are relevant to the task at
hand. A natural approach would be to see the mobile manipulator as a
complex robot composed of several sub-robots, where the sub-robots are
the mobile platform, the arm and the hand, respectively. An even finer
granularity can be achieved if we also see the fingers of the hand as sub-
robots. It would thus be nice if the robot class would support hierarchical
compositions of robots into more complex robots. The robot would then
have methods for retrieving a particular sub-robot. To a client, this sub-
robot can be viewed as a complete robot on its own. The only difference,
not visible from the outside though, is that movements of this sub-robot
cause changes that are propagated to all nodes in the kinematic graph
that appears below the sub-robot. As an example, the arm and hand
are not sub-robots of the platform, but moving the platform will change
the location of them. When propagating the movements to the attached
robots one can take advantage of the fact that no joint has moved, so the
resulting movement is just a translation and a rotation of each sub-robot
as a whole.
Extracting a sub-robot from a complex robot can be seen as a way of
temporarily locking some joints of the complex robot, thereby reducing
the effective number of degrees of freedom. This could be very useful
when planning, e.g., a pick-and-place task; when the hand has grasped
the object we only consider the sub-robot that correspond to the arm.
The hierarchical robot model has a lot of other advantages also. Lets say
we want to model a humanoid. Now, applying InverseKinematics to the
whole humanoid would not make much sense. If we instead apply the
same function on the sub-robot “Head”, we would get the joint angles
for making the robot look in the specified direction. Likewise, applying
InverseKinematics to “LeftArm” would give the joint angles necessary for
the arm to reach the specified position. Thus, by extracting a particular
sub-robot we can, as a useful side effect, in an easy way express our
intentions. If the hierarchical composition of robots is built into the
syntax for generating robots from text files we also gain the benefits of
modularity. Returning to our humanoid again: If the left and right leg
are identical apart from the way they are attached to the hip, then it
would both be error prone and unnecessary amount of work to define
them both in the same file. Instead the legs could be defined in a single
72 3 A General Robot Model for Path Planning

Obelix
13 DOFs ArmHand
10 DOFs
XR4000
3 DOFs Puma
6 DOFs

BarrettHand
4 DOFs

Figure 3.6: The left-hand figure shows Obelix, a mobile robot that
consists of a Nomadics XR4000 platform, a Puma manipulator and a
Barrett hand. The right-hand figure shows how this complex robot can
be seen as a composition of simpler robots. The arrowheads show how
sub-robots are attached to each other. Moving a sub-robot will cause any
attached robot to follow the motion as a single rigid body.

file “leg.rob”, which is then included twice in “humanoid.rob”. The only

thing differing between the two include statements would be the name of
the robots and their transformation with respect to the frame of the hip.
Another example: Suppose we want to change the hand on on our model
of the mobile manipulator in Figure 3.6 to a parallel-jaw gripper. If the
hand and the parallel-jaw gripper are defined in two separate files, then
all we have to do is change a file name in one include statement in the
file defining the mobile manipulator.

3.5 Design and Implementation

In this section we describe the design of the class Robot and some imple-
mentation details.
Robots are modeled as a set of moving coordinate frames. The rela-
tionships between the frames can be described by a graph that has no
3.5 Design and Implementation 73

loops. This graph is implemented in terms of RobNode objects, that can

either be joints, or a coordinate frame that is fixed with respect to its
parent frame, see Figure 3.7. To each frame, we can attach any number
of geometric objects, the robot’s geometric links.
The Robot class itself, maintains a graph of RobNode objects and
possible sub-robots. A robot object also has an object representing a
solver for the inverse kinematics. This solver object can either be a
general numerical solver, or a solver dedicated for a particular robot.
The structure of a robot can be hardcoded in the program, but it is
more flexible to read a robot-description file and let the program build
the corresponding robot. For an example of a simple robot-description
file, see Appendix E.5.

3.5.1 Self-Collision Table

When planning for kinematic chains, not only do we have to check for col-
lisions between the links and the environment, but also between the links
themselves. That is, we have to look for robot self-collisions. For a robot
with n physical links, there are n(n − 1) link pairs that can collide with
each other. However, adjacent links that are connected at a joint can be
considered to be in contact all the time, and does not have to be checked.
Furthermore, joint limits and design of robots tend to eliminate many of
the remaining link pairs. So which link pairs should be considered for
self-collision check is very dependent on the particular robot. Therefore
a robot specification can contain a user defined self-collision table. As
each link geometry is required to have a unique identifier, this table is
simply a list of string pairs. An example of a self-collision table is shown
in Figure 3.8. If robot definitions from several files are put together to
a complex robot, their self-collision tables are automatically merged as
well.

3.5.2 Multiple Inverse Kinematics Solvers

As mentioned in Section 3.3, high level planning will, at some point,
involve solving the inverse kinematics problem. To meet this need, CoPP
provides several inverse kinematics solvers. Which one is actually used is
specified by the user in the robot description file. Here we will describe
the interface to these solvers. It will also be shown that CoPP is designed
such that users can add their own solvers to the framework with a minimal
effort.
74 3 A General Robot Model for Path Planning

Robot
Move(config)
Inverse(pose, solution)
SetInvKinSolver(solver)
NumDOF( )
GetEndEffectorPose( ) subrobots
SubRobotsBegin( )

nodes

RobNode
parent->GetWorldFrame() *
pose_rel_parent;
Update( )
GetWorldFrame( ) parent
GetGeoms( )
link_geoms AddChild(node)
Geom
Accept(RobNodeVisitor v)
AttachToFrame(...)
Transform pose_rel_parent children

RobJoint RobFrame
SetParams(joint_params) RobFrame(pose_rel_parent)
SetJointType(…)
SetJointValue(val)
GetLimits(min, max)

Figure 3.7: Class diagram for the Robot class.

coll_table {
"puma_link_0" and "puma_link_3"
"puma_link_0" and "puma_link_4"
"puma_link_1" and "puma_link_4"
}
Figure 3.8: An example of a (partial) self-collision table for a Puma
robot.
3.5 Design and Implementation 75

The number of solutions to the inverse kinematics problem can be

zero, finite or infinite. In the case the problem has a finite number of
solutions, closed form solutions are often able to find them all, whereas
iterative solvers mostly find the solution that is closest to the initial
configuration. It is often of interest to know if the problem has multiple
solutions, see, e.g., the pick-and-place planner in Chapter 5. Therefore
it was decided that the return value from the solvers must be an object
that can hold multiple solutions. So, instead of returning just a single
robot configuration, inverse kinematics solvers return objects of the class
InvKinSolution, see Figure 3.9.
Conceptually, the robot class should provide the interface for the in-
verse kinematics. In the class diagram in Figure 3.7 we can see that this
is the case, as the class Robot provides the method Inverse. However,
because we want to vary the way we solve the inverse kinematics prob-
lem without affecting the robot class, all the interface in Robot does is to
forward the request to a solver object. The abstract class InvKinSolver
serves as a base class for all solvers. Because solvers need to have infor-
mation about the robot for which the inverse kinematics will be solved,
InvKinSolver provides the virtual method SetRobot, see Figure 3.9.
This method is mostly called by the robot object itself when it is given
a new solver. See the interaction diagram in Figure 3.10 for a detailed
description on how a Robot object is given a new solver. From the inter-
action diagram we can see that the robot owns and maintains a copy of
the provided solver. Thus, we can say that we have effectively changed
the guts of the Robot object. This is an example of the Strategy Pattern
as described in [45].
So far we have seen how we can represent different solvers and how we
can assign a specific solver to a robot. To put this machinery to use, we
would like to have a mechanism that allows users to specify the desired
solver in the robot description file. Essentially, this is nothing but a map-
ping from text strings to solver objects. However, if we put this mapping
in a single file, adding new solvers will require users to read and modify
code that has nothing to do with the solver itself. Furthermore, as the
file containing the mapping creates a lot of unnecessary dependencies,
this solution is not so scalable when it comes to maintenance and compi-
lation times. If we require that adding a new solver should be as simple
as possible, we have to look for another solution. Instead of putting the
mapping in a single file, we can put it in a single object. Solvers that we
want to use will have to register themselves to this object, which will work
as a repository for the available solvers. To make sure all solvers register
76 3 A General Robot Model for Path Planning

InvKinRepository
+ RegisterSolver(name, solver)
+ GetSolver(name) ...
+$ GetInstance( ) return *the_instance;
- InvKinRepository( )
$ InvKinRepository* the_instance
solvers

InvKinSolver InvKinSolution
operator( )(Transform p, InvKinSolution s) + NumSolutions( )
Clone( ) friend + GetSolutions( )
SetRobot(robot) - Clear( )
- AddSolution(config)
num_solutions
RobotConfigs configs

PumaKinematics InvKinNull

InvKinGeneral

Figure 3.9: Class diagram for the inverse kinematic solvers. Note that
the constructor of InvKinRepository is private, which is key to enforce
a single unique instance.

to the same repository, we must enforce the repository to be a single,

unique object. Furthermore, we want to make it possible for solvers to
register themselves to the repository from anywhere in the program. The
Singleton Pattern meet all three of these requirements, as its intent is to
“ensure a class only has one instance, and provide a global point of access
to it” [45]. The class InvKinRepository, Figure 3.9 uses the Singleton
Pattern to enforce a single instance, which maintains a dynamic mapping
from strings to specific solvers.
Using InvKinRepository, all available solvers can register themselves
once and for all at program startup, before main is entered. This way
we are guaranteed that all solvers are available immediately as we enter
main. Figure 3.11 shows an example of how a new solver, in this case
3.5 Design and Implementation 77

aClient aSolver aRobot anotherSolver

SetInvKinSolver(aSolver)

Clone( )

new PumaKinematics(*this)

SetRobot(*this)

Figure 3.10: Interaction diagram showing how a robot object is given

a new solver for the inverse kinematics. Note that the robot object owns
the new solver.

namespace {
const PumaKinematics prototype;
const bool isRegistered =
InvKinRepository::GetInstance().RegisterSolver("Puma560Kin",
prototype);
} // anonymous namespace

Figure 3.11: An example of how a specific solver for the inverse kinemat-
ics is added to the repository of available solvers. Note that this code is
executed at program startup, before main is entered. The solver can now
be retrieved anywhere in the program using the string “Puma560Kin”.

PumaKinematics, is added to the repository. Important to note is that

the code for registering the solver resides in the file containing source for
PumaKinematics, and not in the file containing the main function. Thus,
it is the writer of the solver, and not the user, who is responsible for
registering it. The namespace directive in Figure 3.11 has the effect to
make the global variables prototype and isRegistered invisible outside
the source file of PumaKinematics.
78 3 A General Robot Model for Path Planning

3.6 Chapter Summary

In this chapter we have presented a general robot model, particulary
suited for path planning applications. The Khalil-Kleinfinger nota-
tion [68] is used to describe the kinematic structure of robot. This no-
tation is an extension of the Denavit-Hartenberg notation [39] to allow a
consistent description of tree-like kinematic chains. The main features of
the robot model in this chapter are:

• Tree-like kinematic chains

• Multiple inverse kinematics solvers
• Hierarchical robot composition
• Joint couplings

The possibility to see a complex robot as a combination of simpler

robots introduces several possibilities. The modular approach allows a
robot’s definition to be spread over several files. With different files for
different robot hands, changing the end-effector tool of a robot becomes
easy as changing an include file in the definition of the composed robot.
The most useful aspect of robot composition is that it allows the extrac-
tion of the degrees of freedom that are needed for the task at hand. If
we want a humanoid robot to look in a particular direction, we could for
example extract a subrobot “Head” corresponding to the head and the
neck. Another example is given in Chapter 5, where robot composition is
used in the context of pick-and-place tasks. During the approach phase,
the arm and the hand are considered as one robot. During the transport
phase, however, the joints of the hand must not move, and hence only
the arm is considered. Here robot composition is useful to specify which
degrees of freedom should be used.
A future improvement of the robot model would include a more gen-
eral concept of joint couplings. Currently, joint couplings only allow lin-
ear relationships between the controlling joint and the controlled joints.
There are, however, important cases of nonlinear couplings that would be
useful to consider. The slider-crank mechanism in Figure 3.12 is a funda-
mental machine element that can be found in, e.g., combustion engines,
door closing mechanisms, and in excavator arms. This mechanism can be
considered as planar, closed kinematic chain with one degree of freedom.
In excavator arms, this mechanism is used to convert a translational mo-
tion, induced by a hydraulic cylinder, to a rotational motion. Given the
3.6 Chapter Summary 79

Figure 3.12: A slider-crank mechanism common in, e.g., combustion

engines and excavator arms. The mechanism can be seen as a closed loop
kinematic chain with one degree of freedom.

position of the hydraulic cylinder, there exists closed form solutions for
the other joint angles.
Other joint couplings could be used to enforce closure constraints, in
which case we a robot with loops in the kinematic structure. It is rec-
ommended, however, that closed-loop kinematic chains are modeled in a
class that is separate from the Robot class: Putting to much functionality
in one and the same class can make it monolithic, complicated, and hard
to use.
A file format and a parser have been developed to allow robot de-
scription files. An example of a description file is given in Appendix E.5.
Chapter 4

Augmenting
RRT-Planners with Local
Trees

Rapidly-exploring Random Trees (RRTs), introduced by LaValle in

1998 [78], has been recognized as a very useful tool for designing efficient
single-shot path planners. Bidirectional RRT-planners work by growing
two configuration-space trees towards each other; one tree is rooted at
the start configuration, and the other is rooted at the goal configuration.
These planners have shown to be efficient for a wide range of problems
and they are even able to solve problems involving differential constraints.
In this chapter we present a method for augmenting bidirectional
RRT-planners with local trees. In the presented examples we will see that
the addition of local trees greatly improves the performance for problems
involving several narrow passages. For problems where local trees are not
beneficial, there is a performance degradation due to the time spent on
growing the local trees. However, compared to the performance gain for
the other examples, this degradation is small.
The next section will give a brief introduction to RRTs. Section 4.2
presents the new algorithm that uses local trees. This algorithm is im-
plemented using the framework described in Chapter 2. The resulting
planner is tested on some problems and the results are presented in Sec-
tion 4.3. The chapter ends with a summary and suggestions for future
research.
82 4 Augmenting RRT-Planners with Local Trees

qrand
qnear

qnear
qstop

qrand

(a) (b)

Figure 4.1: (a) If the path between qnear and qrand is collision free, then
qrand becomes the new vertex in the tree. (b) If the path to qrand is not
collision free, then qstop becomes the new vertex. Note that the random
configuration need not be collision free.

4.1 The RRT-ConCon Algorithm

The RRT concept was initially proposed as a tool for solving problems
involving differential constraints [78], i.e., kinodynamic planning. Shortly
after, Kuffner and LaValle [72] showed that RRTs were efficient for holo-
nomic systems as well.

Exploration Starting from a given configuration, an RRT T is incre-

mentally grown to efficiently explore the configuration space. In each
iteration, a random1 configuration, qrand , is generated. From the vertex
in T that is closest to qrand (according to some appropriate metric M),
a new edge is grown towards qrand . If no collision is found on the way
towards the random configuration, then qrand becomes a new vertex in
T , see Figure 4.1 (a). If, on the other hand, qrand cannot be reached
because of an obstacle, then the new edge extends as close as possible
towards the obstacle. In this case, the stopping configuration is the new
vertex to be added in T , see Figure 4.1 (b).
It was shown in [78] that this method leads to a Voronoi-biased growth
1 The configurations need not be random. As pointed out in [79], deterministic,

dense sequences can be used as well.

4.1 The RRT-ConCon Algorithm 83

of T . This means that vertices with a large Voronoi cell2 have a larger
probability for being extended. This is a good property as large Voronoi
cells represent unexplored areas of the configuration space. It was also
shown in [72] that in the limit, as the number of vertices in T tend to
infinity, the coverage of the configuration space is uniform. Thus, in the
limit there is no bias.
In the case of a dynamic system, connecting two configurations can
lead to a nontrivial control problem. For such problems, the growth
towards qrand is incremental, applying a constant control signal over some
time interval ∆t. Wether this strategy reach qrand is not so important;
the growth of T is still Voronoi biased. For examples on kinodynamic
planning with this approach, see [78, 81, 82].

Bidirectional Search It is clear that just growing an RRT will not

solve any path planning problem; it will only explore the C-space. As
described in [79], the simplest way to use an RRT in a planner is to
introduce a bias in the random configurations, such that the goal config-
uration is drawn with some probability pgoal . This means that the RRT
will both explore and try to reach the goal configuration. The probability
pgoal that determines the bias towards the goal configuration can be seen
as control parameter that controls the planner’s behavior: A high bias
towards the goal gives a greedy planner that easily get stuck. A low bias,
on the other hand, gives a planner that might spend too much time on
exploration.
Kuffner and LaValle [72] suggested the use of two trees instead of
one, where one tree is rooted at the initial configuration, and the other is
rooted at the goal configuration. This lead to a very efficient bidirectional
planner, which they called RRT-ConCon. The RRT-ConCon algorithm
is shown in Figure 4.2. In an iteration, Ta is grown towards a random
configuration as described above, generating a new vertex, qnew . From
the vertex in Tb that is closest to qnew is then grown an edge towards qnew .
If this edge reaches qnew , the two trees are connected, and a solution is
found. In the next iteration, the roles of the two trees are changed. Thus,
the two trees alternates between growing towards a random configuration
(exploration) and towards each other (connection).
The Connect function, see Lines 4 and 5 in Figure 4.2, first searches
an RRT T for the vertex that is closest to q according to some metric
M1 . Thereafter a local planner tries to connect the two configurations.
2 Imagine constructing a high-dimensional Voronoi diagram from the vertices in T .
84 4 Augmenting RRT-Planners with Local Trees

RRT ConCon(qstart , qgoal )

1 Ta .Init(qstart ); Tb .Init(qgoal );
2 while (num nodes < NODES MAX) do
3 GetSample(qrand );
4 Connect(Ta , qrand , new nodea );
5 if Connect(Tb , new nodea , new nodeb ) then
6 return Path(Ta , Tb , new nodea , new nodeb );
7 Swap(Ta , Tb );
8 end while
9 return Failure;

Figure 4.2: The RRT-ConCon algorithm. The function UniRand re-

turns a uniformly distributed random number in [0, 1]. Note that the
function GetSample need not return a random configuration; it can as
well represent a deterministic sequence, as pointed out in [23, 79].

If Connect fails due to, e.g., a collision, it adds the last collision free
configuration to the tree T and returns false. If no collision was found,
the two configurations were connected and the function returns true. The
metric M1 is hereafter denoted nearest-neighbor metric.
Note that it is possible that no new vertex at all is created by a call
to Connect. This happens if the vertex in T is so close to an obstacle
that no step is possible towards the other configuration. Thus, in a real
implementation, we would have to check that Connect actually created
a new vertex. To avoid cluttering the algorithm description, we have
chosen to omit such details.

4.2 Local Trees

RRTs have been found to be useful tool for building efficient single-
query planners, see, e.g., [72, 82]. However, if a problem requires passing
through a narrow passage, the RRT variants proposed so far often get
into difficulties. The “narrow passage problem” is of course nothing new
to probabilistic planning methods, but the problem has even more im-
pact on RRT planners than on PRM planners. That is because PRM
methods will save the rare, but valuable, samples that happen to fall in-
side a narrow passage, and sooner or later they will get connected to the
graph. RRT methods, on the other hand, will throw away a potentially
4.2 Local Trees 85

valuable sample if the active tree could not connect with it. Using uni-
form sampling, the planner will probably have to wait a long time until
such an important sample shows up again. The problem becomes even
more pronounced if the solution trajectory has to pass a series of narrow
passages: Once a tree is finished struggling with a passage, the next one
is waiting just around the corner. PRM methods, on the other hand,
have the advantage of being able to treat the narrow passages more or
less in parallel. In a recent paper by Akinc et al. [3], it was noted that in
environments with thin obstacles, like the “hole” problem in Figure 4.8,
RRT planners tended to produce many configurations that were stuck
near the obstacle. They also reported that for such situations, the cost
for a single RRT query approached the cost of the preprocessing phase
for the PRM method.
Based on these observations, is clear that RRT methods need to take
better care of samples that fall into crucial, but hard to reach, regions.
The basic idea that is proposed here, is to let important samples that
cannot yet be reached, spawn a new tree. Such a tree will be called a
local tree, and the start and goal trees will hereafter be denoted global
trees. A local tree is also an RRT, and as it grows it will eventually reach
outside the “hard to reach” region and merge with one of the global trees.
Even two local trees can be merged together if they get close. There are
three important issues that arise when implementing this idea:

• When should a sample be allowed to spawn a local tree?

• How often should the local trees be allowed to grow?

• How often should the planner look for inter-tree connections?

In the extreme case, letting every sample start a new RRT and then
connecting it to, say, the nc closest vertices, the algorithm would simulate
the behavior of the PRM approach. This was pointed out by LaValle and
Kuffner [82]. In this case it is clear that the RRT approach would loose
its qualities as an efficient single-query planner, and we would be better
off using a PRM based method right away. So clearly, there is a tradeoff
between how much time the planner should spend on the two global
trees and on the local trees, respectively. In the next section, effective
heuristics, addressing the three issues above, will be discussed. A new
algorithm, called RRT-LocTrees, is also presented.
86 4 Augmenting RRT-Planners with Local Trees

4.2.1 The RRT-LocTrees Algorithm

For a planner using local trees, when to create them is of course one of
the most important issues. As discussed in the previous section, only
important samples should be allowed to create a local tree. Therefore
we need some rule establishing the importance of a sample. If a sample
is collision free and cannot be reached by neither the global trees, nor
the local trees, then the sample is considered as a candidate for creating
a new RRT. These requirements are very similar to the Visbility-PRM
approach of Siméon et al. [137], where nodes that are not “visible” from
other parts of the graph become guards. As these requirements are rather
weak, the number of local trees would rise very quickly. To avoid having
too many trees, we set an upper limit on the number of local trees, Nloc .
The immediate question is of course how the choice of Nloc affects the
planner and how it is related to the dimension of the configuration space.
As it turns out, a good value on Nloc is not so dependent on the dimension
of the configuration space, but rather on the number of difficult passages
the solution trajectory has to pass. As RRTs are very effective at quickly
covering large, relatively open areas of the configuration space, local trees
in those areas will quickly connect with another tree and disappear. So,
it is likely that the local trees will get more and more concentrated to the
hot spots of the problem. Thus, Nloc roughly tells us how many narrow
passages we can treat in parallel.
To check if a sample q is reachable from a tree T , we do a reachability
check using the Connect function. As Connect may add a new vertex
to an RRT, it is clear that just performing the reachability check will
also grow the local trees, as long as a single step could be taken in that
direction. Thus, doing this reachability check for each sample is would
waste too much time and nodes on the local trees. Therefore, if we already
have reached the maximum number of local trees, the reachability test
will be carried out only with probability pgrow . Thus, pgrow provides a
parameter for tuning the growth of the two global trees relative the local
trees. Note that this parameter will only have effect once the maximum
number of local trees is reached.
So far we have only dealt with the issues when to create local trees
and when to grow them. However, without trying to connect them with
the global trees (or other local trees), they will not be of much use.
Since a local tree originates from a sample that was hard to reach, there
should be some kind of advancement before we try to connect it with
another tree: If a tree has advanced, it might have done so out of the
4.2 Local Trees 87

RRT LocTrees(qstart , qgoal )

1 Ta .Init(qstart ); Tb .Init(qgoal );
2 while (num nodes < NODES MAX) do
3 GetSample(qrand );

4 if not Connect(Ta , qrand , new nodea ) then

5 if (num loc trees < Nloc ) or (UniRand() < pgrow ) then
6 GrowLocalTrees(Ta , qrand );
7 if Ta .BoundingBoxGrew() then
8 for all Ti : Ti 6= Ta , Tb do
9 TryMerge(Ta , Ti , new nodea );

10 if Connect(Tb , new nodea , new nodeb ) then

11 return Path(Ta , Tb , new nodea , new nodeb );
12 Swap(Ta , Tb );
13 end while
14 return Failure;

Figure 4.3: The RRT-LocTrees algorithm. The main difference from

the RRT-ConCon algorithm is due to Lines 4-9.

“hard to reach” area, making it reachable for other trees. The measure
of advancement chosen here is the volume of the box bounding of an
RRT in the configuration space. Thus, each time the bounding box of a
tree grows, the new node will be used for connecting this tree with every
other tree. Since it was this node that caused the growth of the bounding
box, it will in some sense be in the “frontier” of that tree. Experimental
results have shown this to be a very effective heuristic. However, it was
found out that an even greater effect was achieved if this heuristic was
extended such that inter-tree connections are tried also if the sample was
successfully reached by a local tree.
The rules described above have been used to implement a new RRT
algorithm, called RRT-LocTrees. The main loop of the algorithm builds
on the RRT-ConCon algorithm[72], described in Figure 4.2. The new al-
gorithm is described in Figure 4.3. To simplify the description of the algo-
rithms, it is assumed that all trees, both global and local, can be accessed
by an index. Furthermore, the two first trees will always be the start tree
and the goal tree. Two important sub-routines, GrowLocalTrees and
TryMerge, are described in Figure 4.4 and Figures 4.5, respectively.
88 4 Augmenting RRT-Planners with Local Trees

GrowLocalTrees(q, Tact )
1 if not IsSatisfied(q) then return;
2 for all Ti : Ti 6= Tact do
3 reached = Connect(Ti , q, new nodei );
4 if reached or Ti .BoundingBoxGrew() then
5 for all Tj : j > i, Tj 6= Tact do
6 TryMerge(Ti , Tj , new nodei );
7 if reached then return;
8 end for
9 if (num loc trees < Nloc ) then
10 CreateLocalTree(q);
11 return;

Figure 4.4: Pseudo-code for the GrowLocalTrees sub-routine. Note

that Tact is either the start tree or the goal tree, depending on the current
direction.

TryMerge(Tc , Td , node in c)
1 if Connect(Td , node in c, new node) then
2 Tc .Merge(node in c, Td , new node);
3 delete Td ;
4 end if
5 return;

Figure 4.5: Pseudo-code for the TryMerge sub-routine.

4.3 Experiments
To study the benefits of using local trees, two planners were tested on
a set of four path planning problems. The first planner uses the orig-
inal RRT-ConCon algorithm, while the second planner uses local trees
together with the heuristics presented in Section 4.2. Both planners were
implemented using the CoPP framework, described in Chapters 2 and 3.
The function Connect, see Section 4.1, uses a local planner to try to con-
nect two configurations. As local planner, the class BinaryConnector,
see Section 2.8.3, was used. This local planner uses a metric M2 and
a given maximum step size to determine how densely a path segment
has to be checked for collisions. The metric M2 is here denoted path
metric. If not mentioned otherwise, the BoundingBoxDispl metric from
4.3 Experiments 89

Figure 2.11 is used as path metric.

The software was compiled using Visual C++ and run on a 1.2 GHz
Pentium 3 processor. Only PQP was used for the collision detection.
For the experiments, the four problems in Figures 4.6-4.9 were used
to study the effect of using local trees. In particular, the effect of pgrow
has been examined. In all experiments, the maximum number of local
trees was kept constant at Nloc = 10. Because the required planning
time is not deterministic, each problem was run 100 times to get reliable
averages. The table presents averages and the minimum and maximum
values of both the running time and the number of nodes used. Tests
with pgrow = 0 correspond to the RRT-ConCon algorithm.
Initial results for the experiments performed in this section have been
presented in [144]. There are, however, some differences compared to the
results in this section. These differences are mostly due to using an opti-
mized metric for each individual problem; for each problem, we chose the
metric that seemed to give the best results for the RRT-ConCon algo-
rithm. The sometimes large differences from the results in [144] show the
sensitivity of these methods to the choice of metric. They also underline
the importance of documenting all settings for a particular problem, so
that other researchers can do comparisons.

2D-maze A unit square with 2 DOFs must move from the lower
left corner to the upper right. Even though this problem only is two-
dimensional, RRT-ConCon has trouble because of the many narrow pas-
sages: Each entry to a narrow passage poses a problem. Albeit not so
difficult, but there are many of them. Thus, this is an ideal setting for
testing the approach presented here, as each local tree can grow in its
own corridor, waiting for another to come near. The width of each nar-
row passage is 1.6 times the width of the square. The overall size of the
maze is 100 × 100. The problem is available in the MSL distribution. As
nearest-neighbor metric, the squared Euclidean distance was used. This
was metric was also used as path metric.
From Table 4.1 it is seen that without local trees, the planner suc-
ceeded at all 100 attempts, but the planning times are not so encouraging.
As expected, adding local trees has a very positive effect on this problem,
and the performance is rather insensitive to variations in the parameter
pgrow . The runs showed that the number of local trees quickly reached
the maximum value and then dropped again as the two main trees cov-
ered more and more of the maze. As can be seen in Figure 4.3, pgrow
will not have any effect as long as the maximum number of local trees
90 4 Augmenting RRT-Planners with Local Trees

is not reached. Since Nloc is reached only for a very short time for this
problem, the lack of influence from pgrow can be explained.

C-maze A C-shaped object must move through a ’simple’ maze. The

regularly spaced squares in the maze will force the object to repeat a
translate-rotate-translate manoeuver. The problem is adopted from the
MSL distribution. As the solution requires passing a large number of
difficult passages in series, it is expected that the local trees approach
will have a large impact on this problem.
As nearest-neighbor metric, the RingMetric from Figure 2.11 was
used. This metric correctly handles the S1 -topology for the rotational
degree of freedom. The total distance is a weighted sum of the transla-
tions and the rotation:

ρ(q1 , q2 ) = |qx1 − qx2 | + |qy1 − qy2 | + 4|RingDiff(qθ1 , qθ2 , −π, +π)|.

The function RingDiff is described in Figure 2.7, and in this case the
result will be in [−π, +π]. As the translational coordinates can vary in
[0, 100], the translational parts will on average give a larger contribution.
The low rotational weight has the effect to produce many motions that
are pure rotations, something that is beneficial for this problem.
As can be seen from Table 4.1, RRT-LocTrees again performs much
better than RRT-ConCon. Furthermore, as the maximum number of
local trees is reached over a long time, there is also significant influence
from pgrow ; increasing pgrow decreases the required planning time.

Hole A well-known test scene, see Geraerts and Overmars [46]. The
moving object is composed of four identical boxes of dimension 100 ×
20 × 20, such that the bounding box of the final object is 100 × 100 × 180.
The hole has dimension 100 × 100 and the plate thickness is 20. To get
through the hole, the object has to rotate in a complicated manner. Since
the passage in the configuration space will be long and winding, getting
a local tree inside the passage will greatly speed up the solution process.
It was noted that most unreachable samples actually corresponded to
configurations where the object was tightly pressed to the surface of the
plate. The local trees arising from such samples were merged with the
start or the goal tree in just a few iterations, and the maximum number of
local trees was seldom reached. That explains the lack of influence from
pgrow on this problem. Once a local tree was really inside the passage,
the solution was quickly found.
4.3 Experiments 91

Figure 4.6: A maze with narrow corridors for a square with 2 DOFs.

In [144], Euler angles were used to represent the rotations. Here we

instead use quaternions, leading to better interpolation, see Figure 2.9,
and to a better metric. The nearest-neighbor metric is the QuatDist
metric, see Figure 2.11. For the translational part, Manhattan distance
is used, with each coordinate normalized to [0, 1]. For the rotational
part, the distance is related to the quaternion inner product, as given by
Equation (B.12). The rotational contribution is weighted by 0.1. Note
that as the bounding box of a quaternion does not make sense, we still
use Euler angles for the bounding box computations.

Manipulator A Puma manipulator has to move a large wrench from

the upper shelves to the lower shelves, see Figure 4.9. The clearance be-
tween the shelves, starting from the top, is 300, 550 and 250, respectively.
The bounding box of the wrench is 680 × 240 × 80. Due to a nearby wall
and a cupboard, the workspace of the robot is very tight. This is a prob-
lem where the new approach is not expected to gain much, because most
of the local trees will grow in places that are not necessary to visit, e.g.,
between the second and the third shelves. Thus, this is a good problem
for examining the cost of using local trees at a problem where it is not
really beneficial. It is clear from Table 4.1 that the original algorithm
does better in this case. In fact, as pgrow increases, the performance de-
grades. As nearest-neighbor metric, a weighted Manhattan distance was
used, with unit weights for all degrees of freedom.
92 4 Augmenting RRT-Planners with Local Trees

Figure 4.7: A maze for a C-shaped object with 3 DOFs.

Figure 4.8: The rigid body has 6 DOFs and it has to pass through the
hole.

Even if the test suite of problems is small, it seems as if problems

gaining from local trees do so very much, even if pgrow is small. For
problems not gaining from local trees, the performance will degrade as
pgrow increases. It is therefore suggested that when augmenting RRT
planners with local trees, using a small value on pgrow , will make a great
improvement on the overall performance.

4.4 Chapter Summary

Using the idea of local trees makes the planner more closely related
to ’classical’ PRM-methods, while keeping the key ideas of the RRT-
4.4 Chapter Summary 93

Figure 4.9: The 6 DOF Puma manipulator has to move the wrench from
the upper shelf to the lower shelf. Note the nearby wall and cupboard,
limiting the manipulator workspace.

methods: The Voronoi biased growth of the trees and the incremental
connection with other nodes. The latter property is important for non-
holonomic systems. The important issues with this method seem to be
when to create a local tree, how often the local trees should be allowed
to grow and when to look for inter-tree connections. Spending too much
time on local trees will make the planner loose its benefits as an efficient
single-shot planner. In this paper, simple and powerful heuristics have
been proposed for both these issues, with one parameter controlling the
amount of time spent on the local trees. Although the experimental re-
sults are few so far, the trend seems to be that local trees have a large
impact on difficult problems, even if the parameter pgrow is very small,
say pgrow ≈ 0.05. This suggests that augmenting RRT-planners with
local trees, using a small value on pgrow , will have a large effect on the
overall performance.
According to one of the most efficient heuristics, inter-tree connec-
tions should be attempted whenever the bounding box of an RRT grows.
It could, however, be expected that the efficiency of this heuristic de-
creases as the dimension of the configuration space increases; for high-
dimensional configuration spaces, this heuristic can cause too many con-
nection attempts with other trees, as new vertices are likely to increase
the bounding box.
94 4 Augmenting RRT-Planners with Local Trees

pgrow tmin tmax tavg nmin nmax navg

2D-maze
0.0 262.2 983.3 550.7 15086 25449 19941
0.05 0.9 4.2 1.9 883 3314 1683
0.1 0.8 4.0 1.8 803 3131 1596
0.2 0.8 3.7 1.8 782 3055 1662
C-maze
0.0 83.7 727.3 278.7 9409 23637 15531
0.05 14.9 57.5 31.6 4676 9722 6901
0.1 11.3 39.7 19.4 3470 8687 5484
0.2 7.7 27.5 13.4 2962 7319 4604
Hole
0.0 0.3 686.5 42.2 274 36966 7547
0.05 0.5 6.9 2.1 405 4875 1753
0.1 0.6 12.8 2.5 452 7034 2070
0.2 0.2 8.3 2.1 101 5531 1760
Manipulator
0.0 1.3 31.3 9.5 175 2835 950
0.05 4.3 36.1 14.5 369 3260 1339
0.1 4.3 45.6 16.5 432 3939 1534
0.2 4.0 61.2 20.9 363 5335 1884

Table 4.1: Experimental results for the four problems and various values
for pgrow . Each problem was run 100 times. Note that pgrow = 0.0
corresponds to the original RRT-ConCon algorithm, which does not use
any local trees.

Balancing the Trees The path planning problems studied in this

chapter were balanced in the sense that it was not significantly more
difficult to add nodes to the start tree than to the goal tree, or vice
versa. Thus, applying RRT-ConCon or RRT-LocTrees lead to global
trees that on average were of the same size. Many important problems
do not exhibit this property. Consider for example the Flange3 problem
in Figure 4.10, where the problem is to find sequence of motions that
remove the flange from the pipe. Using RRT-LocTrees on this type of
problems typically leads to one tree becoming orders of magnitudes larger
(in terms of number of nodes) than the other.
3 http://parasol-www.cs.tamu.edu/groups/amatogroup/benchmarks/mp/
4.4 Chapter Summary 95

Figure 4.10: The “Flange” benchmark. The goal is to separate the two
parts. Because the start configuration is severely constrained, an efficient
planner should focus more on the start tree.

In [79] a balanced bidirectional RRT-planner is described. It is similar

to RRT-ConCon, but it enforces the two trees to have the same number of
nodes. Thus, focusing more on the tree that is having trouble to explore.
A suggestion for future research is therefore to develop a balanced
version of the RRT-LocTrees algorithm. The major problem is due to
the GrowLocalTrees sub-routine, which can cause one of the global trees
to grow “behind the back” of the other. We have done some initial exper-
iments with a balanced version, and it seems to be possible to combine
the strengths of RRT-LocTrees with those of the balanced planner de-
scribed in [79]. In these experiments, the Flange problem in Figure 4.10
was solved in about two minutes on average.
Chapter 5

Planning of
Pick-and-Place Tasks

In this chapter we will describe a planner capable of solving high-level,

real-life tasks. The tasks considered are of the pick-and-place type, which
are common tasks in both industrial robotics and service robotics. From
the user’s point of view, such a task could be specified simply as “Put
the rice box on the table”, hiding a lot of details such as how to grasp
the object, eventual re-grasping operations and the pose of the arm when
the task is done. Allowing task specifications at this level will of course
increase the complexity of an already hard problem and the question
is how this task can be decomposed, in a general manner, into several
simpler path planning tasks, without imposing unnecessary constraints
on the solutions.
Similar to other approaches, we decompose the task into three parts:
approach, transport, and return. A modified RRT-ConCon planner is
used to plan the motions for each part. The main improvement is that
the planner is able to plan towards several goals at the same. This
is particularly useful if several arm configurations can reach the same
grasp. Furthermore, each RRT-planner maintains its internal state be-
tween queries, allowing efficient backtracking if a previously chosen goal
configuration turns out to be a dead-end.
The next section will give an overview of previous approaches to solve
pick-and-place tasks and manipulation tasks in general. This is followed
by an overview of the proposed planner and its components. Section 5.3
98 5 Planning of Pick-and-Place Tasks

presents the grasp generator concept. This concept is useful for separat-
ing the grasp planning process from the rest of the planner. In Section 5.4
we present a general preprocessing method that help to make the problem
easier for the path planner. The method uses the inverse kinematics of
the robot to convert the start and goal configurations into configuration
space trees. Section 5.5 deals with how motions for the arm and hand
are generated. Section 5.6 presents a simple approach for handling task
constraints.
Due to the random nature of the algorithm, the produced paths are
often jerky and unnecessary long. To be of real use, smoothing of the
solution is necessary. Section 5.7 deals with methods for path smoothing.
In Section 5.8, we will show examples of various tasks solved by the
planner. Two of these examples involve task constraints. The chapter
ends with a summary.

5.1 Related Work

One of the first systems that automatically solved high-level manipulation
tasks was the Handey system [93, 62]. The system required as input poly-
hedral models of the robot’s links, static obstacles and the task object.
To locate the task object, Handey used model-based object recognition
together with a triangulation-based range finder. The combination of
polyhedral objects and a parallel-jaw gripper made it particularly easy
to generate stable grasps. To plan collision free motions, two planners
were used: Large arm movements were planned using a discretized ver-
sion of the configuration space, where the last three joint angles were
kept constant. For small motions near the task object, a potential field
planner was used, constraining the gripper’s motion to the grasp plane.1
If necessary, the system could insert a planned sequence of regrasping
motions to solve the task.
In his thesis, Kuffner [71] proposed a pick-and-place planner suit-
able for use in animation of autonomous agents. The emphasis was put
on natural-looking motions and near realtime behavior. In one of the
demonstration applications, a virtual chess player planned and executed
commanded moves in near realtime. The decomposition of the task is
the same as that in [93], but here all motions are planned using Rapidly-
exploring Random Trees (RRTs) [78, 72], leading to a more robust and
1 In [93] the grasp plane is defined as a plane parallel to the faces being grasped

and midway between them. We use a different definition in Chapter 6

5.2 Overview of the Planner 99

less heuristic planner (no restrictions on the last three joints, no dis-
cretizations of the configuration space).
Nielsen and Kavraki [109] extended the Probabilistic Roadmap
(PRM) framework to handle manipulation planning. Their approach
uses the manipulation graph, whose edges can be transfer paths, or tran-
sit paths. A transfer path corresponds to a movement where the task
object is grasped and moves along with the robot. Accordingly, a tran-
sit path is a path where the object is left in a stable position and only
the robot is moving. The generality of the approach makes it possible to
solve tasks that require several regrasping motions. However, the planner
has to be initialized with a user-defined set of grasps and stable object
placements.
Whereas the approach in [109] used a discrete set of grasps and ob-
ject placements, the recent approach by Siméon et al. [134] could handle
continuous sets. A continuous set of grasps assumes that the set can be
parameterized by some coordinates. This is easy in cases that involve,
e.g., a parallel-jaw gripper and a bar with rectangular cross section. The
example in [134] showed that the planner is capable of finding solutions
to problems that require long sequences of regrasping motions.
The planner presented in this chapter has many similarities with the
one presented in [71]. However, as will be pointed out in Section 5.9, this
planner introduces several improvements over the one in [71].

5.2 Overview of the Planner

In this section we will give an overview of the proposed pick-and-place
planner. We will decompose tasks into a sequence of steps and introduce
the components that are necessary to solve each step. Each concept is
covered in more detail in the subsequent sections.
We will assume that a pick-and-place task can be decomposed into
the following sequence:

• Approach: Move the arm to the grasp configuration, while forming

the hand to a preshape that is compatible with the grasp.

• Grasp: Close the fingers around the task object to form a stable
grasp.

• Transport: Move the grasped object to a specified position, while

keeping the hand configuration fixed.
100 5 Planning of Pick-and-Place Tasks

• Release: Release the task object.

• Return: Move the arm to a specified or default “home” configura-

tion.

This decomposition is similar to that used in [93] and [71]. There is, how-
ever, a class of pick-and-place tasks whose solution cannot be described
by the proposed task decomposition; some tasks require the robot to
move the object to an intermediate position, and then regrasp it to suc-
ceed. Figure 5.1 show some sequences from a task where the robot is
supposed to reposition a cylindrical object. To succeed with the task,
the robot must place the cylinder at an intermediate position where it
can be regrasped, see Figures 5.1 (c) and (d). We will argue here that
tasks that need such regrasping operations can be solved if an interme-
diate pick-and-place task, instantiated by a high-level planner, is solved
first. Thus, any complex pick-and-place task can be decomposed into a
sequence of simpler tasks that follow the proposed task decomposition.
The reason for this distinction between simple and complex pick-and-
place tasks is that regrasping operations in general will require a lot of
semantic knowledge, specific to the environment and the task. First, a
regrasping operation might involve placing the object at some position
other than the start- or goal position. This requires knowledge about
which surfaces in the environment are suitable as support for the object.
Second, cluttered environments could require the robot to reposition an
obstacle, which requires knowledge about which obstacles are moveable.
For these reasons we think that planning of regrasping operations should
take place at a higher level than planning of the basic pick-and-place
task. However, we think it is important that the planner solving the
basic pick-and-place task is capable of communicating the reason of an
eventual failure to the level above. This information can help the higher-
level planner to, e.g., look for a regrasping operation, or reposition a
moveable obstacle. In Section 5.3.3 we look at how statistical informa-
tion that is gathered during the grasp planning process can be used to
detect if a regrasping operation is needed.

Grasp Generators The most critical decision for any pick-and-place

planner is which grasp to choose; the choice of grasp will greatly influ-
ence the required arm motions. More important, for some grasps there
will be no solution at all. In Section 5.3 we will introduce the concept of
grasp generators. Grasp generators are used to generate an ordered set
5.2 Overview of the Planner 101

(a)

(b) (c)

(d) (e)

Figure 5.1: A task that requires a regrasp operation. A possible regrasp

operation is shown in (c) and (d).

of stable grasps, where each grasp satisfies the task constraints at both
the start and the end position. In case of an eventual failure (i.e., the
set of grasps is the empty set), grasp generators can provide higher-level
102 5 Planning of Pick-and-Place Tasks

planners with sufficient information to modify the current plan. As pre-

viously mentioned, a modification of the plan might involve a regrasping
operation, or repositioning an obstacle that blocks an important path.

Modified RRT-Planner Once a grasp has been chosen, the arm con-
figuration at the pick up and put down position, respectively, can be
found solving the inverse kinematics problem. Similar to [71], we use
the RRT-ConCon (see also Section 4.1) algorithm to find motions that
connect these configurations, with some important changes. First, we
consider multiple goal configurations instead of a single goal configura-
tion. This will cause the planner to try to connect the start tree to several
goal trees in parallel. Second, the start and goals are no longer points
in C, but trees. The old planner interface is a special case of this new
interface, because a point in C can bee seen as a configuration space tree
with just one node. The possibility to initialize the RRT-planner with
entire trees will be extremely useful; as described in the following para-
graph, a specialized planner can be used to preprocess the start and goal
configurations into trees that will both accelerate the RRT-planner and
help it generate more efficient motions.

Retract Planners Each transition in the proposed task decomposi-

tion, e.g., from approach to grasping, will involve very constrained mo-
tions near obstacles. Consider Figure 5.2 where the hand is about to
grasp a cylinder; in the close vicinity of the cylinder, the hand is con-
strained to move in directions that are spanned by the y-axis and the
negative z-direction of the end-effector frame. If we already know the set
of admissible motion directions close to the transition, it seems as a waste
of resources (time and computer memory) to not pass this information
to the RRT-planner. We therefore introduce the useful concept of retract
planners. A retract planner will, starting at a given grasp, move away
from the object as far as possible along prescribed workspace directions.
The result will be a tree, rooted at the grasp configuration. Such trees
will hereafter be called retract trees. See Figures 5.4 and 5.5 for examples
of projections of retract trees to the workspace. Note that each node in
a retract tree requires solving the inverse kinematics problem. In Sec-
tion 5.4 we will discuss retract planners in more detail. We will also show
that retract planners are useful for all transitions in the pick-and-place
task.
5.2 Overview of the Planner 103

x
Figure 5.2: In the vicinity of the cylinder the hand can only move in the
directions spanned by the y-axis and the negative z-direction. Movement
in the x-direction is blocked by collisions between the fingers and the
cylinder surface.

Task Constraints Real-life tasks will often involve additional con-

straints that are task specific. A natural example is the robot-butler
scenario, where the robot is about to serve a cup of hot tea. Here we
have an orientational constraint that, in risk of upcoming law suits, bet-
ter not be violated. In Section 5.6 we will discuss how constraints can
be handled, and in Section 5.8 we will show two examples of tasks that
involve constraints.

Path Smoothing The final solution can be expressed as a concatena-

tion of the approach path, the transport path, and the return path. As
is the case with most randomized planning algorithms, the solution path
will be jerky. In fact, if the problem contains some difficult passage, the
solution path will often exhibit a long sequence of ’random walk’ behav-
ior in the vicinity of the difficult passage. To be of any practical use,
such sequences have to be removed and we have to smooth the resulting
path. In Section 5.7 we will discuss two methods that effectively removes
unnecessary via-points and smoothes the solution path.
104 5 Planning of Pick-and-Place Tasks

5.3 Grasp Generators

The solution of a pick-and-place task must inevitably involve some kind
of grasp planning. Most approaches to grasp planning put the effort in
finding one optimal grasp, where optimal is with respect to some measure
of grasp stability or grasp manipulability. However, this approach fails
to recognize the constraints imposed by the task; the chosen grasp might
be very stable but cause the path planner to fail to plan a valid transport
path. To let the grasp planner take the task constraints into account is
not a solution, as that would make the grasp planning problem as hard
as the original problem we wanted to solve and would also be against
our strategy of decomposing the task. The approach proposed here is to
have grasp planners that produce a, possibly ordered, set of grasps that
are locally valid to the task. Here we define a grasp to be locally valid if
it satisfy the following criteria:

• The grasp must be stable.

• The grasp must be reachable at start and end positions, i.e., there
must exist solutions to the inverse kinematic problem for the arm.

• The grasp must be collision free at start and end positions.

The produced grasps should also have the property that they are distinct
or far away from each other in some sense. As an example, when trying
to grasp a box, grasps that approach different sides of the box can be
considered to be distinct with respect to each other. A grasp planner
working after these principles is hereafter called a grasp generator. Note
that the extra constraints in general simplify the grasp planning process
because they can be used to effectively limit the search space.

5.3.1 Algorithms Suitable as Grasp Generators

For a client using a grasp generator, it does not matter how the grasps
are found as long as it behaves as described above. Below we will look
at some ways a grasp generator can be implemented, which will lead
to a model for the implementation. Ideally, a grasp generator should
be capable of generating a set of feasible grasps given any hand-object
pair. However, developing such a grasp planner would be a formidable
task in itself. In fact, most papers on grasping simplify the problem by
ignoring the hand kinematics in the problem formulation. While this
5.3 Grasp Generators 105

approach has increased the mathematical understanding of the grasping

problem, its practical use is questionable: Of what use is an optimal grasp
if the hand cannot reach the planned contact points? A more pragmatic
approach would be to simply store together with each object a set of
precomputed grasps in a database. This approach was used by, e.g.,
Petersson et al. [117] for implementing a fetch-and-carry system, and by
Kuffner [71] in his thesis on animation of autonomous agents. A database
with precomputed grasps is fast and simple to implement, but it is only
viable if the number of objects handled by the planner is small and new
objects are rarely added.
Smith et al. [140] proposed an algorithm for computing grasps for a
parallel-jaw gripper on polygonal objects. The algorithm runs in O(n3 )
time to compute and rank O(n2 ) grasps for an n-sided polygon. As this
algorithm satisfy our criteria, it can clearly be seen as a grasp generator.
Fischer and Hirzinger [42] proposed a fast randomized grasp planner
for finding finger-tip grasps on an arbitrary 3D object for the four-fingered
DLR hand. In this method, contact points are generated by ray-casting
from a point inside the object. A set of heuristics is used to quickly
determine if the generated contact points lead to a good grasp. If the
points are good enough, another test makes sure if the contact points
are reachable by the hand. The method is fast and it can easily be
adopted for other hands than the DLR hand. If ordering of the grasps
is not important, then the algorithm could be used right away as grasp
generator. If, on the other hand, an ordered sequence of grasps is desired,
then the algorithm is modified to generate n distinct grasps, which are
sorted according to some quality measure. A drawback with this method
is that it can only plan for fingertip grasps.
Strandberg [142] and Morales [106] presented two similar grasp plan-
ners for the three-fingered Barrett hand. Both planners take a two-
dimensional contour as input. The planner in [142], which is also de-
scribed in more detail in Chapter 6, produces an ordered set of grasps as
output and can thus be seen as a grasp generator.
Miller et al. [102] proposed a grasp planner for the Barrett hand. They
simplified the grasp planning process by using two models of the task
object: one accurate model, and one approximation made of primitive
shapes like cylinders, spheres and boxes. Humans simplify the problem of
finding an appropriate grasp by choosing a prehensile posture appropriate
for the object and the task. Inspired by this, Miller et al. [102] defined a
set of preshapes for the Barrett hand. These preshapes were used with
a set of rules to generate a set of grasp starting positions for a given
106 5 Planning of Pick-and-Place Tasks

primitive shape. From each grasp starting position, the fingers were
closed around the accurate object model, and the resulting grasps were
evaluated and sorted using the grasping simulator GraspIt! [101, 99].
We use a similar approach for the examples in Section 5.8. However,
our approach is simplified by the assumption that the primitive shapes
exactly describe the task object.
In summary, there exist many approaches that can be used for imple-
menting a grasp generator, but a general one-for-all approach is yet to
come. Therefore it is desirable to have a modular approach so that the
grasp planning strategy can easily be changed without affecting the pick-
and-place planner. To achieve this goal we must define an interface to
which all grasp generators must adhere. This is done in the next section.

5.3.2 The Grasp Generator Interface

As seen in the previous section, the implementation of a grasp genera-
tor will vary and it is also likely to be specific for a particular hand or
gripper geometry. Since all implementations do the same thing, namely
produce a set of grasps, this makes an excellent case for the use of in-
heritance to encapsulate the varying behavior and maintain a uniform
interface. Therefore all grasp generators will inherit from the abstract
base class GraspGenerator, see Figure 5.3. The most important part
of the GraspGenerator interface is how clients retrieve the generated
grasps. From a client’s point of view, a grasp generator represents a,
possibly ordered, set of grasps of unknown size. This abstraction sug-
gests that the interface should only allow clients to sequentially iterate
over the set of grasps. Hence, the most important method in the interface
is NextGrasp, see Figure 5.3. With this interface, the number of grasps
and the way they are generated are hidden from the client.
It is clear that the NextGrasp method alone will not make the interface
complete as it does not provide the grasp generator with any information.
A grasp generator will need access to the following:

• A robot hand for the grasping analysis (unless the grasp generator
is a database of precomputed grasps)

• A robot arm for reachability tests

• An object for collision checking

• An geometric object that models the task object

5.3 Grasp Generators 107

GraspGenerator
+ bool NextGrasp(grasp)
+ NumCollFreeAtStart( )
+ NumCollFreeAtGoal( )
+ NumReachableAtStart( ) if(DoGraspIsCollFree()) {
+ NumReachableAtGoal( ) ++num_reachable;
# GraspIsCollFree(…) return true;
- DoGraspIsCollFree(…) }
# GraspIsReachable(…) return false;
- DoGraspIsReachable(…)
num_reachable_start
…

GraspGeneratorA GraspGeneratorB

Figure 5.3: From the client’s point of view, a grasp generator is a just
a sequence of grasps, which are retrieved with the NextGrasp method.
The base class uses the Template Method pattern [45] to automate the
process of collecting important statistics, such as the number of reachable
grasps that were found.

• The initial pose for the task object

• Valid end poses for the task object

A natural approach is to provide the first four objects at the creation

of a grasp generator, whereas the last two items are provided via some
initialization method. Note that we allow the specification of multiple
end poses for the task object; if all other criteria are fulfilled, then it
is sufficient if a grasp is reachable and collision-free for at least one of
the provided end poses. The importance of multiple end poses will be
discussed in Section 5.5.1.

5.3.3 Feedback to Grasp Generator Clients

If a grasp generator fails to produce any valid grasps at all, it would
be useful for clients to know the reason for this. With such information
108 5 Planning of Pick-and-Place Tasks

clients can modify the original plan in order to find a solution. We have
found that useful information can be gathered during the generation of
the set of grasps. Therefore GraspGenerator maintains and provides the
following data:

• The number of stable grasps

• The number of collision free grasps found at the initial pose
• The number of reachable grasps found at the initial pose
• The number of collision free grasps found at the end poses
• The number of reachable grasps found at the end poses

If a grasp generator fails to produce any grasps, a client can diagnose the
cause of failure using the statistical data. For example, if the number of
reachable grasps at the initial pose is zero the object is probably too far
away to be grasped. If the arm is mounted on a mobile platform, the
client can try to plan for a better position of the platform. If the object
instead is reachable but there are no collision free grasps at the initial
pose, the object is probably in a very cluttered area of the workspace.
A modified plan could then involve repositioning (if possible) one of the
obstacles that are in the way. As a final example, if there are no collision
free grasps at any of the possible end poses of the object, the task could
be modified by
1. Repositioning (if possible) one of the obstacles that are in the way.
2. Inserting an intermediate position where the object is regrasped
(see Figure 5.1 for an example).
Thus, by collecting simple statistics during the grasp planning, important
information can be conveyed to the client.
A drawback with this approach of gathering statistics is that it re-
quires implementers of the derived grasp generator classes to remember
to update every variable when appropriate. Because this would be very
error prone indeed, the base class should make sure that these variables
are updated at the right time in all derived classes. It is clear that the
statistics variables must be updated each time we check if a grasp is col-
lision free or if a grasp is reachable. We have here used the Template
Method pattern [45] to automate the process of updating all the statisti-
cal variables. The pseudo-code in Figure 5.3 gives an example how it is
5.4 Retract Planners 109

done for one of the variables. This way we ensure that the statistics are
updated correctly for all derived classes and that there is no (easy) way
of circumventing this mechanism.

5.4 Retract Planners

To grasp an object, the robot hand must of course move very close to it,
meaning that the hand and arm motions are very constrained right before
the object is grasped. Motion is often further constrained by surrounding
obstacles like, e.g., a table on which the object is standing. This means
that path planners will spend most of their time planning for the final
part of the approach path, where the motion is most constrained. Not
only is this wasteful, most randomized planners will generate unnatural
hand and arm motions in such constrained situations, zigzagging between
obstacles while approaching the goal configuration. This should not be
necessary as we often have a good idea about how the constrained motions
look in the workspace; typically the final part of the approach path will
be a straight line in the workspace that leads to the grasp configuration.
Furthermore, studies on human grasping have shown that humans early
form their hand into a prehensile posture that is kept until the (gradual)
transition from approaching to grasping begins, see, e.g., Arbib et al. [9].
Based on these observations, we propose to preprocess the goal config-
urations, i.e., the grasps, into configuration space trees. The branches
of such a tree will correspond to linear motions in the workspace with a
constant hand preshape. We believe that such a tree will not only ac-
celerate the following path planner, but also help it to find more natural
motions.
To preprocess a grasp configuration, we start with the hand at the
grasp configuration. The hand is opened to an appropriate preshape and
then moved away from the task object along linear workspace paths. The
number of paths and their directions are determined from a set of user-
defined directions that are initially given relative to the end effector. A
planner responsible for the preprocessing phase moves the arm and the
hand stepwise along each direction, until an object is hit or some joint
limit is reached. Each successful step will generate a node in the tree.
We will call the resulting tree a retract tree to emphasize that the hand
is moved away from the task object. Retract trees are generated by a
special type of planners that we call retract planners.
Moving along directions that are specified in the workspace implies
110 5 Planning of Pick-and-Place Tasks

that each node in a retract tree requires solving the inverse kinematics of
the robot arm. This is not a problem as each Robot object, see Chapter 3,
is assured to at least have a numerical implementation of the inverse
kinematics. It is clear though that, using the inverse kinematics, each
node in the retract tree is more expensive than a node produced using
the forward kinematics, as in PRMs and RRTs. However, this is nothing
but the ever occurring tradeoff between computational cost and node
quality: Here we pay slightly more for each node in the hope that its
placement will be much more effective compared to a randomly produced
configuration.
A retract planner is useful not only for finding paths for grasping an
object, but also for finding lift-off and put-down directions when holding
the object: When we put down an object on a planar surface, the last
part of the motion is often parallel to the normal of the surface and
the orientation of the object is roughly constant. Thus, retract planners
can be applied even to this case, with the only difference that we use
another set of retract directions and that the hand is closed. In the same
way we see that retract trees are useful in every transition of the pick-
and-place task: from approaching to grasping, from lift-off to transport,
from transport to put-down, and so on. Figures 5.4 and 5.5 show some
examples of computed retracted trees. We see that the size of a retract
tree can be used as a rough measure of how constrained the robot motion
is at the root configuration of the tree.
If nodes of a retract tree reach into areas with much free space, the
job is made much easier for a subsequent RRT-planner as it will quickly
bridge large open areas and connect with the retract tree. Note that
a retract tree in no way limit or constrain the possible solutions; it is
just a guide whose advice can be followed or ignored. In this thesis we
use retract trees together with RRT-planners, but they could as well be
used with PRMs. In a PRM setting, retract trees can be seen as initially
disconnected components of the roadmap graph.

5.5 Planning of Arm and Hand Motions

Once a grasp has been chosen, all we have to do is solve three path
planning problems on the standard form move-from-A-to-B. This will
give us the approach path, the transport path, and the return path. To
plan these paths we could in principle use any of the numerous planners
proposed in the literature. We could use, for example, the Probabilistic
5.5 Planning of Arm and Hand Motions 111

(a) (b)

Figure 5.4: Retract trees with the hand in preshape configuration. Both
figures show the same grasp, but the arm configurations are different,
resulting in retract trees of different size. Note that some retract tree
nodes are hidden inside the arm links.

(a) (b)

Figure 5.5: Retract trees with the hand in grasp configuration. In (a)
we see that without any surrounding obstacles, the resulting retract tree
becomes very large. In (b) two obstacles have been placed around the
task object, resulting in a much smaller retract tree.

Cell Decomposition (PCD) planner proposed in [91], which has shown to

be extremely effective for a large class of problems. However, we will here
use a variation of the RRT-ConCon algorithm, described in Section 4.1.
The reason for this choice is that this algorithm can easily be adapted in
several ways that are very useful in the context of pick-and-place tasks.
112 5 Planning of Pick-and-Place Tasks

Most important are:

• It can handle multiple goals in parallel.

• It can be initialized with configuration space trees instead of single

points.

In the following sections, we will discuss why these modifications are

important in this context.

5.5.1 Multiple Goals

Given a grasp, i.e., a hand configuration and a transformation that de-
scribes the pose of the hand relative the object, we can use the inverse
kinematics of the arm to compute the necessary arm configuration. Sup-
pose now that the solution to the inverse kinematic problem is not unique,
i.e., there are several arm configurations that reach the grasp. Which one
should we choose? We could choose a configuration based on some heuris-
tic, such as the one that is closest to the start or the goal configuration.
However, the configuration we choose can be a dead end, meaning that
there exist no approach, transport, or return path. In [71] the solution
was to try another grasp if no path was found. This implies that each
RRT-query is given a limited amount of resources to find a solution, e.g.,
time or maximum number of nodes. If no solution is found, we instantiate
a new query towards another goal point. This approach has a number of
drawbacks. First, for each repeated query, significant time will be spent
on rebuilding the RRT rooted at the start configuration. Second, how
much resources should we give to each RRT-query? Too much resources
and we risk spending too much time on a dead end. The other way
around and we risk preempting a query when a solution was just around
the corner.
The solution to this dilemma is to consider all possible goals (or at
least a subset of them) at once. Thus, for each new node in the start
tree, we try to connect each goal tree to this node. The same holds for
the other direction; for each new node in a goal tree, we try to connect
the start tree to this node. This approach can be seen as we are looking
for several solutions in parallel. The cost is of course that we have to
build and maintain NG − 1 extra goal trees, where NG is the number of
goals. Experiments have shown that if one of the provided goals is easy
to reach, i.e., no narrow passages and only a few sharp turns, then the
RRT-planner converges so quickly that the time spent on the other goal
5.5 Planning of Arm and Hand Motions 113

trees is almost negligible. The worst case scenario for this approach is
when there exist a solution for all the provided goals, but each solution
is very hard to find. In this case, testing each goal separately would be
faster.
Multiple goals appears frequently in the context of pick-and-place
tasks, not just because of multiple solutions to the inverse kinematics
problem. As mentioned in Section 5.3, grasp generators are used to
generate a set of grasps that are compatible with the task. The approach
with multiple goals allow us to treat multiple grasps in parallel as well.
Real-life tasks will most likely involve a goal region for the task object,
rather than a single goal point. For example, a task specification like
“Put the book on the top shelf” will leave it to the service robot to
find suitable regions on the shelf, where it can place the book. Goal
regions also arise naturally for objects like cylinders, that have rotational
symmetry; if the rotation around the cylinder’s axis is irrelevant to the
task, then the goal position together with the possible orientations form
a goal region. The idea here is that goal regions can be approximated
with a few representative positions and orientations.
So, we have seen that allowing multiple goals is useful in situations
where we know where we start but are not sure about where to go. There
is of course a limit, where too many goals will make the query inefficient.
A general guideline is not to add a goal that is close to an already existing
goal. In the examples in Section 5.8, the number of goals varied between
one and twenty. The exact number depended on the number of specified
goal positions, the number of grasps and arm configurations for each
grasp. In Section 5.5.3 we will see that multiple goals are also useful in
case we reach a dead-end and have to backtrack, or if we are interested
in optimizing the solution.

5.5.2 RRT-Planners and Retract Trees

In Section 5.4 we introduced the concept of retract planners. They can
be used at each transition of the task to generate retract trees, whose
nodes lie along natural directions in the workspace. If this information is
passed to an RRT-planner we will gain the following:

• Accelerated planning: The retract tree will often lead out from nar-
row and difficult passages, making the task easier for the RRT-
planner.
114 5 Planning of Pick-and-Place Tasks

• Natural transitions: If the RRT-planner connect to a branch of the

retract tree, the transition will look smooth and natural.

For these reasons we will generalize the interface to the RRT-planner,

such that it can be initialized with configuration space trees instead of
single points.
If our goal is to automatically generate natural-looking animations of
human figures, as in [71], there is a slight risk using retract trees; if the
RRT-planner connects with the end of a very long branch of the retract
tree, then the generated motion could look unnaturally stiff, as if the
hand would be guided by invisible tracks. Using the via-point removal
algorithm in Section 5.7.1 this risk is eliminated; if it is possible to connect
to a node further in on the branch of the retract tree, the algorithm will
find that node.

5.5.3 Backtracking
Assume that we have found an approach path and a transport path, but
the planner fail2 to find a return path. That means that the chosen
arm configuration or object end position was a dead-end, and we have
to backtrack to find a new transport path towards another end configu-
ration. With a minor change to the RRT-planner, we can reuse much of
the work done while planning the transport path. If the planner retains
its state (i.e., start and goal trees) when returning a solution, we can
easily go back and ask it to resume planning. If one of goals turned out
to be a dead end, we can tell the planner to delete the corresponding goal
tree. If the planner has no goal trees left, then we backtrack one step
further, i.e., we try to find a new approach path. A backtracking step is
illustrated in Figure 5.6.
Thus, if we use multiple goals and retain the planner state, we can
deal more efficiently with backtracking events than if goals would be
treated sequentially. Furthermore, if we are interested in improving our
solution, we can resume planning for each remaining goal and keep the
best solution.

2 Note that the notion of planner failure will again require that each RRT-query is

given a limited amount of resources. However, in this case we do at least know that
none of the provided goal configurations has a simple solution path.
5.5 Planning of Arm and Hand Motions 115

approach path transport path

return path

start release

grasp home
Figure 5.6: For each transition the planner does a bidirectional search
towards multiple goals. Here the current release configuration is a dead-
end as the absence of dashed lines indicates that there exists no return
path. Therefore the planner must backtrack to find another transport
path. Because each RRT-planner saves its internal state from the last
query, a new path is likely to be found in shorter time than the first one.

5.5.4 Efficient use of Robot Composition

When planning the approach and return paths, we want to consider the
degrees of freedom for the arm and the hand together. That is, we want to
treat the arm and the hand as a single kinematic structure. The situation
is different when planning the transport path, because then we want to
keep the hand configuration fixed and just let the hand follow the motion
of the arm. This could be easily accomplished if we had a mechanism
for locking certain joints of a robot. However, such an approach would
require the pick-and-place planner to know which degrees of freedom
control the hand and which control the arm. Furthermore, as we have
already seen in Section 5.3, grasp generators deal with the arm and the
and separately. Thus, we are in fact working with three different robots
(or different views of the same robot): an arm, a hand, and a robot that
is a combination of the arm and the hand. Here the robot composition
mechanism introduced in Chapter 3 comes to good use; the planner work
with a robot that is a composition of the arm and the hand, as shown
in Figure 5.7. Note that the robot Arm in Figure 5.7 does not have
to conform to the usual notion of a robot arm; it could as well be the
combination of a manipulator and a mobile platform.
So, when planning the transport path, the RRT-planner is given the
sub-robot that corresponds to the arm, thereby fixing the hand configu-
ration automatically. The fact that the hand and the task object move
together as single object also opens up for a useful optimization; if we
116 5 Planning of Pick-and-Place Tasks

ArmHand

Arm Hand

Figure 5.7: Robot composition provides a convenient mechanism for

moving only parts of a complex kinematic structure. The arrow indicates
that moving Arm alone will cause Hand to follow as a single rigid body.

introduce a new geometric object that is the union of the task object and
the hand, then we could significantly reduce the number of moving parts.
This has shown to be useful for the examples in Section 5.8, as much time
was otherwise spent on checking for self-collisions between the numerous
finger links and the arm.

5.6 Task Constraints

Robot tasks often involve constraints that vary in nature from task to
task. As mentioned in Section 5.2, a typical scenario is a service robot
serving a cup of tea, a task with a simple orientation constraint.3 As
another example, consider a hyper-redundant robot that has to perform
a welding task in a tight environment. Here the tip of the welding tool
has to follow the prescribed path with high accuracy. The demands on
the tool orientation, however, are much lower; as long as the tool tip
follows the path, the tool’s orientation should be free to vary within
some specified tolerance. The orientation tolerance could be specified as
a cone emanating from the current tool-tip position, and centered around
a nominal tool orientation.
How should we handle such constraints? The simplest approach would
be to see the constraints just as extensions of the C-obstacle concept.
Thus, every configuration is tested against the constraint with a binary
result. This approach makes it easy to test new types of constraints with-
out changing the planning algorithm; the new type of constraint is simply
represented by a class derived from the abstract class BinaryConstraint,
3 A more rigorous approach would also include acceleration constraints. However,

we assume here that the accelerations are low enough to be ignored.

5.7 Path Smoothing 117

see Appendix E. As shown in Section 5.8, this simple approach works

well together with RRTs.
In a more elaborate approach, constraints could support interpola-
tion between two configurations. If the interpolation succeeds, the in-
terpolated motion would satisfy the constraint. With constraint spe-
cific interpolation methods, it would be easy to include constraints in
the PRM-framework as well; using the constraint specific interpolation
method instead of the traditional linear interpolation, much more con-
nections will be found, resulting in a denser roadmap.
In the MSL framework, each problem instance has an internal state
space model, a set of inputs, and an integration method. Clients can
change the internal state using the provided inputs together with the
integration method. With this general approach, problems with both
nonholonomic and dynamic constraints have been solved [78, 82].
Even with constraint-specific interpolation methods, PRM-methods
will suffer from drawing many samples that do not satisfy the constraint.
(RRT-methods will not suffer so much from this problem, as they do
not require samples to be satisfied, see Figure 4.1 (b).) A further en-
hancement of the constraint concept would be to provide a constraint
specific sampling method as well. This approach was used by Oriolo et
al. [113] to solve path planning problems for redundant robots, where the
end-effector path was prescribed.
Thus, the optimal constraint object would not only answer wether
configurations are satisfied, but also provide constraint specific interpo-
lation and sampling methods. For the examples in this chapter though,
we are not changing the interpolation or sampling methods.

5.7 Path Smoothing

The solution from path planners that use randomization is often far from
optimal and not so appealing: The motions are jerky and the solutions of-
ten contain detours. Because the research has focused much on path plan-
ning algorithms per se, the post processing step, involving path length
reduction and smoothing, has often been neglected.4 However, if the goal
is to actually use the solution path, rather than just studying path plan-
ning algorithms for their own sake, then we must consider postprocessing
methods. Typical examples are computer animation, where we want nat-
4 The fact that omitting the postprocessing step results in much shorter planning

times can also have something to do with it.

118 5 Planning of Pick-and-Place Tasks

ural looking motions, and an industrial workcell, where the emphasis is

on efficient and smooth motions.
In this section we will look at two methods that supplement each other
to produce a smooth solution path. The first method was developed by
Hsu et al. [58], and it effectively eliminates redundant via-points and
reduces the path length. Apart from just describing the method in [58],
we also discuss two alternative termination criteria. The second method
converts a piece-wise linear path to a cubic-spline path, resulting in a very
smooth path with natural velocity constraints between each transition.
This method has to the author’s knowledge not been published before.
The idea is to first use the method from [58] to produce a short piece-
wise linear path. If the application demands smooth joint trajectories,
this path is converted, using the second method,to a smooth spline path.
The resulting joint trajectories are C 2 continuous everywhere, except at
the transitions (e.g., from approaching to grasping).

5.7.1 Path Optimization

For industrial applications, it is desirable to find paths that can be exe-
cuted as quickly as possible. A natural cost function for a path is there-
fore the time it takes to execute it. Under the assumptions that all joints
can reach their maximum speed in negligible amount of time, the time
it takes to traverse a straight line path in C from p = (p1 , p2 , . . . , pn ) to
q = (q1 , q2 , . . . , qn ) is given by

|pi − qi |
ρ(p, q) = max , (5.1)
1≤i≤n σi
where n is the number of joints and σi is the maximum speed of i. In [58]
it was shown that the minimum-cost path is a path that is locally straight
at each point where it is not touching a C-obstacle. In fact, this holds
for all cost functions that are equal to the path length according to some
metric. This is due to metrics obeying the triangle inequality:

ρ(p, q) ≤ ρ(p, r) + ρ(r, q).

Thus, replacing any intermediate via-points with a straight-line sub-path
will always lead to a shorter path.
The Shortcut algorithm in [58] recursively break a path γ into two
sub-paths γ1 = (ν1 , ν2 , . . . , ν[N/2] ) and γ1 = (ν[N/2] , ν[N/2]+1 , . . . , νN ) and
check wether γ1 or γ2 can be replaced by straight-line paths. However, the
5.7 Path Smoothing 119

ν2
ν3
γ0
ν1 γ̂
γ1
ν4

ν0

Figure 5.8: The dashed path is the result after one iteration of the
Shortcut algorithm in [58]. In this case, the algorithm will stop in the
next iteration because the dash-dotted path is not collision free, resulting
in a path far from the optimal path γ̂. The figure is adopted from [58].

Shortcut procedure alone will in most cases not give satisfactory results;
as shown in Figure 5.8, Shortcut can get stuck at an early stage. To
address this, extra via-points νl and νr are inserted around each via-point
νi , 1 ≤ i ≤ N . The points ν1 and ν2 are initially chosen as the midpoints
of (νi−1 , νi ) and (νi , νi+1 ), respectively. If the line segment (νl , νr ) is not
collision free, then the points are iteratively moved closer to vi using a
bisection method. After the extra via-points are inserted, Shortcut is
applied again. The procedure stops when no further improvement, in
terms of the path cost, is obtained. The resulting algorithm, which is
called Adaptive Shortcut, is described in more detail in [58].
As shown in Section 5.8, Adaptive Shortcut performs well in terms
of speed and quality. However, it was found that running the algorithm
many iterations tend to produce paths where the manipulator moves close
to obstacles. This is a natural result of Adaptive Shortcut because it con-
verges to a path that at some places is tangent to C-obstacles. Although
we are looking for short paths, we do not want the manipulator to move
unnecessarily close to obstacles; there must be some clearances between
the manipulator to allow for position and control errors. Therefore it
is important to ensure that Adaptive Shortcut stops before the solution
path is made tangent to some C-obstacle. If we use collision detection al-
gorithms that are capable of solving distance queries, a natural approach
would be to simply add a distance threshold; if the distance between
the manipulator and an obstacle becomes smaller than this threshold,
then Adaptive Shortcut terminates. For manipulation tasks, however,
120 5 Planning of Pick-and-Place Tasks

this appealing solution poses several implementation difficulties: A ma-

nipulation task inevitably involves moving close to obstacles. In fact,
when grasping an object, the distance to the closest “obstacle” is zero.
Furthermore, for the transport phase of the task, the distance between
the task object and the environment is zero at the beginning and the end.
In effect, the approach of using a single threshold for all potential colli-
sions will not work. Instead we must use different thresholds for different
pairs of objects. Furthermore, these thresholds will vary, depending on
which phase of the task we are in (approach, transport, or return).
A much simpler, but less reliable, approach is to monitor the rate
with which the path cost decreases: As the processed path gets closer
and closer to the locally optimal path, the rate of improvement gets
smaller and smaller. So terminating the algorithm before the rate of
improvement gets very small can avoid the problem of passing very close
to obstacles. Using the rate of improvement as a termination criterion
is of course nothing new, as it is used by almost all iterative numerical
methods. However, experiments have shown that at the early stages
Adaptive Shortcut can produce an almost zero improvement, followed by
a much larger improvement in the next iteration. Therefore we suggest
that the termination criterion require the rate of improvement to be small
for at least two successive iterations.

5.7.2 Spline Paths

Postprocessing the solution paths with Adaptive Shortcut as described in
the previous section results in motions that for most purposes are smooth
enough. A piece-wise linear path does, however, have velocity discontinu-
ities at each via-point. These discontinuities can have a negative influence
if the path is used to control a high-speed industrial manipulator: Veloc-
ity discontinuities can lead to induced vibrations and inefficient control
of the manipulator. To the human eye, the discontinuities are most ap-
parent at the transitions, where the manipulator goes from zero to a
finite velocity in an instant. To address these two issues, we would like
to have a path that has continuous velocity profiles for all the joints and
the correct joint velocities at the start and end points.
Assume that we have a collision-free piece-wise linear path with N
via-points and the dimension of the configuration space is D. To get
rid of the velocity discontinuities, we would like to describe each joint
trajectory with a cubic-spline; they have the appealing property of pass-
ing exactly through the interpolation points under C 2 -continuity. Thus,
5.7 Path Smoothing 121

each joint trajectory will consist of N − 1 spline segments, where each

segment is described by four parameters, yielding a total of 4(N − 1)D
parameters to determine. The following constraints will determine the
spline parameters:

• Interpolation of each via-point (2(N − 1)D constraints)

• Continuity of velocity ((N − 2)D constraints)
• Continuity of acceleration ((N − 2)D constraints)
• Start and end velocity (usually) zero (2D constraints)

So, in total we have 4(N − 1)D constraint equations, which is sufficient

to determine the spline path.
It turns out that the resulting spline path in general is not collision-
free, even though the underlying linear path is so. As can be seen in
Figure 5.9, large deviations from the linear path can result in a spline path
that is not collision free. Here we propose a very simple iterative scheme
to resolve this problem: Every time a collision is found on the spline path,
a new via-point is inserted at the corresponding position on the linear
path. This extra via-point will cause the new spline path to follow the
linear path more closely and the risk of collision is reduced. The process
is repeated until the whole spline path is collision free. Experiments
has shown that this simple method performs well except for situations
where the robot move extremely close to object over longer periods; small
spline deviations from the linear path are certain to cause collisions and
therefore many extra via-points are needed to force the spline closer to
the linear path. As the number of via-points increases, so does the cost
for computing the spline curve. This cost could be reduced if we exploit
the tri-diagonal structure of the matrix representing the spline equations.
The current implementation, however, only uses an ordinary solver based
on LU-factorization.
The method proposed here can be used to produce spline paths for al-
most any path planning problem (rigid-body problems could benefit more
from an interpolation scheme based on quaternions, see Section 2.5.2),
but here we will mention some details that are specific to pick-and-place
tasks. As mentioned in Section 5.2, the solution to the task can be seen
as a concatenation of the approach, transport, and return paths. If we
apply the method described in this section to the concatenated path,
the most visible effect is that the manipulator smoothly accelerates and
decelerates at the start and the stop of the task, respectively. However,
122 5 Planning of Pick-and-Place Tasks

ν0

ν2

ν1
ν3

ν4

Figure 5.9: The via-points of a piece-wise linear path can be interpo-

lated with cubic splines to generate a smooth path. In case the spline
causes a collision, extra via-points can be inserted on the corresponding
linear path segment.

with this approach we have no control over the velocities at the transi-
tions from approaching to grasping and from grasping to returning. If we
instead apply the method to the three sub-paths, before concatenating
them, then we have precise control of the transitions. Mostly we want
to have ease-in and ease-out behaviors, such that the robot comes to
a stop when grasping or releasing the object. In such a case the joint
velocities are zero at the transitions. If we are instead are interested
in motions where the object is grasped and moved in a continuous mo-
tion, i.e., the robot does not stop during grasping, then the following
approach can be used: We form the end-effector velocity based on the
approach and departure directions.5 The magnitude of the end-effector
velocity is a user-determined value. Based on the end-effector velocity
and the manipulator Jacobian at grasp configuration, the corresponding
joint velocities can be computed. Note that this assumes a non-singular
Jacobian. With these non-zero velocity constraints, we can now make the
robot to grasp the object in one sweeping motion. The resulting path will
be C 2 -continuous everywhere except at the transition from approaching
to grasping, where it would be C 1 -continuous.
5 This method assumes these two directions are roughly similar. Thus, it would not

work for a grasp from above, where the hand first move downwards and then lifts the
task object.
5.8 Examples 123

As a final remark; converting a piece-wise linear path to a spline

path assumes that we have a time associated with each via-point. So
far we have not discussed how those time values are chosen. Under the
assumptions that each joint i has the maximum speed σi and it can
reach this speed in negligible amount of time, we simply assign the time
duration for each straight-line path segment so that at least one joint
moves with maximum joint velocity in each segment. If the time for the
first via-point is given, the time values for the subsequent via-points can
be computed. There are of course other, more elaborate techniques for
choosing the time values for the via-points, but we point out that the
spline converting method described in this section is independent of that
choice.
Figures 5.10 (a) and (b) show a comparison between the piece-wise
linear solution and the spline solution for a pick-and-place task. Fig-
ure 5.10 (a) shows the trajectory for joint 4 of the robot arm, while the
Figure 5.10 (b) shows the trajectory for one of the finger joints. The
circles are the via-points of piece-wise linear solution, and the triangles
are the via-points of the spline solution. The triangles that do not co-
incide with a circle are thus extra via-points that were inserted because
the spline solution was not collision free. As expected, the extra via-
points are inserted near the transitions (shown with dotted lines), where
the hand is moving close to the task object. From the curves in Fig-
ure 5.10 it is also seen that the spline solution gives a zero velocity at the
beginning and the end of the task.

5.8 Examples
In this section, the pick-and-place planner is tested on some problems,
ranging from easy to difficult. For each example we present the average
time required for: planning, smoothing, and spline conversion. If not
stated otherwise, the average values are based on 40 trials for each prob-
lem. In the planning time is included the time for grasp planning and for
generating the retract trees.
As mentioned in Section 5.5.1, the planner is able to plan toward
several possible goals at the same time. For the approach phase of the
task, the goals are determined by the ten first grasps delivered by the
grasp generator.6 These grasps are known to be compatible with the
6 In case no solution is found for these ten first grasps, the planner will backtrack

and ask the grasp generator for more grasps.

124 5 Planning of Pick-and-Place Tasks

1.5

0.5

θ [rad]
0

−0.5

−1

−1.5
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
time
(a) joint 4
1.4

1.2

0.8

θ [rad]
0.6

0.4

0.2

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
time
(b) finger joint

Figure 5.10: Comparison of the piece-wise linear solution and the spline
solution for a pick-and-place task. The triangles show extra via-points
that were inserted because the spline solution was not collision free. The
grasping time and the release time are shown by the dotted lines.
5.8 Examples 125

Figure 5.11: A simple task where the robot has to put the cylinder on
the table.

initial pose and the final pose(s) of the task object. For the transport
phase, the number of goals is determined by the task and the currently
active approach path. For the return path, the number of goals is equal
to the number of “home configurations” for the arm. In these examples,
the home configuration is equal to the initial configuration of the arm.
The results are presented in Table 5.1. All examples were run on a
1.2 GHz Pentium 3 processor, and PQP was used for collision detection.

Example E1, Standing Cylinder The first example is very simple

and is used to establish some kind of lower bound on the required planning
time. The task is to move a standing cylinder from the top shelf to the
table, see Figure 5.11. The ten first grasps from the grasp generator
are all around the cylinder axis. The results in Table 5.1 show that the
average planning time is 2.5 seconds, which includes grasp planning and
retract tree generation. In this case, the time needed for grasp planning
is negligible as it is only about 0.1 seconds. However, for difficult tasks
with only a few valid grasps, the time required for grasp planning can be
significantly larger.

Example E2, Fixture Task In industrial applications, a typical task

is to place a workpiece in some fixture for machining. Figure 5.12 (a)
shows a simple model of a fixture, in which a cylindrical workpiece is to
126 5 Planning of Pick-and-Place Tasks

be placed. For this task, the robot must grasp the workpiece close to
one of the ends; a grasp on the middle of the cylinder would make the
insertion into the fixture impossible. The results for this task is shown in
Table 5.1. Note that the spline conversion results are missing in the table:
The tight fitting between the cylinder and the fixture will cause the spline
path to collide repeatedly with the fixture. Hence, spline conversion was
not used for this task. If desirable, the approach path and the return
path can still be converted to splines with a low cost.
This is an example that shows the usefulness of the retract trees and
the use of multiple goals; without the use of retract trees, the planner
would spend several minutes on inserting the cylinder into the fixture.
Furthermore, for the chosen grasp, the end position in the fixture is reach-
able with six different arm configurations. Some of these configurations
are bad for the task as a straight-line insertion into the fixture is not pos-
sible. This is shown in Figure 5.12 (b); the size of the retract tree (only
one node is visible) indicates that the range of a straight-line motion is
too small to remove the cylinder from the fixture. Figure 5.12 (c) shows a
retract tree for the same grasp, but with another arm configuration. This
arm configuration is clearly better as the retract tree alone can remove
the cylinder from the fixture. With the possibility to use multiple goals,
there is no need to decide which one of the six different arm configura-
tions to choose; the planner will first connect with the retract tree that
is easiest to reach.

Example E3, Long Rod The following example is considerably more

difficult compared to the previous ones. The robot has to move a long
cylindrical rod from the bookshelf and place it on top of two supports
on the table, see Figure 5.13 (a). Due to the length of the rod and its
awkward initial position, this task requires careful manoeuvering. The
thin pillars of the bookshelf and the surrounding shelves act as an efficient
cage for the rod.
As seen in Table 5.1, this task takes considerably more time to solve.
Most of the time was spent on finding a transport path, i.e., moving the
rod from the bookshelf to the table. When the robot has grasped the rod,
see Figure 5.13 (b), its motion is constrained in every direction. Thus,
the approach of using retract trees has little effect on this problem. In
addition, for this example, the RRT-ConCon approach is not so efficient:
Due to the constrained motion of the rod, it is much harder to add nodes
to the start tree, than to the goal trees, causing the trees to become
unbalanced. An algorithm that tries to balance the trees [79] would be
5.8 Examples 127

(a) Task specification

(b) Small retract tree (c) Large retract tree

Figure 5.12: The robot has to insert the cylindrical object into the
fixture on the table. For the chosen grasp, the end position is reachable
with six different arm configurations. Figures (b) and (c) show that some
arm configurations are more suitable than others for this task.

more effective in this example, as it would spend more time on trying to

move the rod out from the bookshelf.
Worth noticing in Figure 5.13 (b) is that the robot is closing the hand
and moving the arm forward at the same time. Hence, there is no abrupt
transition from approaching to grasping. This gradual transition from
approaching to grasping is due to the smoothing algorithm described in
128 5 Planning of Pick-and-Place Tasks

Section 5.7.1.

Example E4, Orientation Constraint Orientation constraints arise

naturally in tasks that involve manipulating, e.g., liquid filled cups. Here
we see such orientation constraints as equivalent to forcing a body-fixed
direction vector coincide with a world-fixed direction vector. In real-
world applications it is sufficient if the angle between the two direction
vectors is less than some threshold. Thus, an orientation constraint spec-
ification involves two direction vectors and an upper limit for the an-
gle between them. Orientation constraints are represented by the class
OrientationConstr, which is derived from BinaryConstraint, see Fig-
ure E.2 in Appendix E.
To evaluate the planner’s performance in the presence of an orienta-
tion constraint, it was tested on the problem in Figure 5.14. The task
is to move a cylinder from the top shelf to the table. The motion must
satisfy an orientation constraint where the maximum deviation angle is
1 degree. Note that the cylinder radius is too large for a natural grasp
around the cylinder axis; to increase the span between the fingers, a hand
configuration with a spread angle of 90 degrees must be used. (See Fig-
ure 6.1 in Chapter 6 for an explanation of the spread angle.) Even though
the maximum allowed deviation angle is small, Table 5.1 shows that the
task does not take significantly longer time to solve compared to Exam-
ples E1 and E2. Making the task simpler by increasing the constraint
angle, causes the required planning time to rapidly approach that for a
task without any constraint. A typical trajectory of the cylinder is shown
in Figure 5.14.

Example E5, Camera Constraint The following example is different

from the others in that it is not a pick-and-place task. The intent is to
test the performance of the RRT-planner on a rather non-trivial task
constraint. Instead of the Barrett hand, there is now a camera mounted
on the Puma end-effector, see Figure 5.15. We now also consider the
degrees of freedom for the mobile base, resulting in a total of nine degrees
of freedom. The task of the mobile robot is to move from one position
to another, while keeping a target object in the camera’s field of view.
This is a rather complex constraint, and it is satisfied only if all of the
following criteria are satisfied:
• The distance from the camera to the target is greater than dmin .
• The distance from the camera to the target is less than dmax .
5.8 Examples 129

planning smoothing splines total

E1
Standing Cylinder 2.5 3.0 1.1 6.6
E2
Fixture 4.1 4.7 – 8.8
E3
Long Rod 485.9 15.6 7.0 508.5
E4
Orientation Constraint 17.7 3.0 10.0 30.7
E5
Camera Constraint 40.8 1.8 – 42.6

Table 5.1: Average planning times for the different examples in this
chapter. Each example was run 40 times, except example E5, which was
run 100 times.

• The center of the target is within the camera cone.

• The line of sight between the camera and the target center is not
occluded.

The camera cone is defined as a circular cone with opening angle θcone
and its apex at the camera lens. In Figure 5.15 (b) the camera cone is
shown as a transparent cone with length dmax . If rotations of the camera
image is undesirable, an additional constraint can easily be added to the
previous list. Without this constraint, the last degree of freedom is of
no importance as it only rotates the camera around its axis. Thus, the
effective dimension of the task is eight degrees of freedom.
In Figure 5.15 (a) the start position and the end position of the robot
is shown. The camera target is the sphere on the table. The task is made
more difficult by the two boxes surrounding the sphere: They will force
the robot to lift the camera high to avoid occlusion. There are also two
stools and a shelf that give rise to narrow passages for the platform.
Because the platform rotation is unbounded, this degree of freedom
has to be modeled as a ring joint, as described in Section 2.5. The
parameters of the constraint are chosen as dmin = 0.5 m, dmax = 1.4 m,
and θcone = 5.0◦ . This problem was solved using the RRT-LocTrees
algorithm in Chapter 4. The averaged computational times for 100 trials
are shown in Table 5.1. Figure 5.15 (b) shows a snapshot from a solution;
it is clearly seen how the box on the table forces a high camera position.
130 5 Planning of Pick-and-Place Tasks

5.9 Chapter Summary

In this chapter we have presented an efficient and flexible planner for
pick-and-place tasks. Just as the planner in Kuffner’s thesis [71], this
planner also uses an RRT-planner to find the approach path, trans-
port path, and return path. An important difference is that the RRT-
planner in this chapter is modified to allow multiple goals and prepro-
cessed configuration-space trees as input. Multiple goals naturally arise
when, e.g., the same grasp can be reached with several arm configura-
tions. Instead of choosing one of these configurations, the planner can
treat them in parallel.
The retract planner concept has shown to be useful to preprocess the
start and the end-configurations of a task. A retract planner uses the in-
verse kinematics to move the end-effector (and the grasped object) along
predefined directions, thus converting a configuration to a tree. This pre-
processing stage helps the planner to find paths out from confined areas,
and it also helps to generate more natural transitions when grasping or
releasing an object.
The grasp generator concept separates the process of grasp planning
from the trajectory planning. In case of a failure, the grasp generator
interface can provide the higher-level planner with enough information
to formulate a different plan. This plan could, for example, involve a
regrasping operation.
The backtracking mechanism in combination with multiple goals make
the planner very robust. The planner will, however, not be efficient for
tasks that require numerous regrasping operations. For such tasks, the
multiple roadmap approach of Siméon et al. [134] and Sahbani et al. [128]
would be better choice.
Despite the very simple constraint model used, the examples have
shown that the planner is also capable of solving tasks that involve ori-
entation constraints.
Due to the random nature of the underlying RRT-planner, the result-
ing trajectories are jerky and will often contain unnecessary via-points.
Thus, smoothing is a necessary postprocessing step before the solution
can be used. The examples in this chapter showed that smoothing can
take a significant part of the total time; for the simplest tasks, smoothing
took even longer time than the planning itself. It should be mentioned
though that the maximum step size was much smaller during smooth-
ing than during planning. With more efficient methods to check path
segments, the time needed for smoothing can be reduced.
5.9 Chapter Summary 131

A somewhat unexpected benefit from the smoothing process is that

stiff and “robot like” transitions from approaching to grasping could be
turned into soft and gradual transitions; a great improvement if the goal
is to produce smooth and natural looking motions.
To summarize, this chapter has covered the following topics:

• Multiple Goals

• Retract planners and configuration space trees

• Grasp generators

• Backtracking

• Task constraints

• Smoothing

• Splines
(a) (b)

(e) (f)

Figure 5.13: A task involving a long cylindrical rod.

Figure 5.14: A typical trajectory for a pick-and-place task with an
orientation constraint. The maximum deviation angle is 1◦ . Note that
the figure only shows the transport phase of the task.

(a) (b)

Figure 5.15: (a) The robot has to move from the leftmost position to
the rightmost, while keeping the sphere within the camera’s field of view.
(b) To avoid occlusion, the robot must lift the camera high.
Chapter 6

Grasp Planning for a

Three-Fingered Hand

Automated grasp planning is an important step towards autonomous sys-

tems that are able to manipulate their environment.1 Formulating a gen-
eral and efficient grasp planning algorithm has shown to be a formidable
task. Therefore, researchers have often simplified the grasp planning
problem in various ways, e.g., by going from 3D to 2D, or by neglecting
the kinematic constraints. However, neglecting the kinematic constraints
have lead to grasp planning algorithms that are difficult to use in prac-
tice: Of what use is an “optimal” grasp if it cannot be reached by the
hand?
In this chapter we present a fast grasp planning algorithm that is de-
signed to be used on a real robot platform. Based on a 2D-contour, the
planner generates an ordered sequence of grasps where the emphasis is
on robust grasps. With robust we here mean that the properties of the
grasp should not change dramatically even in the presence of small per-
turbations in the position or the geometry. The proposed planner fulfills
the grasp generator requirements, as stated in Section 5.3, so it could be
used together with the pick-and-place planner presented in Chapter 5.
The next section will describe some of the related work and Section 6.2
will provide the background and the problem formulation. The details
of the grasp planner is given Section 6.3 followed by a section with ex-
1 There are of course other modes of manipulation, e.g., pushing, but here we

concentrate on manipulation that involves grasping.

136 6 Grasp Planning for a Three-Fingered Hand

amples of planned grasps. The chapter ends with a summary and some
conclusions.

6.1 Related Work

Automated grasp planning is a difficult problem, and its importance for
autonomous robots has led to a rich literature in the field. The grasp
planning problem is often formulated as an optimization problem, but
constraints from many different domains, e.g., kinematic and friction
constraints, make it hard to solve. A common simplification has been
to neglect the hand kinematics and try to find the optimal positions for
a fixed number of contacts. An example of this is the work of Marken-
scoff and Papadimitriou [96], who derived methods for finding optimal
grasps for 2D polygons. However, neglecting hand kinematics has the
disadvantage that the “optimal” grasp might not be reachable by the
hand.
Another often neglected aspect in grasp planning is grasp robustness,
meaning that small changes in the object geometry or hand configuration
should not dramatically change the properties of the grasp. Any grasp
planner that is to be used for real-world tasks must take grasp robustness
into account in order to not fail due to model errors or noisy sensor
data. Nguyen [108] addressed this aspect by deriving independent contact
regions instead of contact points. As long as the contacts are inside
the computed regions, the grasp is stable. Formulating the solution as
a set of regions also increase the probability of finding a grasp that is
kinematically feasible.
Fischer and Hirzinger [43] suggested a randomized grasp planner for
the DLR hand. Random contact point candidates on the surface of the
object were generated using ray-casting. If a set of contact points is found
reachable, a grasp quality measure is computed. The planner is general
and fast, but it can only plan finger-tip grasps.
Miller et al. [102] proposed a grasp planner for the Barrett hand.
Given a decomposition of the object in terms of primitive shapes, a set
of rules was used to generate grasp candidates. Each candidate was
evaluated on the accurate object model using the grasping simulator
GraspIt! [101, 99].
The planner proposed in this chapter is also developed for the Barrett
hand, and it resembles the planner in [43] in that it generates hypotheses
which are tested for feasibility. Both planners are guided by heuristics to
6.2 Problem Description 137

make the search more efficient. A special feature of our planner is that
it explicitly searches for robust grasps. This is done by accumulating the
number of feasible neighbors, where neighbors is defined as samples in
configuration space that are close to the current grasp. Independent of
the work in this chapter, Morales et al. [106, 105] developed a very similar
grasp planner.2 From a theoretical point of view, the planner presented
here has the advantage of using more detailed kinematic models and
a more explicit robustness test, without necessarily slowing down the
planner. From practical point of view, however, the planner in [106] has
the significant advantage of being tested on a real robot platform.

6.2 Problem Description

The Centre for Autonomous Systems has a custom designed mobile robot,
which is based on a Nomadics XR4000 platform. To be able to grasp
objects, the robot is also equipped with a PUMA 560 arm, a Barrett
hand, cameras and touch sensors. The platform can be seen in Figure 3.6
in Chapter 3.
The Barrett hand, from Barrett Technology, has three fingers and a
large planar palm surface, see Figure 6.1. The two joints on each finger
are coupled and driven by DC-motors inside the hand. Thanks to a clever
clutch mechanism, the outer finger link can continue to close even if the
inner link is blocked by an object. This way, the fingers can wrap around
an object, providing a secure grasp. Additionally, the fingers F1 and F2
can rotate symmetrically around the palm. This is called spread motion
and its value is denoted by ϕ, see Figure 6.1. The spread motion angle
can take values between zero and 180 degrees, allowing for a wide range
of gripper configurations. With eight axes and only four motors, the
Barrett hand is clearly underactuated. It will be shown later that the
mechanical couplings of the hand can be utilized to simplify the grasp
planning problem.
For tasks like object recognition, visual servoing and contour extrac-
tion, a CCD-camera is mounted between fingers F1 and F2 such that the
view plane is always parallel to the palm surface (see Figure 6.1). Note
that this is the only placement of the camera that both avoids obstruc-
tion of the fingers and occlusion of the camera. For contact detection,
the whole palm surface and parts of the fingers are covered with tactile
sensors.
2 Both planners were presented in October 2002, but at different conferences.
138 6 Grasp Planning for a Three-Fingered Hand

thumb
CCD

Figure 6.1: The figure shows the Barrett hand with all fingers fully
opened. The crosses denote each finger’s base position in the palm. Note
the CCD-camera mounted between fingers F1 and F2 .

The main assumptions in this work are that the outer 2D contour
of an object is extracted, using the camera mounted on the hand. This
contour is of course the projection of the object onto the view plane,
but from here on it will be assumed that the contour is the cross section
of a generalized cylinder, with a straight-line axis and a homogeneous
mass distribution. The cylinder assumption can seem very restrictive,
but thanks to the clutch mechanism of the hand, the proposed grasp
planner will also be able to handle many objects that deviate from this
assumption. However, objects that are tapered towards the view plane
6.3 The Grasp Planner 139

normal, such that the smallest cross section is closest to the hand, will
always be difficult for the hand to grasp. Note that the view plane normal
can have any global orientation, but for most practical cases it will be
either horizontal or vertical.
In the starting position, the camera is looking at the top of a cylinder,
with the palm surface parallel to the cylinder top.3 Achieving a grasp
with a plane to plane contact between the cylinder top and the palm
will often imply a strong grasp, which is desirable. Therefore, the grasp
planner will keep the palm parallel to cylinder top. Locking the direc-
tion of the view plane normal this way, reduces the degrees of freedom
(DOFs) for the wrist from six to four. Considering the wrist and the
hand together, the problem has a total of eight DOFs.

6.3 The Grasp Planner

The contour given to the grasp planner can either be a polygon or a spline
curve. In the latter case, the spline curve is adaptively approximated by
a polygon to a user-defined degree of accuracy. The resulting polygon is
parameterized in the arc-length parameter s. The advantage of using a
spline curve as input is that the radius of curvature at the contact points
can be used as an early, and cheap, quality indicator for the contact.
Besides, a spline representation is inherent in many contour detecting
algorithms, see, e.g., Blake and Isard [16].
The direction of gravity relative the view plane will be used in the
grasp evaluation. Additional, but not necessary information, is the height
of the cylinder and depth constraints due to support surfaces. It is also
possible to specify parts of the contour that may not be touched because
of, e.g., nearby obstacles or support surfaces.
The next subsection will give an overview of the grasp planner, with
following subsections describing important parts in more detail.

6.3.1 An Overview
Basically the grasp planner searches the configuration space for kinemat-
ically valid grasps until a termination criterion is fulfilled. As mentioned
in Section 6.2 , the problem has eight DOFs, so performing an exhaustive
search will be time consuming. Here we reduce the search space in two
3 For a horizontal camera orientation, the cylinder top actually corresponds to the

side of the object.

140 6 Grasp Planning for a Three-Fingered Hand

F2 yF thumb
T
d2 palm
dT
x
ϕ FO
x

s
ϕ d1

Figure 6.2: Relevant frames and variables for a grasp hypothesis. The
position of the thumb is specified by the arc-length parameter s. Each
thumb position has a copy of the polygon relative the frame FT .

ways: We utilize the special kinematic structure of the hand and we use a
set of heuristic rules that help guiding the search. Each tested configura-
tion is called a grasp hypothesis, and Figure 6.2 shows a grasp hypothesis
together with some of the configuration parameters. For clarity, a dashed
box representing the palm is also drawn.
A grasp hypothesis is formed in several steps, and at each step there
are some requirements that has to be fulfilled. These requirements can
be divided into hard and soft requirements, respectively. If a grasp hy-
pothesis does not pass a step associated with a hard requirement, the
hypothesis is discarded and a new one is generated. If, instead, the
step was associated with a soft requirement, the hypothesis can be kept
as idle. This way we avoid throwing away a possible solution and can
quickly backtrack idle hypotheses if stuck. Hard requirements are typ-
6.3 The Grasp Planner 141

Place thumb on contour at position s

Transform polygon to thumb frame
Choose dT and ϕ
Close finger F1 → d1
Close finger F2 → d2
Determine z-coordinates
Check if grasp has OK neighbors
Evaluate grasp

Figure 6.3: Pseudo-code describing the steps for forming a grasp hy-
pothesis.

ically kinematic constraints, while soft requirements are heuristic rules

that relate to the expected quality of the grasp. Threshold parameters
are difficult to choose: On the one hand, too low thresholds will generate
many grasps, of which many are bad. On the other hand, with too high
thresholds we risk throwing away the only feasible solutions. With the
distinction between soft and hard requirements we avoid the problem of
choosing various threshold parameters. The most important steps for
forming a grasp hypothesis are shown in Figure 6.3.
It is important that we start with placing the thumb when we form
a grasp hypothesis; placing the thumb has the effect of fixating three
DOFs, thereby achieving a desirable reduction of the search space. After
a valid thumb position has been found, grasp hypotheses can quickly be
generated by choosing combinations of dT and ϕ and thereafter closing
fingers F1 and F2 . This can be seen as if we let F1 and F2 sweep along the
contour of the object. The coupling between F1 and F2 through ϕ allows
us to quickly discard hypotheses that are not kinematically valid: If no
valid contact is found when closing F1 , then there is no need for checking
F2 too, and the hypothesis can be discarded. This of course implies that
we only allow three-finger grasps. If we represent the polygon contour in
the current thumb frame, see FT in Figure 6.2, the computations for the
contact points for F1 and F2 become simple and fast.

Depth Constraints
When all fingers have been closed, the only remaining degree of freedom is
a translation in the z-direction for the whole hand. Ideally we would like
142 6 Grasp Planning for a Three-Fingered Hand

to place the palm against the object to achieve a more stable grasp. Due
to depth constraints imposed by, e.g., the height of cylinder and eventual
support surfaces, palm contact might not be possible. In such cases we try
to place the palm as close as possible to the object, without violating any
depth constraints. This determines the value of the last remaining DOF,
the z-translation. The process of determining this translation and the
z-coordinates is accelerated through the use of precomputed trajectories,
described in Section 6.3.4.

Efficient Search
A benefit with the stepwise computation of a grasp hypothesis is that we
postpone the, comparatively, expensive grasp evaluation by having cheap
validation gates (the soft requirements) between each step. As a result,
the last two steps in Figure 6.3 are rarely executed. Experiments have
shown that the last step in Figure 6.3 is seldom executed for grasps that
are not good. This result indicates that the soft requirements are good
at filtering out bad grasp hypotheses.
The resolutions used when stepping through combinations of dT and ϕ
are 2 mm and 1◦ , respectively. Performing a depth-first search, i.e., test-
ing every possible combination of dT and ϕ before moving on to the next
thumb position, would not be efficient. Instead a quick and coarse scan
is done over the most promising combinations and then a new thumb
position is generated. When the number of valid thumb positions has
reached an upper limit, we go back to the first thumb position again and
try less likely combinations. Likely combinations of dT and ϕ are deter-
mined from global contour characteristics such as size and eccentricity.
The definition and computation of the characteristics will be described
in Section 6.3.2.

Grasp Robustness
An important aspect of the presented grasp planner is that it explicitly
searches for robust grasps. By robust is here meant that the grasp prop-
erties should not change dramatically for perturbations of the geometry
and the hand/wrist configuration. Here we measure the grasp robustness
by sampling s, dT and ϕ in a neighborhood of the current hypothesis
and counting the number of kinematically valid neighbors it has, i.e., the
number of valid samples. Hypotheses that accumulate a low count of
valid neighbors are given a low confidence and they are kept idle in case
6.3 The Grasp Planner 143

no better hypotheses are found.

Grasp Quality and Planner Termination

Each thumb position will keep a reference to the best grasp, i.e., the best
combination of dT and ϕ at this thumb position. The quality of this grasp
is also the quality of the thumb position. When the quality of the Ng
best thumb positions all exceed a lower threshold, the planner sorts these
thumb positions into an output list. The number of grasps in the output
list, Ng , is a user-defined parameter. The sorting criterion considers
both the quality (here the same as the strength) and the robustness of
the grasp. Hence, a strong grasp might end up last in the output list if
it has a low robustness.
In terms of optimization, the planning can be seen as if we, at each
thumb position, are optimizing over dT and ϕ, followed by an optimiza-
tion over thumb positions, i.e., the parameter s. With the algorithm
described here, the planner will obviously find suboptimal solutions, but
that is not a problem as long as we find grasps that are good enough for
the task at hand. The planner can also terminate directly if it finds a
grasp with quality and robustness near the theoretical upper limits. This
decision can greatly reduce the planning time in some instances, espe-
cially for symmetric objects like cylinders with a circular cross section.

6.3.2 Global Contour Characteristics

So far, the grasp planner only performs an exhaustive search over the
configuration space. To make the grasp planner more efficient, it must
also be able to decide which directions to examine first based on global
contour characteristics. Here we propose to use the size and eccentricity
to characterize the contour.
The size of the object is simply defined as the radius of the smallest
circle enclosing the polygon. This is a standard problem in computational
geometry and there exist algorithms with O(N log N ) complexity, see
Preparata and Shamos [121], where N is the number of vertices in the
polygon.
The eccentricity of the contour is defined as
q
2 2
Ix + Iy + (Ix − Iy ) + 4Ixy
e= q , (6.1)
2 2
Ix + Iy − (Ix − Iy ) + 4Ixy
144 6 Grasp Planning for a Three-Fingered Hand

where Ix , Iy and Ixy are the second order area moments, defined as
Z Z Z
2 2
Ix = y dA, Iy = x dA, Ixy = − xy dA. (6.2)

Note that in Equation (6.1), the area moments are computed with re-
spect to a frame located at the centroid of the contour. The eccentricity
measure defined by Equation (6.1) has a useful geometric interpretation:
If we replace the contour with an ellipse that has the same principal mo-
ments, then the eccentricity is equal to the squared ratio of the major
axis to the minor axis.
Using Green’s formula and the fact that the contour is a polygon, we
can easily transform the area integrals in Equation (6.2) into the following
sums:
N
1 X
(xi − xi+1 )(yi + yi+1 ) yi2 + yi+1
2

Ix = , (6.3)
12 i=1
N
1 X
(yi+1 − yi )(xi + xi+1 ) x2i + x2i+1 ,

Iy = (6.4)
12 i=1

N
1 X h
Ixy = (xi+1 yi − xi yi+1 ) (xi + xi+1 ) (yi + yi+1 )
24 i=1
i
+ xi yi + xi+1 yi+1 , (6.5)

where we have introduced the convenient notation xN +1 = x1 and

yN +1 = y1 . The summation formulas are similar to those given in [139],
but they are slightly more effective in terms of the required number of
summations and multiplications. Using the above sums, the moments for
the contour can be quickly computed. The moments are used to compute
the eccentricity and the directions of the principal axes.
Good properties of the size and eccentricity parameters used here are
that they are intuitive, global, and invariant to rotations and translations
of the contour. Their global nature can be used to quickly give the grasp
planner indications of where to look for good grasps. For very eccentric
objects, the planner first tests thumb positions that allow the hand to
wrap around the minor axis of the object, see Section 6.3.3 and Figure 6.8.
The size of the contour determines the order in which the dT -values are
tested; for a large contour large values of dT are tested first, and vice
versa.
6.3 The Grasp Planner 145

The idea of using global contour characteristics can be applied to

grasp planners for other hands as well.

6.3.3 Choosing Good Thumb Positions

Each valid thumb position will give rise to a high number of possible grasp
hypotheses. With the current resolution, each thumb position result in
5460 possible combinations of dT and ϕ. Therefore it would be good if
we, at an early stage, could classify a thumb position as good or bad with
some degree of confidence.
Let the term contact triangle denote the triangle formed by the con-
tact for F1 , F2 and the thumb, respectively. A common property for
grasps that can hold heavy objects is that they have the center of gravity
within the contact triangle. This is especially true for vertical grasps,
where the gravity is perpendicular to the view plane. If we place the
thumb and find that the centroid is not in front of it, there is no way we
can get the centroid within the contact triangle. Taking this idea fur-
ther, we can say that we want the centroid to be within the thumb’s field
of view (FOV), which is defined as a conical sector with a user-defined
opening angle. Hence, thumb positions that do not have the centroid in
the thumb’s FOV are put in an idle state.
For very eccentric objects, good grasps will often be those with ϕ ≈ 0◦
and the hand wrapping around the minor axis of the object. So for
eccentric objects, we can, as a first step, require the normal of the thumb
to be almost perpendicular to the minor principal axis. Those thumb
positions that do not fulfill this requirement are put in an idle state.
For every thumb position, the planner keeps track of the number
of tested and valid grasps, respectively. If the number of valid grasps
remains close to zero even though the number of tested grasps is very
high, the thumb position is put in an idle state. Note that this decision
has to be grounded on the fact that the planner perform coarse scans
over the entire range of dT and ϕ values.
Because our strategy is to perform coarse scans over the configuration
space, new thumb positions should always be as far away as possible
from the previous ones. Therefore we let the s-values form the following
sequence:

stot stot 3stot

s1 = 0, s2 = , s3 = , , ...
2 4 4
where stot is the perimeter of the contour.
146 6 Grasp Planning for a Three-Fingered Hand

To summarize, we have the following rules for the thumb positions:

• The centroid should be within the thumb’s FOV.
• For very eccentric objects, look for grasps that can wrap around
the minor principal axis of the object.
• Thumb positions where the accumulated number of valid grasps
is low (in comparison to the number of tested grasps) should be
avoided.
• New thumb positions are generated to be as far away as possible
from the previous ones.

6.3.4 Use of Precomputed Trajectories

As long as the inner finger link is free to move, the finger will follow a
predetermined trajectory when closing. This implies that interpolating
from precomputed trajectories will almost eliminate the need for solving
kinematical problems. For the grasp planner, a number of trajectories
were found to be useful, see Figure 6.4.
Trajectory 1 is the contact position as the finger closes. Due to the
cylinder assumption, the finger tip tangent will always be vertical at the
contact point. The x-coordinate for the contact point defines the finger
extension, denoted dT , d1 and d2 for the different fingers. All other
trajectories are expressed as functions of the finger extension. Once the
finger extension is found, all other parameters, including motor encoder
values, are quickly found through linear interpolation.
Trajectory 2 is traced out by the maximum z-coordinate for the finger.
This is useful, for example, in the case of a vertical grasp, where we want
to avoid the finger tips colliding with the support surface, see Figure 6.5.
Trajectory 3 shows how close the cylindrical object can get to the
palm. Trajectories 1, 2 and 3 are all used when determining the z-
coordinates for the contacts. The strategy is to move the palm as close
as possible to the cylinder, without violating any of the geometric con-
straints imposed by the trajectories in Figure 6.4. An example is shown
in Figure 6.5. The small circles indicates points that were determined us-
ing the precomputed trajectories. Without the depth constraint imposed
by the support surface, the planner would have placed the palm against
the cylinder top as that would generate a more stable grasp.
Another use of the precomputed trajectories is shown in Figure 6.6.
In general it is bad to have a grasp where the finger extensions are very
6.3 The Grasp Planner 147

c1
2.
c2

3.
z

Figure 6.4: Interpolating from a number of precomputed trajectories

minimizes the time spent on solving kinematical problems. Trajectory 1
shows the contact position as the finger closes. Trajectory 2 is traced
out by the maximum z-coordinate for the finger. Trajectory 3 shows how
close a cylindrical object can get to the palm. The minimum clearance
needed by a finger is given by c.

different. In Figure 6.4 we see that the z-coordinate of the contact point
varies with the finger extension (trajectory 1). If the z-coordinates of the
contact points are different, the grasp will apply a torque on the object.
This torque tend to decrease the stability of the grasp so we would like
to keep it to a minimum. The measure we have chosen to use here is
the angle between the contact plane and the z-axis, where the contact
plane is defined by the three contact points (see Figure 6.6). This is a
148 6 Grasp Planning for a Three-Fingered Hand

Figure 6.5: Illustration of how the precomputed trajectories can be used

to quickly compute grasps that satisfy the depth constraints. The circles
denote points that were determined from the trajectories.

natural measure as it captures the fact that a large grasp, i.e., large finger
extensions, is better at tolerating differences in the contact z-coordinates
than a small grasp.4 If the contact plane inclination is large, then the
grasp quality evaluation and the grasp robustness test are postponed until
no better grasp is found. Morales et al. [106] also avoid grasps with large
differences between the contact z-coordinates. They used the following
criterion:

(d1 − d2 )2 + (d1 − dT )2 + (d2 − dT )2

However, as can be seen from trajectory 1 in Figure 6.4, a large differ-
ence in finger extensions does not necessarily imply a large difference in
contact z-coordinates. As a result, the criterion in [106] can be overly
conservative.
In addition to the trajectories shown in Figure 6.4, we also compute
the horizontal distance between the contact position and the maximum
4 This property is due to the larger torque arms for the friction forces that have to

counteract the generated torque

6.3 The Grasp Planner 149

Figure 6.6: The precomputed trajectories can be used to determine the

contact plane for the grasp. The contact plane is shown with the solid
line.

x-coordinate for the finger boundary, as a function of the finger extension,

see c1 and c2 in Figure 6.4. This is the minimum clearance needed for
the finger, which is useful if, for example, the finger is surrounded by the
contour. This is the case for the thumb in the grasp shown in Figure 6.9.

6.3.5 Grasp Quality Evaluation

The grasp planner must be provided with a value function so that a grasp
hypothesis can be given a value reflecting its quality. The grasp planner
and the value function are treated as two separate objects. That opens
the possibility to switch between different value functions depending on,
e.g., the available computing time, desired properties of the planned
grasp, or task information. Currently, the only value function that has
been tested with the planner is the one described in this section.
If we want to emphasize the ability of the grasp to resist gravitational
forces, a natural value function would be the maximum admissible ob-
ject weight. Thus, the value function becomes a maximization problem
where we have constraints on the contact forces. Here it is convenient to
introduce a new coordinate frame, located at the centroid of the object,
150 6 Grasp Planning for a Three-Fingered Hand

with the z-axis in the opposite direction of gravity. The centroid’s loca-
tion in the view plane is, as mentioned above, computed from the object
contour. The z-coordinate of the centroid is estimated using the height
of the cylinder.
Using this new frame, the value function can be written as the fol-
lowing linear programming (LP) problem:

max fg
x
T
fg [0, 0, 1, 0, 0, 0] = Gx, xi ≥ 0,
X X
{xi }F1 ≤ 1, {xi }F2 ≤ 1,
X
{xi }thumb ≤ 1,

where fg is the object weight, x is a column vector of length M × N of

generalized contact forces, M is the number of contacts, and N is the
number of base vectors in the contact model. The matrix G, called the
grasp matrix, is a mapping from the contact forces to the forces and
torques exerted by the grasp on the object. Given the contact model and
the position and orientation of each contact, the grasp matrix is straight-
forward to compute. The notation {xi }F1 is introduced to conveniently
denote all contact forces on finger F1 . Hence, the last three inequalities
in the LP problem puts an upper bound on the force exerted by each in-
dividual finger. For more details on contact models and the grasp matrix,
see Section 7.1 or the book by Murray et al. [107].
Here, each contact is modeled as a point contact with friction. The
chosen contact model has N = 8 base vectors. Nguyen [108] pointed out
that in the case of planar polygonal contact areas, there is always an
equivalent representation in terms of a finite number of point contacts.
So, even though we only use point contacts, we can still handle distributed
contacts. A value for the coefficient of friction, µ, must be assumed. Since
µ is only used for comparing grasp hypotheses, it is not so important
that it coincides with the real coefficient of friction. Here µ = 0.3 is used.
Note that even though the planner uses a 2D contour with eventual depth
constraints as input, the grasp evaluation is done for 3D grasps. That is,
not only are the contact positions on the 2D curve taken into account,
but also their z-coordinates.
6.4 Examples of Planned Grasps 151

6.4 Examples of Planned Grasps

The proposed algorithm has been tested on a wide range of geometries
with good results. To give an indication of the required planning time
and the type grasps produced by the planner, three examples are pre-
sented here. In all examples, the contour was given as a spline curve,
which was adaptively approximated by a polygon. The smooth contours
in Figures 6.7 to 6.9 indicate that the approximation error is negligible.
For all the examples, only the best grasp is shown, but the output is
actually an ordered set of Ng distinct (i.e., well separated thumb posi-
tions) grasps. For the examples we used Ng = 10. The timing results
were obtained using a Sunblade 100 computer. The time needed for the
spline-to-polygon conversion is included in the total planning time.
In the first example, see Figure 6.7, gravity is directed along the view
plane normal. Thus, we are planning a vertical grasp. The height of
the cylinder is so large that no grasp hypothesis violated any depth con-
straints. Most of the good grasps for this geometry had ϕ ≈ 60◦ , simply
because the admissible object weight reaches its theoretical maximum if
all three contact forces converge at a single point. Although many strong
grasps were found, we can clearly see in Figure 6.7 that the planner also
emphasizes grasp robustness: The planned grasp can tolerate relative
large perturbations without losing its stability. The polygon curve has
174 vertices and the required planning time was 0.38 seconds.
The second example is a vertical grasp on an eccentric ellipse, see
Figure 6.8. Here the strong eccentricity of the object (e = 16) is used to
initially bias the thumb position and the spread angle such that the grasps
wrap around the minor axis of the object. The best grasp is centered over
the centroid and achieves a large contact area between the palm and the
object, thereby making it a very secure grasp. The polygon curve has 60
vertices and the required planning time was 0.24 seconds.
The last example is a horizontal grasp, with gravity directed down-
wards in the view plane. As the intent was to implement this planner on
a (research) service robot, we chose an object more appropriate to that
context, namely an iron. Because the iron is resting on a support plane,
a constraint box is put around its lower parts, which can be seen as the
dash-dot box in Figure 6.9. This is a much harder problem compared to
the previous ones because much of the contour is not accessible. Obvi-
ously one or two fingers5 must get under the handle of the iron, which
5 Because of the cylinder assumption, the hook grasp with all three fingers on one

side of the handle is not considered.

152 6 Grasp Planning for a Three-Fingered Hand

d1 ϕ

y x

dT
ϕ
d2

Figure 6.7: The polygon has 174 vertices and required planning time
was 0.38 seconds. Gravity is perpendicular to the view plane.

is not easy considering the minimum clearance needed for the fingers,
see Figure 6.4. Even though the algorithm assumes cylindrical objects,
this is an example where the planner does well even on objects that do
not fulfill this assumption. From Figure 6.9 we can see that the palm
has contact with the object. If the real robot would execute this grasp,
the hand would be moved forward until the tactile sensor in the palm
senses a contact. When closing the fingers, they would, due to the clutch
mechanism, wrap around the handle and secure the object. This is an
additional reason why grasps with palm contacts are preferred by the
planner. Note that the thumb in Figure 6.9 is surrounded by the contour
and that the clearance is really small. Here the precomputed clearance
measures described in Section 6.3.4 was used to ensure that the clearance
was large enough.
The polygon in this example has 153 vertices and the planning time
6.4 Examples of Planned Grasps 153

Figure 6.8: The polygon has 60 vertices and required planning time
was 0.24 seconds. The palm makes contact with the object and gravity
is perpendicular to the view plane.

F1
F2

Figure 6.9: The polygon has 153 vertices and required planning time
was 1.3 seconds. The dashed line indicates the palm position and the
dash-dot line is the constraint box. Note that this grasp has palm contact
with the object.
154 6 Grasp Planning for a Three-Fingered Hand

was 1.3 seconds. The much longer planning time for this example is
partly due to much time was used on hypotheses with fingers F1 and F2
inside the constraint box. Another reason is that the size heuristic is not
so appropriate in this case: The very large enclosing circle around the
contour suggests that the planner should initially look for grasps with a
large value for dT , whereas all the feasible grasps will have a small value
for the thumb extension. Without the constraint on the lower part of
the iron, the size heuristic would instead have accelerated the planner.
However, the resulting grasps would wrap around the lower and upper
parts of the iron, which, from a practical point of view is not so good.

6.5 Chapter Summary

In this chapter we have presented a grasp planner for the three-fingered
Barrett hand. The planner utilizes the kinematic structure of the hand,
together with a set of heuristics to quickly search through the most
promising directions of the configuration space. Even though the plan-
ner takes a 2D contour as input, depth information is taken into account
and grasp evaluation is done in 3D. This makes the proposed planner
something in between a pure 2D planner and a pure 3D planner.
Grasp robustness is taken into consideration by examining the effect
of perturbations on a grasp hypothesis. The value of the grasp is decided
by considering both its strength and robustness.
Three examples showed that the planner is fast and produces effi-
cient grasps. Furthermore, the example in Figure 6.9 showed that the
planner can produce good grasps even for objects that do not fulfill the
cylinder assumption. This is due to the strategy of placing the palm as
close as possible to the object in combination with the clutch mechanism
controlling the finger curl motion.
Although the algorithm is tailored for the Barrett hand, it is argued
that some of the ideas used here can be used also for other robot hands:
Classifying objects with simple invariants can give rise to regions in the
configuration space that have high probability of producing a good grasp
for geometries in the corresponding class. This is similar to choosing
pregrasp shapes for certain types of objects. These regions, which can
be found either through learning or expert knowledge, can be used to
reduce the search space to a manageable size. Furthermore, with inaccu-
rate models and sensor data, grasp robustness must also be considered.
Another idea proposed here is that modularity and flexibility can be pro-
6.5 Chapter Summary 155

moted by separating the planner from the value function.

If we extend the grasp planner such that it also checks if grasps are
reachable and collision free, then it will fulfill the grasp generator concept,
introduced in Section 5.3. However, to be really useful, an additional
level of planning should be added—a view-planner. The view-planner
would be responsible for choosing appropriate views of the task object,
project the object onto the view plane of the virtual camera and send the
resulting contour to the grasp planner. If the object does not fulfill the
cylinder assumption, the grasp quality computed by the grasp planner
could be misleading. A final step could therefore include testing the
planned grasp on the full model of the object. This would be similar to
the approach in [102], where a grasping simulator was used for the final
grasp evaluation.
Chapter 7

Grasp Stability
Evaluation

Robots that autonomously grasp objects and interact with them must
have methods for choosing appropriate grasps. However, deciding about
a good grasp also requires some method for evaluating and comparing
grasps. Furthermore, evaluation can be performed with respect to differ-
ent properties of the grasp. Considering the uncertainties in models and
sensor data, grasps that are robust to positioning and modeling errors
are to be preferred. Another important property of a grasp is its ability
to apply forces to the grasped object: A good grasp should be able to
efficiently counteract external disturbance forces, especially those forces
that are expected to occur during the task that is to be performed.
In this chapter we present a novel approach to grasp evaluation. The
approach is based on the ability of the grasp to resist disturbance forces,
and it leads to a min-max formulation. We also propose an efficient
algorithm for solving this min-max problem. The result of the algorithm
is easily visualized as a surface in the force space. For polyhedral objects,
we give a proof showing that only the vertices of the object need to be
considered. Compared to other approaches to grasp evaluation, the main
benefits are:

• The procedure incorporates the complete object geometry.

• Task information is easily included and actually reduces the com-

putational complexity.
158 7 Grasp Stability Evaluation

• The result is independent of scale and choice of reference frame.

• The result can be visualized in 3D and is easy to interpret.

In the next section we give a brief introduction to grasp analysis. In

Section 7.2 we discuss related work and in Section 7.3 we present the
proposed grasp evaluation procedure. To demonstrate the procedure, we
give some examples in Section 7.4. The chapter ends with a summary
and conclusions.
Parts of the material in this chapter has been presented in [143].

7.1 Grasp Analysis Introduction

In this section we give a brief introduction to grasp analysis and introduce
the notation and concepts that are used in the context of grasp evaluation.
It is assumed that the grasped object is rigid and that the grasp
consists of any number of point contacts with friction. The point contact
assumption might seem limiting but, as was pointed out by Nguyen [108],
any planar polygonal contact can be represented as the convex sum of
point contacts placed at the vertices of the contact polygon. Attached
to the object is a reference frame, to which all contacts and forces are
related.
Each contact will have its own reference frame, with the z-axis point-
ing in the direction of the inward surface normal, see Figure 7.1 (a).
Because of friction being present, the contact force can deviate from the
z-axis. If the contact forces obey the Coulomb friction model, then the
space of all admissible contact forces forms a circular cone with opening
angle 2 tan−1 (µ), where µ is the coefficient of friction. This cone, called
the friction cone, will impose nonlinear constraints on the contact force
components.
In literature, the circular friction cone is often approximated with
an n-sided pyramid, see Figure 7.1 (b). By doing this, we can write
the contact force as a positive linear combination of the force vectors
spanning the pyramid:
n
X
f= αj fj , αj ≥ 0. (7.1)
j=1

Note that by choosing the vectors {fj } to have unit z-component,

Pn the
normal component of the contact force is easily obtained as j=1 αj .
7.1 Grasp Analysis Introduction 159

y z

fj
z
2 tan−1 (µ)

y
x
(a) Point contact (b) Friction cone approximation

Figure 7.1: For nonslipping contacts that obey the Coulomb friction
model, the contact forces must be inside the friction cone. (a) A side
view of a point contact together with its coordinate system. (b) An
example of a friction cone approximated by a five-sided pyramid.

It is often convenient to concatenate force and torque vectors, F and

T, into a wrench, defined as W = (FT , TT )T . A wrench is thus a
six-dimensional column vector.
Each force fj will result in an object wrench wj , which can be com-
puted if the position and the orientation of the contact relative the object
frame is known. Let the wj from all contacts be the columns of a 6 × mn
matrix G, where m is the number of contacts. This matrix is called the
grasp matrix. Summing up the contributions from all contacts, the total
wrench exerted by the grasp on the object, W, can be written as

W = Gx, xk ≥ 0, k = 1, . . . , mn (7.2)
where x is a vector containing the αj for all contacts. See the book by
Murray et al. [107] for more details on how to construct the grasp matrix.
When analyzing a grasp, it is of interest to know the space of wrenches
that can be applied to the object by the grasp. The unit grasp wrench
1
space (UGWS) is oftenPmn defined as the space of wrenches that satisfies
Equation (7.2) and k=1 xk = 1. This space is equal to the convex hull
1 Note that by choosing another norm for the contact force vector, we get other def-

initions for the UGWS. Another common, and more natural, definition of the UGWS
is to limit the normal component of each individual contact to one, see Ferrari and
Canny [41]. With this definition, however, the UGWS is much more costly to compute,
and therefore the definition based on the sum of all contact forces is more common.
160 7 Grasp Stability Evaluation

of G, which can be efficiently computed using the Quickhull algorithm,

see Barber et al. [12].
An important class of grasps are those that have force closure. A
grasp has force closure if there exists a solution to Equation (7.2) for
any W. This means that the grasp can counteract any external wrench
acting on the body by adjusting the contact forces properly. If a grasp
has force closure, then the convex hull of G must contain a neighborhood
of the origin [107]. The converse is also true: If the convex hull contains
a neighborhood of the origin, then the grasp has force closure.

7.2 Related Work

Grasp evaluation can be performed with respect to different properties of
the grasp. Which properties are important is determined by the context
and the task. For example, when inserting a pin into a hole, it is impor-
tant that the grasp is compliant to not give rise to large contact forces
due to position and orientation errors. For such a task it is important
that the grasp is not only stable, but also that it has a high manipulabil-
ity measure. Grasp manipulability is the degree to which the fingers can
impart arbitrary motions to the object, see Kerr and Roth [66].

Grasp Robustness Considering the uncertainties in models and sen-

sor data, grasps that are robust to positioning and modeling errors are
to be preferred. Nguyen [108] addressed this by developing algorithms
for finding maximal independent contact regions for several important
types of grasps; among them, two-finger grasps of polygons. As long as
the contacts are within these regions, the grasp has force closure. Ponce
and Faverjon [120] extended the results of Nguyen to three-finger grasps
of polygons. Thus, if robustness to position and modeling errors are of
concern, a grasp quality measure could be chosen as the length of the
smallest contact region.
Bone and Du [19] derived a measure of robustness to positional errors
for polygon grasps. They consider all combinations of finger displace-
ments and evaluate the maximum torque magnitude the resulting grasp
can resist. The robustness measure is the sum of the relative change in
torque resisting capability for all the displacements.
It is well known that friction helps to make a grasp more stable. In
situations where the coefficient of friction is not know beforehand, we
should favor grasps whose stability is not dependent on a high coefficient
7.2 Related Work 161

of friction. Mantriota [95] defined as quality measure the minimum co-

efficient of friction needed to resist a set of unit disturbance wrenches,
directed along the principal wrench space directions. The resulting grasps
are robust in the sense that they rely as little as possible on friction.

Wrench Space Volumes Another important property of a grasp is

its ability to apply forces to the grasped object. A good grasp should be
able to efficiently counteract external forces, especially those forces that
are expected to occur during the task that is to be performed. Previous
work focusing on this aspect of a grasp is naturally divided into task in-
dependent quality measures and task directed quality measures. In task
independent grasp evaluation, information about the task is assumed to
be unavailable or ignored, and the resulting grasp quality measure there-
fore reflects, in some way, the overall stability of the grasp. In task
directed grasp evaluation, on the other hand, the suitability of a grasp
with respect to a particular task is evaluated. Both types of quality mea-
sures are often based on the UGWS. The wrench space is six-dimensional
and, intuitively, a good grasp should have a large UGWS whose shape
matches the set of wrenches that are expected to occur for the given task
(the task wrench space).
Kirkpatrick et al. [69] proposed a quality measure which is the radius
of the largest wrench space ball that just fits within the UGWS. This mea-
sure was also used by Ferrari and Canny [41] and by Pollard [119]. This
quality measure is task independent, and it assumes that all directions in
the wrench space are equally important. There are two drawbacks with
this kind of measure: It is not scale invariant and it is not invariant to
translations of the torque origin. The first drawback is easily remedied by
normalizing the torque components of the wrench vector, e.g., with the
inverse of the maximum distance from the torque origin to the surface of
the object [119]. The second drawback is more severe, because, clearly,
a grasp quality measure should not depend on the choice of reference
frame. This issue was addressed by Teichmann [148] , who proposed as
an invariant measure the radius of the largest wrench space ball with
respect to all possible coordinate frames. This approach do however lack
a simple physical interpretation; the choice of coordinate frame for eval-
uating a grasp is dependent on the grasp itself. A much simpler invariant
measure was proposed by Li and Sastry [88]; they suggested to use the
volume of the UGWS as an invariant measure. Note, however, that a
volume-based quality measure can give a nonzero quality to grasps that
do not have force closure, i.e., are unstable in some direction.
162 7 Grasp Stability Evaluation

Much of the problems with quality measures that are based on wrench
space balls arise because we are trying to compare forces with torques,
which does not make sense as they have different units. To avoid such
comparisons, Mirtich and Canny [104] used a decoupled approach, leading
to a two-valued quality measure. The optimal grasp is the one that
lexicographically maximizes both values; they first compute the grasps
that best counteract pure forces, and then select among those grasps the
one which best resists pure torques.
To obtain task directed measures, Li and Sastry [88] suggested the
use of six-dimensional task ellipsoids, whose shape resembles the space of
forces and torques encountered in the task. They defined a task directed
quality measure as the largest scale factor that causes the task ellipsoid
to be embedded in the UGWS. However, as pointed out in [88], “The
process of modeling a task by a task ellipsoid is quite complicated”.

Including the Object Geometry Using the UGWS alone for the
construction of a quality measure has a drawback that, seemingly, has
not received much attention: The wrench space is constructed from in-
formation about the contacts alone, thus, effects of the complete object
geometry on grasp stability are ignored. A grasp quality measure that
does not take object geometry into account would treat the two grasps
in Figure 7.2 as equal; our intuition, however, tells us that the grasp of
object A should be more stable than that of object B.
Introducing the object wrench space (OWS) concept, Pollard [119],
actually incorporates the complete object geometry into the grasp eval-
uation. The OWS represents the best grasp of the object that can ever
be achieved . An alternative view is that the OWS is the set of wrenches
that can be created by a (unit) distribution of disturbance forces act-
ing anywhere on the surface of the object. Thus, the OWS depends on
the geometry of the object. The OWS concept is closely related to the
idea presented in this chapter, however, the examples presented in [119]
on this subject are limited to two-dimensional polygons with frictionless
point contacts.
In a recent approach, Borst et al. [22] combined the OWS concept
of Pollard [119] with the task ellipsoids of Li and Sastry [88]. If no
task information is given, the best assumption one can make about the
disturbance wrenches is that they are distributed according to the OWS.
Borst et al. [22] therefore choose as task ellipsoid an ellipsoid that tightly
encloses the OWS. As the task ellipsoid is automatically constructed, it
removes one of the strongest objections against using them. It it is also
7.2 Related Work 163

x
A
B

Figure 7.2: The grasp’s stability will depend on which object is grasped,
object A or object B. Most grasp quality measures do, however, only use
the contact information.

worth noting that their method for computing the grasp quality does not
rely on any friction cone approximations.

Visualization Visualization of a six-dimensional wrench space is of

course impossible and therefore many papers only dealt with two-
dimensional grasping problems. The wrench space of a 2D grasping prob-
lem is three-dimensional and easy to visualize. Miller and Allen [100] sug-
gested several methods for projecting the 6D wrench space into 3D. Using
these projections, important characteristics of the 6D wrench space could
be visualized. The approach presented here can be seen as a natural pro-
jection from the six-dimensional wrench space to the three-dimensional
force space.

Compliant Grasps The work presented so far have all assumed that
grasp compliance can be neglected. That is also the assumption of the
method proposed in this chapter. For compliant grasps, the grasp stiff-
ness matrix is a useful tool, see, e.g., Howard and Kumar [57]. A com-
pliant grasp is stable if the grasp stiffness matrix is positive definite. It
seems natural to base a quantitative measure of stability on the eigen-
values of the stiffness matrix, but care must be taken as these are not
invariant under change of reference frame, see, e.g., [90, 24].
Bruyninckx et al. [24] derived a grasp quality measure based on the
generalized eigenvalue decomposition of the grasp’s stiffness matrix. The
generalization requires a choice of a metric on the group of rigid body
displacements that enables the identification of twists with wrenches. Lin
et al. [90] derived a frame-invariant quality measure in terms of the prin-
cipal translational and rotational stiffness parameters. By introducing a
164 7 Grasp Stability Evaluation

physically based conversion of rotational stiffness parameters into equiva-

lent translational stiffness, Lin et al. overcame the problem of comparing
translational and rotational stiffness parameters.

7.3 Grasp Evaluation Procedure

As was pointed out in [88], modeling a task using six-dimensional task
ellipsoids is laborious. Furthermore, visualization of the wrench space
is impossible, unless the grasp problem is two-dimensional. We argue
that the explicit use of the torque component is not necessary, thereby
reducing the six-dimensional wrench space into a three-dimensional force
space.

Key Idea Consider how disturbance wrenches are applied to a grasped

object: In almost all practical cases, a disturbance wrench arises from
a pure force acting on the surface of the object. The resulting torque
component is immediately given by

T = a × F, (7.3)
where a is a vector specifying where the force F is applied. Specifying the
grasp matrix G and a in the same coordinate system is key here: Using
the same torque origin assures that the result will be independent of the
choice of frame. According to Equation (7.3), the torque component is
not independent of the applied force and they are always orthogonal to
each other. Based on this observation, the grasp evaluation procedure
described below is proposed.
Consider a unit vector ê, representing a fixed direction for the distur-
bance force so that the disturbance force can be written as f ê, where f
is a dimensionless scalar. Sweep this disturbance force over the surface of
the object, finding the smallest, positive f that results in a wrench that is
exactly on the border of the UGWS. Let us denote this value by f ⋆ . Note
that when performing this ‘sweeping’ operation, only those points on the
object for which ê is inside the friction cone can come into consideration.
We also require f ≥ 0 because we cannot allow tractional disturbance
forces. Repeating this process for all directions of the disturbance force,
we will end up with a closed surface S in force space. Specifying force di-
rections with spherical coordinates ϕ and θ, the surface S is given by the
⋆
vectors {fϕθ êϕθ }. The interpretation of this surface is straightforward:
7.3 Grasp Evaluation Procedure 165

If a disturbance force is inside S, a unit grasp will be able to resist the

resulting wrench, no matter where the force is applied.

Min-Max Formulation The above procedure can be formulated

mathematically as a min-max problem. Let ∂D denote the surface of
the object, and FCa the friction cone at the surface point specified by
the vector a. To include forces that cannot be seen as a force acting on
the surface of the object (e.g., gravitational forces), we can include an
offset wrench W0 . For each direction ê, we want to solve the following
problem:

ê
f ⋆ = min max f ∈ R+ : −W0 − f = Gx,
a x a × ê
mn
)
X
ê ∈ FCa , a ∈ ∂D, xi ≥ 0, xi = 1 . (7.4)
i=1

The resulting surface S can be seen as the space of admissible distur-

bance forces around W0 , assuming a unit grasp and that the disturbance
forces are always applied at the worst case surface point. Taking the com-
plete geometry into account, we thereby assume that disturbance forces
can occur anywhere on the object surface. However, as illustrated by
the following example, task information can easily be included by only
choosing those parts of the object surface where disturbance forces are
likely to occur.

Task Directed Evaluation The usual way to hold a pen when writing
is to place the fingertips close to the tip of the pen, letting its upper part
rest between the thumb and the index finger, see Figure 7.3. Clearly,
this grasp cannot easily resist forces acting on the upper part of the pen.
Using the complete surface of the pen as input to the evaluation procedure
would thus result in a poor overall stability for the grasp. However, when
writing, we know that external forces are only exerted on the tip of the
pen, and the chosen grasp is excellent for balancing those. Accordingly, a
task directed measure, indicating a more stable grasp, could be computed
by using only the tip of the pen as input.

Disturbance Force Friction Cone Note that upon using only parts
of the object surface, S might not be closed. This can happen if some
166 7 Grasp Stability Evaluation

y
F
x

Figure 7.3: The grasp for holding the pen is good at resisting forces
that are applied to the tip of the pen, but bad at resisting forces at the
upper part. A task directed evaluation of the grasp would only include
only the tip of the pen, as this is where external forces are expected to
occur.

of the force directions ê never fall inside a friction cone. The same can
happen if the assumed friction cone for the disturbance force is very
small. However, choosing large enough friction cones for the disturbance
forces, the procedure will cover the force directions relevant to the task.
Thus the procedure depends on some imagined coefficient of friction, that
should be chosen considerably larger than the real one.

Disturbance Force Upper Bound Since we assume a unit grasp,

there exists a theoretical upper bound for S. Itpis easily shown that, if
W0 = 0, S is bounded by a sphere with radius 1 + µ2 . For a nonzero
offset wrench, this bounding sphere will be distorted and translated.

7.3.1 An Algorithm for Solving the Min-Max Prob-

lem
One immediate objection against the proposed procedure is that it seems
both unnecessary and computationally overwhelming to traverse the en-
tire surface of the object. However, it is shown in Section 7.3.3 that if
the object is a polyhedron, then it is enough to traverse its vertices only.
Therefore, in the following analysis, we will assume polyhedral objects.
The UGWS can be seen as the interior of a set of NP hyperplanes in
6D. Each hyperplane has a unit normal np and an offset d′p . The purpose
of the prime will be explained below. For a wrench W to be inside the
7.3 Grasp Evaluation Procedure 167

hull we must have

nT ′
p W + dp ≤ 0, p = 1, . . . , NP . (7.5)
Looking for wrenches that are exactly on the boundary of the wrench
space can be done by looking for intersections with these hyperplanes.
Here we use the Quickhull program [12] to compute the hyperplanes.
Clearly, we must discretize the space disturbance force direc-
tions. This is most easily done using spherical coordinates: ê =
(cos ϕ sin θ, sin ϕ sin θ, cos θ)T , where θ ∈ [0, π] and ϕ ∈ [0, 2π). Assume
that the grasp must resist some default offset wrench W0 . If, in addition,
a disturbance force is applied at a point a on the object’s surface, then
the following must hold for the grasp to remain stable:
mn
ê X
−W0 − f = Gx, xi ≥ 0, xi = 1. (7.6)
a × ê
i=1

The minus signs in the left hand side of Equation (7.6) are necessary
because the grasp has to exert a wrench that cancels the external wrench
to maintain equilibrium.
With ê and a constant, the left hand side of Equation (7.6) describes
a line in wrench space parameterized by f :

W = −W0 − f W⋆ , (7.7)
where we have introduced the convenient notation

ê
W⋆ = . (7.8)
a × ê
Thus, if −W0 is inside the hull there exists two intersections with the hull
boundary and the line: One with negative f , which we are not interested
in, and one with positive f , which we are looking for. Now we only
need an effective algorithm for finding this positive f -value for each force
direction.
Assume that the wrench W from Equation (7.7) is exactly on hyper-
plane p. Then we obtain, from Equation (7.5)

−nT ⋆ ′
p (W0 + f W ) + dp = 0, p = 1, . . . , NP . (7.9)
We note that the term -nT
p W0can be seen as coordinate translation
of the wrench space, changing the plane offsets. Thus, we can introduce
dp = d′p − nT
p W0 which simplifies Equation (7.9) to
168 7 Grasp Stability Evaluation

−f nT ⋆
p W + dp = 0, p = 1, . . . , NP . (7.10)
Since it is assumed that the grasp is able to resist the offset wrench, we
must have dp ≤ 0 for all hyperplanes. This is a more general condition
than the commonly used d′p < 0, which require the grasp to have force
closure. With only one unknown in Equation (7.10), we can solve directly
for f :
dp
fp = , p = 1, . . . , NP . (7.11)
np W⋆
T

The smallest positive fp , taken over all hyperplanes, will correspond to

the largest disturbance force that can be applied to a particular vertex a.
Repeating this procedure for all vertices, for which ê is inside the friction
cone, will give the worst case disturbance force. However, computing fp
according to Equation (7.11) for all hyperplanes and all vertices seems
highly inefficient. The complexity of such an algorithm would be close to
O(NV NP ), where NV is the number of vertices on the object. In reality
it would be slightly lower due to ê falling outside the friction cone for
many vertices.

7.3.2 Improvement from Sorting the Hyperplanes

One way of reducing the complexity would be to precompute and keep
track of the minimum, positive f that can be achieved from each hyper-
plane. Calling these minimum values fmin,p , we can sort the hyperplanes
so that fmin,p is always increasing. This way we are likely to find the
limiting hyperplane earlier and we can stop the loop as soon as the cur-
rent minimum f is smaller than fmin,p . Using Equations (7.8) and (7.11)
together with ||ê|| = 1, we find the following simple lower bound:

−dp
fmin,p = q q , (7.12)
n21p + n22p + n23p + amax n24p + n25p + n26p

where amax is the maximum vertex distance. Note that the minus sign is
necessary because dp is negative.
Noticing that ||a × ê|| is constant while looping over the hyperplanes,
this sorting idea can be further refined: With knowledge about ||a × ê||
we can obtain a better lower bound than that of Equation (7.12). So
instead of sorting only once, sorting can be done NB times where each
7.3 Grasp Evaluation Procedure 169

sorting is associated with an interval for ||a × ê||. Thus, we have NB

buckets, where the hyperplane ordering in each bucket b is determined
b
by sorting them increasing in fmin,p , given by:

b −dp
fmin,p =q q . (7.13)
amax
n21p + n22p + n23p + b n24p + n25p + n26p

So if (b − 1)amax /NB ≤ ||a × ê|| < bamax /NB , the current cross product
b
belongs to bucket b. One could find even better values for fmin,p using
the fact that ê⊥(a × ê), but the resulting optimization problem would
cost too much to make it worthwhile.
In Figure 7.4 it is seen that the proposed sorting scheme has a signifi-
cant influence on the computational time. For sorting, a variation of the
Radix Sort algorithm [34] was used.2 This algorithm is special in that it
has a linear complexity in the number of elements to be sorted. In this
case, sorting always require four traversals over the data, thus we know
that the time spent on sorting will be exactly proportional to the number
of buckets. As the number of buckets are increased, at some point there
will be no further gain from having the hyperplanes sorted. Any further
increase in the number of buckets will then cause a linear increase of the
total computational time. This is also seen in Figure 7.4.
For optimal effect of the sorting, the number of buckets should be
set adaptively for each problem, depending upon the number of object
vertices and the number of hyperplanes. How this should be done re-
quires more numerical experiments and a deeper analysis of the algo-
rithm. However, experiments so far have shown that choosing NB = 30
gives a significant performance increase for most problems.
Assuming that the vertices have a uniform range distribution around
the origin, the limiting, or worst case, vertices will often be those which
generate large torque components. A further optimization is therefore
to sort the vertices, decreasing in ||a × ê||, before we loop over them.
To satisfy the assumption about uniform range distribution, the object
frame should be placed at the object’s centroid. The final algorithm is
described in listing 1.
2 In books on algorithms, Radix Sort is described as an algorithm for sorting integer

values. It is, however, due to the IEEE Standard 754 floating point representation,
possible to use it for sorting floating point numbers as well; if they are interpreted as
integers, comparisons between them will still be correct (although some care has to
be taken with the sign bit).
170 7 Grasp Stability Evaluation

time [ms]
60

35
0 20 40 60 80 100
NB
Figure 7.4: The grasp from Figure 7.5 (a) was evaluated with Fg = 0
(8 object vertices, 416 hyperplanes, and 578 force directions). The curve
shows required computational time, in milliseconds, as a function of NB .
For the special case NB = 0, no sorting at all is done.

7.3.3 Disturbance Forces and Object Vertices

An important question is how many points on the surface of the body
we have to consider when moving around the disturbance force. Here
we will give a proof showing that, if the body is a polyhedron, then it is
sufficient to only consider its vertices.

Theorem 1 For a polyhedral body grasped by a unit grasp, the worst

point of attack for a disturbance force will always be a vertex of the body.

Proof: Without loss of generality, we assume that each face of the poly-
hedral body is a convex polygon; in case of a nonconvex polygon, we can
always decompose it into a finite number of convex polygons. The kth
face is the convex hull of the vertices a1 , a2 , . . . , aNk , where Nk is the
number of vertices.3 Any position a on the kth face can thus be written
as a convex combination of its vertices:
3 A more strict notation would show that the vertices of each polygon face is a sub-

set of the vertices of the whole body. However, besides from running out of letters,
we believe that additional indices in this case would only lead to a notation that is
harder to read.
7.3 Grasp Evaluation Procedure 171

Algorithm 1 Compute disturbance-force surface

choose object vertices relevant to the task
give the offset wrench → W0
compute the convex hull of G, → {n, d′ }
transform the hyperplanes, dp = d′p − nT p W0
if (dp > 0) then
abort {W0 is not inside the hull}
end if
b
sort hyperplanes NB times, increasing in fmin,p

for all force directions êi do

select vertices aj for which êi ∈ FCj
sort vertices, descending in ||aj × êi ||
fi⋆ = +∞
for all sorted vertices aj do
find bucket, b = NB ||aj × êi ||/amax
for all sorted hyperplanes np , dp do
if (fi⋆ < fmin,p
b
) then
break
end if
fcand = dp /(nTpW )
⋆

if (fcand ≥ 0 and fcand < fi⋆ ) then

fi⋆ = fcand
end if
end for
end for
end for

Nk
X Nk
X
a= βi ai , βi = 1, βi ≥ 0. (7.14)
i=1 i=1

For a given disturbance force direction ê we only have to consider the

subset of faces for which the disturbance force satisfy the friction cone
constraint. In this subset, let us now consider the kth face, so that the
position of the disturbance force can be expressed with Equation (7.14).
Using Equations (7.8), (7.11) , and (7.14), we can write the intersection
between the applied wrench and the pth hyperplane as
172 7 Grasp Stability Evaluation

dp
fp = , (7.15)
b0p + b1p β1 + · · · + bip βi + · · · + bNk p βNk
where b0p = n1p ex + n2p ey + n3p ez and bip , 1 ≤ i ≤ Nk , can be computed
from n4p , n5p , n6p , ex , ey , ez and a1 , . . . , aNk . It is seen that, for each
hyperplane, the minimum positive fp is achieved by letting the βi with
smallest coefficient become one (note that dp is negative for a stable
grasp). This clearly corresponds to one of the vertices of the polygon. So
for each hyperplane, the worst point of attack for the disturbance force
will correspond to a vertex of the polygon. Hence, the overall worst point
of attack will also be a vertex.
Since it is sufficient to consider only the vertices for each polygon
face, we conclude that we only have to consider the vertices of a polyhe-
dral body. Note that this conclusion is independent of the friction cone
opening angle: Narrowing the friction cone will only reduce the num-
ber of polygon faces that has to be considered for each disturbance force
direction.

7.4 Examples
To illustrate the procedure proposed in this chapter, it was tested on two
small problems. For both problems, the coefficient of friction was set to
0.3, and each friction cone was approximated by eight force vectors. The
resolution for the disturbance force directions are set to 10◦ for both ϕ and
θ, resulting in a total of 578 force directions. This simple discretization
scheme will produce more samples for the force directions near θ = 0◦
and 180◦ . A uniform, and therefore more efficient, discretization could be
made by choosing the force directions as the vertices of a geodesic dome.
When evaluating the grasp, the coefficient of friction for the disturbance
force was set to 1.5. Note that the friction cone for a vertex can be
defined somewhat arbitrarily. Here it is defined as the union of the cones
belonging to the planes forming the vertex. An additional cone, in the
direction of the averaged vertex normal is also added.

Adding Gravity In the first example we will look at the effect of

adding gravity in the evaluation of a grasp. To simplify interpretation,
the grasped object is simply a box and the grasp is symmetrical. The box
has dimensions 2 × 2 × 5. The object frame is placed in middle of the box,
such that the z-axis points along the longer direction of the box. The
7.5 Chapter Summary 173

box is grasped by four frictional point contacts that are symmetrically

placed in the xy-plane. The coordinates of the contacts are (±1, 0, 0)T
and (0, ±1, 0)T , see Figure 7.5 (a). Solving for the convex hull of the
grasp matrix G results in 416 hyperplanes. The grasp was evaluated two
times; first without gravity and then with the gravitational force set to
Fg = 0.15. Gravity acts along the negative z-axis. Note that Fg = 0.3
is the upper limit for the unit grasp since it is assumed that µ = 0.3. In
Figure 7.6, two views of S are given for each grasp evaluation. In the
first case, Fg = 0, the problem is totally symmetric. Hence, the surface
S is symmetric around all axes and the grasp has almost equal strength
in all directions. Breaking the symmetry by adding a small gravitational
force has the major effect of translating S upwards, compare the upper
diagrams in Figure 7.6. This is due to upward directed disturbance forces
having to work against the gravitational force and vice versa. It is also
seen that the total volume enclosed by S has decreased due to the grasp
being preloaded by a nonzero W0 . As the gravitational force is further
increased, the volume reducing effect quickly becomes dominant. This
behavior simply reflects the loss of stability margins for the grasp as it is
being used more and more for just holding the weight of the object.

Adding Contacts In this example we want to study the effect of

adding an additional contact to the grasp in the previous example. In the
presence of gravity, the grasp in Figure 7.5 (a) is weak in the downward
direction. To make most use of an additional contact, it should therefore
be placed under the box so that it can work against the gravitational
force. The resulting grasp is shown in Figure 7.5 (b). When compar-
ing the two grasps, the gravitational force was set to 0.1. The result is
shown in Figure 7.7. It is seen that the extra contact expands the surface
S in all directions, except in the positive z-direction. As expected, the
expansion is largest in the negative z-direction. The expansion of S in
the xy-plane is explained by the friction force between the box and the
extra contact: Due to its large torque arm, this friction force efficiently
counteracts the torque generated by a horizontal disturbance applied to
the upper or lower parts of the box.

7.5 Chapter Summary

When comparing task directed and task independent grasp measures to
each other, one most often finds that the former are more complicated
174 7 Grasp Stability Evaluation

z z

y y

x
x

(a) Four point contacts (b) Five point contacts

Figure 7.5: Illustration of the grasps evaluated in the examples. The

box dimensions are 2 × 2 × 5 and gravity is directed along the negative
z-axis. The coefficient of friction is set to 0.3 for all the contacts and
each friction cone is approximated by eight force vectors. Note that the
friction cones have been plotted on the outside of the object to make
them more visible.

and take longer time to compute. The proposed algorithm breaks this
pattern: Task information is easily included by removing object vertices
that are unlikely to be in contact during the task, thereby reducing the
size of the problem.
One of the main features of the proposed evaluation procedure is that
the torque-component of the wrench space is used implicitly, thereby
reducing the result to a 3D surface in force space. This surface is easily
interpreted and visualized. Furthermore, in the case of p a zero offset
wrench, this surface is bounded by a sphere with radius 1 + µ2 .
An algorithm solving the resulting min-max problem was also pro-
posed. It was shown that introducing a sorting procedure greatly reduced
the complexity of the algorithm. Further improvements would involve
using more uniformly distributed force directions (e.g., the vertices of
a geodesic dome) and downsampling of detailed objects (to reduce the
number of vertices that has to be considered).
Offset wrenches, like gravitational forces, are easily taken into ac-
count. In some cases the offset wrench can even have a stabilizing effect
on the grasp, as when carrying a plate in the open palm; the weight of
7.5 Chapter Summary 175

the plate helps the grasp to resist horizontal disturbance forces.

The surface S can be used as a quality measure as it is, but in many
cases one would prefer to work with a scalar quality measure. How to
derive a scalar quality measure from S is not addressed here, but it is
argued that the proposed procedure provides a sound basis for doing so.
One suggestion is to choose to choose the minimum f ⋆ over all direc-
tions. A force with this magnitude can be resisted by a unit grasp no
matter where it is applied and no matter its direction. The drawback of
such a quality measure is that it could be overly conservative. If S is a
closed surface, one could instead use the volume enclosed by S as a less
conservative measure.
The proposed procedure can be used in grasp planning, for finding
good grasps for robot grippers. It can also be used as a validation gate in
a more reactive manner: If the executed grasp is good enough, the robot
will continue its task. Otherwise the object will be re-grasped.
Fg = 0 Fg = 0.15
0.15 0.15

0.1 0.1

0.05 0.05

Fz Fz
0 0

−0.05 −0.05

−0.1 −0.1

−0.15 −0.15
−0.1 −0.05 0 0.05 0.1 −0.1 −0.05 0 0.05 0.1
Fx Fx
0.1 0.1

0.05 0.05

Fy 0 Fy 0

−0.05 −0.05

−0.1 −0.1
−0.1 −0.05 0 0.05 0.1 −0.1 −0.05 0 0.05 0.1
Fx Fx

Figure 7.6: Illustration of the effect of adding gravity to the grasp

in Figure 7.5 (a). The left-hand figures show two views of the surface
S when gravity is neglected. In the right-hand figures, a gravitational
force of 0.15 is included. Because µ = 0.3, this is also the maximum
gravitational force a unit grasp can withstand. Note that the top of S is
plotted darker to make the views easier to identify and that the scale is
the same in all the figures.
Four contacts Five contacts
0.15 0.15

0.1 0.1

0.05 0.05

Fz 0 Fz 0

−0.05 −0.05

−0.1 −0.1

−0.15 −0.15

−0.2 −0.2

−0.25 −0.25

−0.12 −0.06 0 0.06 0.12 −0.12 −0.06 0 0.06 0.12

Fx Fx
0.12 0.12

0.08 0.08

0.04 0.04

Fy 0
Fy 0

−0.04 −0.04

−0.08 −0.08

−0.12 −0.08 −0.04 0 0.04 0.08 0.12 −0.12 −0.08 −0.04 0 0.04 0.08 0.12
Fx Fx

Figure 7.7: The left-hand figures show two views of the result for the
grasp with four point contacts in Figure 7.5 (a). The right-hand figures
show the result of adding one extra contact, working against the gravity,
see Figure 7.5 (b). The extra contact causes the surface S to expand
in all directions, except for the positive z-direction. As expected, the
expansion is largest in the negative z-direction. For both grasps, the
gravitational force was 0.1 and the coefficient of friction was 0.3.
Chapter 8

Summary and
Suggestions for Future
Work

This chapter briefly summarizes the key ideas presented in this thesis.
The summary of each chapter ends with suggestions for future work and
research.

Chapter 2
Chapter 2 presented the design of CoPP, an object-oriented framework for
path planning. The framework can be considered as a set of LEGO blocks
that can be used to build path planners; the blocks can be combined in
many different ways and blocks that share the same same interface can
replace each other. The most important aspects of the framework are:

• It makes it easy to make fair comparisons between variations of a

concept or an algorithm.
• It is flexible and easy to extend.
• It is portable.

Future work for this framework would involve adding classes for dy-
namic systems. With this addition, the framework could be used for
studying methods for kinodynamic motion planning as well.
180 8 Summary and Suggestions for Future Work

Chapter 3

A path planning problem involves some moving system, which could be

a single rigid body, a kinematic chain, or a dynamic model of an aircraft.
Chapter 3 presented a class for representing robots with a tree-like kine-
matic structure. This class can be used to model a wide range of robots
and includes features such as

• Customizable inverse kinematics

• Joint couplings

• Hierarchical robot composition

The possibility to compose complex robots from simpler ones has several
benefits. With a complex robot it might not make sense to consider all
degrees of freedom at the same time. With a hierarchical composition it
is easy to extract the sub-robot that involves the degrees of freedom we
are interested in. The resulting modularity also makes it easy to change
various part of a robot. One can for example change the end-effector tool
of a robot from a parallel-jaw gripper to a welding tool.
Note that the CoPP framework does not hinge upon this robot class;
other models of motion are easily added to the framework.
A possible improvement for the robot class would be to allow other
types of joint couplings. Specialized joint couplings could be used to
model slider-crank mechanisms common in, e.g., excavator arms.

Chapter 4

Chapter 4 presented a method for augmenting bidirectional RRT-

planners with local trees. The experimental results in this chapter show
that local trees help to reduce the required planning time for problems
that involve narrow passages. The best effect was seen on problems where
the solution trajectory had to pass a sequence of difficult passages.
For unbalanced problems where either start tree or the goal tree is
stuck, the method is inefficient because too many nodes are added to one
of the trees. A future version of the algorithm should therefore strive
to keep the trees balanced, focusing more on the tree that has difficul-
ties to expand. Future research would also involve more experiments to
determine better heuristics for when and how to grow local trees.
8 Summary and Suggestions for Future Work 181

Chapter 5
Chapter 5 presented a high-level planner for pick-and-place tasks. Such
tasks were divided into three phases: approach, transport, and return. A
bidirectional RRT-planner was used to find motions for each phase. The
contributions of this chapter are:

• The use of multiple goals in case many arm configurations can reach
the same grasp.

• Retract planners that preprocess start and goal configurations into

configuration space trees.

• Grasp generators that serve as useful abstraction of the grasp plan-

ning process.

• A simple model that allows the planner to handle different task

constraints.

This planner has shown to perform well in virtual environments, where

the map is exact and sensors are perfect. The real challenge would be
to implement this planner on a real robot system. This would call for
sensor-based path planning methods for mobile manipulation, an area
where not so much work have been done.

Chapter 6
The last part of the thesis shifted focus towards grasp planning and grasp
stability. Chapter 6 presented a fast grasp planner for a three-fingered
hand. The input to the planner is only a 2D contour, with optional depth
information. Even though the input may be purely two-dimensional, the
grasp planning takes place in 3D. The grasp planner works by quickly
generating grasp hypotheses, which are evaluated in a step wise fashion; a
hypothesis that does not pass an intermediate evaluation step is idle until
no better grasp is found. This way, the most computationally expensive
steps are postponed until the planner has found a really promising grasp
hypothesis.
Considering noisy sensor data and modeling errors, planned grasps
should be robust against positional and geometric errors. The planner
explicitly tests a grasp’s robustness by disturbing the grasp configuration
in a fixed number of directions. The number of valid “neighbors” is used
as a measure of the grasp’s robustness.
182 8 Summary and Suggestions for Future Work

In most cases, the 2D contour that is the input to the planner corre-
spond to the top view of the object. To be more flexible, however, the
grasp planner should be combined with a view planner that plans which
view of the object should be used to extract the 2D contour.

Chapter 7
Chapter 7 presented a novel approach to grasp stability evaluation. The
approach is based on the ability of a grasp to reject external disturbance
forces. The advantages compared to other approaches are:

• Inclusion of the complete object geometry

• Independent of the choice of reference frame

• Easy to do task directed evaluations

The proposed method produces a three-dimensional surface in the

force-space, but grasp planning algorithms need scalar measures to com-
pare grasps. Hence, a subject for future research would be how to derive
a scalar measure from the three-dimensional force surface.
Appendix A

Class Diagram Notation

Throughout this thesis we have used class diagrams to describe designs

and relationships between classes. This appendix describes the class di-
agram notation itself. The notation is based on OMT (Object Modeling
Technique), invented by Rumbaugh et al. [126], but we have added some
conventions to be able to express C++ constructs like template functions.
A class diagram is a graph whose nodes are classes and whose arcs
are relationships between classes; it depicts classes, their structure, and
the static relationships between them. In the OMT notation, a class is
denoted by a box with the class name in bold font at the top. The key
operations of the class are listed in a separate compartment below the
class name. Any member variables are listed in another compartment,
below the operations.1 See Figure A.1 for two examples.

Abstract and Concrete Classes An abstract class is a class where

one or more of its operations lack implementation; its purpose is to define
a common interface for derived classes. An operation that has no imple-
mentation is an abstract operation (also known as a pure virtual function
in C++). A class that has no abstract operations is a concrete class.
The OMT notation [126] has no typographical construct to distinguish
between abstract and concrete classes. Here we follow the convention of
Gamma et al. [45] and set the names of abstract classes in italics, see Fig-
ure A.1. We use the same convention for the operations also. Note that
1 Rumbaugh et al. [126] place the member variables above the list of operations.

We do the opposite to stress that it is the operations of a class, and not its member
variables, that are the most important.
184 A Class Diagram Notation

AbstractClassName ConcreteClassName
AbstractOperation(Type arg) ConcreteOperation1(arg1, arg2)
Type ConcreteOperation( ) Type ConcreteOperation2( )
member_variable1
Type member_variable2

Figure A.1: Examples of abstract and concrete classes. The names of

abstract classes and of pure virtual functions (functions without an im-
plementation) are set in italics. Note that type information for arguments
and return value is optional.

it is not possible to instantiate any objects using an abstract class; that

would not make sense as some of the operations lack implementation.

Class Inheritance New classes can be defined in terms of existing

classes using inheritance. Inheritance relationships is indicated with a
vertical line and a triangle that points toward the parent class, see Fig-
ure A.2. As a convention, the parent class is placed above its subclasses.
As it is clear that a subclass inherits all the operations and all the mem-
ber variables of its parent class, inherited attributes are not repeated in
the class diagram, unless there is a special reason for doing so.

Aggregation and Acquaintance It is common to use object compo-

sition to create more complex objects. For example, an object represent-
ing a drawing can be composed of graphical objects such as lines and
polygons, see Figure A.2 (a). As another example, a company can be
seen as a composition of its divisions, and each division as a composi-
tion of its departments, see Figure A.2 (b). This type of relationship is
denoted aggregation, see [126, 45]. An aggregation relationship implies
that one object owns or is responsible for another object. As a result,
an aggregated object should never outlive its owner. In class diagrams,
aggregation is shown as an arrow with a diamond shape at its base; the
arrow points toward the aggregated object and an optional label can fur-
ther describe the role of the relationship, see Figure A.2. A filled circle at
the arrow defines a one-to-many relationship. In the case of the drawing
in Figure A.2 (a), a single drawing may consist of many graphical objects.
Objects of one class may use the services provided by objects of other
classes. For example, as shown in Figure A.2 (b), a company may use
A Class Diagram Notation 185

shapes
Drawing Shape

Line Polygon Texture

(a) Drawing application

Company Division Department

employees

Person
(b) Company

Figure A.2: Illustration of different class relationships.

the services provided by its employees. In the example in Figure A.2 (a),
we allow polygons to be textured. As texture objects consume much
memory, a possible optimization is to let polygons with identical tex-
tures share a common texture object. Thus, polygon objects do not own
the texture objects, they only use them. This type of relationship is de-
noted acquaintance, see [126, 45]. Acquaintance merely knows of another
object, thus it is a much weaker relationship than aggregation. In class
diagrams, acquaintance is depicted just as aggregation, except that there
is no diamond shape at the base of the arrow, see Figure A.2.
Although aggregation and acquaintance are usually implemented the
same way, they may lead to different semantics. Consider again the exam-
ple of the company class diagram in Figure A.2 (b): Had the relationship
between a company and its employees been aggregation instead of ac-
quaintance, the meaning would be that companies are like labor camps,
from which the employees never leave.

Conventions Names of types and operations begin with a capital let-

ter and contain no underscores. The names of any attributes or function
arguments begin with a lower case letter. If an attribute name is a com-
bination of several words, the words are separated by underscores. These
conventions make it easier to distinguish between types and operations
on the one hand and attributes and arguments on the other hand.
186 A Class Diagram Notation

Complex
RealT Abs( ) ClassName
RealT Imag( ) + PublicOperation( )
... # ProtectedOperation( )
RealT real - PrivateOperation( )
RealT imag $ ClassOperation( )
(a) Template class (b) Access modifiers

Figure A.3: Conventions used to illustrate template classes and to con-

vey information about access rights.

The OMT class diagram notation does not have any construct for
C++ constructs like template classes and template functions. To show
that a function is a template function, we use the convention to let the
name of the type of its argument end with the letter “T”. We use the
same convention to show that a class is a template with respect to a
certain type. As an example, consider the class Complex for representing
complex numbers. We can implement this class as a template, where
the numeric type of the real part and the imaginary part is the template
argument, see Figure A.3 (a).
For some design patterns, the access right to the participating class
methods are an important part of the pattern. Where it is important to
convey access rights, we have borrowed the conventions from the UML
notation [127]. See Figure A.3 for a description of this notation. In case
these symbols are omitted, it can be assumed that all class methods are
public. Member variables are always private, thus no access symbols are
used for these.
In this thesis we use class diagrams to illustrate the high-level design
of classes. Thus, the class diagrams never list every operation of a class.
Instead they list the operations that are most important to convey the
design of a class.
Appendix B

Rigid Body
Transformations

Because path planning deals with moving geometric objects, we need a

systematic way to describe position and orientation. In this appendix
we discuss rigid body transformations. In particular we look at different
ways of representing rotations and their drawbacks and advantages from a
path planning point of view. Note that the words rotation and orientation
are used interchangeably in this discussion; any orientation of a geometric
object can be seen as the result of applying a single rotation to the object
when it is in its default orientation.

B.1 The Homogeneous Transformation Ma-

trix
In this section we briefly discuss the homogeneous transform as a way
of representing rigid body transformations. The discussion will form a
basis for the design and implementation of a class that encapsulates the
concept of homogeneous transformations.
To each object we attach a coordinate frame. Using the notation of
BT denote the homogeneous transformation matrix that
Craig [36], let A
maps position vectors from frame FB to frame FA . That is, given a
position BP in frame FB , its representation in frame FA , AP , is given by
188 B Rigid Body Transformations

ŶA
P
A P
B

FB
PB,ORG
A

X̂B
ẐA
ẐB

ŶA

X̂A
FA

Figure B.1: Illustration of the notation used to distinguish between

different coordinate frames.

A
P = A B
BT P. (B.1)
Note that the position vectors must be expressed in homogeneous coordi-
nates, i.e., P = (x, y, z, 1)T (if we assume no scaling). The notation with
leading sub- and superscripts might seem awkward at first sight, but it
provides a simple mnemonic that help us in writing transform equations
correctly–a subscript must match the following superscript. If we only
consider rigid body transformations, then the homogeneous transforma-
tion matrix has the following structure:
A
BR PB,ORG
A
A
T = , (B.2)
B
01×3 0
where A BR is the 3 × 3 rotation matrix that describes the orientation of
frame FB with respect to frame FA and APB,ORG denotes the origin of
frame FB , see Figure B.1.
In fields like computer graphics, the last row can be used for perspec-
tive transformations. For our purposes however, the extra dimension can
be seen as a construct that allow us to treat rotations and translations in
B.1 The Homogeneous Transformation Matrix 189

a uniform manner. The columns of the rotation matrix in Equation (B.2)

can be interpreted as the principal directions of frame FB expressed in
frame FA :

A
BR = A
X̂ B
A
ŶB
A
ẐB . (B.3)

Given A BT , we can use ordinary matrix inversion to find AT . How-

ever, if we consider that rotation matrices are orthonormal, the following

equation gives a more efficient method for computing the inverse:
A T TA

A −1 BR −A BR PB,ORG
T = . (B.4)
B
01×3 1
So far we have seen the homogeneous transform as just a mapping
of position vectors from one frame to another. It can also be seen as
the description of one frame relative another; if we attach a frame FB
to a geometric object, such that FB will move together with it, then
the position and orientation of the object relative the world frame FW
B T . In this context we will denote B T as the
is completely specified by W W

pose of the object.

Often the pose of an object is not given directly in terms of the world
frame. This is the case when we deal with kinematic chains, where the
pose of each frame is known only with respect to the previous frame. To
obtain the global pose of, say, frame j in the chain, we multiply all the
transforms leading to this frame:

0 j−2 j−1
0 T 1T · · · j−1T
W
j T =W jT (B.5)
Note again how the notation help us to get the order of matrix multipli-
cations right.
As an example, consider Figure B.2 where a robot is about to grasp
a cylinder. For the sake of clarity, the robot arm, to which the gripper is
attached, is not shown. Using Equation (B.5), we have obtained the pose
of the hand, W E T , where subscript E stands for end effector. We assume
that the pose of the cylinder is known, and we denote it W C T . To plan
the task, we would like to know the pose of the cylinder in terms of the
end-effector frame. Using Equation (B.5), we readily obtain

E E W
C T = WT T,
C (B.6)
−1
where T =
E
W
W
ET .
190 B Rigid Body Transformations

Figure B.2: A parallel-jaw gripper that is about to grasp a cylinder.

More detailed expositions on the material presented in this section can

be found in any introductory text book on robotics or computer graphics.
See, e.g., Craig [36], Murray et al. [107], or Watt [153].

B.1.1 A Class for Homogeneous Transformations

The homogeneous transform introduced in Equation (B.2) is very useful
because it allows us to treat both rotations and translations in a uni-
form manner. However, for computational purposes, the representation
is not efficient; using 4 × 4 matrices in a computer program would not
only waste memory, but also time on multiplying by zeros and ones.
Thus, we would like to have a class that mimics the syntax of Equa-
tions (B.1) and (B.5), but implements it more efficiently. If we consider
Equations (B.2) and (B.1), and go back to ordinary vector representation,
it is easy to see what the homogeneous transform actually does:

A
P = A B
BT P +
A
PB,ORG . (B.7)
So, considering the absence of scaling and perspective transformations,
the interface of the class Transform in CoPP follows the form of Equa-
tion (B.1), whereas the implementation uses Equation (B.7).
So far we have only discussed the transformation of position vectors,
which according to Equation (B.7) are transformed by both a rotation
and a translation. However, sometimes we want to transform free vectors,
i.e., vector quantities like velocities and forces. Transforming such vectors
according to Equation (B.7) would be an error, because they should be
rotated only. If we use the homogeneous transform, this can be solved if
we follow the convention that free vectors have the following homogeneous
B.2 Representing Rotations 191

representation: V = (x, y, z, 0)T . In CoPP, position vectors and free

vectors are treated as two different types. That allows multiplication
between a transform and a vector to have different meanings, depending
on the type of the vector.
As seen from Equations (B.2) and (B.5), multiplying two transforms
involves multiplying two rotation matrices. If ordinary matrix algebra
is used, the operation of multiplying two rotation matrices will require
27 multiplications and 18 additions. However, if we utilize that the last
column in a rotation matrix is equal to the cross product of the first two
columns, then the operation count can be reduced to 24 multiplications
and 15 additions. This optimization is used by the Transform class.
In summary, the class Transform overloads the operator * to repre-
sent:

• Multiplication of of two transforms

• Transformation of position vectors

• Transformation of free vectors

In addition the class also provides several useful routines, such as conver-
sion from and to the equivalent axis-angle representation of the rotational
part.

B.2 Representing Rotations

Whereas representing a translation is straightforward, this is not the
case with orientations in 3D. Many different representations have been
suggested, each with its own merits and drawbacks. Here we will take a
look at the more common representations and see how they are useful in
the context of path planning.

B.2.1 Rotation Matrices

As seen in Section B.1, a 3×3 rotation matrix can be used to represent the
orientation of one frame with respect to another. This is a very common
way of representing rotations with several advantages such as: elegant
and straightforward syntax, efficient transformation of coordinates, and
existence of specialized hardware for such operations.
There are also some drawbacks with rotation matrices. The nine ele-
ments of a rotation matrix are clearly not independent of each other; in
192 B Rigid Body Transformations

fact, it is sufficient with only three parameters to describe any rotation

matrix. Thus, this representation is inefficient when it comes to memory
use. Furthermore, due to the finite precision of computers, multiplica-
tion of rotation matrices introduces numerical errors. If many successive
rotations are combined, the resulting matrix may no longer be orthonor-
mal. Using such a matrix to transform an object will not only rotate, but
also shear and scale the object. The problem with numerical drift can be
avoided if the resulting matrix after each multiplication is replaced with
the “closest” orthonormal matrix. This operation is however nontrivial
and costly. If numerical drift should become a problem, then quaternions
provide a more efficient solution, see Section B.2.3.
For applications that require input from humans, rotation matrices
are not user-friendly; imagine typing a nine-element orthonormal matrix
correctly. For such situations, other representations such as axis-angle,
or Euler angles, are used instead.

B.2.2 Euler Angles

Euler angles are a more compact way of representing three-dimensional
rotations compared to rotation matrices. With Euler angles, a rotation
is expressed as the result of three successive rotations, α, β, and γ, where
each rotation is about one of the coordinate axes. Commonly the rota-
tions are chosen as: α about the x-axis, followed by β about the y-axis,
and finally γ about the z-axis. The rotation angles are sometimes referred
to as roll, pitch, and yaw angles [36].
Euler angles have the benefit of being more compact than rotation
matrices; now only three numbers are required instead of nine. They are
also more user-friendly for applications that requires the user to specify
rotations. There are however several drawbacks with Euler angles when
used in the context of path planning. Most of these drawbacks are be-
cause Euler angles are incapable of correctly representing the topology of
SO(3), causing both theoretical and practical problems.
The roll, pitch, yaw angles are just one possible convention for the
sequence rotations and with respect to which axes to perform these.
Some conventions do not even rotate around the axes of the fixed ref-
erence frame; instead they rotate around the axes of the moving frame.
Craig [36] showed that, if only the principal axis of the frames are used,
there are no less than 24 possible Euler angle sets. Even though only a
few of these are used in practice, confusion may arise if it is not clear
which particular convention is used.
B.2 Representing Rotations 193

Even if it is clear which particular convention is used, there are other,

more severe, problems. For a given convention, there are always multiple
sets of parameters which yield the same rotation. Furthermore, there are
even cases where an interval of one the rotation angles yield the same ro-
tation. As an example, consider the following common convention, where
the rotations are about the axis of the moving frame: Rotate α about
the z-axis, followed by β about the (new) y-axis, and finally γ about the
(once again, new) z-axis. It is easily verified that rotations (α, β, γ) of the
form (α, 0, −α) yield the unit rotation matrix. Thus there are infinitely
many representations of the identity rotation with this convention. This
behavior is due to a singularity in the conversion from a rotation to the
corresponding Euler angles. Other conventions avoid the singularity for
the identity rotation, but then it occurs for some other rotation instead.
Indeed, it is a fundamental topological fact that singularities can never
be eliminated in any three-dimensional representation of SO(3) [107].
Another problem with Euler angles is that they are not suitable for
interpolation. Interpolating between two orientations using Euler angles
often results in jerky, unnatural looking motions. This is not only a cos-
metic problem, because the interpolated motions also cause the moving
body to generate a larger swept-volume [70]. This is disadvantageous
to a path planning algorithm as the larger swept-volume increases the
probability of collision.
When choosing the limits for an Euler angle representation, care must
be taken to avoid a double coverage of SO(3). To set the range of each
angle to [−π, +π], would thus be an error; one of the angles should have
range [− π2 , + π2 ]. Which angle should have reduced range depends on
which angle convention is used.

B.2.3 Quaternions
In this section we give a brief introduction to quaternions and the benefits
from using them to represent rotations. For more details, see, e.g., the
book by Kuipers [73].
Quaternions are a generalization of complex numbers and can be used
to represent rotations in much the same way as complex numbers on
the unit-circle can be used to represent two-dimensional rotations. A
quaternion h can be written as a linear combination

h = w · 1 + xi + yj + yk, x, y, z, w ∈ R,
194 B Rigid Body Transformations

where w is the scalar component of the quaternion, and the symbols i, j,

and k denote the “imaginary” parts of a quaternion. Thus, a quater-
nion can be seen as four-dimensional vector. To honor the originator,
the mathematician William Rowan Hamilton, the set of quaternions are
often denoted by H. Hamilton defined the following relationships for the
imaginary parts i2 = j2 = k2 = ijk = −1. From these definitions, the
formula for multiplication of two quaternions can be derived. This for-
mula can be written compactly if the imaginary part of a quaternion,
xi + yj + zk, is written as vector v. Multiplication of two quaternions h1
and h2 is then given by:

h1 · h2 = (w1 w2 − v1 · v2 , w1 v2 + w2 v1 + v1 × v2 ). (B.8)
Note that due to the cross-product in Equation (B.8), quaternion multi-
plication does not commute.

Quaternions and Rotations It can be shown that any unit quater-

nion represents a rotation [73]. More precisely, the rotation about the
unit vector n by an angle θ is given by the quaternion

h = (cos(θ/2), sin(θ/2)n). (B.9)

The set of all unit quaternions, and hence the set of all rotations, can be
seen as the surface of the unit sphere in four dimensions, S 4 . One of the
advantages of this representation is that it, in contrast to Euler angles,
lacks singularities. The only “tricky” part that requires consideration is
that antipodal points of S3 represent the same rotation. Thus h and −h
yield the same rotation. This is no more strange than rotating by an angle
θ about the axis n, is equivalent to rotating an angle −θ about the axis
−n. To ensure that each rotation is associated with a unique quaternion,
we could for example choose to only use the upper hemisphere of S3 .
Applying a sequence of rotations, represented by quaternions h1 and
h2 , is equivalent to quaternion multiplication as given by Equation (B.8).
That rotations are not commutative is thus reflected by the noncommu-
tative quaternion multiplication. Equation (B.8) shows two additional
advantages of using quaternions to represent rotations. Compared to
matrix multiplication, Equation (B.8) uses less operations. Furthermore,
as with rotation matrices, quaternions can also suffer from numerical
drift, that accumulates for each quaternion multiplication. For quater-
nions, this problem is easily remedied as it is reduced to normalizing a
four-dimensional vector to unit length.
B.2 Representing Rotations 195

Quaternions and Interpolation One of main reasons for the popu-

larity of quaternions is that they allow smooth and simple interpolation
between orientations. As the set of unit quaternions is identified with
S3 , the shortest path between two unit quaternions is a great-circle arc.
Shoemake [131] presented the following elegant formula for spherical lin-
ear interpolation (SLERP) between two quaternions:

sin((1 − t)θ) sin(tθ)

h(h1 , h2 , t) = h1 + h2 , (B.10)
sin(θ) sin(θ)
where t ∈ [0, 1], and θ = arccos(h1 h2 ). As antipodal points on S3 are
equivalent, an implementation must make sure to negate, e.g., h1 in case
the inner product between them is negative. Otherwise the interpolated
motion will not take the shortest path. Furthermore, if the inner product
between the two quaternions are close to one, then the quaternions are
almost identical. In such a case, an implementation should use linear
interpolation to avoid division by a number close to zero, i.e., sin(θ), see
Kuffner [70].
As reported by Kuffner [70], the smooth interpolation provided by
Equation (B.8) was important to solve difficult rigid body problems effi-
ciently.

Quaternions and Metrics As the shortest path between two quater-

nions is given by the great-circle arc on S3 , it seems natural to define the
distance between them to be proportional to the length of the arc. The
arc length is proportional to the angle between the quaternions, and a
possible metric is therefore given by

ρ(h1 , h2 ) = arccos(h1 · h2 ). (B.11)

Note again that it is important to check wether the inner product is
negative. If so, one of the quaternions should be negated.
As a path planning algorithm may compute the distance between pairs
of configurations many thousands of times, we are sometimes willing to
use a faster, but less accurate metric. In [70] it was suggested to use the
following metric instead of that in Equation (B.11):

ρ(h1 , h2 ) = wr (1 − |h1 · h2 |), (B.12)

where wr is a weight, relating the rotational distance for rigid body with
the translational distance.
196 B Rigid Body Transformations

B.3 Summary
We have introduced a class Transform that follows the syntax of the
homogeneous transformation matrix. It is used to specify the pose of
geometric objects, to represent moving coordinate frames, and in the
communication with collision detection algorithms. The class can also
be used for transformation of vectors, and here position vectors and free
vectors are transformed differently.
For problems involving kinematic chains, the class Transform is also
used. For rigid body problems, however, we recommend using quater-
nions to specify the rotation. We summarize the following advantages of
using quaternions to represent rotations:

• When using quaternions, it is much easier to compensate for nu-

merical errors, compared to when using rotation matrices.

• Quaternions are a more compact representation than rotation ma-

trices.

• It is easy to interpolate between orientations using quaternions.

• The quaternion representation does not suffer from any singulari-

ties.

It should be mentioned though that quaternion-based interpolation takes

slightly longer time to compute than interpolation of Euler angles. But,
as pointed out in [70], the total computational time for path planning
problems is likely to be reduced due to the better motions produced by
the quaternion interpolation.
Thus, for rigid body problems, the CoPP framework uses quaternions
as the default representation. To make it easy to do comparisons be-
tween quaternion representations and, e.g., Euler angle representations,
the framework does support other representations as well.
Appendix C

Geometric
Representations

Many path planners represent objects either as sets of triangles or as

convex polyhedra. Representing objects as sets of triangles has several
advantages:

• Almost any object can be modeled, up to a certain accuracy, using

enough triangles.

• Many graphical modelers and CAD tools can give output in this
format.

• Triangle sets can directly be used for visualization.

• There exists efficient collision detection algorithms that work on

triangle soups.

For convex polyhedra, there exists numerous efficient collision detec-

tion algorithms and the geometric representation is often more compact
compared to triangle sets. However, with convex polyhedra, the family
of objects that we can model is is much smaller. This drawback can be
alleviated somewhat if we allow objects that are unions of convex poly-
hedra. Still there are objects that are very awkward to decompose into
convex polyhedra, such as torii and other curved objects.
Most available collision detection algorithms either work on triangle
sets or on convex polyhedra. To support both types of algorithms, we
198 C Geometric Representations

2
3

[0] [1]

1 4
0 6
[2] face arr = [0, 1, 2, 3, −1,
7
1, 4, 2, −1,
5 7, 5, 6, −1]

Figure C.1: Example of and indexed face set with three faces. Note
that the faces need not be joined together.

decided that CoPP should provide both triangle sets and convex polyhe-
dra as geometric representations. We also want to provide a geometric
type to represent unions of convex polyhedra.

C.1 Indexed Face Sets

Before implementing the concrete geometric types right away we must
ask ourselves what they have in common; doing so can help us define a
base class for them. It can be seen that all three types are actually sets
of polygons, albeit with different constraints. In the case of a triangle
set, all polygons are constrained to be triangles, whereas in the case of
a convex polyhedron, the polygons must join together to form, well, a
convex polyhedron. Based on these observations, we should have some
class to represent a set of polygons. There exists already a data structure
for representing a set of polygons, namely an indexed face set. An indexed
face set consists of a contiguous array of vertices that can be accessed
through an index starting at zero. Each face is specified with the indices
of the vertices that form the face. The end of a face is marked by a
-1. One of the main reasons for using this data structure is its efficiency
regarding memory consumption: If each face would store its own vertex
data there will in general be a lot of duplicate vertices. An example of
an indexed face set is shown in Figure C.1. Notice that the faces need
not be joined together. Without any constraints though, all we have is a
polygon soup.
C.2 Triangle Sets 199

The class IndexedFaceSet is an abstract class that inherits from

Geom. From the class diagram in Figure C.2 it is seen that this class has
methods for reading the vertices and faces of an indexed face set. An in-
dexed face set is constructed using the methods AddVertex and AddFace.
These functions are protected, meaning that only derived classes (and the
class itself) can call them. As a result, it is up to the derived classes to
implement any required constraints on the polygon sets.
By extracting the common properties of triangle sets and convex poly-
hedra into the abstract class IndexedFaceSet we have gained several
advantages:

• The book-keeping methods for vertices and faces are used by all
derived classes, thereby promoting code reuse.

• Functions that operate on indexed face sets will also work with
triangle sets and convex polyhedra. This is the case, for example,
for the visualization functions; the same function can be used for
drawing both triangle sets and convex polyhedra.

• A cleaner design in that it better expresses separation of concerns.

• The class hierarchy is easy to extend.

C.2 Triangle Sets

Triangle sets are a very common way to represent geometric objects,
and in computer graphics they are almost ubiquitous. Thanks to the
class IndexedFaceSet defined in the previous section, the concrete class
TriangleSet can be implemented with little effort. The only way to
build a TriangleSet is through the two methods AddTriangle and
AddTriangles, which make sure that the appropriate constraint is en-
forced, see Figure C.2. The second method is a template member function
that takes two iterators as arguments, specifying a sequence of triangles.

C.3 Convex Polyhedra

The numerous efficient techniques for computing distances between con-
vex polyhedra have contributed to making this geometry representation
a popular choice in many path planners and other virtual-environment
200 C Geometric Representations

Geom
GetPose( )
Move(Transform t) material
Move(Vector args) Material
SetMoveFormat(format)
GetBoundingBox(…)
pose
AttachToFrame(…) Transform
Accept(GeomVisitor v)

IndexedFaceSet
+ GetVertices( )
+ GetFaces( )
# AddVertex(…)
# AddFace(…)
vertices
faces

ConvexABC TriangleSet
ConvexABC(points) AddTriangle(…)
GetHalfEdges( ) GetTriangles( )
half_edges

GeomConvex GeomConvexGrp
GeomConvex(points) GetParts( )
parts

Figure C.2: Class diagram for the geometry classes currently imple-
mented in CoPP.

applications. A natural way to construct such an object would be to com-

pute the convex hull of a set of points. Several algorithms for computing
convex hulls efficiently have been proposed. One of the more popular is
the quickhull algorithm, Barber et al. [12], which is publicly available in
C.3 Convex Polyhedra 201

3
2

1
0
he arr = [4, 8, 12, 16, 1, 2, 3, −1, 0, 2, 3, −1, 0, 1, 3, −1, 0, 1, 2, −1]

Figure C.3: An example showing the half edge structure for a tetra-
hedron. The first four entries in the array are offsets for each vertex
entry.

the software package Qhull. CoPP uses Qhull, but to make it easier to
use, CoPP provides a wrapper class called Hull3D.
Without any additional topological information, the class for convex
polyhedra is nothing but an indexed face set whose faces happen to form a
convex polyhedron. Clearly there should be more topological information
we can utilize for such a special instance of an indexed face set. One of
the most popular collision detection algorithms for convex polyhedra is
GJK, originally proposed by Gilbert et al. [48]. The original algorithm
used as input only two set of vertices, one for each convex hull. Later
Cameron [27] showed that the algorithm could be more efficient if it is
provided additional information in the form of an adjacency structure
called half edges. The half edge structure tells us which vertices we
can reach in one step by following the edges leading from the vertex
we are currently standing at. This type of adjacency information has
also been used by Sato et al. [129], Ong and Gilbert [111, 112], and
several others. Therefore it was decided that this information is part
of the representation of convex polyhedra. The half edge structure can
be implemented in many ways, for example as a linked list as in [129].
Here we choose to use an array of indices into the vertex array. An
example of how the half edge structure looks like for a tetrahedron is
shown in Figure C.3. The concrete class GeomConvex in Figure C.2 is
used to represent convex polyhedra. Its constructor accepts a set of
points and invokes Hull3D to compute the corresponding convex hull.
It is also seen in Figure C.2 that GeomConvex inherits from an abstract
class ConvexABC. The reason for this intermediate class will be explained
in the next section.
202 C Geometric Representations

C.4 Non-Convex Objects and Groups of

Convex Objects
The family of convex polyhedra is rather restricted. We can broaden
it significantly by also consider shapes that are formed by the union of
several convex polyhedra; then we can represent any shape that can be de-
composed into a finite set of convex objects. A naive attempt would be to
implement such shapes as an array of GeomConvex, where all the objects
move together because they are attached to the same frame. Perform-
ing collision detection between two such shapes would have O(M1 M2 )
complexity, where M1 and M2 are the number of convex parts in each
object. This is clearly not the kind of performance we are looking for, so
we need to organize the parts better. Bounding volumes such as spheres
and boxes are often used in the early steps of collision detection as an
attempt to quickly rule out pairs of objects that are well separated. Be-
cause we are already using convex polyhedra, why not use the convex hull
of the parts as a bounding volume? This will give us the tightest convex
bounding volume possible. The convex hull of all the parts will hereafter
be denoted closure. If the closures remain separated most of the time,
those M1 M2 collision checks will seldom be required. We can take this
idea one step further by using a recursive definition. Our objects will
then be a hierarchy of convex polyhedra and closures.

C.4.1 Three Possible Designs

How such a hierarchical model should be implemented depends of course
on our intended usage of it, but also on our notion of the grouped ob-
ject: We can choose to view a group of convex objects as just any non-
convex object, where the occurring closures are nothing but an imple-
mentation artifact. Alternatively, we can choose to view the closures
as objects in their own right. It turns out that the appropriate de-
sign will depend to a large extent on which view we take. Figure C.4
shows three possible designs that were considered. If we adopt the first
view, then the resulting class has no properties in common with the
class GeomConvex; it is just implemented in terms of it. The class would
have methods such as GetClosure and GetParts, which both return ob-
jects of type GeomConvex. An informative name for this class would be
GroupOfConvex.
If we instead adopt the second view, then a group of convex objects
actually is a convex object in itself. This view is correct if that convex
C.4 Non-Convex Objects and Groups of Convex Objects 203

object is the closure of the group. The IS-A relationship suggest that
we should use inheritance. However, inheriting directly from a concrete
class such as GeomConvex will often lead to constrained class hierarchies
and subtle traps that are easy to fall into. See, e.g., the book by Mey-
ers [98] why inheriting from concrete classes should be avoided in general.
Therefore a new abstract class should be introduced, ConvexABC, from
which all other convex objects inherit. With this step taken, we could
choose to implement hierarchies of convex objects using the Composite
pattern [45], see Figure C.4 (b), or we could use the approach shown
in Figure C.4 (c). The implementation using the Composite pattern is
appealing because of its clarity in showing the recursive definition of the
composite objects. However, as clients use GetParts, they will not know
wether these parts themselves are compositions. After all, the intent
of the Composite pattern is to hide from clients wether or not they are
dealing with composites. The Composite pattern would have been a good
solution if collision detection was part of the ConvexABC itself; as each
object know its own type, it can take appropriate actions. Because the
philosophy in CoPP is to enforce a good separation of concerns, collision
detection is not part of the geometry interface. Therefore the Composite
pattern is not appropriate in this situation. Note that the solution to add
the method GetParts to the interface of ConvexABC is to be considered
bad design in this situation: It is not an intrinsic property of convex
objects to have parts. (Unless those parts are vertices, faces, etc.)

In the third design, shown in Figure C.4 (c), the recursive definition
is carried out on the class itself. As in Figure C.4 (b), the data struc-
tures needed to represent a single convex polyhedron is obtained through
the inheritance relationship. In the general case this derived part is the
closure of the group. A single convex polyhedron will be modeled as a
group with no parts. In this case, the closure and the actual polyhedron
are the same.

The designs in Figure C.4 (a) and in Figure C.4 (c) are both sound
and the main difference lies in how we view the closures. However, it was
found that the class diagram in Figure C.4 (c) leads to several implemen-
tation advantages such as more code reuse and simpler code. Therefore
the class GeomConvexGrp was chosen before GroupOfConvex.
204 C Geometric Representations

GroupOfConvex parts
GetClosure( ) GeomConvex
closure
GetParts( )

(a)

ConvexABC

GeomConvex ConvexComposite parts

GetParts( )

(b)

ConvexABC

GeomConvex GeomConvexGrp
GetParts( ) parts

(c)

Figure C.4: Three possible designs for modeling a union of convex

polyhedra.

C.4.2 An Example of a Hierarchical Geometric Ob-

ject
With all the machinery in place, it is time to take a look at an example
geometry. Figure C.5 (a) shows a simple model of a book shelf.1 The
book shelf is clearly non-convex, but is easily decomposed into 20 convex
parts, all of them regular boxes. If all these parts are put together into a
single group of convex objects, the resulting GeomConvex object encloses
1 Frequent IKEA customers may recognize it to be the Ivar book shelf.
C.4 Non-Convex Objects and Groups of Convex Objects 205

the parts with their closure. In this particular case, the closure becomes
extremely simple; except for a few extra vertices due to diagonal struts,
the closure becomes a regular box. The shelf, together with a slightly
transparent closure, is shown in Figure C.5 (b).
For the purpose of efficient collision detection, however, this arrange-
ment of the parts is far from optimal; once an object penetrates the
closure of the shelf, there are again 20 more collision checks to perform.
Because the sides of the shelf contain six parts each, and they are com-
pact, i.e., they are rather box-like themselves, it is a good idea to let each
side be a group of its own. The resulting model is shown in Figure C.5 (c),
where the outermost closure has been made totally transparent to bet-
ter show the inner closures. The GeomConvex object for the shelf will
now require storage for three more convex polyhedra in addition to the
original 20. This is a typical example of the tradeoff between memory
and speed. In general though, the memory overhead for this hierarchical
representation is modest, because the number of vertices in a closure is
usually much lower than the sum of the vertices of the parts that form
it. Admittedly, the book shelf example is a bit extreme in this respect:
All the parts have together 160 vertices, whereas the closure has only 16.
In a typical pick-and-place task involving an object placed on one
of the shelves, there would be many configurations where the closure of
the shelf is penetrated by both the robot arm and the grasped object.
To confirm a collision-free configuration that places the robot between
two shelves, the arrangement in Figure C.5 (c) will require 11 collision
tests. Without the two inner closures, such configurations would require
21 collision checks. Thus, the added complexity in the arrangement of
the parts pays off in terms of less collision checks. Of course there exist
degenerate cases as well: Consider a cylindrical rod that is long enough
to collide with the closure of both sides at the same time. If the rod
is placed between two shelves such that it touches the closures of both
sides, then 23 collision checks will be required to confirm a collision free
configuration.
The two steps required to make a hierarchical model as the one just
shown, decomposition into convex parts and grouping, are done manually
in the description file. It would of course be nice if at least one of the
steps could be automated, but the study of such algorithms is beyond
the scope of this thesis.
(a) (b) (c)

Figure C.5: Illustration of how a non-convex object is represented as

an hierarchy of convex polyhedra. The hierarchical representation is vi-
sualized using three different views of the object.
Appendix D

Pairwise Collision
Detection with PQP and
GJK

The CoPP framework currently supports two different collision detection

algorithms, PQP [76] and Enhanced GJK [27]. In this appendix we look
at how these algorithms are encapsulated behind the uniform interface
defined in CoPP.
In Section D.1 we look at C++ specific techniques that allow differ-
ent geometric types to be used with PQP. The only requirement on the
geometric type is that it must support triangulization. Section D.2 deals
with Enhanced GJK and distance computations for convex polyhedra. In
Section D.3, techniques for mixing different collision detection algorithms
are briefly discussed.

D.1 Encapsulating PQP

The Proximity Query Package (PQP), developed by Larsen et al. [76], is a
package that provides many different types of proximity queries on pairs
of geometric objects. It does not assume any specific topology on these
objects; an object is just seen as a set of triangles, from which PQP builds
its own internal representation, which has the type PQP Model. Users of
PQP can easily create objects of type PQP Model by adding the triangles
208 D Pairwise Collision Detection with PQP and GJK

one at a time and then tell PQP to build the internal representation.
Thus, any object that we can triangulate can be used with be PQP.
The class that encapsulates PQP for a pair of geometric objects is called
GeomPairPQP, and it inherits from WitnessPair, see Figure D.1.
Objects of type GeomPairPQP always reference a pair of geometric ob-
jects. The best way to enforce this constraint is to only define construc-
tors that require pairs of geometric objects. The question is, what type
should these geometric types have? A straightforward solution would be
to introduce a new geometric class (derived from Geom) that internally
stores a PQP Model. This geometric class would then be compatible with
PQP. However, this approach would go against one of the main goals with
the CoPP framework, namely that changing an algorithm should require
as little work as possible. If, for one reason or another, a user would
like to change to another collision detection algorithm, then she has to
change the geometric type as well. If the geometric type is changed,
chances are that other things will change too. This chain reaction is
caused by the, rather unnecessary, coupling between the geometric type
and the collision detection algorithm via the type PQP Model. In princi-
ple, we should be able to use GeomPairPQP with any geometric type that
can be triangulated. If we know how to triangulate a certain geometric
type, then it is easy to provide a conversion from that type to PQP Model.
Thus, the constructor of GeomPairPQP should accept any combination of
geometric types for which both types support this conversion. We could
achieve such a generous constructor if we overload it on every combina-
tion we are interested in. However, this solution is not very scalable as
the interface of GeomPairPQP would become very cluttered by all the dif-
ferent constructors. Furthermore, as users add new geometric types that
they want to use with this class, they will have to add the appropriate
constructor too, which is definitely not what most users would expect.
There is a solution to the problem with the cluttered interface of
GeomPairPQP, which also minimizes the coupling between this class and
classes for geometric types. This solution builds on the template facility
available in the C++ language. For every geometric type that we want
to use with PQP, there has to exist an overloaded version of the function
CreatePQP:

PQP_Model* CreatePQP(const TriangleSet& obj);

PQP_Model* CreatePQP(const GeomConvex& obj);
PQP_Model* CreatePQP(const GeomConvexGrp& obj);
...
D.1 Encapsulating PQP 209

CollisionPair
+ Collides( ) if (DoCollides()) {
+ GetNumCollisions( ) ++num_collisions;
return true;
- DoCollides( ) }
num collisions return false;

DistancePair
+ Collides(tolerance)
+ Distance( )
+ DistanceSqrd( )
- DoCollides(tolerance)

WitnessPair
+ GetClosestPoints(p1, p2)

PairGJK_Base GeomPairPQP
PairGJK_Base(ConvexABC a, GeomPairPQP(GeomT1 a,
ConvexABC b) GeomT2 b)
SetToTracking(…)
bool UseSimplex( )
GetSimplex( )
SimplexGJK last_result
bool is_tracking

LeafPairGJK MixedPairGJK
LeafPairGJK(ConvexABC a, MixedPairGJK(GeomConvex a,
ConvexABC b) GeomConvexGrp b)

impl
ConvexGrpPairGJK GeomPairGJK PairGJKBase

Figure D.1: Classes that encapsulate proximity queries for pairs of

geometric objects. The constructor of GeomPairPQP is a template that
accepts any geometric types, as long as there exist matching versions of
the function CreatePQP.
210 D Pairwise Collision Detection with PQP and GJK

The ellipses indicate that we would add an overloaded version for ev-
ery new geometric type. Note that these functions are not part of the
interface for any geometric type, hence no unnecessary couplings are in-
troduced. Now we can make the constructor to GeomPairPQP a C++
template, with the geometric types as template arguments. At compile
time, the compiler will look for the appropriate versions of CreatePQP
and generate the needed constructors for us. In effect, we have got n2
constructors for the price of one, where n is the number of overloaded
versions of CreatePQP.
A final note on this class: As each PQP Model can consume large
amounts of memory, creating two such objects for every GeomPairPQP
would be very wasteful. Therefore, the class uses reference counting
techniques to ensure that only one PQP Model per geometric object is
created.

D.2 Encapsulating Enhanced GJK

Enhanced GJK [27] is a fast algorithm for computing the minimum dis-
tance between two convex polyhedra. An advantage of this algorithm
is that it requires very little preprocessing of the geometric data. The
base class for convex polyhedra, ConvexABC, already contains all the data
needed by the algorithm, so an object for doing collision detection on a
pair only needs two references to such objects. To speed up queries in
situations with strong temporal coherence, each pair object will also con-
tain cached information about the latest query. This information is the
simplex (in the Minkowski difference of the two polyhedra) to which GJK
converged the last time.
The class GeomPairGJK, see Figure D.1, takes care of distance
queries on a pair of convex polyhedra. It inherits from the abstract
PairGJK Base, that implements code for maintaining the cached infor-
mation from each query. The class GeomPairGJK is used as an extra level
of indirection; it only forwards queries to an object of type LeafPairGJK,
MixedPairGJK, or ConvexGrpPairGJK. The extra level indirection make
sure that clients only have to deal with GeomPairGJK, no matter which
types of convex objects are used. This is an example of the Handle/Body
idiom, see Coplien [33].
To model non-convex objects we introduced the class GeomConvexGrp,
which provides hierarchical compositions of convex objects. Distance
queries on a pair of such objects will traverse the two hierarchies and
D.3 Mixing Different Algorithms 211

find the pair of leaf objects that minimize the distance. Collision detec-
tion queries are made much faster, because parts of the hierarchies can
simply be skipped if their bounding spheres or closures do not intersect.
The class ConvexGrpPairGJK deals with pairs whose geometries are com-
positions of convex polyhedra. It also inherits from PairGJK Base. For
pairs of hierarchical objects, the cached information only applies to root
object of each hierarchy.

D.3 Mixing Different Algorithms

A problem that is often discussed in association with pairwise collision de-
tection is that of double-dispatch; when calling a virtual function through
a pointer to an object, the function that is actually executed will de-
pend on the dynamic type of the object pointed to (i.e., single-dispatch).
However, sometimes we would like choice of function to depend on the
dynamic type of two objects. As a motivating example, consider that we
want to take advantage of the many closed form solutions that exist for
computing the distance between pairs of simple objects. Closed-form ex-
pressions exist for the distance between, e.g., spheres, rectangular boxes
and line segments. If we want to take advantage of this we are faced with
the double-dispatch problem; the collision checking code that is executed
depend on if we are checking, e.g., a sphere-sphere pair, or a sphere-line
pair. Languages lake CLOS, the Common Lisp Object System, has built-
in support for double-dispatching [98], whereas C++ has not. Therefore,
solutions double-dispatch problem in C++ depend on various tricks to
emulate it. Numerous authors have suggested how to solve this prob-
lem in C++, see e.g., Meyers [98] and Alexandrescu [4]. Most of the
solutions make use of runtime type information or double virtual func-
tion calls to lookup the correct function. Doing this every time for a
collision check can result in a notable overhead, making the approach
with specialized collision detection algorithms unattractive. However, if
no new objects are likely to be created once the application is running
(as opposed to, e.g., video games), we can do all the expensive lookups
once and for all at initialization time. Once the correct DistancePair is
created, it also carries with it the correct distance algorithm. At a design
pattern level, this could be implemented using the Factory Method pat-
tern [45]; a class GeomPairFactory is responsible for associating a pair
of geometric objects with the correct distance algorithm via the function
CreateGeomPair. At the implementation level, we could, e.g., use the
212 D Pairwise Collision Detection with PQP and GJK

generic Factory and BasicDispatcher class templates available in the

Loki library by Alexandrescu [4].
Sometimes it is sufficient though to do the lookup only at compile
time. This is the case if the type of each geometric object is known be-
forehand, or if we want to be able to change the geometric types in our
program without affecting collision detection code. Compile-time lookup
is easily achieved using a factory method implemented as a function tem-
plate:

template<GeomT1, GeomT2>
DistancePair* CreateDistancePair(GeomT1& a, GeomT2& b);

This function would lack an implementation for the general case, but
for every combination of geometric types that we are interested in, there
would exist specialized versions of the function template. Suppose now
that we change the geometric type in our code from, say, TriangleSet to
GeomConvexGrp, then the compiler would automatically switch to another
specialization of CreateDistancePair. Thus, the change of geometric
type would require a minimum of changes in the source code. The use of
a factory method also opens the door for other possibilities: Suppose that
experiments show that Enhanced GJK is faster than PQP on queries in-
volving compositions of convex polyhedra that are not too complex. Then
we could implement the factory method to produce ConvexGrpPairGJK
in situations that favor Enhanced GJK, and GeomPairPQP otherwise.
Appendix E

Framework Details

In this appendix we cover some details of the CoPP framework that are
not covered elsewhere. Section E.1 presents a class for representing paths
in the configuration space. By delegating some of its behavior to metric
objects and interpolation objects, this class can model many different
types of paths. Section E.2 deals with binary constraints, which are
useful abstractions, allowing a planner to transparently handle different
types of constraints.
Section E.3 introduces a problem class, which is used to hold the
definition of a path planning problem. In addition to robots, and obsta-
cles, object of this class can also contain information about which metric,
interpolation method, or sampling method to use.
To avoid introducing platform dependencies, CoPP uses the platform
independent VRML-file format to store animations. Section E.4 gives an
example on how the Visitor pattern [45] can be used to make a function
behave as if it was virtual with respect to a class hierarchy, without
being part of it. Here the pattern is used to allow a uniform interface for
drawing objects of different geometric types.
The last section gives an example of a robot description file for a
simple SCARA robot.

E.1 Configuration Space Path

The result from a path planner is often a finite series of via-points in
the configuration space. The path to take in-between the via-points is
214 E Framework Details

Path
AddViaPoint(time, config)
Length(metric)
Sample(time, config, interpolator)

Figure E.1: A class for representing paths in the configuration space

as a sequence of via-points. The path to take between the via-points is
determined by the interpolator argument.

determined by a chosen interpolation scheme. The most simple is of

course linear interpolation, but in general the appropriate interpolation
method is problem dependent.
The class Path, see Figure E.1, represents sequences of via-points
in an n-dimensional configuration space. Each via-point consists of a
configuration (or input signal in case of a dynamic system) and a time
value. In addition to being the basic return type of many planners, it
is also used to produce off-line animations in VRML format. The class
supports useful operations, such as concatenating two paths. Often it
is useful to compute the length of a path, e.g., to choose the shortest
path among several alternatives. Therefore the class also has a Length
method. The immediate question is of course with which metric this
length is computed. The choice of metric is left to the user, who can
provide an object of type Metric as argument to Length. If no argument
is provided, the Manhattan metric is used as the default choice.
Smoothing algorithms like Adaptive Shortcut, see Hsu et al. [58], need
to sample the path between the via-points. To produce smooth anima-
tions, there is also a need for dense sampling between the via-points.
How these intermediate via-points are computed depend on the chosen
interpolation scheme. Thus, the Sample method requires an interpola-
tion object, see Figure E.1. If no interpolation object is provided, linear
interpolation is used.

E.2 Binary Constraints

The most basic requirement of a planned motion is that it must be col-
lision free. For many applications there are also additional requirements
that have to be satisfied. For example, to provide safety margins, a trajec-
tory could be required to keep a minimum clearance to the obstacles. For
E.3 Problem Class 215

welding tasks and pick-and-place tasks, there are orientation constraints

that the robot’s end-effector must satisfy. A flexible path planning system
should thus be able to handle a wide variety of constraints.
Many path planning algorithms work by probing the C-space to test
wether a configuration is collision-free or not. A more general question to
ask is wether the configuration is satisfied, leading to the concept of binary
constraints. A planner that is implemented in terms of this abstract
concept is more flexible in that it can handle different types of constraints.
Binary constraints are represented by objects of type BinaryConstraint,
see Figure E.2. In addition to specifying an interface, this class uses the
Template Method pattern [45] to automate the gathering of statistics.
These statistics could be used to compare the number of collision checks
needed by two different planning algorithms to solve the same problem.
The class CollFreeConstr in Figure E.2 is the most commonly used
constraint class, and it tests wether a configuration is collision-free or
not. To make this class more generic, it is a template class, with the
type of the moving system and the type of the collision detection object
as template arguments. Thus, CollFreeConstr can be used with rigid
bodies, robot objects, or any other moving system that has a function
Move with the correct signature.
The class OrientationConstr represents orientation constraints on
a robot’s end-effector. The constraint is specified by: a direction vector
attached to the end-effector frame, a direction vector attached to the
fixed world frame, and a maximum deviation angle. The constraint is
satisfied if the angle between the moving direction vector and the fixed
direction vector is less than the maximum deviation angle. To avoid
moving a robot twice for every configuration to be tested, this class also
checks wether a configuration is collision free.

E.3 Problem Class

There are some basic components that need to be defined for every basic
path planning problem. They are: a robot, a set of obstacles, a start-
ing point, and at least one goal point. There are of course variations,
like problems involving multiple robots and more elaborate task specifi-
cations, but the ideas presented here can be applied to them as well.
Writing a path planner will also require writing a lot of code that
has nothing to do with the path-planning algorithm that is to be tested.
Code to instantiate a path planning problem definitely falls into this
216 E Framework Details

BinaryConstraint
+ bool IsSatisfied(config)
+ GetNumQueries( )
+ GetNumSatisfied( )
+ Clone( )
- bool DoIsSatisfied(config)
num_queries
num_satisfied

OrientationConstr CollFreeConstr
Vector3D global_dir agent objSet
Vector3D local_axis
max_dev_angle AgentT ObjSetT
robot objSet Move(config) IsCollisionfree( )

Robot ObjectSet

Figure E.2: Class diagram for different types of constraints. Note that
CollFreeConstr is a class template.

category. To shorten the development time and let developers focus more
on their algorithms, the CoPP framework provides a class representing
a basic path planning problem. This class, named PathPlanProblem,
is completely decoupled from any path planner classes and should be
thought of more as a utility class. Given a problem definition file, the
constructor takes care of building the required objects. In the case of an
error the current parser will provide a, hopefully, helpful error message.
In addition to the robot and the geometric objects, the problem class
also contains formatted information about things like: which metric to
use; which sampling strategy to use, which interpolation method to use,
and the maximum allowed step size. How this information should be
interpreted and used is up the specific application; the problem class is
thus like a recipe, from which we build the desired path planner. With
this approach, every component of the path planner can be specified in
a configuration file.
E.4 Visualization 217

The CoPP framework provides different geometric types. What type

has the geometric objects in the PathPlanProblem class? The approach
to store references to the base class Geom is ruled out because it would
hide all class specific interfaces. It then seems as there would have to
be at least one problem class for every geometric class. However, as the
purpose of the problem class is to serve as a container for the components
of a path planning problem, it should be a perfect candidate for a C++
class template. Based on this idea, PathPlanProblem is implemented
as a class template with two template arguments; one for the geometric
type of the obstacles and one for the geometric type of the robot links.
Because the most common case is that both template parameters have
the same type, the second template parameter defaults to the first. Next
we show an example of how to instantiate two different path planning
problems:

PathPlanProblem<TriangleSet> problem1(file1);
PathPlanProblem<TriangleSet, GeomConvex> problem2(file2);

In the first case, TriangleSet is used for all geometric objects, whereas
the second case uses GeomConvex for the robot links. Note that changing
template parameters also changes the parsers used internally by the class.
This makes it easy to extend the class for use with other geometric types;
just provide a parser with a conforming interface for the new geometric
type.

E.4 Visualization
Visualization of the output from a path planner is an important part of a
path planning application. It is, however, most often the parts concern-
ing visualization that cause existing frameworks to be system dependent,
thereby reducing the number of potential users. To avoid that situa-
tion, we choose to output animations in the VRML file format. See the
textbook by Ames et al. [8] for an excellent introduction to VRML. As
there are free VRML viewers for the most common operating systems,
this solution avoids most of the portability issues. Furthermore, stor-
ing animations in VRML files means less work for the user; saving an
animation with traditional graphical user interfaces requires the user to
convert the ongoing animation on the screen to, e.g., the mpeg format.
With the approach taken here, the VRML file itself is a perfect format
for the animation; put on a web-page, it will provide more information
218 E Framework Details

GeomVisitor
+VisitTriangleSet(obj)
+VisitGeomConvex(obj)
+VisitGeomConvexGrp(obj)

VrmlGraphics
+Draw(Geom geom) geom.Accept(*this);
+Draw(Robot robot)
…
-VisitTriangleSet(obj) draw TriangleSet
-VisitGeomConvex(obj)
...

Figure E.3: Class diagram for VrmlGraphics. Private inheritance is

used to restrict the access to the Visit methods.

to visitors because it is interactive. That is, visitors can zoom in and

out and change the view of the scene as they like. Unless the geometric
models in the animation is made up of several thousands of triangles, the
VRML file format is also a very compact method for saving an animation.

In CoPP, output to a VRML file is handled by objects of the class

VrmlGraphics, see Figure E.3. The class has a number of Draw methods
for drawing geometric objects, robots, coordinate frames and animations.
Note that the interface of VrmlGraphics never mentions any concrete
geometric types, only the base class Geom. An important design issue is
therefore how to determine the specific type of the geometric object, such
that we can know how to draw it. This design problem was solved using
the Visitor pattern from [45]; as seen from Figure E.3, VrmlGraphics
inherits from the abstract class GeomVisitor. Here, private inheritance
is used, so that the Visit-methods can only be called from within the
VrmlGraphics class itself. Each Visit... method will now take care
of drawing the corresponding geometric objects. An interaction diagram
that illustrates the Visitor pattern in this context is shown in Figure E.4.
E.5 Robot Description Files 219

aClient aVrmlGraphics aTriangleSet aGeomConvex

Draw(aTriangleSet)
Accept(*this)

VisitTriangleSet(*this)

DrawTriangleSet(aTriangleSet)

Draw(aGeomConvex)
Accept(*this)

VisitGeomConvex(*this)

DrawGeomConvex(aGeomConvex)

Figure E.4: An interaction diagram that illustrates how geometric ob-

jects of different types are drawn. Note that neither the client, nor the
graphics object, need to know the geometric type when calling Draw.

E.5 Robot Description Files

Robot models can be loaded at runtime using robot description files. A
robot description file contains information about:

• The kinematic structure of the robot

• Joint types and joint limits

• Joint couplings

• Links to geometry files

• Robot composition

• Links that can self-collide

220 E Framework Details

The kinematic structure of a robot is described by a tree of “nodes”,

where each node represents a moving coordinate frame. The parent of a
node is determined by the pred x directive, where x is a number iden-
tifying the parent node. The relative position between the coordinate
frames and how they move is determined by the Kleinfinger-Khalil pa-
rameters of the nodes. A node can also represent a coordinate frame that
is fixed with respect to its parent. This type of node is useful to represent
constant offsets, for example a manipulator’s position in the world.
Without any geometric objects, a robot would just be a set of moving
coordinate frames. To give a robot geometric links, one or more geomet-
ric objects can be attached to each frame. It is mandatory that each
geometric object is given a name; these names are used to specify which
links of the robot can collide with each other. The potential self-collisions
are listed in a collision table, which is also a part of the description file.
When attaching a geometric object, it can be given an optional offset and
a material specification. This allows the same geometry file to be reused
at several places in the same description file. As an example, consider a
robot hand where the finger links are identical, except for some offset or
color. In this case, a single geometry file, describing a finger link, can be
used several times.
Figure E.5 shows an example of a robot description file, describing
a simple SCARA robot. As the example robot is a single kinematic
chain without any branches, the Denavit-Hartenberg notation is suffi-
cient. Thus, the two extra parameters of the Kleinfinger-Khalil notation,
ǫ and γ, are optional. The resulting robot is shown in Figure E.6.
We assume that there exists a specific inverse kinematics solver,
named ScaraInvKin, for this robot type. To use this solver, the robot
description file uses the directive inverse kin "ScaraInvKin". If no
solver is specified, a general numerical solver is used instead.
E.5 Robot Description Files 221

robot "simple_scara" { joint 2 {

# choose a specific solver parameters {
# for the inverse kinematics a 50.0
inverse_kin "ScaraInvKin" alpha 0.0
d 8.5
frame 0 { theta 0.0
transform { }
# determines the robot’s type revolute
# position in the world limits { -154.0d 154.0d }
translation 340 -200 980 pred 1
}
# the root node must geom {
# refer to itself name "link2"
pred 0 file "link2.geom"
}
geom { }
name "link0"
file "link0.geom" joint 3 {
} parameters {
} a 45.0
alpha 180.0d
joint 1 { # nonzero joint offset
parameters { d 40.0
a 0.0 theta 0.0
alpha 0.0 }
d 0.0 type prismatic
theta 0.0 limits { -25.0 40.0 }
} pred 2
type revolute
limits { -180.0d 180.0d } geom {
pred 0 name "link3"
file "link3.geom"
geom { }
name "link1" }
file "link1.geom"
} coll_table {
} "link0" and "link3"
}
}

Figure E.5: An example of a robot description file for a robot with three
degrees of freedom. The resulting robot is shown in Figure E.6.
Figure E.6: A simple SCARA robot with three degrees of freedom. The
world frame and the end-effector frame are also drawn.
Appendix F

The Boost Libraries

At this point we have covered most of the building blocks of the CoPP
framework. There are, however, still a few functionalities that has to
be added to make the framework easy to work with: random number
generators, matrix classes, graph classes, and parsers. On the one hand,
as we do not want to reinvent the wheel, such general purpose code should
come from existing public-domain libraries. On the other hand, we also
want to keep the number of external libraries to a minimum. This makes
the Boost libraries an ideal candidate, as it provides all the functionalities
listed above. The Boost web-site1 provides free peer-reviewed portable
C++ libraries. The quality of the libraries is of very high standard; some
of them will be included in the C++ Standards Committee’s upcoming
C++ Standard Library Technical Report as a step toward becoming part
of a future C++ Standard.
The next two sections will describe two Boost libraries that are partic-
ularly useful in CoPP: the Bost Graph Library, and the parser generator
framework Spirit.

F.1 Boost Graph Library

Almost all path planning algorithms, that are not purely potential field
based, use some graph or tree structure. As the algorithms for searching
and traversing a graph are independent of the data that is stored in it, it
seems as a good idea to separate these and provide an abstract data type
1 www.boost.org
224 F The Boost Libraries

for all graphs. The Boost Graph Library (BGL) [133] provides generic
graph algorithms for constructing, modifying, traversing, and searching
graphs. Using template parameters, users can determine the data type
stored in the graph vertices and the graph edges, respectively. Other tem-
plate arguments can be used to determine wether the graph should be
directed or not, and other properties such as the underlying memory rep-
resentation. Experiments by Lee et al. [84] showed that for breadth-first
search, depth-first search, and Dijksktra’s algorithm, the BGL2 was 5 to
7 times faster than the purely object-oriented LEDA C++ library [97].
This is probably due to BGL relying on compile time polymorphism,
whereas LEDA relies on runtime polymorphism: Compile time polymor-
phism avoids the overhead of virtual function calls and allows the com-
piler to do more optimizations. The conclusion is that the efficiency and
the generic nature of BGL makes it an ideal choice for graph intensive
path planning methods like PRM.

F.2 Spirit
It is inevitable that initializing a path planning application is rather pars-
ing intensive; geometric models, robot descriptions, and problem defini-
tions are all usually loaded from text files. As the expressiveness of these
description files increases, so does the complexity of the corresponding
parser. For really small projects, it is often enough to hand-code a parser
using some ad-hoc approaches. However, if the project requires several
parsers, or if the language specification becomes complex, more system-
atic approaches are needed. The usual approach when writing parsers
is to use parser generator tools like YACC [61, 87] and ANTLR [114].
With these tools, the grammar of the language is specified in a special
file. The grammar definition is then used by the tool to generate C or
C++ code for the corresponding parser. Tools like YACC and its com-
panion FLEX [85] are well known, scalable and generate fast and compact
parsers. However, users that want to modify an existing parser, or write
a new one, might not be familiar with these tools. Furthermore, as the
output from YACC and FLEX is pure C-code, they are sometimes awk-
ward to work with if you want the result of the parsing to be a C++
object. Finally, the many extra files makes maintenance harder.
2 Before becoming a part of the Boost libraries, BGL was named GGCL, which

stands for Generic Graph Component Library. Thus, in [84] the name GGCL is used
instead.
F.2 Spirit 225

A more recent approach to writing parsers is to use XML [53]. Here

the drawback is that interfacing the XML-parser with your application
is not so straightforward.
The approach used here is to use the Spirit parser generator frame-
work [38] that is available in Boost. The key idea of Spirit is to use
the operator overloading feature of C++ to allow EBNF-style grammar
specifications to be written directly in the source code. This is a major
advantage over the approaches of YACC, ANTLR and XML because: (1)
there is one tool less to learn about, (2) less files to maintain, and (3)
no interface troubles between the parser and the application. The Spirit
framework also relies heavily on templates and generic programming.
The benefit is that the one and same parser definition can transparently
be used to parse input from a file, an input stream or even a string in
memory. Furthermore, existing parsers can be composed at compile-time
to more complex parsers.
As a short example of how Spirit works, we will write a parser that
recognizes a comma-separated list of real numbers. Spirit comes with
several predefined parser primitives that are used to build more complex
parsers. One such parser is real p, which recognizes and parses real
numbers. Another such parser is ch p, which recognizes a single charac-
ter. We can now use these parser primitives, together with overloaded
operators, to build a more complex parser:
real p >> *(ch p(’,’) >> real p)
This parser will recognize input that is a comma-separated list of real
numbers. The operator >> is overloaded such that a >> b means “b
must follow a”. The expression *a means that a must be matched zero
or more times. The operator * is also known as the Kleene star and in
conventional notation it appears after the expression it modifies. How-
ever, as C++ has no postfix * operator, Spirit requires the Kleene star
operator to appear before the expression.
The parser just presented is not of much use; it is merely a recognizer
that can tell us wether the input matched its grammar. To be of real
use, we must bind semantic actions to the parser, actions that we want
to be executed every time the parser finds a match. In this case we would
probably want to store all the numbers in a vector. Spirit also comes with
a set of predefined semantic actions. One of those is append, which fits
our purpose perfectly. Assume that we have defined a variable v that is
of type vector<double>. We can now put our list of real numbers in v
with the following parser:
226 F The Boost Libraries

const char* ParseNumbers(const char* str,

vector<double>& v)
{
return parse(str,
real_p[append(v)] >> *(ch_p(’,’) >> real_p[append(v)]),
space_p).stop;
}

Figure F.1: A function for parsing a comma-separated list of real num-

bers. The list items can be separated by any amount of white space.
The function returns a pointer to the first character not consumed by
the parser.

real p[append(v)] >> *(ch p(’,’) >> real p[append(v)])

From this example we see that Spirit uses the [] operator to attach
semantic actions to a parser. Users can easily define their own semantic
actions, which they can attach to parsers.
Finally, to activate our parser we have to call one of the many parse
functions available in Spirit. If we assume that our input is stored in an
ordinary character array, we can wrap the parser call in a function as
shown in Figure F.1.
The function in Figure F.1 will consume the input in str as long as it
matches the specified grammar, or until the end of the string is reached.
The return value is a pointer that points to the last character that was not
consumed. Often the input contains characters that we are not interested
in, such as white space or comments. To decide which characters to skip,
the parse function needs a third argument, a skip parser that acts like a
filter on the input. In this example, we used the predefined space p as
skip parser. The space p argument causes our parser to simply ignore
any white space.
Hopefully this little example has conveyed the main ideas behind
Spirit. To conclude:
• Spirit is easy to learn and relieves users from the need to learn a
specific parser generator tool.
• The ability to define grammars directly in the C++ code leads to:
– Shorter development times
– Easier maintenance
F.2 Spirit 227

– More reuse of domain specific parsers

• The resulting parsers are compact, flexible and reasonable fast.

On the downside we can mention that due to being completely tem-

plate based and relying heavily on template meta-programming, Spirit
pushes the compiler to its limits; complex parsers might require the user
to increase, e.g., the compiler heap limit. Furthermore, the occurrence of
a syntax error during compilation will cause the compiler to utter almost
indecipherable error messages, spanning over several lines. Despite of
these two drawbacks, we think that Spirit is well fit for small- to medium
sized languages.
References

[1] D. Aarno, D. Kragic, and H. I. Christensen. Artificial potential

biased probabilistic roadmap method. In IEEE International Con-
ference on Robotics and Automation, volume 1, pages 461–466, New
Orleans, LA, USA, Apr. 2004.
[2] J. M. Ahuactzin and A. Portilla. A basic algorithm and data struc-
tures for sensor-based path planning in unknown environments. In
IEEE/RSJ International Conference on Intelligent Robots and Sys-
tems, volume 2, pages 903–908, Takamatsu, Japan, Nov. 2000.
[3] M. Akinc, K. E. Bekris, B. Y. Chen, A. M. Ladd, E. Plaku, and
L. E. Kavraki. Probabilistic roadmaps of trees for parallel com-
putation of multiple query roadmaps. In Eleventh Int. Symp. on
Robotics Research, Siena, Italy, Oct. 2003.
[4] A. Alexandrescu. Modern C++ Design: Generic Programming
and Design Patterns Applied. The C++ In-Depth Series. Addison-
Wesley, 2001.
[5] N. M. Amato, O. B. Bayazit, L. K. Dale, C. Jones, and D. Vallejo.
OBPRM: An obstacle-based PRM for 3D workspaces. In Proc.
third International Workshop on the Algorithmic Foundations of
Robotics, 1998.
[6] N. M. Amato, O. B. Bayazit, L. K. Dale, C. Jones, and D. Vallejo.
Choosing good distance metrics and local planners for probabilistic
roadmap methods. IEEE Transactions on Robotics and Automa-
tion, 16(4):442–447, Aug. 2000.
[7] N. M. Amato and G. Song. Using motion planning to study protein
folding pathways. Journal of Computational Biology, 9(2):149–168,
2002.
230 References

[8] A. L. Ames, D. R. Nadeau, and J. L. Moreland. VRML 2.0 Source-

book. John Wiley & Sons, second edition, 1997.

[9] M. A. Arbib, T. Iberall, and D. Lyons. Schemas that integrate

vision and touch for hand control. In M. A. Arbib and A. R. Hanson,
editors, Vision, Brain, and Cooperative Computation, pages 489–
510. MIT Press, 1987.

[10] A. Autere and J. Lehtinen. Robot motion planning by a hierarchical

search on a modified discretized configuration space. In IEEE/RSJ
International Conference on Intelligent Robots and Systems, vol-
ume 2, pages 1208–1213, Grenoble, France, Sept. 1997.

[11] B. Baginski. Efficient dynamic collision detection using expanded

geometry models. In IEEE/RSJ International Conference on Intel-
ligent Robots and Systems, volume 3, pages 1714–1720, Grenoble,
France, Sept. 1997.

[12] C. B. Barber, D. P. Dobkin, and H. T. Huhdanpaa. The Quickhull

algorithm for convex hulls. ACM Transactions on Mathematical
Software, 22(4):469–483, Dec. 1996.

[13] J. Barraquand, B. Langlois, and J.-C. Latombe. Numerical poten-

tial field techniques for robot path planning. IEEE Transactions
on Systems, Man and Cybernetics, 22(2):224–241, 1992.

[14] O. B. Bayazit, G. Song, and N. M. Amato. Ligand binding with

OBPRM and user input. In IEEE International Conference on
Robotics and Automation, volume 1, pages 954–959, Seuol, Korea,
May 2001.

[15] A. Bicchi, G. Casalino, and C. Santilli. Planning shortest bounded-

curvature paths for a class of nonholonomic vehicles among obsta-
cles. In IEEE International Conference on Robotics and Automa-
tion, volume 2, pages 1349–1354, Nagoya, Japan, May 1995.

[16] A. Blake and M. Isard. Active Contours. Springer-Verlag, 1998.

[17] R. Bohlin and L. E. Kavraki. Path planning using lazy PRM.

In IEEE International Conference on Robotics and Automation,
volume 1, pages 521–528, San Francisco, CA, USA, Apr. 2000.
References 231

[18] R. Bohlin and L. E. Kavraki. A randomized algorithm for robot

path planning based on lazy evaluation. In S. Rajasekaran,
P. Pardalos, J. Reif, and J. Rolim, editors, Handbook on randomized
computing. Kluwer Academic Publishers, 2001.

[19] G. M. Bone and Y. Du. Multi-metric comparison of optimal 2D

grasp planning algorithms. In IEEE International Conference on
Robotics and Automation, volume 3, pages 3061–3066, Seoul, Ko-
rea, May 2001.

[20] V. Boor, M. H. Overmars, and A. F. van der Stappen. The Gaussian

sampling strategy for probabilistic roadmap planners. In IEEE
International Conference on Robotics and Automation, volume 2,
pages 1018–1023, Detroit, MI, USA, May 1999.

[21] J. Borenstein, H. R. Everett, and L. Feng. Navigating Mobile

Robots: Systems and Techniques. A. K. Peters, Ltd., 1996.

[22] C. Borst, M. Fischer, and G. Hirzinger. Grasp planning: How

to choose a suitable task wrench space. In IEEE International
Conference on Robotics and Automation, New Orleans, LA, USA,
Apr. 2004.

[23] M. S. Branicky, S. M. LaValle, K. Olson, and L. Yang. Quasi-

randomized path planning. In IEEE International Conference on
Robotics and Automation, volume 2, pages 1481–1487, Seoul, Ko-
rea, May 2001.

[24] H. Bruyninckx, S. Demey, and V. Kumar. Generalized stability of

compliant grasps. In IEEE International Conference on Robotics
and Automation, volume 3, pages 2396–2402, Leuven, Belgium,
May 1998.

[25] S. Cameron. Collision detection by four-dimensional intersection

testing. IEEE Transactions on Robotics and Automation, 6(3):291–
302, June 1990.

[26] S. Cameron. A comparison of two fast algorithms for computing

the distance between two convex polyhedra. IEEE Transactions on
Robotics and Automation, 13(6):915–920, Dec. 1997.

[27] S. Cameron. Enhancing GJK: Computing minimum and penetra-

tion distances between convex polyhedra. In IEEE International
232 References

Conference on Robotics and Automation, volume 4, pages 3112–

3117, Albuquerque, NM, USA, Apr. 1997.

[28] S. Cameron. Motion planning and collision avoidance with complex

geometry. In Proceedings of the IEEE Industrial Electronics Society,
volume 4, pages 2222–2226, Aachen, Germany, Sept. 1998.

[29] S. Cameron and J. Pitt-Francis. Using OxSim for path planning.

Journal of Robotic Systems, 18(8):421–431, Aug. 2001.

[30] J. A. Castellanos and J. D. Tardós. Mobile Robot Localization and

Map Building: A Multisensor Fusion Approach. Kluwer Academic
Publishers, 1999.

[31] K. Chung. An efficient collision detection algorithm for polytopes in

virtual environments. M. Phil. Thesis, Dept. of Computer Science,
University of Hong Kong, Sept. 1996.

[32] J. D. Cohen, M. C. Lin, D. Manocha, and M. Ponamgi. I-

COLLIDE: An interactive and exact collision detection system for
large-scale environments. In Proc. Symposium on Interactive 3D
Graphics, pages 189–196, Apr. 1995.

[33] J. O. Coplien. Advanced C++: Programming Styles and Idioms.

Addison-Wesley, 1991.

[34] T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introduction to

Algorithms. The MIT Press, MacGraw-Hill, June 1990.

[35] J. Cortés, T. Siméon, and J. Laumond. A random loop genera-

tor for planning the motions of closed kinematic chains using PRM
methods. In IEEE International Conference on Robotics and Au-
tomation, volume 2, pages 2141–2146, Washington, DC, USA, May
2002.

[36] J. J. Craig. Introduction to Robotics: Mechanics and Control.

Addison-Wesley, second edition, 1989.

[37] T. Danner and L. E. Kavraki. Randomized planning for short in-

spection paths. In IEEE International Conference on Robotics and
Automation, volume 2, pages 971–976, San Francisco, CA, USA,
Apr. 2000.
References 233

[38] J. de Guzman and D. Nuffer. The Spirit library: Inline parsing in

C++. C/C++ Users Journal, 21:22, Sept. 2003.

[39] J. Denavit and R. S. Hartenberg. A kinematic notation for lower-

pair mechanisms based on matrices. Journal of Applied Mechanics,
22:215–221, 1955.

[40] R. Earnshaw, N. Magnenat-Thalmann, D. Terzopoulos, and

D. Thalmann. Computer animation for virtual humans. IEEE
Computer Graphics and Applications, 18(5):20–23, Sept. 1998.

[41] C. Ferrari and J. Canny. Planning optimal grasps. In IEEE Inter-

national Conference on Robotics and Automation, volume 3, pages
2290–2295, Nice, France, May 1992.

[42] M. Fischer and G. Hirzinger. Fast planning of precision grasps for

3D objects. In IEEE/RSJ International Conference on Intelligent
Robots and Systems, volume 1, pages 120–126, Grenoble, France,
Sept. 1997.

[43] M. Fischer and G. Hirzinger. Fast planning of precision grasps for

three-dimensional objects. Advanced Robotics, 12(5):535–549, 1999.

[44] L. Flückiger. Interface pour le pilotage et l’analyse des robots basée

sur un générateur de cinématiques. PhD thesis, École Polytech-
nique Fédérale de Lausanne, EPFL, Lausanne, Switzerland, 1998.

[45] E. Gamma, R. Helm, R. Johnson, and J. Vlissides. Design Patterns:

Elements of Reusable Object-Oriented Software. Addison-Wesley
professional computing series. Addison-Wesley, 1995.

[46] R. Geraerts and M. H. Overmars. A comparative study of prob-

abilistic roadmap planners. Technical Report UU-CS-2002-041,
Dept. of Computer Sci., Utrecht Univ., Utrecht, the Netherlands,
2002.

[47] E. G. Gilbert and S. M. Hong. A new algorithm for detecting the

collision of moving objects. In IEEE International Conference on
Robotics and Automation, volume 1, pages 8–14, Scottsdale, AZ,
USA, May 1989.

[48] E. G. Gilbert, D. W. Johnson, and S. S. Keerthi. A fast proce-

dure for computing the distance between two complex objects in
234 References

three-dimensional space. IEEE Journal of Robotics and Automa-

tion, 4(2):193–203, Apr. 1988.

[49] G. H. Golub and C. F. Van Loan. Matrix Computations. The Johns

Hopkins University Press, second edition, 1989.

[50] F. Gravot, R. Alami, and T. Siméon. Playing with several roadmaps

to solve manipulation problems. In IEEE/RSJ International Con-
ference on Intelligent Robots and Systems, volume 3, pages 2311–
2316, EPFL, Lausanne, Switzerland, Oct. 2002.

[51] L. J. Guibas, D. Halperin, H. Hirukawa, J.-C. Latombe, and R. H.

Wilson. A simple and efficient procedure for polyhedral assembly
partitioning under infinitesimal motions. In IEEE International
Conference on Robotics and Automation, volume 3, pages 2553–
2560, Nagoya, Japan, May 1995.

[52] M. W. Hannan and I. D. Walker. The ’elephant trunk’ manipulator,

design and implementation. In IEEE International Conference on
Advanced Intelligent Mechatronics, volume 1, pages 14–19, Como,
Italy, July 2001.

[53] E. R. Harold and W. S. Means. XML in a Nutshell: A Desktop

Quick Reference. O’Reilly, second edition, Jan. 2001.

[54] M. Hernando and E. Gambao. New concept of visibility tetrahe-

dra for fast robot motion planning. In IEEE/ASME International
Conference on Advanced Intelligent Mechatronics, volume 2, pages
1340–1345, Como, Italy, July 2001.

[55] M. Hernando and E. Gambao. Visibility analysis and genetic al-

gorithms for fast robot motion planning. In IEEE/RSJ Interna-
tional Conference on Intelligent Robots and Systems, volume 3,
pages 2413–2418, EPFL, Lausanne, Switzerland, Oct. 2002.

[56] C. Holleman and L. E. Kavraki. A framework for using the

workspace medial axis in PRM planners. In IEEE International
Conference on Robotics and Automation, volume 2, pages 1408–
1413, San Francisco, CA, USA, Apr. 2000.

[57] W. S. Howard and V. Kumar. On the stability of grasped objects.

IEEE Transactions on Robotics and Automation, 12(6):904–917,
Dec. 1996.
References 235

[58] D. Hsu, J.-C. Latombe, and S. Sorkin. Placing a robot manipula-

tor amid obstacles for optimized execution. In IEEE International
Symposium on Assembly and Task Planning, pages 280–285, Porto,
Portugal, July 1999.

[59] G. Immega and K. Antonelli. The KSI tentacle manipulator. In

IEEE International Conference on Robotics and Automation, vol-
ume 3, pages 3149–3154, Nagoya, Japan, May 1995.

[60] P. Isto. Path planning by multiheuristic search via subgoals. In

Proc. 27th Int. Symposium on Industrial Robots, pages 712–726,
1996.

[61] S. C. Johnson. Yacc: Yet another compiler-compiler. Technical

Report Computing Science Technical Report No. 32, Bell Labora-
tories, Murray Hill, New Jersey, 1975.

[62] L. Jones, Joseph and T. Lozano-Pérez. Planning two-fingered

grasps for pick-and-place operations on polyhedra. In IEEE Inter-
national Conference on Robotics and Automation, volume 1, pages
683–688, Cincinnati, OH, May 1990.

[63] L. E. Kavraki and J.-C. Latombe. Randomized preprocessing of

configuration space for fast path planning. In IEEE International
Conference on Robotics and Automation, volume 3, pages 2138–
2145, San Diego, CA, USA, May 1994.

[64] L. E. Kavraki, P. Švestka, J.-C. Latombe, and M. H. Overmars.

Probabilistic roadmaps for path planning in high-dimensional con-
figuration spaces. IEEE Transactions on Robotics and Automation,
12(4):566–580, Aug. 1996.

[65] M. Kazemi and M. Mehrandezh. Robot navigation using harmonic

function-based probabilistic roadmaps. In IEEE International Con-
ference on Robotics and Automation, volume 5, pages 4765–4770,
New Orleans, LA, USA, Apr. 2004.

[66] J. Kerr and B. Roth. Analysis of multifingered hands. The Inter-

national Journal of Robotics Research, 4(4):3–17, 1986.

[67] W. Khalil and E. Dombre. Modeling, identification & control of

robots. Hermes Penton Science, 2002.
236 References

[68] W. Khalil and J. F. Kleinfinger. A new geometric notation for

open and closed-loop robots. In IEEE International Conference on
Robotics and Automation, volume 3, pages 1174–1179, 1986.

[69] D. Kirkpatrick, B. Mishra, and C. K. Yap. Quantitative Steinitz’s

theorems with applications to multifingered grasping. In Proceed-
ings of the 20th ACM Symposium on Theory of Computing, pages
341–351, 1990.

[70] J. J. Kuffner. Effective sampling and distance metrics for 3D rigid

body path planning. In IEEE International Conference on Robotics
and Automation, volume 4, pages 3993–3998, New Orleans, LA,
USA, Apr. 2004.

[71] J. J. Kuffner Jr. Autonomous Agents for Real-Time Animation.

PhD thesis, Stanford University, Stanford, CA, USA, Dec. 1999.

[72] J. J. Kuffner Jr. and S. M. LaValle. RRT-Connect: An efficient

approach to single-query path planning. In IEEE International
Conference on Robotics and Automation, volume 2, pages 995–1001,
San Francisco, CA, USA, Apr. 2000.

[73] J. B. Kuipers. Quaternions and Rotation Sequences. Princeton

University Press, 1999.

[74] F. Lamiraux, E. Ferré, and E. Vallée. Kinodynamic motion plan-

ning: Connecting exploration trees using trajectory optimization
methods. In IEEE International Conference on Robotics and Au-
tomation, volume 4, pages 3987–3992, New Orleans, LA, USA, May
2004.

[75] P. T. Lansbury Jr. Evolution of amyloid: What normal protein

folding may tell us about fibrillogenesis and disease. In Proc. Natl.
Acad. Sci. USA, volume 96, pages 3342–3344, Mar. 1999.

[76] E. Larsen, S. Gottschalk, M. C. Lin, and D. Manocha. Fast distance

queries with rectangular swept sphere volumes. In IEEE Interna-
tional Conference on Robotics and Automation, volume 4, pages
3719–3726, San Fransisco, CA, Apr. 2000.

[77] J.-C. Latombe. Robot Motion Planning. Kluwer Academic Pub-

lishers, 1991.
References 237

[78] S. M. LaValle. Rapidly-exploring random trees: A new tool for path

planning. Technical Report TR 98-11, Computer Science Dept.,
Iowa State Univ., Oct. 1998.
[79] S. M. LaValle. Planning Algorithms. 2004. available at
http://msl.cs.uiuc.edu/planning.
[80] S. M. LaValle and J. E. Hinrichsen. Visibility-based pursuit-
evasion: The case of curved environments. IEEE Transactions on
Robotics and Automation, 17(2):196–202, Apr. 2001.
[81] S. M. LaValle and J. J. Kuffner Jr. Randomized kinodynamic plan-
ning. In IEEE International Conference on Robotics and Automa-
tion, volume 1, pages 473–479, Detroit, MI, USA, May 1999.
[82] S. M. LaValle and J. J. Kuffner Jr. Rapidly-exploring random
trees: Progress and prospects. In In Workshop on the Algorithmic
Foundations of Robotics, 2000.
[83] S. M. LaValle, D. Lin, L. J. Guibas, J.-C. Latombe, and R. Mot-
wani. Finding an unpredictable target in a workspace with obsta-
cles. In IEEE International Conference on Robotics and Automa-
tion, volume 1, pages 737–742, Albuquerque, NM, USA, Apr. 1997.
[84] L.-Q. Lee, J. G. Siek, and A. Lumsdaine. The generic graph com-
ponent library. In ACM SIGPLAN conference on Object-oriented
programming, systems, languages, and applications, pages 399–414,
Denver, CO, USA, 1999.
[85] M. E. Lesk and E. Schmidt. Lex – a lexical analyzer generator.
Technical Report Computing Science Technical Report No. 39, Bell
Laboratories, Murray Hill, New Jersey, July 1975.
[86] E. Levey, C. Peters, and C. O’Sullivan. New metrics for evalua-
tion of collision detection techniques. Technical Report TCD-CS-
1999-55, Dept. of Computer Sci., Univ. of Dublin, Trinity College,
Dublin, Ireland, Nov. 1999.
[87] J. R. Levine, T. Mason, and D. Brown. lex & yacc. O’Reilly, second
edition, Oct. 1992.
[88] Z. Li and S. S. Sastry. Task-oriented optimal grasping by multi-
fingered robot hands. IEEE Journal of Robotics and Automation,
4(1):32–44, Feb. 1988.
238 References

[89] M. C. Lin and J. F. Canny. A fast algorithm for incremental dis-

tance calculation. In IEEE International Conference on Robotics
and Automation, pages 1008–1014, Sacramento, CA, Apr. 1991.

[90] Q. Lin, J. W. Burdick, and E. Rimon. A stiffness-based quality

measure for compliant grasps and fixtures. IEEE Transactions on
Robotics and Automation, 16(6):675–688, Dec. 2000.

[91] F. Lingelbach. Path planning using probabilistic cell decomposi-

tion. In IEEE International Conference on Robotics and Automa-
tion, volume 1, pages 467–472, New Orleans, LA, USA, Apr. 2004.

[92] T. Lozano-Pérez. Spatial planning: A configuration space ap-

proach. IEEE Transactions on Computers, C-32(2):108–120, Feb.
1983.

[93] T. Lozano-Pérez, J. L. Jones, E. Mazer, P. O´Donnel, W. E. L.

Grimson, P. Tournassoud, and A. Lanusse. Handey: A robot sys-
tem that recognizes, plans, and manipulates. In IEEE International
Conference on Robotics and Automation, volume 4, pages 843–849,
Mar. 1987.

[94] L. Lu and S. Akella. Folding cartons with fixtures: A motion plan-

ning approach. IEEE Transactions on Robotics and Automation,
16(4):346–356, Aug. 2000.

[95] G. Mantriota. Communication on optimal grip points for con-

tact stability. The International Journal of Robotics Research,
18(5):502–513, May 1999.

[96] X. Markenscoff and C. H. Papadimitriou. Optimum grip of a poly-

gon. The International Journal of Robotics Research, 8(2):17–29,
Apr. 1989.

[97] K. Mehlhorn and S. Näher. LEDA: A Platform for Combinatorial

and Geometric Computing. Cambridge University Press, Nov. 1999.

[98] S. Meyers. More Effective C++. Addison-Wesley, 1996.

[99] A. T. Miller. GraspIt!: A Versatile Simulator for Robotic Grasping.

PhD thesis, Department of Computer Science, Columbia Univer-
sity, June 2001.
References 239

[100] A. T. Miller and P. K. Allen. Examples of 3D grasp quality com-

putations. In IEEE International Conference on Robotics and Au-
tomation, volume 2, pages 1240–1246, Detroit, MI, USA, May 1999.

[101] A. T. Miller and P. K. Allen. GraspIt!: A versatile simulator for

grasp analysis. In Proceedings ASME International Mechanical En-
gineering Congress & Exposition, pages 1251–1258, Orlando, FL,
USA, Nov. 2000.

[102] A. T. Miller, S. Knoop, H. I. Christensen, and P. K. Allen. Au-

tomatic grasp planning using shape primitives. In IEEE Interna-
tional Conference on Robotics and Automation, volume 2, pages
1824–1829, Taipei, Taiwan, Sept. 2003.

[103] B. Mirtich. V-Clip: Fast and robust polyhedral collision detection.

ACM Transactions on Graphics, 17(3):177–208, July 1998.

[104] B. Mirtich and J. Canny. Easily computable optimum grasps in

2-D and 3-D. In IEEE International Conference on Robotics and
Automation, volume 1, pages 739–747, San Diego, CA, USA, May
1994.

[105] A. Morales, P. J. Sanz, and A. P. del Pobil. Vision-based com-

putation of three-finger grasps on unknown planar objects. In
IEEE/RSJ International Conference on Intelligent Robots and Sys-
tems, volume 2, pages 1711–1716, EPFL, Lausanne, Switzerland,
Oct. 2002.

[106] A. Morales, P. J. Sanz, A. P. del Pobil, and A. H. Fagg. An ex-

periment in constraining vision-based finger contact selection with
gripper geometry. In IEEE/RSJ International Conference on In-
telligent Robots and Systems, volume 2, pages 1693–1698, 2002.

[107] R. M. Murray, Z. Li, and S. S. Sastry. A Mathematical Introduction

to Robotic Manipulation. CRC Press, 1994.

[108] V.-D. Nguyen. Constructing force-closure grasps. The International

Journal of Robotics Research, 7(3):3–16, June 1988.

[109] C. L. Nielsen and L. E. Kavraki. A two level fuzzy PRM for ma-
nipulation planning. In IEEE/RSJ International Conference on
Intelligent Robots and Systems, volume 3, pages 1716–1721, Taka-
matsu, Japan, Nov. 2000.
240 References

[110] C. Nissoux, T. Siméon, F. Lamiraux, J. Cortes, and C. van Geem.

Move3D Programming Manual. LAAS-CNRS, Groupe Robotique
et Intelligence Artificielle, Toulouse, France, Mar. 2001.
[111] C. J. Ong and E. G. Gilbert. The Gilbert-Johnson-Keerthi algo-
rithm: A fast version for incremental motions. In IEEE Interna-
tional Conference on Robotics and Automation, pages 1183–1189,
Albuquerqe, NM, USA, Apr. 1997.
[112] C. J. Ong and E. G. Gilbert. Fast versions of the Gilbert-Johnson-
Keerthi distance algorithm: Additional results and comparisons.
IEEE Transactions on Robotics and Automation, 17(4):531–539,
Aug. 2001.
[113] G. Oriolo, M. Ottavi, and M. Vendittelli. Probabilistic motion
planning for redundant robots along given end-effector paths. In
IEEE/RSJ International Conference on Intelligent Robots and Sys-
tems, volume 2, pages 1657–1662, EPFL, Lausanne, Switzerland,
Oct. 2002.
[114] T. J. Parr and R. W. Quong. ANTLR: A predicated-LL(k) parser
generator. Software–Practice and Experience, 25(7):789–810, July
1995.
[115] R. P. Paul. Robot manipulators: Mathematics, programming and
control. The computer control of robot manipulators. The MIT
Press, Cambridge, Massachusetts and London, England, fifth edi-
tion, 1983.
[116] R. P. Paul and H. Zhang. Computationally efficient kinematics for
manipulators with spherical wrists based on the homogenous trans-
formation representation. The International Journal of Robotics
Research, 5(2):32–44, 1986.
[117] L. Petersson, P. Jensfelt, D. Tell, M. Strandberg, D. Kragic, and
H. I. Christensen. Systems integration for real-world manipulation
tasks. In IEEE International Conference on Robotics and Automa-
tion, volume 3, pages 2500–2505, Washington, DC, USA, May 2002.
[118] J. Pettré, T. Siméon, and J. Laumond. Planning human walk in
virtual environments. In IEEE/RSJ International Conference on
Intelligent Robots and Systems, volume 3, pages 3048–3053, EPFL,
Lausanne, Switzerland, Oct. 2002.
References 241

[119] N. S. Pollard. Parallel Methods for Synthesizing Whole-Hand

Grasps from Generalized Prototypes. PhD thesis, Department of
Electrical Engineering and Computer Science, Massachusetts Insti-
tute of Technology, 1994.

[120] J. Ponce and B. Faverjon. Computing three-finger force-closure

grasps of polygonal objects. IEEE Transactions on Robotics and
Automation, 11(6):868–881, Dec. 1995.

[121] F. P. Preparata and M. I. Shamos. Computational Geometry: An

Introduction, chapter 6.4. Springer-Verlag, 1985.

[122] C. Qin, S. Cameron, and A. McLean. Towards efficient motion

planning for manipulators with complex geometry. In Proc. IEEE
International Symposium on Assembly and Task Planning, pages
207–212, Pittsburg, PA, USA, Aug. 1995.

[123] J. A. Reeds and L. A. Shepp. Optimal paths for a car that goes
both forwards and backwards. Pacific Journal of Mathematics,
145(2):367–393, 1990.

[124] M. Reggiani, M. Mazzoli, and S. Caselli. An experimental evalua-

tion of collision detection packages for robot motion planning. In
IEEE/RSJ International Conference on Intelligent Robots and Sys-
tems, volume 3, pages 2329–2334, EPFL, Lausanne, Switzerland,
Oct. 2002.

[125] J. H. Reif. Complexity of the mover’s problem and generalizations.

In Proceedings IEEE Symposium of Foundations of Computer Sci-
ence, pages 421–427, San Juan, Puerto Rico, Oct. 1979.

[126] J. Rumbaugh, M. Blaha, W. Premerlani, F. Eddy, and

W. Lorensen. Object-Oriented Modeling and Design. Prentice Hall,
1991.

[127] J. Rumbaugh, I. Jacobson, and G. Booch. The Unified Modeling

Language Reference Manual. Addison-Wesley, second edition, 2004.

[128] A. Sahbani, J. Cortés, and J. Cortés. A probabilistic algorithm for

manipulation planning under continuous grasps and placements. In
IEEE/RSJ International Conference on Intelligent Robots and Sys-
tems, volume 2, pages 1560–1565, EPFL, Lausanne, Switzerland,
Oct. 2002.
242 References

[129] Y. Sato, M. Hirata, T. Maruyama, and Y. Arita. Efficient collision

detection using fast distance-calculation algorithms for convex and
non-convex objects. In IEEE International Conference on Robotics
and Automation, volume 1, pages 771–778, Minneapolis, MN, USA,
Apr. 1996.

[130] P. N. Sheth and J. J. Uicker. A generalized symbolic notation for

mechanisms. J. of Engineering for Industry, Transactions of the
ASME, 93:102–112, 1971.

[131] K. Shoemake. Animating rotations with quaternion curves. In

Proceedings of SIGGRAPH ’85, pages 245–254, San Francisco, CA,
USA, July 1985.

[132] K. Shoemake. Graphics Gems III, chapter Uniform Random Ro-

tations, pages 124–132. Academic Press, San Diego, CA, USA,
1992.

[133] J. G. Siek, L.-Q. Lee, and A. Lumsdaine. The Boost Graph Library.
The C++ In-Depth Series. Addison-Wesley, 2002.

[134] T. Siméon, J. Cortés, A. Sahbani, and J.-P. Laumond. A manipula-

tion planner for pick and place operations under continuous grasps
and placements. In IEEE International Conference on Robotics and
Automation, volume 2, pages 2022–2027, Washington, DC, May
2002.

[135] T. Siméon, J.-P. Laumond, and F. Lamiraux. Move3D: a generic

platform for path planning. In IEEE International Symposium on
Assembly and Task Planning, pages 25–30, Fukuoka, Japan, May
2001.

[136] T. Siméon, J.-P. Laumond, and F. Lamiraux. Towards a software

development kit for motion synthesis in virtual worlds. In IEEE In-
ternational Conference on Virtual Systems and Multimedia, pages
854–863, Berkely, CA, USA, Oct. 2001.

[137] T. Siméon, J.-P. Laumond, and C. Nissoux. Visibility based prob-

abilistic roadmaps for motion planning. Advanced Robotics, 14(6),
2000.

[138] B. H. Simov, G. Slutzki, and S. M. LaValle. Pursuit-evasion using

beam detection. In IEEE International Conference on Robotics and
References 243

Automation, volume 2, pages 1657–1662, San Francisco, CA, USA,

Apr. 2000.
[139] M. H. Singer. A general approach to moment calculation for poly-
gons and line segments. Pattern Recognition, 26(7):1019–1028, Jan.
1993.
[140] G. Smith, E. Lee, K. Goldberg, K. Böhringer, and J. Craig. Com-
puting parallel-jaw grips. In IEEE International Conference on
Robotics and Automation, volume 3, pages 1897–1903, Detroit, MI,
USA, May 1999.
[141] G. Song and N. M. Amato. Randomized motion planning for car-
like robots using C-PRM. In IEEE/RSJ International Conference
on Intelligent Robots and Systems, volume 1, pages 37–42, Maui,
HI, USA, Nov. 2001.
[142] M. Strandberg. A fast grasp planner for a three-fingered hand based
on 2D contours. In 33rd International Symposium on Robotics,
Stockholm, Sweden, Oct. 2002. IFR, Swira.
[143] M. Strandberg. A grasp evaluation procedure based on distur-
bance forces. In IEEE/RSJ International Conference on Intelligent
Robots and Systems, volume 2, pages 1699–1704, EPFL, Lausanne,
Switzerland, Oct. 2002.
[144] M. Strandberg. Augmenting RRT-planners with local trees. In
IEEE International Conference on Robotics and Automation, vol-
ume 4, pages 3258–3262, New Orleans, LA, USA, Apr. 2004.
[145] S. Sundaram, I. Remmler, and N. M. Amato. Disassembly sequenc-
ing using a motion planning approach. In IEEE International Con-
ference on Robotics and Automation, volume 2, pages 1475–1480,
Seoul, Korea, May 2001.
[146] M. Suppa, P. Wang, K. Gupta, and G. Hirzinger. C-space explo-
ration using noisy sensor models. In IEEE International Conference
on Robotics and Automation, volume 5, pages 4777–4782, New Or-
leans, LA, USA, Apr. 2004.
[147] P. Švestka and M. H. Overmars. Coordinated motion planning
for multiple car-like robots using probabilistic roadmaps. In IEEE
International Conference on Robotics and Automation, volume 2,
pages 1631–1636, Nagoya, Japan, May 1995.
244 References

[148] M. Teichmann. A grasp metric invariant under rigid motions. In

IEEE International Conference on Robotics and Automation, vol-
ume 3, pages 2143–2148, Minneapolis, MN, USA, Apr. 1996.

[149] C.-H. Tsai, J.-S. Lee, and J.-H. Chuang. Path planning of 3-D
objects using a new workspace model. In IEEE International Sym-
posium on Computational Intelligence in Robotics and Automation,
pages 420–425, Alberta, Canada, July 2001.

[150] G. van den Bergen. A fast and robust GJK implementation for
collision detection of convex objects. Journal of Graphics Tools,
4(2):7–25, 1999.

[151] C. Van Geem. On using a Manhattan distance-like function for

robot motion planning on a non-uniform grid in configuration space.
Technical Report 94-74a, Research Inst. for Symbolic Computation,
Joh. Kepler Univ., Linz, Austria, 1994.

[152] M. Vendittelli, J.-P. Laumond, and C. Nissoux. Obstacle distance

for car-like robots. IEEE Transactions on Robotics and Automa-
tion, 15(4):678–691, Aug. 1999.

[153] A. Watt. 3D Computer Graphics. Addison-Wesley, third edition,

2000.
[154] J. Wernecke. The Inventor Mentor. Addison-Wesley, 1994.

[155] S. A. Wilmarth, N. M. Amato, and P. F. Stiller. MAPRM: A

probabilistic roadmap planner with sampling on the medial axis of
the free space. In IEEE International Conference on Robotics and
Automation, volume 2, pages 1024–1031, Detroit, MI, USA, May
1999.