Path Planning Oriented Objects
Path Planning Oriented Objects
An Object-Oriented Approach
Morten Strandberg
TRITA–S3–REG–0401
ISSN 1404–2150
ISBN 91-7283-868-X
Automatic Control
Department of Signals, Sensors and Systems
Royal Institute of Technology (KTH)
Stockholm, Sweden, 2004
Automatic Control
Department of Signals, Sensors and Systems
Royal Institute of Technology (KTH)
SE-100 44 Stockholm, Sweden
Abstract
Path planning has important applications in many areas, for exam-
ple industrial robotics, autonomous systems, virtual prototyping, and
computer-aided drug design. This thesis presents a new framework for
developing and evaluating path planning algorithms. The framework is
named CoPP (Components for Path Planning). It consists of loosely cou-
pled and reusable components that are useful for building path planning
applications. The framework is especially designed to make it easy to do
fair comparisons between different path planning algorithms.
CoPP is also designed to allow almost any user-defined moving sys-
tem. The default type of moving system is a robot class, which is capable
of describing tree-like kinematic chains. Additional features of this robot
class are: joint couplings, numerical or closed-form inverse kinematics,
and hierarchical robot representations. The last feature is useful when
planning for complex systems like a mobile platform equipped with an
arm and a hand.
During the last six years, Rapidly-exploring Random Trees (RRTs)
have become a popular framework for developing randomized path plan-
ning algorithms. This thesis presents a method for augmenting bidirec-
tional RRT-planners with local trees. For problems where the solution
trajectory has to pass through several narrow passages, local trees help
to reduce the required planning time.
To reduce the work needed for programming of industrial robots, it
is desirable to allow task specifications at a very high level, leaving it up
to the robot system to figure out what to do. Here we present a fast and
flexible pick-and-place planner. Given an object that has to be moved to
another position, the planner chooses a suitable grasp of the object and
finds motions that bring the object to the desired position. The planner
can also handle constraints on, e.g., the orientation of the manipulated
object.
For planning of pick-and-place tasks it is necessary to choose a grasp
suitable to the task. Unless the grasp is given, some sort of grasp planning
has to be performed. This thesis presents a fast grasp planner for a three-
fingered robot hand. The grasp planner could be used in an industrial
setting, where a robot is to pick up irregularly shaped objects from a
conveyor belt. In conjunction with grasp planning, a new method for
evaluating grasp stability is presented.
v
Acknowledgments
First I would like to thank my advisors, Professor Bo Wahlberg and Pro-
fessor Henrik Christensen, for giving me the opportunity to be a graduate
student at the Centre for Autonomous Systems. As my toughest critics,
they have also been of invaluable help during the writing of this thesis.
I am especially indebted to Frank Lingelbach for many fruitful dis-
cussions on path planning, for proof reading, and for being a brave CoPP
test pilot.
I am thankful to the following persons for suggestions and correc-
tions to the manuscript: Fredrik Almqvist, Fredrik Niemelä, and Paul
Sundvall.
Finally, I thank my wife Firozeh and our children Nadja, Jonatan and
Peter for their love and support during these years; I promise that you
will see more of me from now on!
This research has been sponsored by the Swedish Foundation for
Strategic Research through the Centre for Autonomous Systems at KTH.
The support is gratefully acknowledged.
Contents
1 Introduction 1
1.1 Path Planning Applications . . . . . . . . . . . . . . . . . 2
1.2 Notation and Terminology . . . . . . . . . . . . . . . . . . 4
1.3 Probabilistic Roadmap Methods . . . . . . . . . . . . . . 6
1.4 Outline and Contributions of the Thesis . . . . . . . . . . 8
1.5 Publications . . . . . . . . . . . . . . . . . . . . . . . . . . 11
References 229
Chapter 1
Introduction
In its most basic form, robot path planning is about finding a collision-
free motion from one position to another. Efficient algorithms for solving
problems of this type have important applications in areas such as: indus-
trial robotics, computer animation, drug design, and automated surveil-
lance. It is therefore not surprising that the research activity in this field
has been steadily increasing over the last two decades.
In the first part of this thesis, we study how object-oriented design
methods can be of use in the context of path planning. The result is an
object-oriented framework that makes it easy to develop and compare
path planning algorithms. Using this framework, two new algorithms
have been developed; one for the basic form of the path planning problem,
and one aimed for pick-and-place tasks.
The last chapters of the thesis are about grasp planning, which is a
related but slightly different type of planning; whereas path planning is
about finding motions, grasp planning is about finding a single configu-
ration, i.e., the grasp, that satisfy certain specifications.
The next section gives examples of application areas for path planning
algorithms. Section 1.2 introduces the terminology and notation that
is used in the path planning community. Section 1.3 presents a type
of randomized path planning algorithms that are most common today,
namely probabilistic roadmap methods (PRMs). The framework to be
presented in the thesis is not constrained to a particular planning method,
but will focus on PRM-like methods. Hence, Section 1.3 is useful to
understand the motivation behind the framework.
The final section gives an overview of the thesis and its contributions.
2 1 Introduction
Cobs = {q ∈ C | A(q) ∩ O =
6 ∅} (1.1)
Equation (1.1) can be seen as a mapping of the obstacles in W to obstacles
in the configuration space C. The resulting regions are often denoted
configuration space obstacles, or C-space obstacles for short. Note that
for robots consisting of several moving objects, the definition of Cobs in
Equation (1.1) have to be extended to include self-collisions.
The remaining portion of the configuration space is called the free
space. The free space is defined and denoted as Cf ree = C \ Cobs .
Since the seminal work of Lozano-Pérez [92], most planning algo-
rithms work in the configuration space. Transforming the problem from
W to the C-space using Equation (1.1) is a useful abstraction: Path plan-
ning problems that look very different in W still look similar in a C-space
formulation, allowing the same type of algorithm to solve widely differ-
ent problems. Note, however, that the transformation in Equation (1.1)
is carried out explicitly only for very simple problems of low-dimension.
Most algorithms instead work by probing C to find out wether a config-
uration belongs to Cf ree .
6 1 Introduction
qi
qg
In the query phase, the initial configuration, qi , and the goal con-
figuration, qg , are connected to the graph using the same local planner
that was used for the learning phase. If this connection succeeds, then
the path planning problem has been reduced to a graph search problem,
which can be answered in fractions of a second. Figure 1.1 (a) shows
an example of a roadmap for a two-dimensional configuration space. An
example of a query to the same roadmap is shown in Figure 1.1 (b).
PRM methods are particularly useful if repeated queries are expected
for the same environment; then the cost for constructing the roadmap
is amortized over each query. Thus, given a static environment, the
overall efficiency of a PRM-planner usually increases with the number of
queries. Due this behavior, there is often a distinction between single
shot planners, and multiple query planners. PRM-planners have for a
long time been thought of as multiple query planners due to the costly
learning phase, but recent contributions by Bohlin and Kavraki [17, 18]
have changed on that; they showed that the costly operation of verifying
wether path segments are collision free could be postponed until the query
phase. Thus, the constructed roadmap contains many infeasible edges,
but these are detected and deleted during the query phase. With this
Lazy PRM approach, the number of required collision detections were
significantly reduced, making the approach competitive as a single shot
8 1 Introduction
planner as well.
Variations Since the original work in [63, 147, 64], numerous varia-
tions, improvements, and extensions have been suggested. Already in [64]
it was pointed out that using uniform sampling of the C-space would not
be efficient in case the problem involves a narrow passage. To get through
the passage, the roadmap must have at least one vertex in that area.
However, as the critical region usually covers an extremely small portion
of the configuration space, the chance of getting a configuration there at
random is very small. Therefore many researchers have suggested sam-
pling schemes that bias the distribution towards the narrow passages of
the configuration space.
Other variations explore different heuristics for determining between
which vertices connection attempts should be made. These heuristics
often involve the choice of a metric, according to which the distance
between two configurations is determined. Throughout the thesis we will
use the words vertices and nodes interchangeably.
In Chaper 2 we will describe many of these PRM-methods in more
detail. In that chapter we will also present an object-oriented framework
that offer a systematic way to deal with all these variations.
Chapter 3
Chapter 3 describes a general class for representing robots. The class
can describe robots with tree-like kinematic chains and joint couplings.
The possibility to add joint couplings makes it easy to model robots
with coupled degrees of freedom, such as, e.g., the Barrett hand.2 Joint
couplings also allow us to model certain types of robots with closed kine-
matic chains; an important feature as many industrial robots have closed
kinematic chains to increase the stiffness of the structure.
The robot class also provides features like general and specific in-
verse kinematics and robot composition. These two features have shown
to be useful in high-level path planning applications like pick-and-place
planning, see Chapter 5.
Chapter 4
Rapidly-exploring Random Trees (RRTs) [78] have become a popular
framework for solving path planning problems. In Chapter 4 we address
some of the weaknesses of bidirectional RRT-planners and present an im-
proved algorithm. The improved algorithm uses local trees, in addition to
the ones rooted at the start and goal configuration. Several experiments
show that the new algorithm performs much better than the original one
in case the solution has to pass several difficult regions.
Chapter 5
In Chapter 5 we present a pick-and-place planner that builds upon the
work by Kuffner [71]. The proposed planner is fast; it can solve every-
day pick-and-place tasks in a few seconds. Some improvements over the
planner in [71] allow the new planner to handle difficult problems more
2 The Barrett hand is a three-fingered robot hand from Barrett Technology, Inc.
The second part of the thesis focuses on grasp planning and grasp sta-
bility. To be of real use, a robot must be able to interact with its envi-
ronment by picking up and manipulating objects. Thus, for robots that
autonomously interact with the environment, grasp planning and path
planning are needed in combination.
Chapter 6
In Chapter 6 we present a fast grasp planner for a three-fingered hand.
The planner uses a two-dimensional contour of the object as input and
the result is an ordered set of stable grasps. This contribution could be
useful for example in industrial applications where a robot is to pick up
objects of unknown shape from a conveyor belt.
Chapter 7
Grasp planning is often formulated as an optimization problem, where
the goal is to find the best grasp according to some criterion. In Chapter 7
we present a novel approach to grasp stability evaluation. The approach
is based on the grasp’s ability to withstand disturbance forces that act
on the surface of the grasped object. Previous approaches have often
treated the grasp stability analysis as independent of the object geometry,
whereas we claim that the stability depends on the geometry also. Our
approach takes the geometry into account, and it can also incorporate
task information to derive a task directed quality measure for a grasp.
Appendices
Much of the framework in Chapter 2 is illustrated using a variant of the
class diagram notation introduced by Rumbaugh et al. [126]. Appendix A
gives a brief introduction to class diagrams. Here we also make clear the
additional conventions that are used in this thesis.
Appendix B gives a short introduction introduction to rigid body
transformations and various ways of representing rotations.
The framework presented in Chapter 2 is designed to support different
geometric representations. Users of the framework can define their own
1.5 Publications 11
geometric types, or they can use one of the three types that the frame-
work provides: triangle sets, convex polyhedra, and hierarchies of convex
polyhedra. These three geometric types are covered in Appendix C.
In a similar manner, the framework is not bound to a specific collision
detection algorithm. The collision detection algorithms that have been
tested so far are PQP [76] and Enhanced GJK [27]. These algorithms are
covered in Appendix D.
Appendix E covers additional details of the framework, such as visu-
alization, task constraints, and path representation.
Appendix F presents parts of the Boost libraries that have been found
useful for implementing path planning algorithms. Examples are: generic
graph functions, random number generation, matrix classes, and parsing.
1.5 Publications
Some of the work presented in this thesis has been published elsewhere.
The material in Chapters 4, 6, and 7 is presented in part in the following
papers:
The following publication contains results that are relevant, but not
covered in the thesis.
Since the original work in [63, 147, 64], many variations of the basic
PRM algorithm have been published. Examples are: different sampling
schemes, different metrics, and various local planners. However, it is
difficult to compare these contributions objectively as “they were tested
on different types of scenes, using different underlying libraries, imple-
mented by different people on different machines” [46]. This problem
underlines the importance of a homogeneous software platform, where
changing, e.g., the collision detection algorithm is as simple as changing
one line of code or a directive in a configuration file. Only then can we
truly compare two versions of the same concept.
To address this problem, we will in this chapter present an object-
oriented framework for building path planners. The design rationale be-
hind the framework is plug-and-play capability between variations of all
the major concepts in path planning. Apart from the framework itself,
the contributions in this chapter are the identification of a set of or-
thogonal concepts, and the definition of coherent and efficient interfaces
to these concepts. The resulting framework is a set of loosely coupled
components that can be combined and extended to build path planners.
To emphasize the component-based approach, the framework is named
CoPP, which stands for Components for Path Planning.
As this is not the first attempt to create a framework for path plan-
ning, the next section will describe other approaches and we will see how
14 2 A Framework for Path Planning
3 www.kineocam.com
4 Object-oriented programming is of course possible in C as well, but it usually
Geometric Representations
Models of Motion move objects
- triangle sets
- rigid body - convex polyhedra
- kinematic chain - octrees
Planner
Collision Detection/
collisions/distances Distance Computation
- metrics
- interpolation
- sampling
Figure 2.1: Illustration of the main components that are involved when
solving a path planning problem. The arrows show the flow of infor-
mation for a typical implementation. The dashed line indicates a more
passive relationship; the geometric objects that need to be checked for
collisions are usually registered once at startup to the object responsible
for collision detection.
the chosen representation is tightly coupled to the method used for col-
lision detection, but here we want to see the geometric representation as
a concept that can vary as freely as possible. At the highest level, as
described by Figure 2.1, we only require a geometric object to have have
a few basic properties such as a position, an orientation, and a bounding
box, determining the object’s extent in space. We make no distinction
between stationary and moving objects; all objects are seen as moveable.
More details on geometric representations can be found in Section 2.3.
• Easy to use
• Easy to extend
• Loosely coupled
• Portable
• Efficient
Among the techniques and principles we have used are the design pat-
terns in [45]. We have for example used the Template Method pattern to
automate (from the point of view of the derived classes) the gathering of
useful statistics, such as the number of collision checks.
For each concept mentioned in this section there is a corresponding
class hierarchy. The result is a set of small, loosely coupled class hier-
archies, where each hierarchy ideally has one well-defined responsibility,
e.g., interpolation between two configurations. Each class hierarchy orig-
inates from an abstract class, whose only purpose is to define a uniform
interface to the modeled concept. The idea is that planners that only refer
to the abstract interface of a concept can be reconfigured with any vari-
ation of that concept. Thus, making it easy to perform fair comparisons
between variations of a concept, which was one of the main motivations
behind the framework. An additional advantage is that we have in effect
got a multitude of planners at the price of one; just a reconfigure the
planner with another combination of concepts.
2.3 Dealing with Geometry 19
must also be suitable for rendering. It turns out that these two require-
ments often are incompatible, so many path planners use two set of data
structures for each geometric object; one for collision detection and one
for visualization. For some planners, the planning method itself requires
adding “embellishments” to an otherwise standard geometry represen-
tation. Examples of this are workspace potential field methods, which
require each geometric object to have a set control points attached to it,
see, e.g., Tsai et al. [149] and Barraquand et al. [13]. At each configura-
tion, the forces resulting from the potential field are summed over all the
control points to give a resultant force and torque acting on the object.
Other examples are the method of Holleman and Kavraki [56] to sample
the medial axis of the workspace and the visibility tetrahedron concept of
Hernando and Gambao [54, 55], which also need special points attached
to each geometry. In the planning method suggested by Baginski [11],
each link of the robot is covered with several protective layers of increas-
ing thickness. These layers are used both to speed up collision checks
along path segments and to provide an estimate of the penetration depth
in case of a collision.
It is clear from the above that testing different collision detection algo-
rithms and developing new planning methods is difficult if the path plan-
ner framework restricted to one single geometric representation. There-
fore it was decided that CoPP should support several geometric represen-
tations through an extensible class hierarchy, where all geometric objects
inherit from a single base class.
Geom
GetPose( )
Move(Transform t) material
Move(Vector args) Material
SetMoveFormat(format)
GetBoundingBox(…)
pose
AttachToFrame(…) Transform
Accept(GeomVisitor v)
about the (oriented) bounding box of the geometry. The bounding box
can be used to speedup collision detection, since we can quickly discard
pairs of geometries whose bounding boxes do not overlap. It is also used
to estimate the displacement of a robot link as the robot moves from one
configuration to another (see Section 2.6 on metrics).
We have found that these requirements form a base class interface that
is rich enough to be useful, yet imposes hardly no constraints on which
geometric representations we can use. The base class of all geometric
types is called Geom and it is shown in a class diagram in Figure 2.2.
From Figure 2.2 we can see that the base class declares a virtual
function Accept that takes a single argument of type GeomVisitor. This
is a prerequisite for the Visitor pattern [45], whose intent can be described
as adding new virtual methods to a class hierarchy without modifying
the hierarchy. The methods we want to add are put in a class that
derives from GeomVisitor. An example of the Visitor pattern is given
in Appendix E, where it is used to determine how to draw a geometric
object.
For visualization purposes, geometric objects also have material prop-
erties. The class Material in Figure 2.2 has the same type of data as the
material descriptions used in, e.g., VRML [8] and Open Inventor [154].
It can be used to model a wide range of effects such as shininess, specu-
lar color and transparency. Even though only a subset of these settings
are used most of the time, we have found that the transparency setting
is particularly useful for visualizing trajectories or hierarchically com-
posed geometric objects. See Figure 5.14 in Chapter 5 or Figure C.5 in
Appendix C for examples.
22 2 A Framework for Path Planning
inefficient and limiting for some type of planners. Therefore there is also
an indirect method for moving objects. This method uses the concept
of moving coordinate frames, to which we can attach any number of
geometric objects. Once an object is attached to a frame, it will move
along with the frame. As an example, we can think of a robot manipulator
with a moving coordinate frame at each joint. To make the robot’s links
move we just attach them to the corresponding frame. When we ask
a link about its current pose, it will return the pose of the frame it is
connected to.
The member function AttachToFrame takes a single Transform and
attaches the object to it. An overloaded version of this function takes two
Transform arguments, where the second argument specifies a constant
offset between the moving frame and the object. This second version is
especially convenient in pick-and-place planners, where a grasped object
suddenly moves along with the robot. When an object is grasped, the
following code makes sure it moves correctly together with the robot:
grasped_object.AttachToFrame(robot.GetEndEffectorPose(),
offset);
The second argument specifies the pose of the grasped object relative
the end effector, and is obtained from the inverse of Equation (B.6) in
Appendix B.
a finite set of convex polyhedra. Details about these classes can be found
in Appendix C.
To make it easy to construct rather complex environments and ob-
jects, we have developed a VRML-like file format in which geometric
objects can be defined. The parser for this format was developed with
Spirit parser generator framework. Spirit is part of the Boost libraries
and is briefly described in Appendix F.
detection and collision detection; the former checks wether two static objects overlap
in space, whereas the latter checks wether two objects interfere over a whole range of
motions. In this thesis we follow the mainstream and use the term collision detection
for both problems.
2.4 Collision Detection 25
Simplex-Based Algorithms
One of the most well-known simplex-based algorithms is the Gilbert-
Johnson-Keerthi (GJK) algorithm [48]. Conceptually, the GJK algorithm
work with the Minkowski difference of the two convex polyhedra. The
Minkowski difference is also a convex polyhedron and the minimum dis-
tance problem is reduced to finding the point in that polyhedron that is
closest to the origin; if the polyhedron includes the origin, then the two
polyhedra intersect. However, forming the Minkowski difference explic-
itly would be a costly approach. Instead GJK work iteratively with small
subsets, simplices, of the Minkowski difference that quickly converge to a
subset that contains the point closest to the origin. For problems involv-
ing continuous motion, the temporal coherence between two time instants
can be exploited by initializing the algorithm with the final simplex from
the previous distance computation. Experiments in [48] showed that the
algorithm has a linear complexity in terms of the total number of vertices,
except for a few degenerate cases.
Since the original paper [48], several improvements on GJK have been
published. Cameron [26, 27] suggested the use of adjacency information
for each vertex to speedup the algorithm. With this adjacency infor-
mation, the algorithm uses hill-climbing to achieve faster convergence.
This variant of GJK is often referred to as Enhanced GJK. Experi-
ments in [26, 27] involving tracking pairs of convex polyhedra showed
26 2 A Framework for Path Planning
that the computational time was close to constant when the total num-
ber of vertices was varied between 10 and 500. This almost-constant
time complexity has been confirmed by Ong and Gilbert [111, 112]. Both
Cameron [26, 27] and Ong and Gilbert [111, 112] point out that this be-
havior only holds for situations with high temporal coherence. Nagle7
pointed out that there are cases where the GJK algorithm experiences
problems with numerical errors; in situations where two faces, very differ-
ent in size, are almost parallel, the algorithm might end up in an infinite
loop. This was in fact experienced during tests of the pick-and-place
planner in Chapter 5; at certain stages of a task, the robot hand was
moving close to parallel to the table top, causing Enhanced GJK8 to cy-
cle when computing the distance between the finger links and the table.
In a later version of his implementation of Enhanced GJK9 , Cameron
addressed this problem by changing the termination condition and also
returning an error parameter. This version has not yet been tested in the
CoPP framework.
The robustness of GJK was also addressed by van den Bergen [150].
Furthermore, efficient caching of repeated dot products tend to make this
the fastest GJK implementation currently available.
To summarize:
• There are only two numerical tolerances and the algorithm is easy
to implement.
It is also interesting to note that R3 is just a special case for the original
formulation of GJK; it remains valid in higher dimensions as well.
7 J. Nagle. GJK collision detection algorithm wanted. Posted on the
Available at web.comlab.ox.ac.uk/oucl/work/stephen.cameron/distances/
9 version 2.4
2.4 Collision Detection 27
Feature-Based Algorithms
A polyhedron can be seen as a set of features, where the different fea-
tures are vertices, edges and faces. The Lin-Canny closest features al-
gorithm [89] traverses the surfaces of two convex polyhedra to find the
closest pair of features. This search is made very efficient because of the
use of precomputed Voronoi regions for each feature. The Voronoi re-
gions basically divides the exterior of each polyhedron into semi-infinite
regions, where each region belongs to a specific feature. Figure 2.3 shows
examples of the three types of Voronoi regions for a simple geometry.
Once the closest features are found and cached, subsequent queries will
run in expected constant time. Even though Lin-Canny is considered to
be among the fastest algorithms for this problem, it has several draw-
backs as pointed by Mirtich [103]. First, it cannot handle the case of
intersecting polyhedra, in which case the algorithm cycle forever. Sec-
ond, other geometrically degenerate situations can cause the algorithm
to cycle as well. Third, the implementation depend on over six numerical
tolerances that have to be tweaked. It should be mentioned though that
the implementation of Cohen et al. [32] is able to handle the intersecting
case also, but at the cost of extra pre-processing.
Addressing the shortcomings of Lin-Canny, Mirtich [103] developed
28 2 A Framework for Path Planning
CollisionPair
if (DoCollides()) {
+ Collides( ) ++num_collisions;
+ GetNumCollisions( ) return true;
- DoCollides( ) }
return false;
num_collisions
DistancePair
+ Collides(tolerance)
+ Distance( )
+ DistanceSqrd( )
- DoCollides(tolerance)
WitnessPair
+ GetClosestPoints(p1, p2)
Figure 2.4: Classes that encapsulate proximity queries for pairs of geo-
metric objects.
then it is clear that each rank extends the capabilities of the previous
one. Thus, we have in effect found an inheritance hierarchy. We have
chosen the names of the corresponding classes to be CollisionPair,
DistancePair, and WitnessPair. The name WitnessPair was chosen
because the two points that realize the minimum distance are often called
witness points in the literature [27]. A class diagram for the resulting
hierarchy is shown in Figure 2.4.
Note that none of the abstract classes have any members that refer-
ence a pair of geometric objects; it is up to the derived classes to decide
which specific type of geometric objects they accept. Still, clients can
access the pair through the base class interface using the pure virtual
functions GetFirst and GetSecond, which all derived classes have to
implement.
The class CollisionPair supports only the binary collision query
through the member function Collides. This function is implemented
using the Template Method pattern [45] to automate the computation of
statistics like the number of collisions detected for a particular pair. The
30 2 A Framework for Path Planning
++num_calls;
ObjectSet
free = DoIsCollisionFree();
+ IsCollisionFree( ) time_used += ElapsedTime();
+ GetCollidingPair(...) return free;
# DoIsCollisionFree( )
num_calls
time_used
DistEngine
+ IsCollisionFree(tolerance)
+ MinDist( )
# DoIsCollisionFree(tolerance)
# DoMinDist( )
DistEnginePQP DistEngineGJK
AddPair(GeomT1 a, AddPair(GeomConvex a,
GeomT2 b) GeomConvex b)
AddPair(GeomConvexGrp a,
GeomConvexGrp b)
AddPair(GeomConvex a,
GeomConvexGrp b)
SetTrackingMode(…)
PairBasedDistEngine
pairs
Add(DistancePair p) DistancePair
plate, using the same technique described in Appendix D.1 for the class
DistPairPQP; here reference counting techniques are used to avoid du-
plicate PQP Model objects. In the case of DistEngineGJK, we would like
to use it transparently with any combination of the convex types de-
2.5 Configuration Space Interpolation 33
(a) (b)
Figure 2.6: (a) A maze on the surface of a torus. (b) The same maze,
stretched out on a planar surface. The start and goal positions are also
drawn.
Figure 2.8: Interpolation between two ring joint values by the fraction
t ∈ [0, 1].
Interpolator
Interpolate(q1, q2, t, qt)
Clone( )
RingInterp
min_vals
max_vals
ring_joints
2.6 Metrics
The definition of a metric is of fundamental importance to any sampling-
based path planner, but it is far from clear which metric one should
choose. Using the straightforward Euclidean distance measure is in most
cases not so good, because the topology of the configuration space is often
very different from the Euclidean space. In this section we will present
some of the most commonly used metrics for path planning.
A metric space is a topological space X together with a real valued
function ρ : X × X → R (called a metric) such that, for every x, y, z ∈ X,
ρ(x, y) ≥ 0, (2.1)
40 2 A Framework for Path Planning
ρ(x, y) = 0 ⇔ x = y, (2.2)
ρ(x, y) = ρ(y, x), (2.3)
ρ(x, y) + ρ(y, z) ≥ ρ(x, z). (2.4)
h i1/2
2 2 2
ρ(x, y) = (x1 − y1 ) + (x2 − y2 ) + · · · + (xn − yn ) (2.5)
where n is the dimension of the configuration space. This metric has been
widely used, especially in path planners using a discretized representation
of the configuration space, see, e.g., Autere and Lehtinen [10] and Van
Geem [151]. The reason for its popularity is part due to its simplicity, but
also because it is better than the Euclidean distance in the case of grid-
based planners. This is due to the fact that grid-based planners rarely
allow diagonal motions, causing the Manhattan distance to be a better
estimate of the distance. Another advantage of the Manhattan metric
is its low cost, compared to, e.g., the Euclidean distance that requires a
square root. As the metric is computed many thousands of times, the
cost of computing the metric can be an important factor.
Studying a serial manipulator it is seen that moving one joint will
move all links subsequent to that joint. Thus, moving a joint closer to
the base of the robot will in most cases cause a larger gross motion of the
robot than moving a joint closer to the end-effector. The more links that
are in motion, the higher the probability for a collision. This observation
suggests that the cost for moving a joint should be higher the closer to the
base it is. The simplest way to achieve this is to extend Equation (2.6)
to a weighted Manhattan distance:
n
X
ρ(x, y) = ci |xi − yi |, ci > 0 (2.7)
i=1
Metric
Distance(q1, q2)
Clone( )
Euclidean WeightedManhattan
weights
BoundingBoxDispl QuatMetric
AddGeom(geom) rot_weight
RemoveGeom(geom)
use_summed_displ RingMetric
min_vals
geoms agent max_vals
ring_joints
AgentT weights
Geom
Move(config)
In the method of Cortés et al. [35], loops in a kinematic chains are cut.
Each cut defines an active chain and a passive chain. Random sampling
and forward kinematics are used for the active chains. The passive chains
are then forced to close the open loops using inverse kinematics. To be
really effective, this method assumes that closed-form inverse kinematics
exist for the passive chains.
Oriolo et al. [113] considered the problem of a redundant robot that
has to move among obstacles along a given end-effector path. As the
robot is constrained to follow the end-effector path, it has to use its re-
dundancy to avoid the obstacles. The sampling method in [113] generated
samples that satisfied the end-effector constraint.
ConfigSpaceSampler
GetSample(sample )
Clone( )
ObstBasedSampler UniformQuats
AllowCollFreeSamples(...) min_vals
NumDrawnSamples( ) range_vals
NumCollidingSamples( )
SetNumDirections(n)
robot dist_engine
Robot DistEngine
Move( ) IsCollisionFree( )
MinDist( )
the planner. For rigid body problems, the class UniformQuats should
be used. The translational part is drawn with a uniform distribution
from a box in Rn , while the rotational part is generated from uniformly
distributed quaternions, as described in [132].
tection at a finite number of points along each segment. The question is,
how dense do we have to sample a given segment? To keep the proba-
bility of missing a collision low, it is important that the relative motion
between each collision check is small. Thus, sampling a segment until,
e.g., the bounding box displacement between all samples is less than some
threshold seems as a good idea. The sampling of the segment could be
done in an incremental fashion, from one end towards the other. It is in
general, however, more efficient to recursively divide the path segment in
smaller and smaller segments and do a collision check on the mid-point of
each segment. For a collision free path segment, there is no difference be-
tween the two approaches. If there is a collision, however, then the second
approach will find this quicker, see e.g., Geraerts and Overmars [46].
For rigid body problems, it is possible to find exact upper bounds
on the displacement between two configurations. These bounds depend
on the object geometry, the center of rotation, and assumes that the
motion taken between the two configurations is given by interpolation
on SE(3) (or SE(2) for 2D problems). Thus, given the distance to the
nearest obstacle at two configurations, and the upper bounds on the
displacement, then a path segment is either classified as collision free
with certainty, or more samples are needed.
There are also other methods, which are not based on sampling of the
path segment. These methods have the advantage of being more exact
than the sampling based techniques, but are, on the other hand, also
more costly. Gilbert and Hong [47] formulated the collision detection
problem as a root finding problem, where the root (if it exists) is the first
collision on the given path of motion. Another technique is to check for
collision between the volumes that are swept out by the geometric objects
as they move, see Xavier [156]. A drawback with methods that use swept-
volumes is that they are conservative; two objects might occupy the same
space, but at different times. Cameron [25] suggested that the sweeping
operation instead should be done in both space and time, generating
four dimensional objects. Unless the motions are very simple, these four-
dimensional objects are difficult to construct.
necting path segment looks like, and a boolean result is all the information
needed. More elaborate local planners could, however, find a non-trivial
path connecting the two configurations. In such a case, the client must
be given the path also. A uniform interface to all kinds of local plan-
ners must hence return a path also. The resulting interface is shown in
Figure 2.13.
The general interface of the abstract class ConfigConnector makes
it easy to encapsulate almost any planning algorithm and use as a local
planner. One could, for example, use a PRM planner with RRT as local
planner, making it easy to compare the tradeoff between the speed and
connection rate of the local planner.
ConfigConnector
bool Connect(from, to, path)
metric
BinaryConnector Metric
SetMaxStepSize(…)
interp
SetToPersistent(…) Interpolator
SetMetric(…)
SetInterpolator(…) constr
SetConstraint(…) BinaryConstraint
12 This project is carried out at the Centre for Autonomous Systems by: Daniel
Path planning deals with the problem of finding motion strategies for
movable objects or articulated structures. An articulated structure can
be used to model things like, e.g., the motion of a computer animated
character, a robotic manipulator or a complex protein molecule. From
hereafter we will call an articulated structure a robot. The goal in this
chapter is to develop a general robot class that is able to represent a
wide range of different robot types. The main purpose of this class is to
maintain the kinematic structure of the robot and to provide functions
for both the forward and the inverse kinematics.
It is important to have a systematic method for describing the kine-
matic structure of a robot. The next section will go through some of the
proposed notations. It will be pointed out that the standard Denavit-
Hartenberg notation [39] is incapable of describing robots with tree-like
kinematic structures. We use instead the notation of Kleinfinger and
Khalil [68], which in the case of a serial kinematic chain reduces to that
of Denavit and Hartenberg.
To reduce the number of actuators in, e.g., robot hands and hyper-
redundant robots, it is a common design strategy to mechanically couple
joints together. To allow modeling of such robots, the proposed robot
model incorporates the concept of joint couplings, which is discussed in
Section 3.2. Another important idea of the proposed robot model is
that simple robots can be combined to form a more complex robot. For
56 3 A General Robot Model for Path Planning
See Figure 3.1 for a picture of two adjacent frames with the corresponding
parameters. In the literature, these parameters are known as the link
twist, the link length, the joint angle and the joint offset, respectively.
From Figure 3.1 we see that we can derive the transform i−1 i T , de-
scribing the pose of link i relative to link i − 1, as a concatenation of two
rotations and two translations:
58 3 A General Robot Model for Path Planning
i−1
i T = RX (αi )DX (ai )RZ (θi )DZ (di ), (3.1)
where RX stands for rotation about the X-axis and DX stands for dis-
placement along the X-axis. See, e.g., the book by Craig [36] for details.
Evaluating Equation (3.1) gives:
cos θi − sin θi 0 ai
i−1
cos αi sin θi cos αi cos θi − sin αi −di sin αi
T =
sin αi sin θi
(3.2)
i
sin αi cos θi cos αi di cos αi
0 0 0 1
For a revolute joint, θi will be the joint variable, while all the other pa-
rameters are constant. For a prismatic joint, di will be the joint variable.
The pose of the last link relative the base of the robot can be found
by applying all the transforms from the base leading up to the last link:
0 n−1
nT = 01T 12T . . . n T (3.3)
Note that this equation also gives the pose of all the intermediate links
relative the base.
One of the main reasons for the popularity of the D-H notation is its
simplicity and compactness; for any link we only need to specify three
parameters, while the fourth one is the joint variable. This compactness
comes from the clever choice of the frames fixed to each link, reducing
the number of parameters to a minimum. The main drawback of the D-H
notation is that it runs into trouble when trying to represent kinematic
chains with more than one branch, i.e., a link has more than two joints.
If a link i connected to joint i, also is connected to m other joints, the
definition of frame i fixed to this link becomes ambiguous. This is because
Xi is defined from the relationship between joint axis i and the next joint
axis. In this case we have m choices for the next joint axis. If we choose
one of the m joints, say joint j, as the next joint axis, then the frame for
link i and j can be defined as usual, but the frames for the other m − 1
one links cannot be defined using the standard D-H parameters because
Xi does not point towards the right joint axis.
To overcome this problem, Sheth and Uicker [130] (S-U) proposed a
new notation for robots with tree-like or closed loop kinematic chains.
In their method, two frames are assigned to each link, and the result-
ing transformation matrix can be seen as composed of two parts: One
3.1 Notation for Articulated Structures 59
Axis i − 1 Axis i
Zi−1 Li−1 Li
Zi
Xi
Xi−1 θi
di
ai
αi
constant transform related to the shape of the link, called the shape ma-
trix, and one variable transform representing the joint motion, called the
joint matrix. In the S-U notation, six parameters are needed to define
the shape matrix. The joint type and the joint variable defines the joint
matrix. The large number of frames, two for every link, and parameters
makes the S-U notation much more complicated than its D-H counter-
part. Because of this, the D-H notation has always been the choice unless
studying robots with loop kinematics. Thus, for a unified notation to be
used for both serial robots and robots with loop kinematics, it should re-
duce to the simple D-H notation in the case of a serial manipulator. This
is exactly the case of the notation proposed by Kleinfinger and Khalil [68]
(K-K). In the case of a serial manipulator, their notation coincide with
that the D-H notation described above, but for a branching link, two ex-
tra parameters, together with the usual D-H parameters, are needed for
each extra frame. Next we will describe the K-K notation in detail for a
tree-structure robot. In the case of a serial manipulator the notation is
60 3 A General Robot Model for Path Planning
Cγi Cθi − Sγi Cαi Sθi −Cγi Sθi − Sγi Cαi Cθi
i−1
Sγi Cθi + Cγi Cαi Sθi −Sγi Sθi + Cγi Cαi Cθi
T =
i Sαi Sθi Sαi Cθi
0 0
Sγi Sαi ai Cγi + di Sγi Sαi
−Cγi Sαi ai Sγi − di Cγi Sαi
(3.4)
Cαi di Cαi + ǫi
0 1
As pointed out by Craig [36], the D-H notation is ambiguous. That
is also the case of the K-K notation. Below some guidelines from [36]
are given for how to choose the frames in case a parameter is not well-
defined. To begin with, the choice of Zi is ambiguous, since there are
two directions in which we can point Zi when making it coincident with
joint axis i. In the case of joint axis a(j) and j being parallel, the choice
of origin location for frame a(j) is arbitrary. Here Craig [36] suggests to
choose the origin such that da(j) becomes zero, which is possible only if
the joint is revolute. If the two joint axes intersect, e.g., in the case of a
spherical wrist, then the mutual perpendicular is not defined. In such a
case, the Xa(j) axis is chosen to be perpendicular to the plane containing
both joint axes, leaving two choices for the direction of Xa(j) . The origin
of frame a(j) is located at the intersection of the two joint axes. The
choice of the first frame of the robot is arbitrary, but by convention [36]
it is chosen such that F0 = F1 when the first joint variable is zero. This
will make sure that a1 = 0 and α1 = 0. Additionally, d1 = 0 for a
revolute joint and θ1 = 0 for a prismatic joint.
It was shown in [68] that the K-K notation can also be applied in
the case of a robot with loop kinematics. By cutting each loop at an
arbitrary joint, such a robot can be transformed to an equivalent tree-
structure robot if a loop-closure constraint is added for each loop that is
cut.
Xi′
Xk
Zi dk
Li Xj
Lk
γk Zk
ǫk Lj
αj Li
θj
aj Xi
Zj
consists of eight revolute joints, where the first two control the others.
With all gear factors equal to one, we achieve constant curvature along
each segment. The resulting ’elephant trunk’, see Figure 3.3, has eight
degrees of freedom.
Modern industrial manipulators often have loops in the kinematic
structure to increase the stiffness and the payload capacity of the manip-
ulator. Solving the kinematics of manipulators with closed loop kinematic
chains is not so straightforward in that it requires satisfying the loop con-
straints. Often iterative solution techniques are required. However, there
is an important sub-class of such manipulators that we can handle with-
out, again using joint couplings. Robots like the Acma SR400 robot,
see [67], have a loop in the shape of a parallelogram. When such a par-
allelogram is deformed, all the joint angles change by the same amount.
Thus, we can model a parallelogram structure using one active joint that
control three passive joints, see Figure 3.4.
In summary, joint couplings are useful to model:
Figure 3.4: A closed loop kinematic structure where the loop constraint
is satisfied using joint couplings. Due to the parallelogram structure, the
relationships between the joint angles are linear.
be slow and they often only find one solution even if there are more.
Allowing only one solver would therefore be very inefficient for those
robots where we have a closed form solution. The method chosen here
is to have a repository of various solvers, from which users can choose at
runtime.
In the following, we will discuss a straightforward method to numeri-
cally solve the inverse kinematics problem. We will test it on two robots
and also compare its efficiency to a closed form solution. Finally we will
see how different solvers coexist in CoPP.
Ẋ = J(q)q̇, (3.5)
where the end-effector velocity is composed of both the linear and the
T
rotational velocities, Ẋ = VT ΩT . If not stated otherwise, the end-
effector velocities are measured relative the base-frame of the robot. The
column i of the Jacobian can be interpreted as the velocity of the end
effector when joint i has unit velocity and all other joints are locked. This
suggests that J can be computed column by column, looking at one joint
at a time. Below we will look at the velocity contributions to the end-
effector frame from a prismatic joint and a revolute joint, respectively.
The velocities will be expressed relative the end-effector frame itself, but
a simple transformation can express them relative another frame, e.g.,
the base frame.
Joint k produces the linear velocity nVk and the angular velocity nΩk
with respect to the frame n. If joint k is prismatic, then we have [67]:
n
Vk = nZk q̇k (3.6)
n
Ωk = 0 (3.7)
n
Vk = ( nZk × nPk ) q̇k (3.8)
n
Ωk = nZk q̇k (3.9)
where nPk is the vector connecting frame k to frame n, expressed relative
frame n. Introducing the auxiliary variable σk , which is defined as
(
1 if joint k is prismatic,
σk = (3.10)
0 if joint k is revolute,
the velocity contribution from joint k can be written in the general form
n
Vk = [σk nZk + σ̄k ( nZk × nPk )] q̇k , (3.11)
n
Ωk = σ̄k nZk q̇k , (3.12)
where σ̄k = 1 − σk . Summing the contributions from all joints from the
base to the end-effector, we get the Jacobian as
σ1 Z1 + σ̄1 (Z1 × P1 ) . . . σn Zn + σ̄n (Zn × Pn )
J= , (3.13)
σ̄1 Z1 ... σ̄n Zn
where the leading superscript n has been omitted.
The Jacobian for a system with coupled joints will of course be dif-
ferent from that of the non-coupled system, since one or several joint
velocities can expressed in terms of the velocity of other joints. If we
have a coupling between joint i and j such that θ̇i = kij θ̇j , where kij is
the ’gear factor’, then it is seen that the jth column is given by
cj = cj + kij ci . (3.14)
The ith column is discarded, so each coupling will reduce the number of
columns in the Jacobian by one.
Note that the Jacobian in Equation (3.13) gives the end-effector ve-
locities relative to the end-effector frame itself. To express the velocities
relative another frame Fi , we can use the following transformation:
i
nR 03×3
i nJ,
J= (3.15)
i
03×3 nR
where inR is the rotation matrix that specify the orientation of the end-
effector frame relative frame Fi and nJ is from Equation (3.13).
3.3 Inverse Kinematics 67
∆X = J(q)∆q. (3.16)
This differential model will remain valid as long as ∆X and ∆q are
sufficiently small. The main idea when solving the inverse kinematics
numerically is to let ∆X be the pose error and use Equation (3.16) to
solve for ∆q. However, just inverting J will in most cases not work
because the Jacobian may not be a square matrix. Instead we use the
pseudo-inverse J+ of J. The pseudo-inverse of a matrix does always exist
and can easily be computed from the singular-value decomposition of the
matrix, see Golub and Van Loan [49]. Furthermore, it can be shown that
the solution given by
∆q = J+ ∆X (3.17)
minimizes the residual kJ∆q − ∆Xk2 .
Let 0nTd denote the desired end-effector pose and 0nTc the current end-
effector pose. The following algorithm can be used to iteratively solve for
the inverse kinematics problem [67]:
• Compute the position error dXp and the rotation error dXr , rep-
resenting the difference between 0nTd and 0nTd .
• If dXp and dXr are sufficiently small, then qd = qc and the itera-
tion terminates.
If it were not for the two thresholds Sp and Sr , this algorithm would
run efficiently for robots of any size. According to [67], the values 0.2
meter and 0.2 radians are acceptable for industrial robots of typical size.
For robots considerably larger, these small thresholds would make the
algorithm take small and inefficient steps. Tests by the author have also
shown that too large values of Sp and Sr can make the iterations vary
erratically a long time before converging.
Another, more severe drawback of this simple algorithm is that takes
no concern to joint limits; the solution provided by the algorithm can
have joint angles that are outside the valid range for the robot. As long
as the robot only contains revolute joints with 2π range this is not a
problem, but as the joint ranges decrease, more and more solutions fall
outside the joint limits.
Table 3.1: A comparison between the closed form solution and the gen-
eral solver for the Puma inverse kinematics. The table shows the outcome
of 10,000 calls to each solver. Only once did the general solver fail be-
cause it exceeded the maximum number of iterations. The other times it
failed because the generated configuration violated the joint limits.
also more reliable; it never failed to find a solution, and it reported on av-
erage 4.8 solutions to each query. The general solver, on the other hand,
did not look for multiple solutions and found a solution to only about
half the queries. The reason for the high failure rate is the algorithms
ignorance of joint limits; only once did it not converge to the desired
end-effector pose within the maximum number of iterations, which was
set to 1,000, but the found solution often violated the joint limits of one
or more joints. It was found that repeating a failed query with a different
initial configuration made the solver converge to a valid configuration.
of them. In all cases, the solver had no problem with the joint limits
until the robot reached really awkward configurations. Also, the motions
were generated very quickly as the number of iterations for solving each
step was usually very small. An example is shown in Figure 3.5, where
the end-effector was forced to follow a linear path with constant orienta-
tion. This shows that the general solver can be of use in path planning
problems where we have constraints on, e.g., the orientation.
Obelix
13 DOFs ArmHand
10 DOFs
XR4000
3 DOFs Puma
6 DOFs
BarrettHand
4 DOFs
Figure 3.6: The left-hand figure shows Obelix, a mobile robot that
consists of a Nomadics XR4000 platform, a Puma manipulator and a
Barrett hand. The right-hand figure shows how this complex robot can
be seen as a composition of simpler robots. The arrowheads show how
sub-robots are attached to each other. Moving a sub-robot will cause any
attached robot to follow the motion as a single rigid body.
Robot
Move(config)
Inverse(pose, solution)
SetInvKinSolver(solver)
NumDOF( )
GetEndEffectorPose( ) subrobots
SubRobotsBegin( )
nodes
RobNode
parent->GetWorldFrame() *
pose_rel_parent;
Update( )
GetWorldFrame( ) parent
GetGeoms( )
link_geoms AddChild(node)
Geom
Accept(RobNodeVisitor v)
AttachToFrame(...)
Transform pose_rel_parent children
RobJoint RobFrame
SetParams(joint_params) RobFrame(pose_rel_parent)
SetJointType(…)
SetJointValue(val)
GetLimits(min, max)
coll_table {
"puma_link_0" and "puma_link_3"
"puma_link_0" and "puma_link_4"
"puma_link_1" and "puma_link_4"
}
Figure 3.8: An example of a (partial) self-collision table for a Puma
robot.
3.5 Design and Implementation 75
InvKinRepository
+ RegisterSolver(name, solver)
+ GetSolver(name) ...
+$ GetInstance( ) return *the_instance;
- InvKinRepository( )
$ InvKinRepository* the_instance
solvers
InvKinSolver InvKinSolution
operator( )(Transform p, InvKinSolution s) + NumSolutions( )
Clone( ) friend + GetSolutions( )
SetRobot(robot) - Clear( )
- AddSolution(config)
num_solutions
RobotConfigs configs
PumaKinematics InvKinNull
InvKinGeneral
Figure 3.9: Class diagram for the inverse kinematic solvers. Note that
the constructor of InvKinRepository is private, which is key to enforce
a single unique instance.
SetInvKinSolver(aSolver)
Clone( )
new PumaKinematics(*this)
SetRobot(*this)
namespace {
const PumaKinematics prototype;
const bool isRegistered =
InvKinRepository::GetInstance().RegisterSolver("Puma560Kin",
prototype);
} // anonymous namespace
Figure 3.11: An example of how a specific solver for the inverse kinemat-
ics is added to the repository of available solvers. Note that this code is
executed at program startup, before main is entered. The solver can now
be retrieved anywhere in the program using the string “Puma560Kin”.
position of the hydraulic cylinder, there exists closed form solutions for
the other joint angles.
Other joint couplings could be used to enforce closure constraints, in
which case we a robot with loops in the kinematic structure. It is rec-
ommended, however, that closed-loop kinematic chains are modeled in a
class that is separate from the Robot class: Putting to much functionality
in one and the same class can make it monolithic, complicated, and hard
to use.
A file format and a parser have been developed to allow robot de-
scription files. An example of a description file is given in Appendix E.5.
Chapter 4
Augmenting
RRT-Planners with Local
Trees
qrand
qnear
qnear
qstop
qrand
(a) (b)
Figure 4.1: (a) If the path between qnear and qrand is collision free, then
qrand becomes the new vertex in the tree. (b) If the path to qrand is not
collision free, then qstop becomes the new vertex. Note that the random
configuration need not be collision free.
of T . This means that vertices with a large Voronoi cell2 have a larger
probability for being extended. This is a good property as large Voronoi
cells represent unexplored areas of the configuration space. It was also
shown in [72] that in the limit, as the number of vertices in T tend to
infinity, the coverage of the configuration space is uniform. Thus, in the
limit there is no bias.
In the case of a dynamic system, connecting two configurations can
lead to a nontrivial control problem. For such problems, the growth
towards qrand is incremental, applying a constant control signal over some
time interval ∆t. Wether this strategy reach qrand is not so important;
the growth of T is still Voronoi biased. For examples on kinodynamic
planning with this approach, see [78, 81, 82].
If Connect fails due to, e.g., a collision, it adds the last collision free
configuration to the tree T and returns false. If no collision was found,
the two configurations were connected and the function returns true. The
metric M1 is hereafter denoted nearest-neighbor metric.
Note that it is possible that no new vertex at all is created by a call
to Connect. This happens if the vertex in T is so close to an obstacle
that no step is possible towards the other configuration. Thus, in a real
implementation, we would have to check that Connect actually created
a new vertex. To avoid cluttering the algorithm description, we have
chosen to omit such details.
valuable sample if the active tree could not connect with it. Using uni-
form sampling, the planner will probably have to wait a long time until
such an important sample shows up again. The problem becomes even
more pronounced if the solution trajectory has to pass a series of narrow
passages: Once a tree is finished struggling with a passage, the next one
is waiting just around the corner. PRM methods, on the other hand,
have the advantage of being able to treat the narrow passages more or
less in parallel. In a recent paper by Akinc et al. [3], it was noted that in
environments with thin obstacles, like the “hole” problem in Figure 4.8,
RRT planners tended to produce many configurations that were stuck
near the obstacle. They also reported that for such situations, the cost
for a single RRT query approached the cost of the preprocessing phase
for the PRM method.
Based on these observations, is clear that RRT methods need to take
better care of samples that fall into crucial, but hard to reach, regions.
The basic idea that is proposed here, is to let important samples that
cannot yet be reached, spawn a new tree. Such a tree will be called a
local tree, and the start and goal trees will hereafter be denoted global
trees. A local tree is also an RRT, and as it grows it will eventually reach
outside the “hard to reach” region and merge with one of the global trees.
Even two local trees can be merged together if they get close. There are
three important issues that arise when implementing this idea:
In the extreme case, letting every sample start a new RRT and then
connecting it to, say, the nc closest vertices, the algorithm would simulate
the behavior of the PRM approach. This was pointed out by LaValle and
Kuffner [82]. In this case it is clear that the RRT approach would loose
its qualities as an efficient single-query planner, and we would be better
off using a PRM based method right away. So clearly, there is a tradeoff
between how much time the planner should spend on the two global
trees and on the local trees, respectively. In the next section, effective
heuristics, addressing the three issues above, will be discussed. A new
algorithm, called RRT-LocTrees, is also presented.
86 4 Augmenting RRT-Planners with Local Trees
“hard to reach” area, making it reachable for other trees. The measure
of advancement chosen here is the volume of the box bounding of an
RRT in the configuration space. Thus, each time the bounding box of a
tree grows, the new node will be used for connecting this tree with every
other tree. Since it was this node that caused the growth of the bounding
box, it will in some sense be in the “frontier” of that tree. Experimental
results have shown this to be a very effective heuristic. However, it was
found out that an even greater effect was achieved if this heuristic was
extended such that inter-tree connections are tried also if the sample was
successfully reached by a local tree.
The rules described above have been used to implement a new RRT
algorithm, called RRT-LocTrees. The main loop of the algorithm builds
on the RRT-ConCon algorithm[72], described in Figure 4.2. The new al-
gorithm is described in Figure 4.3. To simplify the description of the algo-
rithms, it is assumed that all trees, both global and local, can be accessed
by an index. Furthermore, the two first trees will always be the start tree
and the goal tree. Two important sub-routines, GrowLocalTrees and
TryMerge, are described in Figure 4.4 and Figures 4.5, respectively.
88 4 Augmenting RRT-Planners with Local Trees
GrowLocalTrees(q, Tact )
1 if not IsSatisfied(q) then return;
2 for all Ti : Ti 6= Tact do
3 reached = Connect(Ti , q, new nodei );
4 if reached or Ti .BoundingBoxGrew() then
5 for all Tj : j > i, Tj 6= Tact do
6 TryMerge(Ti , Tj , new nodei );
7 if reached then return;
8 end for
9 if (num loc trees < Nloc ) then
10 CreateLocalTree(q);
11 return;
TryMerge(Tc , Td , node in c)
1 if Connect(Td , node in c, new node) then
2 Tc .Merge(node in c, Td , new node);
3 delete Td ;
4 end if
5 return;
4.3 Experiments
To study the benefits of using local trees, two planners were tested on
a set of four path planning problems. The first planner uses the orig-
inal RRT-ConCon algorithm, while the second planner uses local trees
together with the heuristics presented in Section 4.2. Both planners were
implemented using the CoPP framework, described in Chapters 2 and 3.
The function Connect, see Section 4.1, uses a local planner to try to con-
nect two configurations. As local planner, the class BinaryConnector,
see Section 2.8.3, was used. This local planner uses a metric M2 and
a given maximum step size to determine how densely a path segment
has to be checked for collisions. The metric M2 is here denoted path
metric. If not mentioned otherwise, the BoundingBoxDispl metric from
4.3 Experiments 89
2D-maze A unit square with 2 DOFs must move from the lower
left corner to the upper right. Even though this problem only is two-
dimensional, RRT-ConCon has trouble because of the many narrow pas-
sages: Each entry to a narrow passage poses a problem. Albeit not so
difficult, but there are many of them. Thus, this is an ideal setting for
testing the approach presented here, as each local tree can grow in its
own corridor, waiting for another to come near. The width of each nar-
row passage is 1.6 times the width of the square. The overall size of the
maze is 100 × 100. The problem is available in the MSL distribution. As
nearest-neighbor metric, the squared Euclidean distance was used. This
was metric was also used as path metric.
From Table 4.1 it is seen that without local trees, the planner suc-
ceeded at all 100 attempts, but the planning times are not so encouraging.
As expected, adding local trees has a very positive effect on this problem,
and the performance is rather insensitive to variations in the parameter
pgrow . The runs showed that the number of local trees quickly reached
the maximum value and then dropped again as the two main trees cov-
ered more and more of the maze. As can be seen in Figure 4.3, pgrow
will not have any effect as long as the maximum number of local trees
90 4 Augmenting RRT-Planners with Local Trees
is not reached. Since Nloc is reached only for a very short time for this
problem, the lack of influence from pgrow can be explained.
Hole A well-known test scene, see Geraerts and Overmars [46]. The
moving object is composed of four identical boxes of dimension 100 ×
20 × 20, such that the bounding box of the final object is 100 × 100 × 180.
The hole has dimension 100 × 100 and the plate thickness is 20. To get
through the hole, the object has to rotate in a complicated manner. Since
the passage in the configuration space will be long and winding, getting
a local tree inside the passage will greatly speed up the solution process.
It was noted that most unreachable samples actually corresponded to
configurations where the object was tightly pressed to the surface of the
plate. The local trees arising from such samples were merged with the
start or the goal tree in just a few iterations, and the maximum number of
local trees was seldom reached. That explains the lack of influence from
pgrow on this problem. Once a local tree was really inside the passage,
the solution was quickly found.
4.3 Experiments 91
Figure 4.6: A maze with narrow corridors for a square with 2 DOFs.
Figure 4.8: The rigid body has 6 DOFs and it has to pass through the
hole.
Figure 4.9: The 6 DOF Puma manipulator has to move the wrench from
the upper shelf to the lower shelf. Note the nearby wall and cupboard,
limiting the manipulator workspace.
methods: The Voronoi biased growth of the trees and the incremental
connection with other nodes. The latter property is important for non-
holonomic systems. The important issues with this method seem to be
when to create a local tree, how often the local trees should be allowed
to grow and when to look for inter-tree connections. Spending too much
time on local trees will make the planner loose its benefits as an efficient
single-shot planner. In this paper, simple and powerful heuristics have
been proposed for both these issues, with one parameter controlling the
amount of time spent on the local trees. Although the experimental re-
sults are few so far, the trend seems to be that local trees have a large
impact on difficult problems, even if the parameter pgrow is very small,
say pgrow ≈ 0.05. This suggests that augmenting RRT-planners with
local trees, using a small value on pgrow , will have a large effect on the
overall performance.
According to one of the most efficient heuristics, inter-tree connec-
tions should be attempted whenever the bounding box of an RRT grows.
It could, however, be expected that the efficiency of this heuristic de-
creases as the dimension of the configuration space increases; for high-
dimensional configuration spaces, this heuristic can cause too many con-
nection attempts with other trees, as new vertices are likely to increase
the bounding box.
94 4 Augmenting RRT-Planners with Local Trees
Table 4.1: Experimental results for the four problems and various values
for pgrow . Each problem was run 100 times. Note that pgrow = 0.0
corresponds to the original RRT-ConCon algorithm, which does not use
any local trees.
Figure 4.10: The “Flange” benchmark. The goal is to separate the two
parts. Because the start configuration is severely constrained, an efficient
planner should focus more on the start tree.
Planning of
Pick-and-Place Tasks
presents the grasp generator concept. This concept is useful for separat-
ing the grasp planning process from the rest of the planner. In Section 5.4
we present a general preprocessing method that help to make the problem
easier for the path planner. The method uses the inverse kinematics of
the robot to convert the start and goal configurations into configuration
space trees. Section 5.5 deals with how motions for the arm and hand
are generated. Section 5.6 presents a simple approach for handling task
constraints.
Due to the random nature of the algorithm, the produced paths are
often jerky and unnecessary long. To be of real use, smoothing of the
solution is necessary. Section 5.7 deals with methods for path smoothing.
In Section 5.8, we will show examples of various tasks solved by the
planner. Two of these examples involve task constraints. The chapter
ends with a summary.
less heuristic planner (no restrictions on the last three joints, no dis-
cretizations of the configuration space).
Nielsen and Kavraki [109] extended the Probabilistic Roadmap
(PRM) framework to handle manipulation planning. Their approach
uses the manipulation graph, whose edges can be transfer paths, or tran-
sit paths. A transfer path corresponds to a movement where the task
object is grasped and moves along with the robot. Accordingly, a tran-
sit path is a path where the object is left in a stable position and only
the robot is moving. The generality of the approach makes it possible to
solve tasks that require several regrasping motions. However, the planner
has to be initialized with a user-defined set of grasps and stable object
placements.
Whereas the approach in [109] used a discrete set of grasps and ob-
ject placements, the recent approach by Siméon et al. [134] could handle
continuous sets. A continuous set of grasps assumes that the set can be
parameterized by some coordinates. This is easy in cases that involve,
e.g., a parallel-jaw gripper and a bar with rectangular cross section. The
example in [134] showed that the planner is capable of finding solutions
to problems that require long sequences of regrasping motions.
The planner presented in this chapter has many similarities with the
one presented in [71]. However, as will be pointed out in Section 5.9, this
planner introduces several improvements over the one in [71].
• Grasp: Close the fingers around the task object to form a stable
grasp.
This decomposition is similar to that used in [93] and [71]. There is, how-
ever, a class of pick-and-place tasks whose solution cannot be described
by the proposed task decomposition; some tasks require the robot to
move the object to an intermediate position, and then regrasp it to suc-
ceed. Figure 5.1 show some sequences from a task where the robot is
supposed to reposition a cylindrical object. To succeed with the task,
the robot must place the cylinder at an intermediate position where it
can be regrasped, see Figures 5.1 (c) and (d). We will argue here that
tasks that need such regrasping operations can be solved if an interme-
diate pick-and-place task, instantiated by a high-level planner, is solved
first. Thus, any complex pick-and-place task can be decomposed into a
sequence of simpler tasks that follow the proposed task decomposition.
The reason for this distinction between simple and complex pick-and-
place tasks is that regrasping operations in general will require a lot of
semantic knowledge, specific to the environment and the task. First, a
regrasping operation might involve placing the object at some position
other than the start- or goal position. This requires knowledge about
which surfaces in the environment are suitable as support for the object.
Second, cluttered environments could require the robot to reposition an
obstacle, which requires knowledge about which obstacles are moveable.
For these reasons we think that planning of regrasping operations should
take place at a higher level than planning of the basic pick-and-place
task. However, we think it is important that the planner solving the
basic pick-and-place task is capable of communicating the reason of an
eventual failure to the level above. This information can help the higher-
level planner to, e.g., look for a regrasping operation, or reposition a
moveable obstacle. In Section 5.3.3 we look at how statistical informa-
tion that is gathered during the grasp planning process can be used to
detect if a regrasping operation is needed.
(a)
(b) (c)
(d) (e)
of stable grasps, where each grasp satisfies the task constraints at both
the start and the end position. In case of an eventual failure (i.e., the
set of grasps is the empty set), grasp generators can provide higher-level
102 5 Planning of Pick-and-Place Tasks
Modified RRT-Planner Once a grasp has been chosen, the arm con-
figuration at the pick up and put down position, respectively, can be
found solving the inverse kinematics problem. Similar to [71], we use
the RRT-ConCon (see also Section 4.1) algorithm to find motions that
connect these configurations, with some important changes. First, we
consider multiple goal configurations instead of a single goal configura-
tion. This will cause the planner to try to connect the start tree to several
goal trees in parallel. Second, the start and goals are no longer points
in C, but trees. The old planner interface is a special case of this new
interface, because a point in C can bee seen as a configuration space tree
with just one node. The possibility to initialize the RRT-planner with
entire trees will be extremely useful; as described in the following para-
graph, a specialized planner can be used to preprocess the start and goal
configurations into trees that will both accelerate the RRT-planner and
help it generate more efficient motions.
x
Figure 5.2: In the vicinity of the cylinder the hand can only move in the
directions spanned by the y-axis and the negative z-direction. Movement
in the x-direction is blocked by collisions between the fingers and the
cylinder surface.
• The grasp must be reachable at start and end positions, i.e., there
must exist solutions to the inverse kinematic problem for the arm.
The produced grasps should also have the property that they are distinct
or far away from each other in some sense. As an example, when trying
to grasp a box, grasps that approach different sides of the box can be
considered to be distinct with respect to each other. A grasp planner
working after these principles is hereafter called a grasp generator. Note
that the extra constraints in general simplify the grasp planning process
because they can be used to effectively limit the search space.
primitive shape. From each grasp starting position, the fingers were
closed around the accurate object model, and the resulting grasps were
evaluated and sorted using the grasping simulator GraspIt! [101, 99].
We use a similar approach for the examples in Section 5.8. However,
our approach is simplified by the assumption that the primitive shapes
exactly describe the task object.
In summary, there exist many approaches that can be used for imple-
menting a grasp generator, but a general one-for-all approach is yet to
come. Therefore it is desirable to have a modular approach so that the
grasp planning strategy can easily be changed without affecting the pick-
and-place planner. To achieve this goal we must define an interface to
which all grasp generators must adhere. This is done in the next section.
• A robot hand for the grasping analysis (unless the grasp generator
is a database of precomputed grasps)
GraspGenerator
+ bool NextGrasp(grasp)
+ NumCollFreeAtStart( )
+ NumCollFreeAtGoal( )
+ NumReachableAtStart( ) if(DoGraspIsCollFree()) {
+ NumReachableAtGoal( ) ++num_reachable;
# GraspIsCollFree(…) return true;
- DoGraspIsCollFree(…) }
# GraspIsReachable(…) return false;
- DoGraspIsReachable(…)
num_reachable_start
…
GraspGeneratorA GraspGeneratorB
Figure 5.3: From the client’s point of view, a grasp generator is a just
a sequence of grasps, which are retrieved with the NextGrasp method.
The base class uses the Template Method pattern [45] to automate the
process of collecting important statistics, such as the number of reachable
grasps that were found.
clients can modify the original plan in order to find a solution. We have
found that useful information can be gathered during the generation of
the set of grasps. Therefore GraspGenerator maintains and provides the
following data:
If a grasp generator fails to produce any grasps, a client can diagnose the
cause of failure using the statistical data. For example, if the number of
reachable grasps at the initial pose is zero the object is probably too far
away to be grasped. If the arm is mounted on a mobile platform, the
client can try to plan for a better position of the platform. If the object
instead is reachable but there are no collision free grasps at the initial
pose, the object is probably in a very cluttered area of the workspace.
A modified plan could then involve repositioning (if possible) one of the
obstacles that are in the way. As a final example, if there are no collision
free grasps at any of the possible end poses of the object, the task could
be modified by
1. Repositioning (if possible) one of the obstacles that are in the way.
2. Inserting an intermediate position where the object is regrasped
(see Figure 5.1 for an example).
Thus, by collecting simple statistics during the grasp planning, important
information can be conveyed to the client.
A drawback with this approach of gathering statistics is that it re-
quires implementers of the derived grasp generator classes to remember
to update every variable when appropriate. Because this would be very
error prone indeed, the base class should make sure that these variables
are updated at the right time in all derived classes. It is clear that the
statistics variables must be updated each time we check if a grasp is col-
lision free or if a grasp is reachable. We have here used the Template
Method pattern [45] to automate the process of updating all the statisti-
cal variables. The pseudo-code in Figure 5.3 gives an example how it is
5.4 Retract Planners 109
done for one of the variables. This way we ensure that the statistics are
updated correctly for all derived classes and that there is no (easy) way
of circumventing this mechanism.
that each node in a retract tree requires solving the inverse kinematics of
the robot arm. This is not a problem as each Robot object, see Chapter 3,
is assured to at least have a numerical implementation of the inverse
kinematics. It is clear though that, using the inverse kinematics, each
node in the retract tree is more expensive than a node produced using
the forward kinematics, as in PRMs and RRTs. However, this is nothing
but the ever occurring tradeoff between computational cost and node
quality: Here we pay slightly more for each node in the hope that its
placement will be much more effective compared to a randomly produced
configuration.
A retract planner is useful not only for finding paths for grasping an
object, but also for finding lift-off and put-down directions when holding
the object: When we put down an object on a planar surface, the last
part of the motion is often parallel to the normal of the surface and
the orientation of the object is roughly constant. Thus, retract planners
can be applied even to this case, with the only difference that we use
another set of retract directions and that the hand is closed. In the same
way we see that retract trees are useful in every transition of the pick-
and-place task: from approaching to grasping, from lift-off to transport,
from transport to put-down, and so on. Figures 5.4 and 5.5 show some
examples of computed retracted trees. We see that the size of a retract
tree can be used as a rough measure of how constrained the robot motion
is at the root configuration of the tree.
If nodes of a retract tree reach into areas with much free space, the
job is made much easier for a subsequent RRT-planner as it will quickly
bridge large open areas and connect with the retract tree. Note that
a retract tree in no way limit or constrain the possible solutions; it is
just a guide whose advice can be followed or ignored. In this thesis we
use retract trees together with RRT-planners, but they could as well be
used with PRMs. In a PRM setting, retract trees can be seen as initially
disconnected components of the roadmap graph.
(a) (b)
Figure 5.4: Retract trees with the hand in preshape configuration. Both
figures show the same grasp, but the arm configurations are different,
resulting in retract trees of different size. Note that some retract tree
nodes are hidden inside the arm links.
(a) (b)
Figure 5.5: Retract trees with the hand in grasp configuration. In (a)
we see that without any surrounding obstacles, the resulting retract tree
becomes very large. In (b) two obstacles have been placed around the
task object, resulting in a much smaller retract tree.
trees is almost negligible. The worst case scenario for this approach is
when there exist a solution for all the provided goals, but each solution
is very hard to find. In this case, testing each goal separately would be
faster.
Multiple goals appears frequently in the context of pick-and-place
tasks, not just because of multiple solutions to the inverse kinematics
problem. As mentioned in Section 5.3, grasp generators are used to
generate a set of grasps that are compatible with the task. The approach
with multiple goals allow us to treat multiple grasps in parallel as well.
Real-life tasks will most likely involve a goal region for the task object,
rather than a single goal point. For example, a task specification like
“Put the book on the top shelf” will leave it to the service robot to
find suitable regions on the shelf, where it can place the book. Goal
regions also arise naturally for objects like cylinders, that have rotational
symmetry; if the rotation around the cylinder’s axis is irrelevant to the
task, then the goal position together with the possible orientations form
a goal region. The idea here is that goal regions can be approximated
with a few representative positions and orientations.
So, we have seen that allowing multiple goals is useful in situations
where we know where we start but are not sure about where to go. There
is of course a limit, where too many goals will make the query inefficient.
A general guideline is not to add a goal that is close to an already existing
goal. In the examples in Section 5.8, the number of goals varied between
one and twenty. The exact number depended on the number of specified
goal positions, the number of grasps and arm configurations for each
grasp. In Section 5.5.3 we will see that multiple goals are also useful in
case we reach a dead-end and have to backtrack, or if we are interested
in optimizing the solution.
• Accelerated planning: The retract tree will often lead out from nar-
row and difficult passages, making the task easier for the RRT-
planner.
114 5 Planning of Pick-and-Place Tasks
5.5.3 Backtracking
Assume that we have found an approach path and a transport path, but
the planner fail2 to find a return path. That means that the chosen
arm configuration or object end position was a dead-end, and we have
to backtrack to find a new transport path towards another end configu-
ration. With a minor change to the RRT-planner, we can reuse much of
the work done while planning the transport path. If the planner retains
its state (i.e., start and goal trees) when returning a solution, we can
easily go back and ask it to resume planning. If one of goals turned out
to be a dead end, we can tell the planner to delete the corresponding goal
tree. If the planner has no goal trees left, then we backtrack one step
further, i.e., we try to find a new approach path. A backtracking step is
illustrated in Figure 5.6.
Thus, if we use multiple goals and retain the planner state, we can
deal more efficiently with backtracking events than if goals would be
treated sequentially. Furthermore, if we are interested in improving our
solution, we can resume planning for each remaining goal and keep the
best solution.
2 Note that the notion of planner failure will again require that each RRT-query is
given a limited amount of resources. However, in this case we do at least know that
none of the provided goal configurations has a simple solution path.
5.5 Planning of Arm and Hand Motions 115
return path
start release
grasp home
Figure 5.6: For each transition the planner does a bidirectional search
towards multiple goals. Here the current release configuration is a dead-
end as the absence of dashed lines indicates that there exists no return
path. Therefore the planner must backtrack to find another transport
path. Because each RRT-planner saves its internal state from the last
query, a new path is likely to be found in shorter time than the first one.
ArmHand
Arm Hand
introduce a new geometric object that is the union of the task object and
the hand, then we could significantly reduce the number of moving parts.
This has shown to be useful for the examples in Section 5.8, as much time
was otherwise spent on checking for self-collisions between the numerous
finger links and the arm.
|pi − qi |
ρ(p, q) = max , (5.1)
1≤i≤n σi
where n is the number of joints and σi is the maximum speed of i. In [58]
it was shown that the minimum-cost path is a path that is locally straight
at each point where it is not touching a C-obstacle. In fact, this holds
for all cost functions that are equal to the path length according to some
metric. This is due to metrics obeying the triangle inequality:
ν2
ν3
γ0
ν1 γ̂
γ1
ν4
ν0
Figure 5.8: The dashed path is the result after one iteration of the
Shortcut algorithm in [58]. In this case, the algorithm will stop in the
next iteration because the dash-dotted path is not collision free, resulting
in a path far from the optimal path γ̂. The figure is adopted from [58].
Shortcut procedure alone will in most cases not give satisfactory results;
as shown in Figure 5.8, Shortcut can get stuck at an early stage. To
address this, extra via-points νl and νr are inserted around each via-point
νi , 1 ≤ i ≤ N . The points ν1 and ν2 are initially chosen as the midpoints
of (νi−1 , νi ) and (νi , νi+1 ), respectively. If the line segment (νl , νr ) is not
collision free, then the points are iteratively moved closer to vi using a
bisection method. After the extra via-points are inserted, Shortcut is
applied again. The procedure stops when no further improvement, in
terms of the path cost, is obtained. The resulting algorithm, which is
called Adaptive Shortcut, is described in more detail in [58].
As shown in Section 5.8, Adaptive Shortcut performs well in terms
of speed and quality. However, it was found that running the algorithm
many iterations tend to produce paths where the manipulator moves close
to obstacles. This is a natural result of Adaptive Shortcut because it con-
verges to a path that at some places is tangent to C-obstacles. Although
we are looking for short paths, we do not want the manipulator to move
unnecessarily close to obstacles; there must be some clearances between
the manipulator to allow for position and control errors. Therefore it
is important to ensure that Adaptive Shortcut stops before the solution
path is made tangent to some C-obstacle. If we use collision detection al-
gorithms that are capable of solving distance queries, a natural approach
would be to simply add a distance threshold; if the distance between
the manipulator and an obstacle becomes smaller than this threshold,
then Adaptive Shortcut terminates. For manipulation tasks, however,
120 5 Planning of Pick-and-Place Tasks
ν0
ν2
ν1
ν3
ν4
with this approach we have no control over the velocities at the transi-
tions from approaching to grasping and from grasping to returning. If we
instead apply the method to the three sub-paths, before concatenating
them, then we have precise control of the transitions. Mostly we want
to have ease-in and ease-out behaviors, such that the robot comes to
a stop when grasping or releasing the object. In such a case the joint
velocities are zero at the transitions. If we are instead are interested
in motions where the object is grasped and moved in a continuous mo-
tion, i.e., the robot does not stop during grasping, then the following
approach can be used: We form the end-effector velocity based on the
approach and departure directions.5 The magnitude of the end-effector
velocity is a user-determined value. Based on the end-effector velocity
and the manipulator Jacobian at grasp configuration, the corresponding
joint velocities can be computed. Note that this assumes a non-singular
Jacobian. With these non-zero velocity constraints, we can now make the
robot to grasp the object in one sweeping motion. The resulting path will
be C 2 -continuous everywhere except at the transition from approaching
to grasping, where it would be C 1 -continuous.
5 This method assumes these two directions are roughly similar. Thus, it would not
work for a grasp from above, where the hand first move downwards and then lifts the
task object.
5.8 Examples 123
5.8 Examples
In this section, the pick-and-place planner is tested on some problems,
ranging from easy to difficult. For each example we present the average
time required for: planning, smoothing, and spline conversion. If not
stated otherwise, the average values are based on 40 trials for each prob-
lem. In the planning time is included the time for grasp planning and for
generating the retract trees.
As mentioned in Section 5.5.1, the planner is able to plan toward
several possible goals at the same time. For the approach phase of the
task, the goals are determined by the ten first grasps delivered by the
grasp generator.6 These grasps are known to be compatible with the
6 In case no solution is found for these ten first grasps, the planner will backtrack
1.5
0.5
θ [rad]
0
−0.5
−1
−1.5
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
time
(a) joint 4
1.4
1.2
0.8
θ [rad]
0.6
0.4
0.2
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
time
(b) finger joint
Figure 5.10: Comparison of the piece-wise linear solution and the spline
solution for a pick-and-place task. The triangles show extra via-points
that were inserted because the spline solution was not collision free. The
grasping time and the release time are shown by the dotted lines.
5.8 Examples 125
Figure 5.11: A simple task where the robot has to put the cylinder on
the table.
initial pose and the final pose(s) of the task object. For the transport
phase, the number of goals is determined by the task and the currently
active approach path. For the return path, the number of goals is equal
to the number of “home configurations” for the arm. In these examples,
the home configuration is equal to the initial configuration of the arm.
The results are presented in Table 5.1. All examples were run on a
1.2 GHz Pentium 3 processor, and PQP was used for collision detection.
be placed. For this task, the robot must grasp the workpiece close to
one of the ends; a grasp on the middle of the cylinder would make the
insertion into the fixture impossible. The results for this task is shown in
Table 5.1. Note that the spline conversion results are missing in the table:
The tight fitting between the cylinder and the fixture will cause the spline
path to collide repeatedly with the fixture. Hence, spline conversion was
not used for this task. If desirable, the approach path and the return
path can still be converted to splines with a low cost.
This is an example that shows the usefulness of the retract trees and
the use of multiple goals; without the use of retract trees, the planner
would spend several minutes on inserting the cylinder into the fixture.
Furthermore, for the chosen grasp, the end position in the fixture is reach-
able with six different arm configurations. Some of these configurations
are bad for the task as a straight-line insertion into the fixture is not pos-
sible. This is shown in Figure 5.12 (b); the size of the retract tree (only
one node is visible) indicates that the range of a straight-line motion is
too small to remove the cylinder from the fixture. Figure 5.12 (c) shows a
retract tree for the same grasp, but with another arm configuration. This
arm configuration is clearly better as the retract tree alone can remove
the cylinder from the fixture. With the possibility to use multiple goals,
there is no need to decide which one of the six different arm configura-
tions to choose; the planner will first connect with the retract tree that
is easiest to reach.
Figure 5.12: The robot has to insert the cylindrical object into the
fixture on the table. For the chosen grasp, the end position is reachable
with six different arm configurations. Figures (b) and (c) show that some
arm configurations are more suitable than others for this task.
Section 5.7.1.
Table 5.1: Average planning times for the different examples in this
chapter. Each example was run 40 times, except example E5, which was
run 100 times.
• The line of sight between the camera and the target center is not
occluded.
The camera cone is defined as a circular cone with opening angle θcone
and its apex at the camera lens. In Figure 5.15 (b) the camera cone is
shown as a transparent cone with length dmax . If rotations of the camera
image is undesirable, an additional constraint can easily be added to the
previous list. Without this constraint, the last degree of freedom is of
no importance as it only rotates the camera around its axis. Thus, the
effective dimension of the task is eight degrees of freedom.
In Figure 5.15 (a) the start position and the end position of the robot
is shown. The camera target is the sphere on the table. The task is made
more difficult by the two boxes surrounding the sphere: They will force
the robot to lift the camera high to avoid occlusion. There are also two
stools and a shelf that give rise to narrow passages for the platform.
Because the platform rotation is unbounded, this degree of freedom
has to be modeled as a ring joint, as described in Section 2.5. The
parameters of the constraint are chosen as dmin = 0.5 m, dmax = 1.4 m,
and θcone = 5.0◦ . This problem was solved using the RRT-LocTrees
algorithm in Chapter 4. The averaged computational times for 100 trials
are shown in Table 5.1. Figure 5.15 (b) shows a snapshot from a solution;
it is clearly seen how the box on the table forces a high camera position.
130 5 Planning of Pick-and-Place Tasks
• Multiple Goals
• Backtracking
• Task constraints
• Smoothing
• Splines
(a) (b)
(c) (d)
(e) (f)
(a) (b)
Figure 5.15: (a) The robot has to move from the leftmost position to
the rightmost, while keeping the sphere within the camera’s field of view.
(b) To avoid occlusion, the robot must lift the camera high.
Chapter 6
amples of planned grasps. The chapter ends with a summary and some
conclusions.
make the search more efficient. A special feature of our planner is that
it explicitly searches for robust grasps. This is done by accumulating the
number of feasible neighbors, where neighbors is defined as samples in
configuration space that are close to the current grasp. Independent of
the work in this chapter, Morales et al. [106, 105] developed a very similar
grasp planner.2 From a theoretical point of view, the planner presented
here has the advantage of using more detailed kinematic models and
a more explicit robustness test, without necessarily slowing down the
planner. From practical point of view, however, the planner in [106] has
the significant advantage of being tested on a real robot platform.
F2
thumb
CCD
F1
Figure 6.1: The figure shows the Barrett hand with all fingers fully
opened. The crosses denote each finger’s base position in the palm. Note
the CCD-camera mounted between fingers F1 and F2 .
The main assumptions in this work are that the outer 2D contour
of an object is extracted, using the camera mounted on the hand. This
contour is of course the projection of the object onto the view plane,
but from here on it will be assumed that the contour is the cross section
of a generalized cylinder, with a straight-line axis and a homogeneous
mass distribution. The cylinder assumption can seem very restrictive,
but thanks to the clutch mechanism of the hand, the proposed grasp
planner will also be able to handle many objects that deviate from this
assumption. However, objects that are tapered towards the view plane
6.3 The Grasp Planner 139
normal, such that the smallest cross section is closest to the hand, will
always be difficult for the hand to grasp. Note that the view plane normal
can have any global orientation, but for most practical cases it will be
either horizontal or vertical.
In the starting position, the camera is looking at the top of a cylinder,
with the palm surface parallel to the cylinder top.3 Achieving a grasp
with a plane to plane contact between the cylinder top and the palm
will often imply a strong grasp, which is desirable. Therefore, the grasp
planner will keep the palm parallel to cylinder top. Locking the direc-
tion of the view plane normal this way, reduces the degrees of freedom
(DOFs) for the wrist from six to four. Considering the wrist and the
hand together, the problem has a total of eight DOFs.
6.3.1 An Overview
Basically the grasp planner searches the configuration space for kinemat-
ically valid grasps until a termination criterion is fulfilled. As mentioned
in Section 6.2 , the problem has eight DOFs, so performing an exhaustive
search will be time consuming. Here we reduce the search space in two
3 For a horizontal camera orientation, the cylinder top actually corresponds to the
F2 yF thumb
T
d2 palm
dT
x
ϕ FO
x
s
ϕ d1
F1
Figure 6.2: Relevant frames and variables for a grasp hypothesis. The
position of the thumb is specified by the arc-length parameter s. Each
thumb position has a copy of the polygon relative the frame FT .
ways: We utilize the special kinematic structure of the hand and we use a
set of heuristic rules that help guiding the search. Each tested configura-
tion is called a grasp hypothesis, and Figure 6.2 shows a grasp hypothesis
together with some of the configuration parameters. For clarity, a dashed
box representing the palm is also drawn.
A grasp hypothesis is formed in several steps, and at each step there
are some requirements that has to be fulfilled. These requirements can
be divided into hard and soft requirements, respectively. If a grasp hy-
pothesis does not pass a step associated with a hard requirement, the
hypothesis is discarded and a new one is generated. If, instead, the
step was associated with a soft requirement, the hypothesis can be kept
as idle. This way we avoid throwing away a possible solution and can
quickly backtrack idle hypotheses if stuck. Hard requirements are typ-
6.3 The Grasp Planner 141
Figure 6.3: Pseudo-code describing the steps for forming a grasp hy-
pothesis.
Depth Constraints
When all fingers have been closed, the only remaining degree of freedom is
a translation in the z-direction for the whole hand. Ideally we would like
142 6 Grasp Planning for a Three-Fingered Hand
to place the palm against the object to achieve a more stable grasp. Due
to depth constraints imposed by, e.g., the height of cylinder and eventual
support surfaces, palm contact might not be possible. In such cases we try
to place the palm as close as possible to the object, without violating any
depth constraints. This determines the value of the last remaining DOF,
the z-translation. The process of determining this translation and the
z-coordinates is accelerated through the use of precomputed trajectories,
described in Section 6.3.4.
Efficient Search
A benefit with the stepwise computation of a grasp hypothesis is that we
postpone the, comparatively, expensive grasp evaluation by having cheap
validation gates (the soft requirements) between each step. As a result,
the last two steps in Figure 6.3 are rarely executed. Experiments have
shown that the last step in Figure 6.3 is seldom executed for grasps that
are not good. This result indicates that the soft requirements are good
at filtering out bad grasp hypotheses.
The resolutions used when stepping through combinations of dT and ϕ
are 2 mm and 1◦ , respectively. Performing a depth-first search, i.e., test-
ing every possible combination of dT and ϕ before moving on to the next
thumb position, would not be efficient. Instead a quick and coarse scan
is done over the most promising combinations and then a new thumb
position is generated. When the number of valid thumb positions has
reached an upper limit, we go back to the first thumb position again and
try less likely combinations. Likely combinations of dT and ϕ are deter-
mined from global contour characteristics such as size and eccentricity.
The definition and computation of the characteristics will be described
in Section 6.3.2.
Grasp Robustness
An important aspect of the presented grasp planner is that it explicitly
searches for robust grasps. By robust is here meant that the grasp prop-
erties should not change dramatically for perturbations of the geometry
and the hand/wrist configuration. Here we measure the grasp robustness
by sampling s, dT and ϕ in a neighborhood of the current hypothesis
and counting the number of kinematically valid neighbors it has, i.e., the
number of valid samples. Hypotheses that accumulate a low count of
valid neighbors are given a low confidence and they are kept idle in case
6.3 The Grasp Planner 143
where Ix , Iy and Ixy are the second order area moments, defined as
Z Z Z
2 2
Ix = y dA, Iy = x dA, Ixy = − xy dA. (6.2)
Note that in Equation (6.1), the area moments are computed with re-
spect to a frame located at the centroid of the contour. The eccentricity
measure defined by Equation (6.1) has a useful geometric interpretation:
If we replace the contour with an ellipse that has the same principal mo-
ments, then the eccentricity is equal to the squared ratio of the major
axis to the minor axis.
Using Green’s formula and the fact that the contour is a polygon, we
can easily transform the area integrals in Equation (6.2) into the following
sums:
N
1 X
(xi − xi+1 )(yi + yi+1 ) yi2 + yi+1
2
Ix = , (6.3)
12 i=1
N
1 X
(yi+1 − yi )(xi + xi+1 ) x2i + x2i+1 ,
Iy = (6.4)
12 i=1
N
1 X h
Ixy = (xi+1 yi − xi yi+1 ) (xi + xi+1 ) (yi + yi+1 )
24 i=1
i
+ xi yi + xi+1 yi+1 , (6.5)
c1
2.
c2
1.
3.
z
different. In Figure 6.4 we see that the z-coordinate of the contact point
varies with the finger extension (trajectory 1). If the z-coordinates of the
contact points are different, the grasp will apply a torque on the object.
This torque tend to decrease the stability of the grasp so we would like
to keep it to a minimum. The measure we have chosen to use here is
the angle between the contact plane and the z-axis, where the contact
plane is defined by the three contact points (see Figure 6.6). This is a
148 6 Grasp Planning for a Three-Fingered Hand
natural measure as it captures the fact that a large grasp, i.e., large finger
extensions, is better at tolerating differences in the contact z-coordinates
than a small grasp.4 If the contact plane inclination is large, then the
grasp quality evaluation and the grasp robustness test are postponed until
no better grasp is found. Morales et al. [106] also avoid grasps with large
differences between the contact z-coordinates. They used the following
criterion:
with the z-axis in the opposite direction of gravity. The centroid’s loca-
tion in the view plane is, as mentioned above, computed from the object
contour. The z-coordinate of the centroid is estimated using the height
of the cylinder.
Using this new frame, the value function can be written as the fol-
lowing linear programming (LP) problem:
max fg
x
T
fg [0, 0, 1, 0, 0, 0] = Gx, xi ≥ 0,
X X
{xi }F1 ≤ 1, {xi }F2 ≤ 1,
X
{xi }thumb ≤ 1,
F1
d1 ϕ
y x
dT
ϕ
d2
F2
Figure 6.7: The polygon has 174 vertices and required planning time
was 0.38 seconds. Gravity is perpendicular to the view plane.
is not easy considering the minimum clearance needed for the fingers,
see Figure 6.4. Even though the algorithm assumes cylindrical objects,
this is an example where the planner does well even on objects that do
not fulfill this assumption. From Figure 6.9 we can see that the palm
has contact with the object. If the real robot would execute this grasp,
the hand would be moved forward until the tactile sensor in the palm
senses a contact. When closing the fingers, they would, due to the clutch
mechanism, wrap around the handle and secure the object. This is an
additional reason why grasps with palm contacts are preferred by the
planner. Note that the thumb in Figure 6.9 is surrounded by the contour
and that the clearance is really small. Here the precomputed clearance
measures described in Section 6.3.4 was used to ensure that the clearance
was large enough.
The polygon in this example has 153 vertices and the planning time
6.4 Examples of Planned Grasps 153
Figure 6.8: The polygon has 60 vertices and required planning time
was 0.24 seconds. The palm makes contact with the object and gravity
is perpendicular to the view plane.
F1
F2
mg
Figure 6.9: The polygon has 153 vertices and required planning time
was 1.3 seconds. The dashed line indicates the palm position and the
dash-dot line is the constraint box. Note that this grasp has palm contact
with the object.
154 6 Grasp Planning for a Three-Fingered Hand
was 1.3 seconds. The much longer planning time for this example is
partly due to much time was used on hypotheses with fingers F1 and F2
inside the constraint box. Another reason is that the size heuristic is not
so appropriate in this case: The very large enclosing circle around the
contour suggests that the planner should initially look for grasps with a
large value for dT , whereas all the feasible grasps will have a small value
for the thumb extension. Without the constraint on the lower part of
the iron, the size heuristic would instead have accelerated the planner.
However, the resulting grasps would wrap around the lower and upper
parts of the iron, which, from a practical point of view is not so good.
Grasp Stability
Evaluation
Robots that autonomously grasp objects and interact with them must
have methods for choosing appropriate grasps. However, deciding about
a good grasp also requires some method for evaluating and comparing
grasps. Furthermore, evaluation can be performed with respect to differ-
ent properties of the grasp. Considering the uncertainties in models and
sensor data, grasps that are robust to positioning and modeling errors
are to be preferred. Another important property of a grasp is its ability
to apply forces to the grasped object: A good grasp should be able to
efficiently counteract external disturbance forces, especially those forces
that are expected to occur during the task that is to be performed.
In this chapter we present a novel approach to grasp evaluation. The
approach is based on the ability of the grasp to resist disturbance forces,
and it leads to a min-max formulation. We also propose an efficient
algorithm for solving this min-max problem. The result of the algorithm
is easily visualized as a surface in the force space. For polyhedral objects,
we give a proof showing that only the vertices of the object need to be
considered. Compared to other approaches to grasp evaluation, the main
benefits are:
y z
fj
z
2 tan−1 (µ)
y
x
(a) Point contact (b) Friction cone approximation
Figure 7.1: For nonslipping contacts that obey the Coulomb friction
model, the contact forces must be inside the friction cone. (a) A side
view of a point contact together with its coordinate system. (b) An
example of a friction cone approximated by a five-sided pyramid.
W = Gx, xk ≥ 0, k = 1, . . . , mn (7.2)
where x is a vector containing the αj for all contacts. See the book by
Murray et al. [107] for more details on how to construct the grasp matrix.
When analyzing a grasp, it is of interest to know the space of wrenches
that can be applied to the object by the grasp. The unit grasp wrench
1
space (UGWS) is oftenPmn defined as the space of wrenches that satisfies
Equation (7.2) and k=1 xk = 1. This space is equal to the convex hull
1 Note that by choosing another norm for the contact force vector, we get other def-
initions for the UGWS. Another common, and more natural, definition of the UGWS
is to limit the normal component of each individual contact to one, see Ferrari and
Canny [41]. With this definition, however, the UGWS is much more costly to compute,
and therefore the definition based on the sum of all contact forces is more common.
160 7 Grasp Stability Evaluation
Much of the problems with quality measures that are based on wrench
space balls arise because we are trying to compare forces with torques,
which does not make sense as they have different units. To avoid such
comparisons, Mirtich and Canny [104] used a decoupled approach, leading
to a two-valued quality measure. The optimal grasp is the one that
lexicographically maximizes both values; they first compute the grasps
that best counteract pure forces, and then select among those grasps the
one which best resists pure torques.
To obtain task directed measures, Li and Sastry [88] suggested the
use of six-dimensional task ellipsoids, whose shape resembles the space of
forces and torques encountered in the task. They defined a task directed
quality measure as the largest scale factor that causes the task ellipsoid
to be embedded in the UGWS. However, as pointed out in [88], “The
process of modeling a task by a task ellipsoid is quite complicated”.
Including the Object Geometry Using the UGWS alone for the
construction of a quality measure has a drawback that, seemingly, has
not received much attention: The wrench space is constructed from in-
formation about the contacts alone, thus, effects of the complete object
geometry on grasp stability are ignored. A grasp quality measure that
does not take object geometry into account would treat the two grasps
in Figure 7.2 as equal; our intuition, however, tells us that the grasp of
object A should be more stable than that of object B.
Introducing the object wrench space (OWS) concept, Pollard [119],
actually incorporates the complete object geometry into the grasp eval-
uation. The OWS represents the best grasp of the object that can ever
be achieved . An alternative view is that the OWS is the set of wrenches
that can be created by a (unit) distribution of disturbance forces act-
ing anywhere on the surface of the object. Thus, the OWS depends on
the geometry of the object. The OWS concept is closely related to the
idea presented in this chapter, however, the examples presented in [119]
on this subject are limited to two-dimensional polygons with frictionless
point contacts.
In a recent approach, Borst et al. [22] combined the OWS concept
of Pollard [119] with the task ellipsoids of Li and Sastry [88]. If no
task information is given, the best assumption one can make about the
disturbance wrenches is that they are distributed according to the OWS.
Borst et al. [22] therefore choose as task ellipsoid an ellipsoid that tightly
encloses the OWS. As the task ellipsoid is automatically constructed, it
removes one of the strongest objections against using them. It it is also
7.2 Related Work 163
x
A
B
Figure 7.2: The grasp’s stability will depend on which object is grasped,
object A or object B. Most grasp quality measures do, however, only use
the contact information.
worth noting that their method for computing the grasp quality does not
rely on any friction cone approximations.
Compliant Grasps The work presented so far have all assumed that
grasp compliance can be neglected. That is also the assumption of the
method proposed in this chapter. For compliant grasps, the grasp stiff-
ness matrix is a useful tool, see, e.g., Howard and Kumar [57]. A com-
pliant grasp is stable if the grasp stiffness matrix is positive definite. It
seems natural to base a quantitative measure of stability on the eigen-
values of the stiffness matrix, but care must be taken as these are not
invariant under change of reference frame, see, e.g., [90, 24].
Bruyninckx et al. [24] derived a grasp quality measure based on the
generalized eigenvalue decomposition of the grasp’s stiffness matrix. The
generalization requires a choice of a metric on the group of rigid body
displacements that enables the identification of twists with wrenches. Lin
et al. [90] derived a frame-invariant quality measure in terms of the prin-
cipal translational and rotational stiffness parameters. By introducing a
164 7 Grasp Stability Evaluation
T = a × F, (7.3)
where a is a vector specifying where the force F is applied. Specifying the
grasp matrix G and a in the same coordinate system is key here: Using
the same torque origin assures that the result will be independent of the
choice of frame. According to Equation (7.3), the torque component is
not independent of the applied force and they are always orthogonal to
each other. Based on this observation, the grasp evaluation procedure
described below is proposed.
Consider a unit vector ê, representing a fixed direction for the distur-
bance force so that the disturbance force can be written as f ê, where f
is a dimensionless scalar. Sweep this disturbance force over the surface of
the object, finding the smallest, positive f that results in a wrench that is
exactly on the border of the UGWS. Let us denote this value by f ⋆ . Note
that when performing this ‘sweeping’ operation, only those points on the
object for which ê is inside the friction cone can come into consideration.
We also require f ≥ 0 because we cannot allow tractional disturbance
forces. Repeating this process for all directions of the disturbance force,
we will end up with a closed surface S in force space. Specifying force di-
rections with spherical coordinates ϕ and θ, the surface S is given by the
⋆
vectors {fϕθ êϕθ }. The interpretation of this surface is straightforward:
7.3 Grasp Evaluation Procedure 165
ê
f ⋆ = min max f ∈ R+ : −W0 − f = Gx,
a x a × ê
mn
)
X
ê ∈ FCa , a ∈ ∂D, xi ≥ 0, xi = 1 . (7.4)
i=1
Task Directed Evaluation The usual way to hold a pen when writing
is to place the fingertips close to the tip of the pen, letting its upper part
rest between the thumb and the index finger, see Figure 7.3. Clearly,
this grasp cannot easily resist forces acting on the upper part of the pen.
Using the complete surface of the pen as input to the evaluation procedure
would thus result in a poor overall stability for the grasp. However, when
writing, we know that external forces are only exerted on the tip of the
pen, and the chosen grasp is excellent for balancing those. Accordingly, a
task directed measure, indicating a more stable grasp, could be computed
by using only the tip of the pen as input.
Disturbance Force Friction Cone Note that upon using only parts
of the object surface, S might not be closed. This can happen if some
166 7 Grasp Stability Evaluation
y
F
x
Figure 7.3: The grasp for holding the pen is good at resisting forces
that are applied to the tip of the pen, but bad at resisting forces at the
upper part. A task directed evaluation of the grasp would only include
only the tip of the pen, as this is where external forces are expected to
occur.
of the force directions ê never fall inside a friction cone. The same can
happen if the assumed friction cone for the disturbance force is very
small. However, choosing large enough friction cones for the disturbance
forces, the procedure will cover the force directions relevant to the task.
Thus the procedure depends on some imagined coefficient of friction, that
should be chosen considerably larger than the real one.
nT ′
p W + dp ≤ 0, p = 1, . . . , NP . (7.5)
Looking for wrenches that are exactly on the boundary of the wrench
space can be done by looking for intersections with these hyperplanes.
Here we use the Quickhull program [12] to compute the hyperplanes.
Clearly, we must discretize the space disturbance force direc-
tions. This is most easily done using spherical coordinates: ê =
(cos ϕ sin θ, sin ϕ sin θ, cos θ)T , where θ ∈ [0, π] and ϕ ∈ [0, 2π). Assume
that the grasp must resist some default offset wrench W0 . If, in addition,
a disturbance force is applied at a point a on the object’s surface, then
the following must hold for the grasp to remain stable:
mn
ê X
−W0 − f = Gx, xi ≥ 0, xi = 1. (7.6)
a × ê
i=1
The minus signs in the left hand side of Equation (7.6) are necessary
because the grasp has to exert a wrench that cancels the external wrench
to maintain equilibrium.
With ê and a constant, the left hand side of Equation (7.6) describes
a line in wrench space parameterized by f :
W = −W0 − f W⋆ , (7.7)
where we have introduced the convenient notation
ê
W⋆ = . (7.8)
a × ê
Thus, if −W0 is inside the hull there exists two intersections with the hull
boundary and the line: One with negative f , which we are not interested
in, and one with positive f , which we are looking for. Now we only
need an effective algorithm for finding this positive f -value for each force
direction.
Assume that the wrench W from Equation (7.7) is exactly on hyper-
plane p. Then we obtain, from Equation (7.5)
−nT ⋆ ′
p (W0 + f W ) + dp = 0, p = 1, . . . , NP . (7.9)
We note that the term -nT
p W0can be seen as coordinate translation
of the wrench space, changing the plane offsets. Thus, we can introduce
dp = d′p − nT
p W0 which simplifies Equation (7.9) to
168 7 Grasp Stability Evaluation
−f nT ⋆
p W + dp = 0, p = 1, . . . , NP . (7.10)
Since it is assumed that the grasp is able to resist the offset wrench, we
must have dp ≤ 0 for all hyperplanes. This is a more general condition
than the commonly used d′p < 0, which require the grasp to have force
closure. With only one unknown in Equation (7.10), we can solve directly
for f :
dp
fp = , p = 1, . . . , NP . (7.11)
np W⋆
T
−dp
fmin,p = q q , (7.12)
n21p + n22p + n23p + amax n24p + n25p + n26p
where amax is the maximum vertex distance. Note that the minus sign is
necessary because dp is negative.
Noticing that ||a × ê|| is constant while looping over the hyperplanes,
this sorting idea can be further refined: With knowledge about ||a × ê||
we can obtain a better lower bound than that of Equation (7.12). So
instead of sorting only once, sorting can be done NB times where each
7.3 Grasp Evaluation Procedure 169
b −dp
fmin,p =q q . (7.13)
amax
n21p + n22p + n23p + b n24p + n25p + n26p
So if (b − 1)amax /NB ≤ ||a × ê|| < bamax /NB , the current cross product
b
belongs to bucket b. One could find even better values for fmin,p using
the fact that ê⊥(a × ê), but the resulting optimization problem would
cost too much to make it worthwhile.
In Figure 7.4 it is seen that the proposed sorting scheme has a signifi-
cant influence on the computational time. For sorting, a variation of the
Radix Sort algorithm [34] was used.2 This algorithm is special in that it
has a linear complexity in the number of elements to be sorted. In this
case, sorting always require four traversals over the data, thus we know
that the time spent on sorting will be exactly proportional to the number
of buckets. As the number of buckets are increased, at some point there
will be no further gain from having the hyperplanes sorted. Any further
increase in the number of buckets will then cause a linear increase of the
total computational time. This is also seen in Figure 7.4.
For optimal effect of the sorting, the number of buckets should be
set adaptively for each problem, depending upon the number of object
vertices and the number of hyperplanes. How this should be done re-
quires more numerical experiments and a deeper analysis of the algo-
rithm. However, experiments so far have shown that choosing NB = 30
gives a significant performance increase for most problems.
Assuming that the vertices have a uniform range distribution around
the origin, the limiting, or worst case, vertices will often be those which
generate large torque components. A further optimization is therefore
to sort the vertices, decreasing in ||a × ê||, before we loop over them.
To satisfy the assumption about uniform range distribution, the object
frame should be placed at the object’s centroid. The final algorithm is
described in listing 1.
2 In books on algorithms, Radix Sort is described as an algorithm for sorting integer
values. It is, however, due to the IEEE Standard 754 floating point representation,
possible to use it for sorting floating point numbers as well; if they are interpreted as
integers, comparisons between them will still be correct (although some care has to
be taken with the sign bit).
170 7 Grasp Stability Evaluation
80
75
70
65
time [ms]
60
55
50
45
40
35
0 20 40 60 80 100
NB
Figure 7.4: The grasp from Figure 7.5 (a) was evaluated with Fg = 0
(8 object vertices, 416 hyperplanes, and 578 force directions). The curve
shows required computational time, in milliseconds, as a function of NB .
For the special case NB = 0, no sorting at all is done.
Proof: Without loss of generality, we assume that each face of the poly-
hedral body is a convex polygon; in case of a nonconvex polygon, we can
always decompose it into a finite number of convex polygons. The kth
face is the convex hull of the vertices a1 , a2 , . . . , aNk , where Nk is the
number of vertices.3 Any position a on the kth face can thus be written
as a convex combination of its vertices:
3 A more strict notation would show that the vertices of each polygon face is a sub-
set of the vertices of the whole body. However, besides from running out of letters,
we believe that additional indices in this case would only lead to a notation that is
harder to read.
7.3 Grasp Evaluation Procedure 171
Nk
X Nk
X
a= βi ai , βi = 1, βi ≥ 0. (7.14)
i=1 i=1
dp
fp = , (7.15)
b0p + b1p β1 + · · · + bip βi + · · · + bNk p βNk
where b0p = n1p ex + n2p ey + n3p ez and bip , 1 ≤ i ≤ Nk , can be computed
from n4p , n5p , n6p , ex , ey , ez and a1 , . . . , aNk . It is seen that, for each
hyperplane, the minimum positive fp is achieved by letting the βi with
smallest coefficient become one (note that dp is negative for a stable
grasp). This clearly corresponds to one of the vertices of the polygon. So
for each hyperplane, the worst point of attack for the disturbance force
will correspond to a vertex of the polygon. Hence, the overall worst point
of attack will also be a vertex.
Since it is sufficient to consider only the vertices for each polygon
face, we conclude that we only have to consider the vertices of a polyhe-
dral body. Note that this conclusion is independent of the friction cone
opening angle: Narrowing the friction cone will only reduce the num-
ber of polygon faces that has to be considered for each disturbance force
direction.
7.4 Examples
To illustrate the procedure proposed in this chapter, it was tested on two
small problems. For both problems, the coefficient of friction was set to
0.3, and each friction cone was approximated by eight force vectors. The
resolution for the disturbance force directions are set to 10◦ for both ϕ and
θ, resulting in a total of 578 force directions. This simple discretization
scheme will produce more samples for the force directions near θ = 0◦
and 180◦ . A uniform, and therefore more efficient, discretization could be
made by choosing the force directions as the vertices of a geodesic dome.
When evaluating the grasp, the coefficient of friction for the disturbance
force was set to 1.5. Note that the friction cone for a vertex can be
defined somewhat arbitrarily. Here it is defined as the union of the cones
belonging to the planes forming the vertex. An additional cone, in the
direction of the averaged vertex normal is also added.
z z
y y
x
x
and take longer time to compute. The proposed algorithm breaks this
pattern: Task information is easily included by removing object vertices
that are unlikely to be in contact during the task, thereby reducing the
size of the problem.
One of the main features of the proposed evaluation procedure is that
the torque-component of the wrench space is used implicitly, thereby
reducing the result to a 3D surface in force space. This surface is easily
interpreted and visualized. Furthermore, in the case of p a zero offset
wrench, this surface is bounded by a sphere with radius 1 + µ2 .
An algorithm solving the resulting min-max problem was also pro-
posed. It was shown that introducing a sorting procedure greatly reduced
the complexity of the algorithm. Further improvements would involve
using more uniformly distributed force directions (e.g., the vertices of
a geodesic dome) and downsampling of detailed objects (to reduce the
number of vertices that has to be considered).
Offset wrenches, like gravitational forces, are easily taken into ac-
count. In some cases the offset wrench can even have a stabilizing effect
on the grasp, as when carrying a plate in the open palm; the weight of
7.5 Chapter Summary 175
0.1 0.1
0.05 0.05
Fz Fz
0 0
−0.05 −0.05
−0.1 −0.1
−0.15 −0.15
−0.1 −0.05 0 0.05 0.1 −0.1 −0.05 0 0.05 0.1
Fx Fx
0.1 0.1
0.05 0.05
Fy 0 Fy 0
−0.05 −0.05
−0.1 −0.1
−0.1 −0.05 0 0.05 0.1 −0.1 −0.05 0 0.05 0.1
Fx Fx
0.1 0.1
0.05 0.05
Fz 0 Fz 0
−0.05 −0.05
−0.1 −0.1
−0.15 −0.15
−0.2 −0.2
−0.25 −0.25
0.08 0.08
0.04 0.04
Fy 0
Fy 0
−0.04 −0.04
−0.08 −0.08
−0.12 −0.08 −0.04 0 0.04 0.08 0.12 −0.12 −0.08 −0.04 0 0.04 0.08 0.12
Fx Fx
Figure 7.7: The left-hand figures show two views of the result for the
grasp with four point contacts in Figure 7.5 (a). The right-hand figures
show the result of adding one extra contact, working against the gravity,
see Figure 7.5 (b). The extra contact causes the surface S to expand
in all directions, except for the positive z-direction. As expected, the
expansion is largest in the negative z-direction. For both grasps, the
gravitational force was 0.1 and the coefficient of friction was 0.3.
Chapter 8
Summary and
Suggestions for Future
Work
This chapter briefly summarizes the key ideas presented in this thesis.
The summary of each chapter ends with suggestions for future work and
research.
Chapter 2
Chapter 2 presented the design of CoPP, an object-oriented framework for
path planning. The framework can be considered as a set of LEGO blocks
that can be used to build path planners; the blocks can be combined in
many different ways and blocks that share the same same interface can
replace each other. The most important aspects of the framework are:
Future work for this framework would involve adding classes for dy-
namic systems. With this addition, the framework could be used for
studying methods for kinodynamic motion planning as well.
180 8 Summary and Suggestions for Future Work
Chapter 3
• Joint couplings
The possibility to compose complex robots from simpler ones has several
benefits. With a complex robot it might not make sense to consider all
degrees of freedom at the same time. With a hierarchical composition it
is easy to extract the sub-robot that involves the degrees of freedom we
are interested in. The resulting modularity also makes it easy to change
various part of a robot. One can for example change the end-effector tool
of a robot from a parallel-jaw gripper to a welding tool.
Note that the CoPP framework does not hinge upon this robot class;
other models of motion are easily added to the framework.
A possible improvement for the robot class would be to allow other
types of joint couplings. Specialized joint couplings could be used to
model slider-crank mechanisms common in, e.g., excavator arms.
Chapter 4
Chapter 5
Chapter 5 presented a high-level planner for pick-and-place tasks. Such
tasks were divided into three phases: approach, transport, and return. A
bidirectional RRT-planner was used to find motions for each phase. The
contributions of this chapter are:
• The use of multiple goals in case many arm configurations can reach
the same grasp.
Chapter 6
The last part of the thesis shifted focus towards grasp planning and grasp
stability. Chapter 6 presented a fast grasp planner for a three-fingered
hand. The input to the planner is only a 2D contour, with optional depth
information. Even though the input may be purely two-dimensional, the
grasp planning takes place in 3D. The grasp planner works by quickly
generating grasp hypotheses, which are evaluated in a step wise fashion; a
hypothesis that does not pass an intermediate evaluation step is idle until
no better grasp is found. This way, the most computationally expensive
steps are postponed until the planner has found a really promising grasp
hypothesis.
Considering noisy sensor data and modeling errors, planned grasps
should be robust against positional and geometric errors. The planner
explicitly tests a grasp’s robustness by disturbing the grasp configuration
in a fixed number of directions. The number of valid “neighbors” is used
as a measure of the grasp’s robustness.
182 8 Summary and Suggestions for Future Work
In most cases, the 2D contour that is the input to the planner corre-
spond to the top view of the object. To be more flexible, however, the
grasp planner should be combined with a view planner that plans which
view of the object should be used to extract the 2D contour.
Chapter 7
Chapter 7 presented a novel approach to grasp stability evaluation. The
approach is based on the ability of a grasp to reject external disturbance
forces. The advantages compared to other approaches are:
We do the opposite to stress that it is the operations of a class, and not its member
variables, that are the most important.
184 A Class Diagram Notation
AbstractClassName ConcreteClassName
AbstractOperation(Type arg) ConcreteOperation1(arg1, arg2)
Type ConcreteOperation( ) Type ConcreteOperation2( )
member_variable1
Type member_variable2
shapes
Drawing Shape
Person
(b) Company
the services provided by its employees. In the example in Figure A.2 (a),
we allow polygons to be textured. As texture objects consume much
memory, a possible optimization is to let polygons with identical tex-
tures share a common texture object. Thus, polygon objects do not own
the texture objects, they only use them. This type of relationship is de-
noted acquaintance, see [126, 45]. Acquaintance merely knows of another
object, thus it is a much weaker relationship than aggregation. In class
diagrams, acquaintance is depicted just as aggregation, except that there
is no diamond shape at the base of the arrow, see Figure A.2.
Although aggregation and acquaintance are usually implemented the
same way, they may lead to different semantics. Consider again the exam-
ple of the company class diagram in Figure A.2 (b): Had the relationship
between a company and its employees been aggregation instead of ac-
quaintance, the meaning would be that companies are like labor camps,
from which the employees never leave.
Complex
RealT Abs( ) ClassName
RealT Imag( ) + PublicOperation( )
... # ProtectedOperation( )
RealT real - PrivateOperation( )
RealT imag $ ClassOperation( )
(a) Template class (b) Access modifiers
The OMT class diagram notation does not have any construct for
C++ constructs like template classes and template functions. To show
that a function is a template function, we use the convention to let the
name of the type of its argument end with the letter “T”. We use the
same convention to show that a class is a template with respect to a
certain type. As an example, consider the class Complex for representing
complex numbers. We can implement this class as a template, where
the numeric type of the real part and the imaginary part is the template
argument, see Figure A.3 (a).
For some design patterns, the access right to the participating class
methods are an important part of the pattern. Where it is important to
convey access rights, we have borrowed the conventions from the UML
notation [127]. See Figure A.3 for a description of this notation. In case
these symbols are omitted, it can be assumed that all class methods are
public. Member variables are always private, thus no access symbols are
used for these.
In this thesis we use class diagrams to illustrate the high-level design
of classes. Thus, the class diagrams never list every operation of a class.
Instead they list the operations that are most important to convey the
design of a class.
Appendix B
Rigid Body
Transformations
ŶA
P
A P
B
FB
PB,ORG
A
X̂B
ẐA
ẐB
ŶA
X̂A
FA
A
P = A B
BT P. (B.1)
Note that the position vectors must be expressed in homogeneous coordi-
nates, i.e., P = (x, y, z, 1)T (if we assume no scaling). The notation with
leading sub- and superscripts might seem awkward at first sight, but it
provides a simple mnemonic that help us in writing transform equations
correctly–a subscript must match the following superscript. If we only
consider rigid body transformations, then the homogeneous transforma-
tion matrix has the following structure:
A
BR PB,ORG
A
A
T = , (B.2)
B
01×3 0
where A BR is the 3 × 3 rotation matrix that describes the orientation of
frame FB with respect to frame FA and APB,ORG denotes the origin of
frame FB , see Figure B.1.
In fields like computer graphics, the last row can be used for perspec-
tive transformations. For our purposes however, the extra dimension can
be seen as a construct that allow us to treat rotations and translations in
B.1 The Homogeneous Transformation Matrix 189
0 j−2 j−1
0 T 1T · · · j−1T
W
j T =W jT (B.5)
Note again how the notation help us to get the order of matrix multipli-
cations right.
As an example, consider Figure B.2 where a robot is about to grasp
a cylinder. For the sake of clarity, the robot arm, to which the gripper is
attached, is not shown. Using Equation (B.5), we have obtained the pose
of the hand, W E T , where subscript E stands for end effector. We assume
that the pose of the cylinder is known, and we denote it W C T . To plan
the task, we would like to know the pose of the cylinder in terms of the
end-effector frame. Using Equation (B.5), we readily obtain
E E W
C T = WT T,
C (B.6)
−1
where T =
E
W
W
ET .
190 B Rigid Body Transformations
FE
FC
A
P = A B
BT P +
A
PB,ORG . (B.7)
So, considering the absence of scaling and perspective transformations,
the interface of the class Transform in CoPP follows the form of Equa-
tion (B.1), whereas the implementation uses Equation (B.7).
So far we have only discussed the transformation of position vectors,
which according to Equation (B.7) are transformed by both a rotation
and a translation. However, sometimes we want to transform free vectors,
i.e., vector quantities like velocities and forces. Transforming such vectors
according to Equation (B.7) would be an error, because they should be
rotated only. If we use the homogeneous transform, this can be solved if
we follow the convention that free vectors have the following homogeneous
B.2 Representing Rotations 191
In addition the class also provides several useful routines, such as conver-
sion from and to the equivalent axis-angle representation of the rotational
part.
B.2.3 Quaternions
In this section we give a brief introduction to quaternions and the benefits
from using them to represent rotations. For more details, see, e.g., the
book by Kuipers [73].
Quaternions are a generalization of complex numbers and can be used
to represent rotations in much the same way as complex numbers on
the unit-circle can be used to represent two-dimensional rotations. A
quaternion h can be written as a linear combination
h = w · 1 + xi + yj + yk, x, y, z, w ∈ R,
194 B Rigid Body Transformations
h1 · h2 = (w1 w2 − v1 · v2 , w1 v2 + w2 v1 + v1 × v2 ). (B.8)
Note that due to the cross-product in Equation (B.8), quaternion multi-
plication does not commute.
B.3 Summary
We have introduced a class Transform that follows the syntax of the
homogeneous transformation matrix. It is used to specify the pose of
geometric objects, to represent moving coordinate frames, and in the
communication with collision detection algorithms. The class can also
be used for transformation of vectors, and here position vectors and free
vectors are transformed differently.
For problems involving kinematic chains, the class Transform is also
used. For rigid body problems, however, we recommend using quater-
nions to specify the rotation. We summarize the following advantages of
using quaternions to represent rotations:
Geometric
Representations
• Many graphical modelers and CAD tools can give output in this
format.
2
3
[0] [1]
1 4
0 6
[2] face arr = [0, 1, 2, 3, −1,
7
1, 4, 2, −1,
5 7, 5, 6, −1]
Figure C.1: Example of and indexed face set with three faces. Note
that the faces need not be joined together.
decided that CoPP should provide both triangle sets and convex polyhe-
dra as geometric representations. We also want to provide a geometric
type to represent unions of convex polyhedra.
• The book-keeping methods for vertices and faces are used by all
derived classes, thereby promoting code reuse.
• Functions that operate on indexed face sets will also work with
triangle sets and convex polyhedra. This is the case, for example,
for the visualization functions; the same function can be used for
drawing both triangle sets and convex polyhedra.
Geom
GetPose( )
Move(Transform t) material
Move(Vector args) Material
SetMoveFormat(format)
GetBoundingBox(…)
pose
AttachToFrame(…) Transform
Accept(GeomVisitor v)
IndexedFaceSet
+ GetVertices( )
+ GetFaces( )
# AddVertex(…)
# AddFace(…)
vertices
faces
ConvexABC TriangleSet
ConvexABC(points) AddTriangle(…)
GetHalfEdges( ) GetTriangles( )
half_edges
GeomConvex GeomConvexGrp
GeomConvex(points) GetParts( )
parts
Figure C.2: Class diagram for the geometry classes currently imple-
mented in CoPP.
3
2
1
0
he arr = [4, 8, 12, 16, 1, 2, 3, −1, 0, 2, 3, −1, 0, 1, 3, −1, 0, 1, 2, −1]
Figure C.3: An example showing the half edge structure for a tetra-
hedron. The first four entries in the array are offsets for each vertex
entry.
the software package Qhull. CoPP uses Qhull, but to make it easier to
use, CoPP provides a wrapper class called Hull3D.
Without any additional topological information, the class for convex
polyhedra is nothing but an indexed face set whose faces happen to form a
convex polyhedron. Clearly there should be more topological information
we can utilize for such a special instance of an indexed face set. One of
the most popular collision detection algorithms for convex polyhedra is
GJK, originally proposed by Gilbert et al. [48]. The original algorithm
used as input only two set of vertices, one for each convex hull. Later
Cameron [27] showed that the algorithm could be more efficient if it is
provided additional information in the form of an adjacency structure
called half edges. The half edge structure tells us which vertices we
can reach in one step by following the edges leading from the vertex
we are currently standing at. This type of adjacency information has
also been used by Sato et al. [129], Ong and Gilbert [111, 112], and
several others. Therefore it was decided that this information is part
of the representation of convex polyhedra. The half edge structure can
be implemented in many ways, for example as a linked list as in [129].
Here we choose to use an array of indices into the vertex array. An
example of how the half edge structure looks like for a tetrahedron is
shown in Figure C.3. The concrete class GeomConvex in Figure C.2 is
used to represent convex polyhedra. Its constructor accepts a set of
points and invokes Hull3D to compute the corresponding convex hull.
It is also seen in Figure C.2 that GeomConvex inherits from an abstract
class ConvexABC. The reason for this intermediate class will be explained
in the next section.
202 C Geometric Representations
object is the closure of the group. The IS-A relationship suggest that
we should use inheritance. However, inheriting directly from a concrete
class such as GeomConvex will often lead to constrained class hierarchies
and subtle traps that are easy to fall into. See, e.g., the book by Mey-
ers [98] why inheriting from concrete classes should be avoided in general.
Therefore a new abstract class should be introduced, ConvexABC, from
which all other convex objects inherit. With this step taken, we could
choose to implement hierarchies of convex objects using the Composite
pattern [45], see Figure C.4 (b), or we could use the approach shown
in Figure C.4 (c). The implementation using the Composite pattern is
appealing because of its clarity in showing the recursive definition of the
composite objects. However, as clients use GetParts, they will not know
wether these parts themselves are compositions. After all, the intent
of the Composite pattern is to hide from clients wether or not they are
dealing with composites. The Composite pattern would have been a good
solution if collision detection was part of the ConvexABC itself; as each
object know its own type, it can take appropriate actions. Because the
philosophy in CoPP is to enforce a good separation of concerns, collision
detection is not part of the geometry interface. Therefore the Composite
pattern is not appropriate in this situation. Note that the solution to add
the method GetParts to the interface of ConvexABC is to be considered
bad design in this situation: It is not an intrinsic property of convex
objects to have parts. (Unless those parts are vertices, faces, etc.)
In the third design, shown in Figure C.4 (c), the recursive definition
is carried out on the class itself. As in Figure C.4 (b), the data struc-
tures needed to represent a single convex polyhedron is obtained through
the inheritance relationship. In the general case this derived part is the
closure of the group. A single convex polyhedron will be modeled as a
group with no parts. In this case, the closure and the actual polyhedron
are the same.
The designs in Figure C.4 (a) and in Figure C.4 (c) are both sound
and the main difference lies in how we view the closures. However, it was
found that the class diagram in Figure C.4 (c) leads to several implemen-
tation advantages such as more code reuse and simpler code. Therefore
the class GeomConvexGrp was chosen before GroupOfConvex.
204 C Geometric Representations
GroupOfConvex parts
GetClosure( ) GeomConvex
closure
GetParts( )
(a)
ConvexABC
(b)
ConvexABC
GeomConvex GeomConvexGrp
GetParts( ) parts
(c)
the parts with their closure. In this particular case, the closure becomes
extremely simple; except for a few extra vertices due to diagonal struts,
the closure becomes a regular box. The shelf, together with a slightly
transparent closure, is shown in Figure C.5 (b).
For the purpose of efficient collision detection, however, this arrange-
ment of the parts is far from optimal; once an object penetrates the
closure of the shelf, there are again 20 more collision checks to perform.
Because the sides of the shelf contain six parts each, and they are com-
pact, i.e., they are rather box-like themselves, it is a good idea to let each
side be a group of its own. The resulting model is shown in Figure C.5 (c),
where the outermost closure has been made totally transparent to bet-
ter show the inner closures. The GeomConvex object for the shelf will
now require storage for three more convex polyhedra in addition to the
original 20. This is a typical example of the tradeoff between memory
and speed. In general though, the memory overhead for this hierarchical
representation is modest, because the number of vertices in a closure is
usually much lower than the sum of the vertices of the parts that form
it. Admittedly, the book shelf example is a bit extreme in this respect:
All the parts have together 160 vertices, whereas the closure has only 16.
In a typical pick-and-place task involving an object placed on one
of the shelves, there would be many configurations where the closure of
the shelf is penetrated by both the robot arm and the grasped object.
To confirm a collision-free configuration that places the robot between
two shelves, the arrangement in Figure C.5 (c) will require 11 collision
tests. Without the two inner closures, such configurations would require
21 collision checks. Thus, the added complexity in the arrangement of
the parts pays off in terms of less collision checks. Of course there exist
degenerate cases as well: Consider a cylindrical rod that is long enough
to collide with the closure of both sides at the same time. If the rod
is placed between two shelves such that it touches the closures of both
sides, then 23 collision checks will be required to confirm a collision free
configuration.
The two steps required to make a hierarchical model as the one just
shown, decomposition into convex parts and grouping, are done manually
in the description file. It would of course be nice if at least one of the
steps could be automated, but the study of such algorithms is beyond
the scope of this thesis.
(a) (b) (c)
Pairwise Collision
Detection with PQP and
GJK
one at a time and then tell PQP to build the internal representation.
Thus, any object that we can triangulate can be used with be PQP.
The class that encapsulates PQP for a pair of geometric objects is called
GeomPairPQP, and it inherits from WitnessPair, see Figure D.1.
Objects of type GeomPairPQP always reference a pair of geometric ob-
jects. The best way to enforce this constraint is to only define construc-
tors that require pairs of geometric objects. The question is, what type
should these geometric types have? A straightforward solution would be
to introduce a new geometric class (derived from Geom) that internally
stores a PQP Model. This geometric class would then be compatible with
PQP. However, this approach would go against one of the main goals with
the CoPP framework, namely that changing an algorithm should require
as little work as possible. If, for one reason or another, a user would
like to change to another collision detection algorithm, then she has to
change the geometric type as well. If the geometric type is changed,
chances are that other things will change too. This chain reaction is
caused by the, rather unnecessary, coupling between the geometric type
and the collision detection algorithm via the type PQP Model. In princi-
ple, we should be able to use GeomPairPQP with any geometric type that
can be triangulated. If we know how to triangulate a certain geometric
type, then it is easy to provide a conversion from that type to PQP Model.
Thus, the constructor of GeomPairPQP should accept any combination of
geometric types for which both types support this conversion. We could
achieve such a generous constructor if we overload it on every combina-
tion we are interested in. However, this solution is not very scalable as
the interface of GeomPairPQP would become very cluttered by all the dif-
ferent constructors. Furthermore, as users add new geometric types that
they want to use with this class, they will have to add the appropriate
constructor too, which is definitely not what most users would expect.
There is a solution to the problem with the cluttered interface of
GeomPairPQP, which also minimizes the coupling between this class and
classes for geometric types. This solution builds on the template facility
available in the C++ language. For every geometric type that we want
to use with PQP, there has to exist an overloaded version of the function
CreatePQP:
CollisionPair
+ Collides( ) if (DoCollides()) {
+ GetNumCollisions( ) ++num_collisions;
return true;
- DoCollides( ) }
num collisions return false;
DistancePair
+ Collides(tolerance)
+ Distance( )
+ DistanceSqrd( )
- DoCollides(tolerance)
WitnessPair
+ GetClosestPoints(p1, p2)
PairGJK_Base GeomPairPQP
PairGJK_Base(ConvexABC a, GeomPairPQP(GeomT1 a,
ConvexABC b) GeomT2 b)
SetToTracking(…)
bool UseSimplex( )
GetSimplex( )
SimplexGJK last_result
bool is_tracking
LeafPairGJK MixedPairGJK
LeafPairGJK(ConvexABC a, MixedPairGJK(GeomConvex a,
ConvexABC b) GeomConvexGrp b)
impl
ConvexGrpPairGJK GeomPairGJK PairGJKBase
The ellipses indicate that we would add an overloaded version for ev-
ery new geometric type. Note that these functions are not part of the
interface for any geometric type, hence no unnecessary couplings are in-
troduced. Now we can make the constructor to GeomPairPQP a C++
template, with the geometric types as template arguments. At compile
time, the compiler will look for the appropriate versions of CreatePQP
and generate the needed constructors for us. In effect, we have got n2
constructors for the price of one, where n is the number of overloaded
versions of CreatePQP.
A final note on this class: As each PQP Model can consume large
amounts of memory, creating two such objects for every GeomPairPQP
would be very wasteful. Therefore, the class uses reference counting
techniques to ensure that only one PQP Model per geometric object is
created.
find the pair of leaf objects that minimize the distance. Collision detec-
tion queries are made much faster, because parts of the hierarchies can
simply be skipped if their bounding spheres or closures do not intersect.
The class ConvexGrpPairGJK deals with pairs whose geometries are com-
positions of convex polyhedra. It also inherits from PairGJK Base. For
pairs of hierarchical objects, the cached information only applies to root
object of each hierarchy.
template<GeomT1, GeomT2>
DistancePair* CreateDistancePair(GeomT1& a, GeomT2& b);
This function would lack an implementation for the general case, but
for every combination of geometric types that we are interested in, there
would exist specialized versions of the function template. Suppose now
that we change the geometric type in our code from, say, TriangleSet to
GeomConvexGrp, then the compiler would automatically switch to another
specialization of CreateDistancePair. Thus, the change of geometric
type would require a minimum of changes in the source code. The use of
a factory method also opens the door for other possibilities: Suppose that
experiments show that Enhanced GJK is faster than PQP on queries in-
volving compositions of convex polyhedra that are not too complex. Then
we could implement the factory method to produce ConvexGrpPairGJK
in situations that favor Enhanced GJK, and GeomPairPQP otherwise.
Appendix E
Framework Details
In this appendix we cover some details of the CoPP framework that are
not covered elsewhere. Section E.1 presents a class for representing paths
in the configuration space. By delegating some of its behavior to metric
objects and interpolation objects, this class can model many different
types of paths. Section E.2 deals with binary constraints, which are
useful abstractions, allowing a planner to transparently handle different
types of constraints.
Section E.3 introduces a problem class, which is used to hold the
definition of a path planning problem. In addition to robots, and obsta-
cles, object of this class can also contain information about which metric,
interpolation method, or sampling method to use.
To avoid introducing platform dependencies, CoPP uses the platform
independent VRML-file format to store animations. Section E.4 gives an
example on how the Visitor pattern [45] can be used to make a function
behave as if it was virtual with respect to a class hierarchy, without
being part of it. Here the pattern is used to allow a uniform interface for
drawing objects of different geometric types.
The last section gives an example of a robot description file for a
simple SCARA robot.
Path
AddViaPoint(time, config)
Length(metric)
Sample(time, config, interpolator)
BinaryConstraint
+ bool IsSatisfied(config)
+ GetNumQueries( )
+ GetNumSatisfied( )
+ Clone( )
- bool DoIsSatisfied(config)
num_queries
num_satisfied
OrientationConstr CollFreeConstr
Vector3D global_dir agent objSet
Vector3D local_axis
max_dev_angle AgentT ObjSetT
robot objSet Move(config) IsCollisionfree( )
Robot ObjectSet
Figure E.2: Class diagram for different types of constraints. Note that
CollFreeConstr is a class template.
category. To shorten the development time and let developers focus more
on their algorithms, the CoPP framework provides a class representing
a basic path planning problem. This class, named PathPlanProblem,
is completely decoupled from any path planner classes and should be
thought of more as a utility class. Given a problem definition file, the
constructor takes care of building the required objects. In the case of an
error the current parser will provide a, hopefully, helpful error message.
In addition to the robot and the geometric objects, the problem class
also contains formatted information about things like: which metric to
use; which sampling strategy to use, which interpolation method to use,
and the maximum allowed step size. How this information should be
interpreted and used is up the specific application; the problem class is
thus like a recipe, from which we build the desired path planner. With
this approach, every component of the path planner can be specified in
a configuration file.
E.4 Visualization 217
PathPlanProblem<TriangleSet> problem1(file1);
PathPlanProblem<TriangleSet, GeomConvex> problem2(file2);
In the first case, TriangleSet is used for all geometric objects, whereas
the second case uses GeomConvex for the robot links. Note that changing
template parameters also changes the parsers used internally by the class.
This makes it easy to extend the class for use with other geometric types;
just provide a parser with a conforming interface for the new geometric
type.
E.4 Visualization
Visualization of the output from a path planner is an important part of a
path planning application. It is, however, most often the parts concern-
ing visualization that cause existing frameworks to be system dependent,
thereby reducing the number of potential users. To avoid that situa-
tion, we choose to output animations in the VRML file format. See the
textbook by Ames et al. [8] for an excellent introduction to VRML. As
there are free VRML viewers for the most common operating systems,
this solution avoids most of the portability issues. Furthermore, stor-
ing animations in VRML files means less work for the user; saving an
animation with traditional graphical user interfaces requires the user to
convert the ongoing animation on the screen to, e.g., the mpeg format.
With the approach taken here, the VRML file itself is a perfect format
for the animation; put on a web-page, it will provide more information
218 E Framework Details
GeomVisitor
+VisitTriangleSet(obj)
+VisitGeomConvex(obj)
+VisitGeomConvexGrp(obj)
VrmlGraphics
+Draw(Geom geom) geom.Accept(*this);
+Draw(Robot robot)
…
-VisitTriangleSet(obj) draw TriangleSet
-VisitGeomConvex(obj)
...
Draw(aTriangleSet)
Accept(*this)
VisitTriangleSet(*this)
DrawTriangleSet(aTriangleSet)
Draw(aGeomConvex)
Accept(*this)
VisitGeomConvex(*this)
DrawGeomConvex(aGeomConvex)
• Joint couplings
• Robot composition
Figure E.5: An example of a robot description file for a robot with three
degrees of freedom. The resulting robot is shown in Figure E.6.
Figure E.6: A simple SCARA robot with three degrees of freedom. The
world frame and the end-effector frame are also drawn.
Appendix F
At this point we have covered most of the building blocks of the CoPP
framework. There are, however, still a few functionalities that has to
be added to make the framework easy to work with: random number
generators, matrix classes, graph classes, and parsers. On the one hand,
as we do not want to reinvent the wheel, such general purpose code should
come from existing public-domain libraries. On the other hand, we also
want to keep the number of external libraries to a minimum. This makes
the Boost libraries an ideal candidate, as it provides all the functionalities
listed above. The Boost web-site1 provides free peer-reviewed portable
C++ libraries. The quality of the libraries is of very high standard; some
of them will be included in the C++ Standards Committee’s upcoming
C++ Standard Library Technical Report as a step toward becoming part
of a future C++ Standard.
The next two sections will describe two Boost libraries that are partic-
ularly useful in CoPP: the Bost Graph Library, and the parser generator
framework Spirit.
for all graphs. The Boost Graph Library (BGL) [133] provides generic
graph algorithms for constructing, modifying, traversing, and searching
graphs. Using template parameters, users can determine the data type
stored in the graph vertices and the graph edges, respectively. Other tem-
plate arguments can be used to determine wether the graph should be
directed or not, and other properties such as the underlying memory rep-
resentation. Experiments by Lee et al. [84] showed that for breadth-first
search, depth-first search, and Dijksktra’s algorithm, the BGL2 was 5 to
7 times faster than the purely object-oriented LEDA C++ library [97].
This is probably due to BGL relying on compile time polymorphism,
whereas LEDA relies on runtime polymorphism: Compile time polymor-
phism avoids the overhead of virtual function calls and allows the com-
piler to do more optimizations. The conclusion is that the efficiency and
the generic nature of BGL makes it an ideal choice for graph intensive
path planning methods like PRM.
F.2 Spirit
It is inevitable that initializing a path planning application is rather pars-
ing intensive; geometric models, robot descriptions, and problem defini-
tions are all usually loaded from text files. As the expressiveness of these
description files increases, so does the complexity of the corresponding
parser. For really small projects, it is often enough to hand-code a parser
using some ad-hoc approaches. However, if the project requires several
parsers, or if the language specification becomes complex, more system-
atic approaches are needed. The usual approach when writing parsers
is to use parser generator tools like YACC [61, 87] and ANTLR [114].
With these tools, the grammar of the language is specified in a special
file. The grammar definition is then used by the tool to generate C or
C++ code for the corresponding parser. Tools like YACC and its com-
panion FLEX [85] are well known, scalable and generate fast and compact
parsers. However, users that want to modify an existing parser, or write
a new one, might not be familiar with these tools. Furthermore, as the
output from YACC and FLEX is pure C-code, they are sometimes awk-
ward to work with if you want the result of the parsing to be a C++
object. Finally, the many extra files makes maintenance harder.
2 Before becoming a part of the Boost libraries, BGL was named GGCL, which
stands for Generic Graph Component Library. Thus, in [84] the name GGCL is used
instead.
F.2 Spirit 225
From this example we see that Spirit uses the [] operator to attach
semantic actions to a parser. Users can easily define their own semantic
actions, which they can attach to parsers.
Finally, to activate our parser we have to call one of the many parse
functions available in Spirit. If we assume that our input is stored in an
ordinary character array, we can wrap the parser call in a function as
shown in Figure F.1.
The function in Figure F.1 will consume the input in str as long as it
matches the specified grammar, or until the end of the string is reached.
The return value is a pointer that points to the last character that was not
consumed. Often the input contains characters that we are not interested
in, such as white space or comments. To decide which characters to skip,
the parse function needs a third argument, a skip parser that acts like a
filter on the input. In this example, we used the predefined space p as
skip parser. The space p argument causes our parser to simply ignore
any white space.
Hopefully this little example has conveyed the main ideas behind
Spirit. To conclude:
• Spirit is easy to learn and relieves users from the need to learn a
specific parser generator tool.
• The ability to define grammars directly in the C++ code leads to:
– Shorter development times
– Easier maintenance
F.2 Spirit 227
[109] C. L. Nielsen and L. E. Kavraki. A two level fuzzy PRM for ma-
nipulation planning. In IEEE/RSJ International Conference on
Intelligent Robots and Systems, volume 3, pages 1716–1721, Taka-
matsu, Japan, Nov. 2000.
240 References
[123] J. A. Reeds and L. A. Shepp. Optimal paths for a car that goes
both forwards and backwards. Pacific Journal of Mathematics,
145(2):367–393, 1990.
[133] J. G. Siek, L.-Q. Lee, and A. Lumsdaine. The Boost Graph Library.
The C++ In-Depth Series. Addison-Wesley, 2002.
[149] C.-H. Tsai, J.-S. Lee, and J.-H. Chuang. Path planning of 3-D
objects using a new workspace model. In IEEE International Sym-
posium on Computational Intelligence in Robotics and Automation,
pages 420–425, Alberta, Canada, July 2001.
[150] G. van den Bergen. A fast and robust GJK implementation for
collision detection of convex objects. Journal of Graphics Tools,
4(2):7–25, 1999.