Sampling-Based A Algorithm For Robot Path-Planning
Sampling-Based A Algorithm For Robot Path-Planning
net/publication/273133066
CITATIONS READS
44 1,432
2 authors:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Inna Sharf on 18 November 2015.
Abstract
This paper presents a generalization of the classic A* algorithm to the domain of sampling-based motion planning. The
root assumptions of the A* algorithm are examined and reformulated in a manner that enables a direct use of the search
strategy as the driving force behind the generation of new samples in a motion graph. Formal analysis is presented to
show probabilistic completeness and convergence of the method. This leads to a highly exploitative method which does
not sacrifice entropy. Many improvements are presented to this versatile method, most notably, an optimal connection
strategy, a bias towards the goal region via an Anytime A* heuristic, and balancing of exploration and exploitation on a
simulated annealing schedule. Empirical results are presented to assess the proposed method both qualitatively and quan-
titatively in the context of high-dimensional planning problems. The potential of the proposed methods is apparent, both
in terms of reliability and quality of solutions found.
Keyword
Sampling-based algorithm, path-planning, motion-planning, A*, RRT*, probabilistic completeness, simulated
annealing, robot manipulator, high-dimensional planning, optimization
Kavraki, 2010). The important concepts introduced include example, the strategies for the dynamic reconfiguration of
mainly the formal analysis of probabilistic completeness the motion-graph were applied in a straight-forward man-
and of space connectivity, as well as the more practical ner to an RRT planner (Ferguson et al., 2006). Similarly,
concept of monitoring the density of nodes in the motion- since the PRM algorithm requires a shortest-path algorithm
graph and leveraging that information to ‘‘push’’ the expan- to resolve a query once a sufficiently connected roadmap is
sion in useful regions. available, one can alternate the generation of the motion
This last concept will be critical in the present paper. graph and an anytime dynamic search through it, as in
Another early algorithm that builds on this idea is the (Berg et al., 2006). Yet another possibility is to use a
expansive spaces tree (EST) (Hsu et al., 1999) which gen- layered approach such as super-imposing a coarse discrete
erates a tree via random walks (or expansions) of existing search and a sampling-based planner (Plaku, 2012).
nodes in the tree, which are picked with a probability inver- More interestingly and more relevant to the present
sely proportional to the number of nodes in their direct paper is the concept of expanding the motion graph through
neighborhood. The formal analysis in the present paper will the shortest-path algorithm itself. One interesting example
rely heavily on the analysis of the EST algorithm, as pre- is the Flexible Anytime Dynamic PRM (FADPRM)
sented in the thesis (Hsu, 2000). Few later algorithms (Belghith et al., 2006) which uses a mixture of the density
directly fall under this category of algorithms, but two nota- metric (from PRM) and the AD* heuristic to select nodes
ble later developments are: the single-query bi-directional to be expanded by random walks. The FADPRM relies on
lazy planner (SBL) (Sanchez and Latombe, 2001) which many heuristics and fine-tuned parameters that make its
introduces many practical improvements to the EST algo- application rather difficult in practice. Another notable
rithm; and the Guided-EST algorithm (Phillips et al., 2004) example of driving the sampling with a shortest-path heur-
which introduces a more sophisticated density heuristic istic is the Utility-guided RRT (Burns and Brock, 2007).
which incorporates, among other things, the A* cost of the This method is especially relevant to the present paper
nodes. In that sense, the algorithm presented in this paper because it attempts to model the utility of the exploration in
arrives at a similar result, but with the critical difference a probability theoretic fashion. Building off a prior method
that the density and A* cost combination are derived from by the same authors (Burns and Brock, 2005) which
a generalization of the A* algorithm itself and are justified explored the idea of predictive statistical models to learn
by formal analysis. the topology of the free configuration-space, the Utility-
Another important early development in sampling-based guided RRT uses local predictions of the utility of explor-
motion planning was the realization that quasi-random sam- ing around a given node and try to sample along the least-
pling could, in general, be sufficient and beneficial to such explored directions therefrom. Similarly, the application of
algorithms (Branicky et al., 2001). By their nature, quasi- equivalence classes as a means to exhaust neighborhoods is
random samplers rely on a finite discretization of space, another predictive model that can be applied (Gonzalez and
with controlled interval sizes, and generate samples from Likhachev, 2011). In the present paper, we rely on a similar
the finite set which tends to produce more uniform distribu- idea by relying on the expected value of the total path cost
tions. In other words, this can be seen as a probabilistic dis- (as in the A* algorithm) with the aim of maximizing
covery of a finite motion graph, avoiding the problem of exploitation by sampling around optimal regions.
representing the complete motion graph in memory or hav- Coming back to the classical sampling-based
ing to traverse it entirely, while benefiting from its limited approaches, Karaman and Frazzoli (2011) have presented a
and uniform density, another important concept in the pres- very influential paper in which they describe three new
ent paper. algorithms, RRT*, rapidly-exploring random graph (RRG),
The present paper inscribes itself into the trend of bring- and PRM*, which are proven to be asymptotically optimal.
ing useful concepts from deterministic path planning into The RRT* is probably the most widely used sampling-
sampling-based methods. The classic deterministic path- based algorithm today and varies from the RRT mainly in
planning method is the A* algorithm (Hart et al., 1968) the fact that it keeps track of the accumulated travel cost
which relies on a best-first exploration of the motion-graph and performs optimal re-wirings to conserve a record of
to find an optimal path from a starting node to a goal node. the optimal path from the root to any other node. A
Then, two relatively recent modifications to this classic branch-and-bound strategy can also be added to the algo-
algorithm have been made. First, the Anytime A* algorithm rithm (Karaman et al., 2011). The argument behind the
was developed to speed up the initial search for a feasible branch-and-bound strategy is that nodes that cannot con-
path and then progressively improve it, and thus, achieving tribute to optimal paths (from start to goal) should be
an anytime behavior (Likhachev et al., 2003). Then, pruned from the motion tree to reduce wasted efforts.
dynamic reconfiguration of the motion-graph was added to In machine learning and numerical optimization litera-
create the Anytime Dynamic A* (AD*) algorithm ture in particular, there is a recurring theme, that of explo-
(Likhachev et al., 2005). ration versus exploitation. This is the problem of choosing
Some attempts have been made to incorporate these between broadening a search in order to discover all possi-
aforementioned deterministic algorithms more intimately ble solutions versus refining the current best solution(s)
with different kinds of sampling-based algorithms. For (e.g. gradient descent). This balancing act has been less
Downloaded from ijr.sagepub.com by guest on October 17, 2015
Persson and Sharf 1685
present in sampling-based motion-planning literature, path-planning approach. The A* search strategy has a num-
which has implicitly favored exploration since its inception ber of advantages, two of which are lacking in current
(Kavraki et al., 1996; LaValle, 1998) and sometimes expli- sampling-based approaches, that is, the search is focused
citly (Sucan and Kavraki, 2012), but there are notable on the current most promising path and the first solution
exceptions. In many practical implementations of uni- found is the optimal one (within the discretization).
directional RRT methods, the goal region is sampled on a However, the A* search is an exhaustive search over a
regular basis in a greedy attempt to grow the tree more rap- finite motion graph (e.g. a ‘‘grid’’), which is problematic
idly towards it. The exploration-exploitation tree (EET) since, by definition, a sampling-based approach involves
(Rickert et al., 2008) is a method which draws samples the construction of the motion-graph through sampling of
from a region around the goal, making incremental expan- the configuration space, and can thus be generated ad infi-
sions towards the obtained sample, and expanding or nitum. We tackle this problem by examining the underlying
shrinking the sampling region according to a heuristic rationale behind the A* algorithm, and we generalize that
based on the success rate of the expansions, thus balancing rationale, through local predictive statistical models, so that
exploration and exploitation via a larger or smaller sam- it is suitable for a sampling-based approach.
pling region around the goal.
Another interesting method, presented in Vazquez-Otero
et al. (2012), uses a dynamic reaction–diffusion process to 2.1 Overview of A*
expand a search for the goal location and then contract At the core of the A* search algorithm is a priority queue
while leaving a simulated tension between the start and which chooses the most promising node to visit. Visiting a
goal, resulting in an optimal path. This strategy is strangely node entails surveying its neighboring nodes, adding them
reminiscent of simulated annealing methods used in numer- to the priority queue as needed, and the process is repeated
ical optimization and certain machine learning algorithms. until the goal node is discovered. In other words, nodes of
The authors did not find any reported use of simulated the motion graph start with the label ‘‘undiscovered’’, then
annealing in a sampling-based motion planner, but it has become ‘‘discovered but not expanded’’ (or ‘‘open’’), and
shown to be useful when solving a path-planning problem finally, become ‘‘expanded’’ (or ‘‘closed’’). Needless to say,
as a general non-linear trajectory-optimization problem. the search strategy is very simple, yet very effective, in fact,
Most recently, the covariant Hamiltonian optimization for optimal under most usual conditions.
motion planning (CHOMP) method presents one state-of- Clearly, the algorithm is driven by a measure of how
the-art use of simulated annealing in motion planning ‘‘promising’’ the visit of a node is to the search. This mea-
(Zucker et al., 2013). We mention these methods mainly sure is obtained through an approximation that underesti-
because one of the central novelty in the present paper is mates the total travel cost when taking a path through a
the application of a simulated annealing strategy to balance given node. In concrete terms, given a node u, if we accu-
exploration and exploitation in the proposed sampling- mulate the cost to travel from the start to that node via the
based motion planner. shortest path through the discovered portion of the motion-
This paper is organized as follows. Section 2 presents a graph, denoting that cost as g(u), and then compute a heur-
generalization of the A* algorithm by deconstructing its istic value for the remaining cost to the goal, denoting it as
basal assumptions and casting them in a probability theore- h(u), we can obtain a lower-bound approximation of the
tic framework that can be used with local predictive models total cost as
to drive the expansion of a motion graph. We name this
method the sampling-based A* algorithm (SBA*) and pro- f (u) [ g(u) + h(u) ð1Þ
vide formal analysis to characterize its convergence. Then,
where the heuristic value h(u) is required to be equal to or
in Section 3, the practical SBA* algorithm is presented
less than the actual travel cost along a non-colliding path
with a number of refinements to it, notably, an optimal con-
from u to the goal node. Usually, h(u) is simply the ‘‘bird-
nection strategy, an anytime heuristic to provide a stronger
flight’’ distance, i.e. the distance between u and the goal
goal-bias, and the use of simulated annealing to balance
node in the configuration space when obstacles are ignored.
exploration and exploitation. Finally, in Section 4, results
Once the total travel cost can be approximated, the node
are presented to characterize the behavior of the proposed
with the least total travel cost is considered as the most pro-
algorithm in a cluttered environment, in high-dimensional
mising and, thus, the priority queue chooses the minimum
spaces, and for the practical application motivating this
element (i.e. a ‘‘min-heap’’). The optimality of the A*
work: motion planning for a seven-degree-of-freedom (7-
search hinges upon requirements on the heuristic value,
dof) manipulator to capture a free-floating target.
most notably, that it does not overestimate the remaining
cost (called admissibility) and that it is monotonically
decreasing as progress is made towards the goal (called
2. Generalizing the A* algorithm
consistency). A ‘‘bird-flight’’ distance in a configuration
This section outlines the process of generalization of the space with a proper metric automatically satisfies these
A* algorithm (Hart et al., 1968) to a sampling-based conditions, and since most path-planning applications, even
Downloaded from ijr.sagepub.com by guest on October 17, 2015
1686 The International Journal of Robotics Research 33(13)
in the most esoteric domains, have these properties already the value of r(u) reflects the expected total travel cost of a
for other reasons (e.g. having consistent neighborhoods), new path discovered that goes through node u. The empha-
the conditions for optimality of the A* search are easily sis here is on the fact that the partial path is new, and that it
met. is feasible, both encoded by whether the node u belongs to
The A* algorithm is well-known to most researchers in the OPEN set or not, since a node will not be in the OPEN
the field, and we will not provide further detail on it in this set if it is unreachable by a collision-free path or has
section. The key elements to keep in mind are, first, the already been expanded.
idea of choosing the most ‘‘promising’’ nodes and, second, We define two types of events that can occur during a
approximating the total travel cost by a combination of the visitation.
accumulated cost from the start node and the heuristic eva-
luation of the remaining cost to reach the goal. The latter is NEW : A new partial path was constructed ð5Þ
simple to do and is available to virtually all application
FREE : A collision - free segment was discovered ð6Þ
domains. The former, however, will need further dissection
to make the A* search applicable as the driver for a This leads to the following redefinition of r(u):
sampling-based path-planning approach.
r(u) = P(NEW, FREEju) f (u) + (1 P(NEW, FREEju))‘
ð7Þ
2.2 Generalization of the node value function
The classic formulation of the A* search algorithm (Hart 1 if u 2 fOPENg
P(NEW, FREEju) = ð8Þ
et al., 1968) makes a silent assumption, that is, expanding 0 otherwise
a node is only useful if there is information to be gained
from expanding it. This can seem like a trivial point, but it where P(NEW,FREE ju) is the probability that, given a
becomes important in a sampling-based approach. Given node u, a visitation will yield a new and collision-free path,
that the A* search is normally conducted on a finite motion which, in a classic A* algorithm, is encoded by the OPEN
graph, upon the first examination of a node, one can set.
exhaustively survey its neighborhood, after which there is Clearly, r(u) is an expected value from a binomial distri-
obviously no more information to be gained from that bution, which we can express as such:
neighborhood, and the node can be ‘‘closed’’ and never
r(u) = EN (u) ½f (u) ð9Þ
expanded again.
If one were to simply take an A* search algorithm and which is the expected value of f(u) in the neighborhood of
replace the exhaustive survey of the neighborhood by the u, when rejecting nodes that are not reachable through a
generation of samples within the nearby configuration collision-free path or have already been explored. This for-
space, the algorithm would make no progress at all, since it mulation is clearly far more general, and now holds more
would repeatedly choose nodes from the same area and hope in transposing the A* search method to sampling-
generate more samples in that neighborhood, ad infinitum. based path planning.
We need a measure of the information gain to be expected For any continuous probability P(NEW,FREE ju), we
from generating a sample in a given neighborhood, that is, obtain the following expression for r(u):
a measure of how exhaustively searched the neighborhood
of a node is. In other words, we must depart from the bin- f (u)
ary concept of ‘‘open’’ versus ‘‘closed’’ nodes from the clas- r(u) = ð10Þ
P(NEW, FREEju)
sic A* algorithm, and generalize the concept to a more
appropriate model of information gain. which, as expected, goes to infinity as the probability of
In more concrete terms, the A* search algorithm chooses discovering a new, collision-free node in the neighborhood
the node ui to expand next based on the following criteria: N (u) goes to zero. In addition, we note that, given the defi-
nition of the NEW and FREE events, they are independent
ui = arg min f (u) with u 2 fOPENg ð2Þ distributions, meaning that we can do the following:
u
which could be reformulated by defining a new node value P(NEW, FREEju) = P(NEWju) P(FREEju) ð11Þ
function:
In a classic A* search algorithm, any node belonging to
the OPEN set is given a priori values of 1 for the probabil-
f (u) if u 2 fOPENg
r(u) [ ð3Þ ities P(NEW ju) and P(FREE ju), and given that a visitation
‘ otherwise
is an exhaustive search in the finite neighborhood N (u),
ui = arg min r(u) ð4Þ when the node is closed, those probabilities drop to 0. In a
u
sampling-based approach, this open–close transition is pro-
which clearly exposes the binary nature of the expected gressively achieved as the neighborhood N (u) gets filled
information gain from expanding a node. In other words, with nodes sampled from the configuration space.
Downloaded from ijr.sagepub.com by guest on October 17, 2015
Persson and Sharf 1687
2.3 Idealized sampling-based A* algorithm locally reachable from a point u, e.g. a hyperball around u
For analytical purposes, it is sufficient to assume a simpli- or a region constrained by controllability limitations.
The analysis relies on the concept of expansive spaces
fied algorithm, and therefore, Algorithm 1 presents a
(Hsu et al., 1999), and we begin with a brief reminder of
general random-expansion algorithm which is simply a
two fundamental definitions relevant to this concept, as
re-statement of Algorithm 4.1 of Hsu (2000) with the sam-
follows.
pling strategy substituted for the CHOOSEBESTSAMPLE func-
tion. The RANDOMWALK function is a simple random Definition 1 (b-LOOKOUT). Given a constant b 2 (0,1],
steering function that attempts to produce a new sample the b-LOOKOUT of a set S F is
unew in the neighborhood of uexp with a feasible path to it.
Then, the algorithm integrates that new sample to the b(S) [ fu 2 S j m(R‘ (u) n S) bm(R‘ (S) n S)g ð13Þ
motion graph. Finally, the iterations stop when a sample is
where m(S) denotes the volume of a set S and b(S) denotes
within the goal region.
its b-LOOKOUT.
This idealized algorithm is presented mainly for exposi-
tion purposes and to be used in the subsequent formal anal- Definition 2 ((a, b)-expansive). Given constants a, b
ysis, while the practical algorithm is presented in Section 2 (0,1], the set F is (a, b)-expansive if for every point
3. The key aspect of this idealized algorithm is choosing u 2 F and subset S R‘ (u), we have that
the ‘‘best’’ sample. From the previous section, it should be m(b(S)) am(S).
apparent how we would choose such a best sample for In less formal terms, the b-LOOKOUT denotes the region
expansion. Faithfully conforming to the classic A* search from which a significant portion of new space could be dis-
algorithm, we would choose the best sample to expand covered by a random walk or local sampling, while an (a,
based on the best expected value of r(u), as so b)-expansive space has the property of a lower bound on
the volume of the lookout region around any arbitrary
uexp = arg min r(u) ð12Þ point. Together, these definitions form the basis for con-
u2V
structing sequentially reachable chains of milestones that
Broadly speaking, this algorithm captures the essential explore the space (Hsu, 2000), making them an ideal basis
elements of the SBA* algorithm. Given the similarities with for the formal analysis to follow.
the EST algorithm (Hsu et al., 1999), we can consider that
the SBA* falls under that family of methods. The similari-
2.4.1 Unbiased sampling. We begin the analysis by con-
ties between these algorithms run deep enough, in fact, that
sidering the case of unbiased sampling, that is, considering
it is natural to express the formal analysis of the idealized
uniform exploration of the space. To start the construction
SBA* algorithm in terms of the formal analysis that sup-
of this formal analysis, we present a definition of the local
ports the EST family of algorithms.
Voronoi regions.
Definition 3 (Local Voronoi region). Given a point u 2 F
2.4 Formal analysis being the seed of a Voronoi region V(u), the local Voronoi
region V ‘ (u) is defined as
Given the similarities between EST and SBA*, we start the
formal analysis by building upon the work of Hsu (2000) V ‘ (u) [ R‘ (u) \ V(u) ð14Þ
on an idealized EST algorithm, to which we add a few rele-
vant definitions and lemmas to build a proof of probabilis- where R‘ (u) F is the locally reachable region around u.
tic completeness with a convergence rate of the idealized In other words, the local Voronoi region is the set of all
SBA* algorithm. In this section, we refer to the configura- points that are closer to u than to any other point and that
tion space as X and its collision-free subset as F X . are locally reachable from u.SThe local Voronoi region of a
Moreover, we use R‘ (u) F to denote the region of space point set V is then V ‘ (V ) [ u2V V ‘ (u).
Downloaded from ijr.sagepub.com by guest on October 17, 2015
1688 The International Journal of Robotics Research 33(13)
Z X m(V ‘ (u))
Lemma 1. Given a metric space (X , d) and a set V of c(G) = m(du0 ) ð20Þ
points u 2 X for which there is local reachable region G u2N (u0 ) m(R‘ (u))
R‘ (u) = fv 2 X j d(u, v) Rg, the local Voronoi space of
V (as per Definition 3) is coincident with the local reach- If we look at G R‘ (V ) with
able space of V, i.e. V ‘ (V ) = R‘ (V ). N = fu 2 V ju 2 Gg V , the integral c(G) reduces to
Proof. Since the Voronoi regions span the entire space X , a X m(V ‘ (u))
point can only be excluded from V ‘ (V ) if it is not reach- c(G) = m(R‘ (N )) m(R‘ (u) n G)
u2N
m(R‘ (u))
able from the point that is closest to it (the seed of the ð21Þ
X m(V ‘ (u))
Voronoi region), and because the local reachable regions + m(G \ R‘ (u))
R‘ (u) are hyperballs of equal radius, such a point cannot u2V nN
m(R‘ (u))
be reachable from any other point in V, and thus, must
necessarily be excluded from R‘ (V ) as well. That is, a By defining the sets I = fu 2 N j R‘ (u) n G 6¼ ;g and
point belongs to V ‘ (V ) if and only if it belongs to R‘ (V ). O = fu 2 V n N j R‘ (u) \ G 6¼ ;g, the bounds on c(G) can
It is important to note that Lemma 1 extends to a free be expressed as
configuration space F over a metric space, as long as F is
compact (cf. fully controllable state space). The proof is m(G) m(R‘ (I)) c(G) m(G) + m(R‘ (O)) ð22Þ
only more involved if one has to consider reachable regions
which are expected to be conservative bounds. More impor-
R‘ (u) that intersect the non-convex boundaries of F . For
tantly, the true integral always lies within a limited interval
that reason, we omit this generalization here.
around the uniform value, with both the upper and lower
Also, one should note that a consequence of Lemma 1
margins being of equal volume on average, meaning that
is that it presents an alternative to computing the volume of
the adopted sampling strategy approximates uniform sam-
R‘ (V ) by taking the sum of the volumes of the local
pling and is unbiased.
Voronoi regions:
The significance of Lemma 2 should be clear as it pre-
X sents the proposed sampling strategy as a viable alternative
m(R‘ (V )) = m(V ‘ (v)) ð15Þ
v2V
to uniform sampling in R‘ (V ), which is difficult in practice
(and for analysis). The local deviations of the integral prob-
which suggests a sampling strategy based on the fraction abilities merely imply that the distribution of samples devi-
m(V ‘ (u)) of the total volume m(R‘ (V )) as a basis for gener- ates from a uniform distribution symmetrically within a
ating nearly uniform samples in that region. reachable band around the boundary of G, but is not biased
by the density of the existing points in V inside or around
Lemma 2. Given a point set V where each point u 2 F
G. It is also worth noting that for most choices of G the
has a locally reachable region R‘ (u) F , then drawing a
deviations are limited to a small fraction of m(G). Lemma 2
sample u0 uniformly from the reachable neighborhood
allows us to prove the lemma that follows, which is key to
R‘ (u) of a point u chosen with probability:
obtaining a convergence rate and thus, probabilistic
m(V ‘ (u)) completeness.
P(u) = ð16Þ
m(R‘ (V )) Lemma 3. Given a point set V where each point u 2 F
has a locally reachable region R‘ (u) F , and that F is
yields a sample distribution that is an unbiased approxima-
(a, b)-expansive for some values of a, b 2 (0,1] (Hsu,
tion of a uniform distribution over R‘ (V ).
2000), then drawing a sample u0 uniformly from the reach-
Proof. The integral definition of a uniform probability dis- able neighborhood R‘ (u) of a point u chosen with
tribution over X is probability:
Z
m(V ‘ (u))
m(X ) p(x)m(dx) = m(G) forallG X ð17Þ P(u) = ð23Þ
G m(R‘ (V ))
which can only be true if the distribution is constant and yields a probability of sampling u0 in the b-LOOKOUTof
integrates to 1 over X. R‘ (V ) that is at least a.
Following the same logic, we can examine the following
Proof. Following the proof of Lemma 2, we can choose to
integral:
take the integrals c over R‘ (V ) n b(V ) and its complement,
Z where b(V) denotes the b-LOOKOUT of R‘ (V ), which sat-
c(G) = m(R‘ (V )) p(u0 )m(du0 ) ð18Þ isfy the following:
G
Z X c(b(V )) = m(R‘ (V )) c(R‘ (V ) n b(V )) ð24Þ
m(V ‘ (u))
c(G) = m(R‘ (V )) m(du0 ) ð19Þ
G u2N (u0 ) m(R ‘ (V ))m(R‘ (u)) c(b(V )) m(R‘ (V )) m(R‘ (V ) n b(V )) m(R‘ (O)) ð25Þ
c(b(V )) m(b(V )) m(R‘ (O)) ð26Þ Definition 4 (h-Entanglement). Given a free metric space
(F , ds ) (X , d) with connected components F 1 , . . . , F k ,
where O is the set of points from V that lie in the b- and some constant value h 1, then the free-space is said
LOOKOUT of R‘ (V ), which is empty, because, by definition, to be h-entangled if, for all i 2 [1,k], the total length of
points in the b-LOOKOUT must be able to reach beyond the collision-free path between any pair of points in F i is
R‘ (V ), which none of the points in V can do. Therefore, no more than an h multiple of the ‘‘bird-flight’’ distance
m(R‘ (O)) = 0. Furthermore, in a (a, b)-expansive space, the between that pair of points, i.e.
volume of the b-LOOKOUT is bounded below by a fraction a
of the total volume, meaning that c(b(V )) am(R‘ (V )), or, ds (u, v) hd(u, v) for all u, v 2 F i ð28Þ
in terms of the integral probability,
where ds(u,v) is the length of the shortest non-colliding
Z
c(b(V )) path between u and v, and d() denotes the ‘‘bird-flight’’ dis-
P(u0 2 b(V )) = p(u0 )m(du0 ) = a ð27Þ tance between two points in X .
b(V ) m(R‘ (V ))
Definition 4 characterizes the space in terms of how
which completes the proof. much more than a bird-flight distance one has to travel to
For analytical purposes, we use Algorithm 1, but substi- get from any point to any other within a single connected
tute, in place of the CHOOSEBESTSAMPLE function, the component of the free space. Naturally, one could trivially
‘‘Voronoi-sample’’ strategy described in Lemma 2. Because refine the definition to a single connected component F 0
the sampling strategy is the only relevant difference com- by saying that F 0 is h-entangled, or to a single-query prob-
pared to the idealized EST algorithm, we are able to re-use lem by considering only a single pair of points. The key
Theorem 4.3 from Hsu (2000) in its entirety by using here is that the h value gives a bound on how difficult the
Lemma 3 to arrive at the same probabilistic convergence problem can be compared with a trivial straight-line
rate, under the same assumptions. solution.
Theorem 4. Let g . 0 be the volume of the goal region in Corollary 5. Given a pair of points (uinit , ugoal ) 2 F 0 where
X and g be a constant in (0,1]. A sequence V of n nodes F 0 is an h-entangled connected component of F X for
generated by Algorithm 1 with the sample uexp chosen by some finite value h 1, let g be a constant in (0,1]. If we
the strategy described in Lemma 3 contains a node in the have a sequence V of n nodes generated by Algorithm 1 with
goal region with probability at least 1 2 g, if n (k/ the following modified expansion probability:
a)ln(2k/g) + (2/g)ln(2/g), where k = (1/b)ln(2/g).
1 m(V ‘ (u))
P0 (u) = ð29Þ
Proof. The proof for this theorem is identical to the proof to f (u) m(R‘ (V ))
Theorem 4.3 of Hsu (2000) since Lemma 3 establishes that
the probability of sampling a point in the lookout of R‘ (V ) where f(u) [ ds(uinit, u) + d(u, ugoal), then V will con-
is at least a as is required by the ideal sampler in that proof. tain a node in the goal region R‘ (ugoal ) with probability at
Theorem 4 is a significant result, even beyond the algo- least 1 2 g, if n (kh/a)ln(2k/g) + (2/g)ln(2/g), where
rithms proposed in this paper. The sampling strategy k = (1/b)ln(2/g)
adopted in the analysis can be related to many existing Proof. By definition of h-entanglement, the heuristic f(u)
sampling strategies and density metrics. Most notably, the is bounded as
RRT algorithms implicitly generate samples in regions
where the Voronoi cells are the largest by volume and, f (u) hdmax , dmax [ max d(u, v) ð30Þ
(u, v)2F 02
therefore, the bound on the convergence probability in
Theorem 4 applies to the RRT-style algorithms as well, where the maximum distance value dmax is a constant for a
which may prove more convenient than existing bounds given connected component F 0 . And because dmax is con-
(LaValle and Kuffner, 2001). Then, many algorithms based stant, we can use it to multiply P0 (u) without affecting the
on expansion or random walks, such as kinodynamic distribution, giving
motion planners, rely on density heuristics (Hsu et al.,
1999), and Lemma 2 implies that the sampling strategy dmax m(V ‘ (u)) 1 m(V ‘ (u))
P00 (u) = P0 (u)dmax = ð31Þ
adopted here should be the target value that density heuris- f (u) m(R‘ (V )) h m(R‘ (V ))
tics should approximate or be compared with.
which implies, from Lemma 3, that for a sample u0 , we have
a bounded probability that it lies in the lookout of R‘ (V ) :
2.4.2 Sampling heuristic. So far, we have ignored the pres-
ence of the heuristic function that prioritizes the search. We a
P(u0 2 b(V )) ð32Þ
now look at the effect it has on the convergence of the algo- h
rithm, but first, we must characterize the difficulty of the
and from this point on, the probability bound a/h replaces
problem with respect to the heuristic and the sought solu-
the probability a in Theorem 4, reaching the probability
tion path.
bound stated, and completing the proof.
Downloaded from ijr.sagepub.com by guest on October 17, 2015
1690 The International Journal of Robotics Research 33(13)
The value of f(u) distorts the sampling probability distri- graph in the same way as in discrete search algorithms.
bution in a fashion that is concentric around the straight- Evidently, this approximation will necessarily over-estimate
line path from uinit to ugoal which will lower the worst-case the actual optimal path distance, and an asymptotic
probability of sampling in the b-LOOKOUT of R‘ (V ) if the approach of the actual value hinges on the fact that, as the
lookout region is far from that central axis. Nevertheless, motion-graph grows, its edges can capture the optimal
as Corollary 5 states, the entanglement of the space or the path. Therefore, adopting a connection strategy that will
problem worsens the probabilistic convergence in a (a, b)- capture the optimal path is critical, but, fortunately, such
expansive space by a factor 1/h on the probability a. proven connection strategies have already been developed
At this point, a salient question might arise: if the heuris- as part of the RRG and RRT* algorithms (Karaman and
tic worsens the convergence rate, why use it? To that, we Frazzoli, 2011).
must remind the reader that the heuristic may worsen the In that vein, observing the proofs of asymptotic optimal-
worst-case convergence rate, but will increase the average ity for RRG/RRT* (Karaman and Frazzoli, 2011), there are
convergence rate in problems with low entanglement. A for- clear parallels between the sequences of connected balls
mal analysis of the average case is beyond the scope of this and the linking sequences used to prove probabilistic com-
article as it would require far more elaborate constructs, but pleteness both here and for EST-like algorithms (Hsu,
practical experience and formal analysis on discrete search 2000). Moreover, those proofs assume a uniform sampling
algorithms, such as A*, support this conjecture. in the reachable free-space, which we establish in Section
2.4.1 and modify with a bias in Section 2.4.2 with a
bounded distortion on the sampling. Together, these obser-
2.4.3 Practical implications. As we emerge from the for- vations make a strong argument to support the proposition
mal analysis, it is important to outline some of the key prac- that the idealized sampling-based A* algorithm is also
tical implications of the analysis and how they carry over to asymptotically optimal when it employs the optimizing
a practical algorithm. connection strategy of either the RRG or RRT* algorithms.
First, the sampling strategy adopted in the analysis relies
on the volume of local Voronoi regions around the points of
the motion-graph. Clearly, computing those volumes is pro- 3. Practical sampling-based A* algorithms
hibitively hard and cannot be used in a practical algorithm. In this section, we present a complete and detailed descrip-
However, it is clear that the probability P(NEW, FREEju) is tion of practical SBA* algorithms, moving away from the
a direct analog of those Voronoi volumes. Therefore, to general and idealized discussions of the previous section.
achieve an unbiased sampling within R‘ (V ), the objective First, we present a predictive model that can approximate
should be to approximate either the volume of the local the node values r(u) in a convenient and computationally
Voronoi regions or, equivalently, the probability P(NEW, efficient way. Second, we define a basic, concrete form of
FREEju). Intuition also seems to indicate that if we wish to the SBA* algorithm. Finally, we present a number of
increase the convergence rate, we should use a sampling important improvements to this basic algorithm.
strategy that is even more biased towards the larger Voronoi
volumes, or least explored regions, i.e. inducing a Voronoi
bias, as is the hallmark of the RRT sampling strategy. 3.1 Predictive model for node values
Furthermore, the sampling strategy of Equation (12) is In the rich field of information theory, there are many
clearly not identical to the probabilistic sampling adopted tools that can help us construct a predictive statistical
in Corollary 5. However, given that each expansion will model for the expected value r(u), but probably the most
decrease the P(NEW, FREEju) value of its neighbors, there widely used is the Kullback–Leibler (KL) divergence
is a natural convection occurring in the ranking of the (Kullback and Leibler, 1951), which is especially useful
nodes by their expected value r(u) which provides a similar here due to its straight-forward relationship to the concept
probabilistic effect at a much cheaper computational cost. of surprisal. Given a sample drawn from the configuration-
And as was previously noted, the introduction of further space neighborhood of node u, we want to know what is
bias towards the larger local Voronoi volumes is only the probability that the sample is surprising (i.e. NEW)
expected to improve the convergence by increasing the with respect to the current set of nodes in the neighborhood
chance of sampling in the lookout of R‘ (V ), and thus, of u.
adopting Equation (12) is likely to be beneficial, rather To represent the configuration-space neighborhood
than detrimental as compared to the idealized sampling N (u), we use Su(x) as a probability distribution reflecting
strategy. the sampling region, i.e. Su(x) gives the probability of the
Finally, Corollary 5 assumes, in the definition of the configuration x in the neighborhood of u. Let Nu be the set
value function f(u), that the optimal path distance ds(uinit, of neighboring samples within the reachable region of u,
u) is available. Computing this value is as hard as solving that are currently part of the motion graph or that were once
the single-query problem itself and, therefore, only an attempted to be added to it (and failed due to a collision or
approximation can be used in practice. An approximation to the connection strategy). Then, we can define SN,u(x) as
of this value can be accumulated in the nodes of the motion the probability distribution reflecting the sampling of nodes
Downloaded from ijr.sagepub.com by guest on October 17, 2015
Persson and Sharf 1691
from the sampling regions around the nodes in Nu, instead If we use the same standard deviation s (defining the
of the neighborhood of u directly. This allows us to formu- size of the sampling region), the expression further reduces
late the probability of a surprisal around node u as to the following:
and assuming that during iteration k, we add a new node v standard deviation, we must use Equation (36) to calculate
in the neighborhood of u, we can accumulate the density their KL divergence. Finally, as before, the constriction
value as value can be incrementally computed as new unreachable
nodes are found:
dk + 1 (u) = (1 Su (p(v))) dk (u) + Su (p(v)) eDKL (Su jjSv )
ð42Þ ck + 1 (u) = (1 Su (p(v))) ck (u) + Su (p(v)) eDKL (Su jjScv )
ð44Þ
where dk(u) is the density value of u before the addition of
node v, and dk + 1(u) is the density value after adding node At this point, we can come back to the expression for
v. This incremental calculation is extremely convenient the expected value of a sample from the configuration
from a computational point of view since traversals of space neighborhood of u, which we defined as r(u) and
neighborhoods of nodes in a graph are relatively expensive compute using Equation (10). We get the following expres-
operations, even with state-of-the-art cache-optimized data sion for r(u),
structures, and minimizing their occurrences in the itera-
tions of an algorithm can be of great benefit. g(u) + h(u)
r(u) = ð45Þ
We can proceed in a similar fashion to obtain the prob- (1 c(u))(1 d(u))
ability of sampling an unreachable node in the neighbor-
which depends on four values associated to each node: the
hood of u. This time, we define the constriction value c(u)
accumulated travel distance or cost g(u), the heuristic dis-
as follows:
tance or cost h(u), the constriction value c(u), and the den-
X sity value d(u).
c(u) [ Su (p(v)) eDKL (Su jjScv ) ð43Þ
v2Cu
1: function SBA*-Loop Q, V, E
Require: Q is a priority queue ordered by minimum r[u] value.
Require: V is the list of all vertices of the motion-graph.
Require: E is the list of all undirected edges of the motion-graph.
Require: Each vertex u has associated values for position p[u], heuristic h[u],
accumulated distance g[u], density d[u], constriction c[u], key value r[u] and
predecessor pred[u].
Ensure: There are no more useful vertices to explore, or the termination condition
was met.
Ensure: The pred[u] values trace out the optimal path from any vertex back to the
start.
2: repeat
3: u Top(Q)
4: {pnew , success} RANDOMWALK (p[u]) x Generate reachable sample near p[u]
5: if success then
6: {P,S} NEARESTNEIGHBORS (pnew , V) x Get set of neighbors of pnew
7: v NEWVERTEX (pnew, V) x Create new vertex at position pnew
8: CONNECTPREDECESSORS (v, P, Q, V, E)
9: CONNECTSUCCESSORS (v, S, Q, V, E)
10: else
11: Update c[u] and d[u] using Equations (41) and (43). x or, Record the failure
12: REQUEUE (u, Q)
13: end if
14: until EMPTYQ or SHOULDTERMINATE (V, E)
15: end function
ConnectPredecessors and ConnectSuccessors, respectively. 3.2.1 Search and connection strategies. The main loop
In both pseudo-code presentations, many of the implemen- simply keeps a priority queue for all of the nodes in the
tation details have been omitted for brevity, outlining only motion-graph with respect to their associated key value
1
the main logic of the algorithm. r[u], prioritizing the nodes with minimum key values. At
Downloaded from ijr.sagepub.com by guest on October 17, 2015
1694 The International Journal of Robotics Research 33(13)
each iteration, the best node is obtained from the queue and re-positioned in the priority queue. The main difference
a sample is drawn from its neighborhood with an attempt between Algorithms 3 and 4 as compared with the RRG
to achieve a collision-free path to that sample: this process connection routines is the work done to update the density,
is performed by an application-specific RANDOMWALK func- constriction, and key values, as well as keeping the priority
tion. If the sampling and connection were unsuccessful, the queue consistent with those values.
failure must be recorded in the cumulative values of the
density d[u] and constriction c[u] using Equations (41) and
(43), respectively, or using a recursive formula if possible, 3.2.2 Implementation remarks. At this point, we make a
as per the remarks in the previous section. Then, the node’s few general implementation remarks: these will be obvious
rank in the priority queue is updated since its key value was to any seasoned implementer, but worth mentioning never-
changed through the recorded failure. If the sampling and theless. First, our presentation of the algorithms include
connection were successful, then the new sample can be explicit mentions of the points at which the priority queue
inserted into the motion graph by first obtaining sets of is updated, which is not customary in standard pseudo-code
neighbors (predecessors and successors), then creating a expositions. We aim to show explicitly where those updates
new vertex in the motion graph and finally, calling are needed since they can represent an important computa-
Algorithms 3 and 4 to create the connections. The sets of tional cost in an implementation.
nearest neighbors can have a size (number of neighbors) Second, the recursive updates of the accumulated dis-
and range that are either fixed or dictated by an adaptive tances, after re-wirings (or successor connections) have
strategy such as the so-called star neighborhood (Karaman been performed, are presented as a non-recursive breadth-
and Frazzoli, 2011). first traversal (via the inconsistent set I). Here, we want to
The loop finishes when there are no more nodes in the make it clear that an implementation through actual recur-
queue or when some specific termination condition is sive calls is out of the question in this type of application
reached (checked with the ShouldTerminate function). In given the potential great depth of any branch of the optimal
non-optimizing sampling-based motion planners, the natu- motion tree.
ral termination condition is reached when a connection is Third, when the local planner and distance metric are
established between the start and goal locations. However, asymmetric, which is quite common in practice, the set of
like most asymptotically optimal planners, the SBA* algo- potential nearest predecessors to a given configuration
rithm does not have a natural stopping criterion. In this point is different from the set of potential nearest successors
case, this is manifested by the queue never becoming empty from that point. The nearest-neighbor query for a set of pre-
as samples are consistently requeued to it. One option to decessors and successors can, in general, be performed in
cause a natural termination is to impose a threshold on the one operation, thereby making fairly substantial savings in
density d[u] or value r[u] above which the samples are no computational effort, especially in terms of making good
longer requeued, which will eventually exhaust all valuable use of cached memory. On the contrary, if the planner and
samples and empty the queue. This termination method corresponding distance metric are symmetric, i.e. the path
introduces additional parameters to tune and it is unclear and distance from A to B are exactly the same as the path
what repercussions it would have on the convergence of the and distance from B to A, then the algorithm must simply
algorithm. However, simple termination conditions that are use the same set of neighbors for predecessors and succes-
often used in sampling-based algorithms can be easily sors. Furthermore, in that case, Algorithms 3 and 4 can be
applied here, such as terminating after a certain number of combined into a single routine, allowing some beneficial
iterations have passed or when a sufficiently good solution modifications such as the removal of the second set of
was obtained. steering attempts needed when connecting successors.
In this nominal version of the sampling-based A* algo- Finally, as in most sampling-based motion planners (and
rithm, an exhaustive and eager connection strategy is used, even planning on a fixed motion graph), the topology of the
as seen in Algorithms 3 and 4, which is similar to the con- motion graph is essentially that of a nearest-neighbor graph,
nection strategy in the RRG algorithm (Karaman and and most operations done (re-wirings) are local as well and,
Frazzoli, 2011); a more economical alternative will be pre- thus, it is recommended to choose a storage strategy that
sented in Section 3.3.1. The CONNECTPREDECESSORS routine reflects this pattern in the memory layout of the nodes, i.e.
proceeds to connect a new vertex u to its potential prede- nodes that are close to each other in configuration space
cessors, and, at the same time, it finds the optimal prede- ought to be close to each other in physical memory. There
cessor in that set. Then, the ConnectSuccessors routine are some appropriate cache-aware or cache-oblivious data
makes the successor connections going from the vertex u structures that can be used for that purpose (Kasheff, 2004;
to other neighboring vertices in the motion graph. Both Chowdhury, 2007; Jamriska et al., 2012). Moreover, an
routines rely on a Steer function that attempts to steer important computational cost in all sampling-based motion
between two points and returns a record of the path snew planners is the nearest-neighbor queries performed at each
that links the two points with cost cost (snew), if successful. iteration. It is thus important, for performance sake, to have
In addition, of course, the successors must be recursively an effective space-partitioning tree to resolve those queries
traversed to update their accumulated distance g[v] and be in O(log(N)) time (Fu et al., 2000), which often implies a
Downloaded from ijr.sagepub.com by guest on October 17, 2015
Persson and Sharf 1695
cache-friendly data structure to keep the performance from values by surveying the neighborhood will not accurately
degenerating towards O(N) time due to cache thrashing. reflect the true neighborhood of the node, unless a nearest-
neighbor search is conducted, which would be prohibitively
expensive relative to keeping a fully connected motion
3.3 Refining the sampling-based A* graph. Hence, we recommend the general form if a survey
Thus far, we have presented a very general version of the of the neighborhood is required to update the density or
sampling-based A* algorithm, but it has many shortcom- constriction values associated with the nodes.
ings which can be rectified, especially when simplifying Finally, there is a trade-off involved in checking colli-
assumptions can be made. In this section, we present a sions lazily in the SBA* algorithm. Because the SBA*
number of improvements to the general algorithm with the heuristic uses a constriction value to reflect the likelihood
aim of mitigating those problems. First, because the general of sampling colliding points in the neighborhood of a point,
version uses an RRG connection strategy, an obvious if collision is only checked for potentially optimal edges of
improvement is to adopt an RRT* connection strategy, i.e. the motion graph, then the constriction will be underesti-
pruning sub-optimal edges and evaluating collisions in a mated in general since it will not capture colliding paths in
lazy fashion, that is, under the assumption that full- sub-optimal directions. Fortunately, this does not seem to
neighborhood connectivity is not required for the density have a significant impact on the algorithm, and in fact, the
computations. Second, one shortcoming of the SBA* algo- SBA* algorithm can still work even without a constriction
rithm is that it is biased towards improving current optimal value. In our opinion, this trade-off is reasonable given the
paths (i.e. exploitation) and not towards finding a connec- potential benefit that lazy collision checking can have on
tion to the goal region. Given that this shortcoming is also overall performance.
an issue with the A* algorithm, we employ the same solu- Mostly for the sake of completeness, we present the con-
tion that is employed in that domain, i.e. the Anytime A* nection strategy in the form of Algorithms 5 and 6. These
heuristic (Likhachev et al., 2003) is used to drive the constitute rather straight-forward modifications to the gen-
growth of the motion-graph more rapidly towards the goal. eral connection algorithms. Mainly, the distance metric is
Finally, a fundamental issue with the exploitation bias is first computed and tested for giving rise to an optimal edge
the lack of exploration of the unknown regions of the con- and, if so, non-colliding steering is attempted and, if suc-
figuration space, which could potentially yield better paths. cessful, a new edge is created to replace the existing inci-
To solve this problem, we employ a simulated annealing dent edge. What is noteworthy, however, are the updates to
(Ingber, 1996) approach that balances the SBA* algorithm the density and constriction values. Whether a connection
with a purely exploratory algorithm, that is, RRT*. is useful or not (optimal), and whether a connection is pos-
sible or not (collision-free), the density value must still be
updated to reflect that a new neighboring point has been
3.3.1 Lazy and pruned connections. In this section, we added to the motion graph and, thus, the update is per-
present an alternative to the connection strategy presented formed for all potential predecessors and successors. As
in Algorithms 3 and 4 which prunes away sub-optimal per the aforementioned trade-off, the constriction value can
edges and delays collision-checking to the point of creation only be updated once a collision-free travel was attempted
of a new optimal edge. This alternative strategy is certainly and failed, as seen in Algorithms 5 and 6. Clearly, the astute
more economical in terms of computational time and mem- reader will notice that this connection strategy is in effect
ory required to store the motion graph, which now becomes the same as that used in the classic RRT* algorithm.
a motion tree due to the pruning of sub-optimal edges.
However, it is important to note the assumptions that must
be made and a trade-off involved in choosing this strategy. 3.3.2 Anytime A* as goal bias. As is well known, the A*
The first necessary assumption is that the distance algorithm is driven to fully explore the regions around the
metric must reflect the actual cost of travel between two optimal path, but not towards actually finding a feasible
nodes (if a collision-free path exists); this assumption is path as fast as possible. The SBA* algorithm will behave
necessary to use the lazy collision checking strategy. In the in a similar fashion, generating samples around the optimal
general SBA* algorithm, a steering routine is always called regions to try and enrich those areas (up to a certain den-
before the distance value is required, i.e. eagerly and, thus, sity), but there is no bias towards actually reaching the goal
the general form should be used when a simple and rela- region.
tively inexpensive distance metric cannot accurately reflect In the domain of discrete path-planning, this problem
the steering cost-to-go. can be solved with the so-called Anytime A* algorithm
It is also necessary to assume that recursive (or incre- (Likhachev et al., 2003). The idea with this method is to
mental) formulations for the density and constriction values artificially inflate the heuristic value such that priority for
are available for the given problem. If we prune sub- choosing nodes is biased towards those nodes that have a
optimal edges from the motion-graph, then it implies that smaller heuristic value, i.e. are closer to the goal region.
nodes will only be connected to their ‘‘optimal neighbors’’, This method rapidly finds a feasible solution, and as the
which means that re-computing the density or constriction heuristic values are deflated back to their true values, the
Downloaded from ijr.sagepub.com by guest on October 17, 2015
1696 The International Journal of Robotics Research 33(13)
product of the algorithm is brought back to being the opti- that is, we are looking for a solution to the following
mal path, thus, giving it an anytime quality which is useful problem:
in many applications.
For the SBA* algorithm, this anytime strategy could minðg(u) + h(u)Þ with h(u) = 0 ð46Þ
u
prove to be very useful as a bias towards establishing a
connection to the goal region. The problem statement is as which, in the classic A* heuristic, does not include the
follows: we seek an optimal path that connects to the goal, equality constraint because an exhaustive search is
Downloaded from ijr.sagepub.com by guest on October 17, 2015
Persson and Sharf 1697
guaranteed to find the node that satisfies the constraint if The only problem remaining now is to strike a balance
one exists. If we hope to find a solution that satisfies the between nodes generated with the SBA* strategy and with
equality constraint, without requiring an exhaustive search, the RRT* strategy, i.e. between exploitation and explora-
we must put emphasis on satisfying the constraint, as is tion. A classic technique used for this purpose in the con-
usually done in constrained optimization algorithms. For text of general optimization problems is simulated
this purpose, we can introduce a Lagrange multiplier annealing (Ingber, 1996). We propose to employ this strat-
l 2 [0,N[: egy in the context of motion planning, specifically as a
means of balancing the two sampling strategies. The choice
m(u) [ g(u) + h(u) + lh(u) ð47Þ between exploitation and exploration is determined ran-
domly at each iteration with a probability driven by a tem-
The effect of a large value for the Lagrange multiplier is perature value which cools as the algorithm progresses.
to drive the search more greedily towards satisfying the Initially, at high temperatures, there is a greater chance of
constraint, i.e. establishing a connection to the goal region. choosing exploration, and as cooling takes effect, the focus
Once the connection has been established, the Lagrange is shifted towards exploitation.
multiplier can be relaxed down to 0. However, given that a The classic cooling formula (Ingber, 1996) used in simu-
single connection to the goal is probably not useful enough lated annealing methods yields the following probability of
in practice, the relaxation must be progressive. choosing an exploitation step:
By going through the same derivation as in Section 3.1,
T0
one can obtain the following expected value for a given eth = 1 elog (N ) ð49Þ
node u :
where T0 is the initial temperature and N is the number of
g(u) + h(u) iterations performed. There are, of course, other alternative
rany (u) [ EN (u) ½m(u)= +l(h(u) hb )
(1 c(u))(1 d(u)) formulations to drive the cooling, but the above is both the-
ð48Þ oretically optimal under certain assumptions and works well
in practice.
where hb represents an optional goal bias in the sampling Algorithm 7 shows the complete SBA* algorithm with
strategy which measures how much closer to the goal a new simulated annealing to schedule the generation of RRT-
sample is expected to be compared with node u. The above style nodes. As one can see, it simply chooses to generate
heuristic formula illustrates how inducing a constant bias RRT nodes or SBA* nodes randomly depending on the
towards the goal, as is often done in uni-directional sam- simulated annealing schedule, and uses the same connec-
pling-based planning algorithms, may not have a significant tion strategy in either case. This algorithm combines the
impact on the prioritization of nodes. We also caution the best of both sampling-based algorithms. The RRT* algo-
reader that a naive application of a goal bias, such as repeat- rithm has the advantage of rapidly exploring the space, but
edly attempting to expand directly towards the goal, could generating enough nodes to sufficiently refine the solution
have negative impacts on the entropy of the overall search. is often prohibitively expensive. On the other hand, the
For the remainder of this paper, we will assume no bias, i.e. basic SBA* strategy has the opposite problem, it can leave
hb = 0. many regions of the space unexplored, thus missing poten-
tial optimal solutions. Through the smooth transition from
exploration to exploitation, we can retain the essential ben-
3.3.3 Balancing RRT* and SBA* with simulated efits of the RRT* algorithm while being able to refine the
annealing. The final improvement to the SBA* algorithm optimal paths more effectively.
is aimed at counter-balancing its exploitation bias with an
exploratory method. The key issue here is in generating
new nodes for the motion graph that are driven towards 4. Simulation results
unexplored regions of the configuration space. This is com- In this section, we wish to present as complete a picture as
monly referred to as the Voronoi bias and is the central ele- possible to characterize the SBA* algorithm empirically. To
ment of the RRT family of methods (Kuffner and LaValle, this end, we first present a set of results on a simple two-
2000). In RRT methods, new nodes are generated by taking dimensional point robot such that visual representations of
a random sample from the configuration space, finding its the resulting motion graphs are clear. Moreover, we present
nearest neighbor in the current motion graph, and expand- statistical analysis for environments representative of the
ing that node towards the random sample. This method can intended application and also a brief qualitative assessment
be used as a drop-in replacement for the expansion method of more difficult scenarios. Then, to better understand the
of the SBA* algorithm, which would essentially yield the behavior of the SBA* algorithm with respect to dimension-
RRG or RRT* algorithm, if using the full connection strat- ality, we plan paths through empty spaces (i.e. obstacle-
egy or the pruned connection strategy, respectively. It thus free) of increasing dimensionality. Finally, we apply the
follows that this node generation method can serve to eas- proposed method to a real-world example with a 7-dof
ily introduce exploration into the SBA* algorithm. manipulator performing a static interception task through a
Downloaded from ijr.sagepub.com by guest on October 17, 2015
1698 The International Journal of Robotics Research 33(13)
lightly cluttered environment and analyse its performance environment is especially useful for presentation of the
in Monte Carlo runs. shape and evolution of the motion graph as generated by
In all cases, different variants of the proposed SBA* the different path-planning algorithms. The comparisons
algorithm will be compared with each other and with the mainly involve the RRT* algorithm as a reference point,
RRT* algorithm. The RRT* algorithm is one of the best the SBA* algorithm and the SBA* algorithm with simu-
performing algorithms for these types of scenarios, i.e. sta- lated annealing (SA-SBA*; as per Section 3.3.3). For both
tic environments, no kinodynamic constraints, and assum- SBA* variants, the lazy connection strategy and the any-
ing uni-directional planning. We omit comparison with the time heuristics are used, as per Sections 3.3.1 and 3.3.2,
RRT algorithm as it shares the same sampling method as respectively. In all experiments, the initial relaxation factor
RRT* and is otherwise superseded by the RRT* optimal for the anytime heuristic was 5.0, and the initial tempera-
connection strategy. ture for the simulated annealing schedule was 2.0, which
It is important to note that all of the algorithms pre- results in approximately 500 nodes generated by the RRT-
sented were tested with the same software written by the style mechanism during the entire initial high-entropy
same implementer, only changing the core algorithmic phase of the SA-SBA* algorithm.
loop. The implementation is in C + + and is available The distance metric used here is the Euclidean distance
2
open-source under GPLv3 as part of the ReaK library. between points, measured in pixels, and the distance is also
Nearest-neighbor queries are performed, in all cases, using used as the travel cost along edges. For each sample u, with
a dynamic vantage-point tree (DVP-tree) (Fu et al., 2000) associated position p(u) [ (x, y), the heuristic value is
implementation which is capable of efficient queries for computed as
any distance metric (including asymmetric metrics). In all
cases, the vertices of the motion graph are stored within the h(u) [ pgoal p(u)2 ð50Þ
DVP-tree’s data storage, which, in turn, is a quaternary tree
laid out in breadth-first order in contiguous memory, pro- where pgoal is the position of the goal pixel and kk2 denotes
viding better locality for efficient use of cache memory. the Euclidean norm.
4.1 Two-dimensional point-robot environments 4.1.1 Single runs. In this section, we present a qualitative
As a first set of tests, we present a number of results on a assessment of the SBA* algorithms by presenting the
simple two-dimensional environment represented by a motion-graphs resulting from single runs of the different
black-and-white image whose white pixels represent free variants of SBA* and RRT* on example environments.
areas and black pixels represent obstacles. This simple The idea is to illustrate, compare and contrast the
Downloaded from ijr.sagepub.com by guest on October 17, 2015
Persson and Sharf 1699
Figure 1. Motion-graphs generated on moderately cluttered environment: (a) SBA* at 500; (b) SBA* at 1000; (c) SBA* at 1500; (d)
SA-SBA* at 500; (e) SA-SBA* at 1000; (f) SA-SBA* at 1500.
characteristics of the different algorithms. More quantita- on a highly cluttered environment. From Figure 2(a), one
tive assessments with Monte Carlo runs follow in the next can see the aforementioned issue about continuing to gener-
section. ate an RRT* tree in the hopes of refining the solution, that
Figure 1 shows a comparison of the evolution of the is, the optimal path does not improve significantly since the
SBA* algorithm versus the SA-SBA* algorithm for a mod- additional computational capital is spread over the entire
erately cluttered environment with the start position at the motion graph. A standard method to mitigate this problem
bottom-left corner and the goal location at the upper-right is to use branch-and-bound pruning, which is the simple
corner of the image. As one can see, the anytime SBA* act of eliminating nodes from the motion graph that cannot
algorithm evolves by pushing strongly for an expansion possibly be part of a path that would be shorter than the
towards the goal region, but as the area towards the goal current best solution. Figure 2(b) shows the result of 1500
region gets increasingly dense, more nodes are generated iterations of RRT* with branch-and-bound pruning. In this
back towards the starting area to obtain more uniform den- particular run, the final motion graph contains about 400
sity of nodes all along the region around the optimal path. nodes, including about 50 nodes added since the first prun-
As mentioned earlier, this greedy push to make a connec- ing pass (after the first solution was found). This highlights
tion to the goal region makes the algorithm susceptible to a problem with RRT* and branch-and-bound: the pruning
local minimas and variance in the results. When explora- strategy wants to limit the search to the optimal region
tion through RRT* is scheduled with simulated annealing, while the sampling strategy is biased towards unexplored
one can see from the bottom row in Figure 1 that the initial regions. This conflict results in a lot of iterations with very
growth is much more reminiscent of the RRT* algorithm, little progress in the number of nodes or in total cost of the
as expected, but it rapidly adopts an SBA* strategy which optimal path, because the overlap between the unexplored
helps fodder the optimal path region while leaving the far- areas of the configuration space and the areas that can pass
reaching regions virtually unaffected. This latter aspect is the pruning criteria is very small, leading, in this particular
very desirable because continuing with RRT* in the hopes case, to a new node to iteration ratio of less than 10%. This
of optimizing the optimal path wastes a lot of computa- is a considerable waste of effort since rejected samples
tional capital on far-reaching, sub-optimal regions of the must still be added to the tree before being rejected, and
configuration space. thus, the computational expense is per iteration, not per
Continuing on the topic of using RRT* entirely, we pres- successfully added node.
ent Figure 2 which compares the RRT* algorithm and the By contrast, the SA-SBA* algorithm is much more
SA-SBA* algorithm, both with and without branch-and- graceful in the solution refinement phase. As seen from
bound pruning, after an equal amount of iterations (1500) Figure 2(c), the overall continuation of the SA-SBA*
Downloaded from ijr.sagepub.com by guest on October 17, 2015
1700 The International Journal of Robotics Research 33(13)
Figure 2. Motion-graphs generated after 700 iterations on a highly cluttered environment: (a) RRT*; (b) RRT* with BnB; (c) SA-
SBA*; (d) SA-SBA* with BnB.
algorithm past the initial solution is much more effective at among 100 runs of the 3 algorithms. The total travel dis-
creating a richer motion-graph along the optimal path, tance is measured in pixels of the image, with the absolute
while leaving uninteresting regions unaffected. This gener- optimal solution being about 301 pixels. The main observa-
ally leads to more progress towards refining the current tion we can make here is that the SBA* and SA-SBA*
solution, and thus, an earlier termination of the overall algorithms significantly out-perform the RRT* algorithm
algorithm. With branch-and-bound pruning, an additional on early solutions (below 1000 nodes). The early solutions
benefit of the SA-SBA* algorithm comes to light. As seen produced by the SBA* algorithm are the best of the three
from Figure 2(d), when branch-and-bound pruning is methods tested, however, the reliability of the SBA* algo-
applied to the motion-graph, the resulting motion-graph rithm is only comparable to or slightly better than the
has, in fact, about twice the amount of nodes (about 750 in RRT* algorithm (see Figure 4) and it is rather slow at
this case) and is much denser around the optimal path. The improving the solutions in the later phase. This suggests
key difference here is that the pruning criteria and the that the exploitative (or greedy) nature of the algorithm
sampling strategy are in accord with each other, leading, causes it to more often miss the global optimum in favor of
in this particular case, to a new node to iteration ratio alternative sub-optimal paths (i.e. local minimas). On the
above 50%. other hand, the SA-SBA* appears to find very good early
solutions, with more reliability than either the RRT* or
4.1.2 Monte Carlo runs. In this section, we compare the SBA* algorithms, and exhibits an asymptotic behavior that
three main algorithms, RRT*, SBA* and SA-SBA*, with closely matches that of the RRT* algorithm.
Monte Carlo runs, bearing more statistical significance. To further assess the reliability of the three methods,
There are three variables we are interested in: the quality of Figures 4(a) and 4(b) show the number of planners, out of
the solutions, the termination rates, and the execution time. 100, that have found at least one solution after a given
For the purpose of exposition, we have gathered the results number of iterations. What is clear from those results is
of 100 Monte Carlo runs on the moderately cluttered envi- that the SA-SBA* algorithm is very reliable at finding an
ronment (see Figure 1) and on the highly cluttered environ- initial solution quickly, with a median arrival at about 250
ment (see Figure 2) as a representative set of results. Many iterations and a success probability above 99% after 600
more tests have been performed, always showing compara- iterations for both the moderately and highly cluttered
ble results. environments. By contrast, those figures are more than
We begin by assessing the quality of the solutions doubled for the RRT* algorithm. As expected, the SBA*
obtained from the different algorithms. To this end, algorithm becomes less reliable as the cluttering in the
Figures 3(a) and 3(b) show the average solutions found environment is increased, due to the higher likelihood of
Downloaded from ijr.sagepub.com by guest on October 17, 2015
Persson and Sharf 1701
Comparison of RRT* and Sampling−based A* Algorithms Comparison of RRT* and Sampling−based A* Algorithms
Average Solution Distances on Moderately Cluttered 2D Env. Average Solution Distances on Highly Cluttered 2D Env.
Average Solution Distance after N iterations (pixels)
RRT* RRT*
400 400
380 380
SA−SBA* SA−SBA*
360 360
340 340
320 320
SBA* SBA*
300 300
0 500 1000 1500 2000 2500 3000 0 500 1000 1500 2000 2500 3000
Number of Nodes Number of Nodes
(a) (b)
Figure 3. Monte Carlo results comparing RRT*, SBA* and SA-SBA* on (a) moderately and (b) highly cluttered environments,
showing the average cost of the solutions found by a 100 planners plotted against the number of iterations.
Comparison of RRT* and Sampling−based A* Algorithms Comparison of RRT* and Sampling−based A* Algorithms
Cumulative Success Probability on Moderately Cluttered 2D Env. Cumulative Success Probability on Highly Cluttered 2D Env.
100 100
Success Probability after N iterations (%)
80 80
70 70
60 60
50 RRT* 50 RRT*
40 40
SBA*
30 30 SBA*
20 20
10
SA−SBA* 10
SA−SBA*
0 0
0 200 400 600 800 1000 1200 1400 1600 1800 2000 0 200 400 600 800 1000 1200 1400 1600 1800 2000
Number of Nodes Number of Nodes
(a) (b)
Figure 4. Monte Carlo results comparing RRT*, SBA* and SA-SBA* on (a) moderately and (b) highly cluttered environments,
showing the percentage of successful planners versus the number of iterations.
being led astray into dead-ends or unfruitful local minimas. Comparison of RRT* and Sampling−based A* Algorithms
Execution−time per iteration on Highly Cluttered 2D Env.
The most interesting phenomenon observed here is that the 300
250
One would expect that since the SA-SBA* algorithm is
part RRT* and part SBA*, under a simulated annealing 200
combination that performs better than the sum of its parts. 100
Figure 6. Motion-graphs generated on ‘‘indoor’’ environments such as a bug-trap, an office space, and a symmetric world, all
captured at the moment when start-goal connectivity is achieved: (a) SBA* at 1600; (b) SA-SBA* at 1600; (c) SA-SBA* at 2100; (d)
SA-SBA* at 3000.
whenever it is exhausted, leading to those spikes, which the SBA* algorithms tested will be adversely affected by
amortize to a constant run-time cost and are thus negligible these difficulties as well.
artifacts from an analytical point of view. The second Figure 6 shows a number of examples of the SBA* and
observation is that in all cases, the execution time per itera- SA-SBA* motion graphs on a number of these difficult
tion is a logarithmic function of the number of nodes in the spaces, such as the classic ‘‘bug-trap’’ environment, a sim-
motion graph. This is the cost incurred by nearest-neighbor ple office space, and a classic symmetric environment
queries performed during each iteration. Thus, as expected, riddled with narrow corridors and poor connectivity. The
the empirical time complexity of each algorithm is O(N log number of nodes displayed in Figure 6 correspond to the
N) (for N nodes). Finally, we observe that the computa- moment when start-goal connectivity is achieved, at 1600,
tional cost of each algorithm is essentially the same, with 2100, and 3000, for each environment. The RRT* solved
the SBA* algorithms being marginally better, which is the same problems with 1000, 800, and 1000 nodes,
attributable to the fact that only one nearest-neighbor query respectively. Even without statistical analysis, it is clear that
is required for a SBA* iteration, as opposed to two for an the number of nodes required for a solution is quite sub-
RRT* iteration (one for the expansion and one to find the stantial in comparison with an exploratory algorithm such
connection neighborhood). as RRT*, as expected. The algorithms nevertheless make
progress and succeed to solve the problems. Another quali-
tative observation in these scenarios is that it appears that
4.1.3 Indoor environments. Although it is not the intended narrow passages are less of an obstacle to the planners than
application, it is interesting to take a brief look at the beha- the entanglement of the solution paths, which is, again,
vior of the SBA* algorithms in ‘‘indoor’’ environments. By expected from the direct effect that entanglement has on
this, we mean the typical office environment, which are the probability of progress by the expansions. By rough
characterized by poor connectivity (in the sense of e-good- inspection of the problems used here, we could estimate
ness or (a, b)-expansiveness), narrow passages and that the h value for the entanglement of these problems
entangled solutions (in the sense of h-entanglement). As is range between 1.5 and 3. Overall, these results confirm the
clear from the convergence rates derived in Theorems 4 idea that the SBA* algorithms are more appropriate for
and 5, these factors have a direct effect on the performance problems that feature low entanglement of the solution
of the idealized SBA* algorithm. It is thus expected that paths.
Downloaded from ijr.sagepub.com by guest on October 17, 2015
Persson and Sharf 1703
Comparison of RRT* and Sampling−base A* Algorithms The linear relationship between the required node count
Number of Nodes to First Solution on High−dimensional Spaces
and the dimensionality of the space when running the
4
SBA* algorithm is interesting and deserves additional clari-
Number of Nodes to First Solution
10
fications. For an unbiased, exploratory algorithm such as
RRT*
RRT*, the volume of space to be explored is an exponential
3
function of the dimensionality, and so is the number of
10
nodes needed to explore it. In a heuristically driven algo-
SA−SBA* rithm such as SBA*, the volume of ‘‘useful’’ space to be
explored is limited to a corridor between the start and the
2
10
goal points, in the absence of obstacles, i.e. when the space
SBA*
is h-entangled with a low value of h. The volume of this
4 6 8 10 12 14 16 18 20
Number of Dimensions corridor is proportional to the length of the central path.
The same phenomenon is true of A*, and does not depend
Figure 7. Monte Carlo results comparing RRT*, SBA*, and SA- on the relaxation of the heuristic. This phenomenon was
SBA* on high-dimensional empty spaces, showing the mean
also reported for the Guided-EST (Phillips et al., 2004). It
number of vertices required to find the first solution versus the
number of dimensions of the space.
is important to note, however, that the introduction of
obstacles and, thus, raising the h value that characterizes
the entanglement of the problem will expand the volume to
4.2 High-dimensional empty spaces be explored to find a solution, ultimately requiring a full
Another interesting aspect to investigate is the behavior of exploration, as the RRT* does.
the algorithm as the dimensionality of the problem is
increased. In this section, we want to shed some light on the
dimensionality alone and, thus, we use ‘‘empty’’ spaces as 4.3 7-dof Manipulator
the configuration space, i.e. a high-dimensional Euclidean The target application of the proposed algorithm is an
space (from 3 to 20 dimensions) with no obstacles, uncooperative satellite capture scenario involving a robotic
bounded with a unit hyper-cube and with a start and goal manipulator mounted on a servicing spacecraft. As an
point on opposite corners of the hyper-cube. The heuristic uncontrolled satellite is drifting in the vicinity of the servi-
function is, again, the Euclidean norm between the sample cing spacecraft after an orbital rendezvous, the robotic
and the goal position, as in Equation (50). We limit the size manipulator should be able to autonomously move to cap-
of any segment (edge)
pffiffiffiffi connecting two configuration points ture the satellite while avoiding a self-collision or a colli-
to a length of 0:2 N where N is the number of dimensions sion with the target satellite. In this section, we show that
in the space, i.e. there is always a minimum of 5 segments the SBA* algorithms can be successfully applied to solve
required to travel from the starting point to the goal point in problems in this real scenario for a static target. What
a straight line. makes the SBA* algorithm a suitable candidate in this sce-
Figure 7 shows the mean number of vertices required to nario is that the navigation problem is characterized by a
reach an initial solution by all three algorithms for 100 runs. rather straight-forward path to the target location while
As expected, the RRT* algorithm shows an exponential side-stepping a danger zone around the target’s collision
relationship to the number of dimensions, as per the well- geometry, with possibly narrow corridors to traverse.
known curse of dimensionality. The RRT* algorithm seems Our simulations of the satellite capture task are based
to require approximately twice the amount of vertices for on the experimental facility available at the Aerospace
every additional dimension, i.e. the mean node count is Mechatronics Laboratory at McGill University which uses
O(g N) with g = 1.92. By contrast, if we direct our attention a neutrally buoyant airship to emulate the target satellite
to the SBA* algorithms, we see sub-exponential relation- and a robotic manipulator on a 3 m linear track to represent
ships. Indeed, the SBA* results show a linear relationship the space manipulator (see Figure 8(a)). The capture target
between the mean node count and the dimensionality of the for the end-effector of the manipulator is a single grapple
space, with the best regression giving O(Nr) with r = 1.02. fixture mounted on the airship which itself is a spherical
The SA-SBA* results are more difficult to characterize or blimp of 6 foot in diameter. The mobile manipulator sys-
find a good regression for, but it appears the relationship is tem has a total of 7 dofs (6-dof manipulator and the linear
quadratic, with a power-function regression giving O(Nr) track), and planning is performed in the 14-dimensional
with r = 2.01, and the quadratic curve fit having the least joint state-space, i.e. using the 7 joint positions and 7 joint
residual error of all regressions tried. It is reasonable to con- velocities as the state representation stored in the motion
clude that the SA-SBA* algorithm achieves a mean node graph and using per-joint cubic-spline interpolations
count that is bounded-above by the RRT* performance and between states. Collision geometries are represented by
bounded below by the SBA* performance in the absence of simple 3-dimensional primitives, and proximity queries
obstacles. Where exactly the SA-SBA* performance lies (collision detection) are performed with simple closed-form
will naturally depend on the tuning of the simulated anneal- expressions for each pair of primitives. Figure 8(b) shows a
ing schedule (e.g. initial temperature or cooling formula). three-dimensional rendering of the planning environment.
Downloaded from ijr.sagepub.com by guest on October 17, 2015
1704 The International Journal of Robotics Research 33(13)
Figure 8. Picture of the robot-airship facility at the Aerospace Mechatronics Laboratory, McGill University: (a) actual picture; (b)
virtual environment.
Figure 10. Resulting motion graphs after 1000 iterations on the robot-airship facility, shown as traces of the end-effector positions for
(a) RRT*, (b) SBA*, and (c) SA-SBA*, and a still animation of the SA-SBA* solution in (d).
required to do because the grapple fixture, and thus the tar- and the distance is taken as the L‘ -norm of the joint-wise
get configuration, is on that side. The main features of this Euclidean norm of position and velocity, in reach-time repre-
scenario is that the corridor along which to travel to move sentation. In these experiments, the initial relaxation factor
from one side of the track to the other is quite narrow and for the anytime heuristic was 5.0, and the initial temperature
requires the manipulator to be turned to the side (profile) for the simulated annealing schedule was 5.0.
and tilted away from target to allow a safe (collision-free) Figures 11(a) and 11(b) show the average solutions
passage across. found after a given number of iterations of the different
As an example, Figure 10 shows the resulting motion- algorithms. The results corroborate those obtained in the
graphs from running the RRT*, SBA*, and SA-SBA* algo- two-dimensional cases, as do the success rates seen in
rithms for 1000 iterations (1000 nodes). One such solution Figures 11(c) and 11(d). We can see that the SA-SBA*
path is illustrated as a still animation in Figure 10(d), which algorithm out-performs the RRT* algorithm on early solu-
shows clearly that the manipulator does indeed make con- tions and then achieves a comparable asymptotic behavior.
siderable motion to avoid colliding with the airship on its On the state-space planning (14 dimensions), the difference
way to the capture configuration. is the most significant, in favor of the SA-SBA* algorithm.
Moreover, the reliability, in terms of success rates, of the
SA-SBA* is also comparable to that of the RRT* algorithm
4.3.1 Monte Carlo runs. To further validate the proposed in these scenarios. However, the SBA* algorithm does suf-
methods in this real scenario, we present the results of fer from a significant reliability problem, especially when
Monte Carlo runs of the different algorithms for planning in the dimensionality of the problem is increased to 14 in the
the configuration space (zeroth order) and in the state space state-space planning. Clearly, in very high-dimensional
(first order) of this 7 degrees-of-freedom manipulator, problems, the SBA* algorithm has more difficulty circum-
amounting to a 7-dimensional space and a 14-dimensional venting the obstacles and dead-ends due to the higher num-
space, respectively. The results presented in this section are ber of samples required to sufficiently increase the density
accumulated over 1000 runs of the different algorithms. For values. Nevertheless, it is clearly beneficial to mix SBA*
configuration-space planning, the interpolation (or local iterations into RRT* iterations, as demonstrated by the SA-
planner) is linear and the distance metric used is the L‘ - SBA* algorithm’s performance.
norm of the reach-time representation of the joint positions Table 1 presents an additional view of the results of the
or angles. For state-space planning, the interpolation is cubic Monte Carlo runs. First, we have the average first solutions
Downloaded from ijr.sagepub.com by guest on October 17, 2015
1706 The International Journal of Robotics Research 33(13)
Table 1. First solution statistics for the robot-airship capture scenario, comparing RRT*, SBA* and SA-SBA*, in a zeroth-order space
(configuration space, 7 dimensions) and a first-order space (state space, 14 dimensions).
Comparison of RRT* and Sampling−based A* Algorithms Comparison of RRT* and Sampling−based A* Algorithms
3.6 SA−SBA*
5.1
3.55 5
4.9
3.5
SBA*
4.8
3.45
4.7 SBA*
3.4 4.6
0 50 100 150 200 250 300 0 500 1000 1500 2000 2500 3000
Number of Nodes Number of Nodes
(a) (b)
Comparison of RRT* and Sampling−based A* Algorithms Comparison of RRT* and Sampling−based A* Algorithms
Cumulative Success Probability on 7−dof Manipulator Cumulative Success Probability on 7−dof Manipulator (state−space)
100 100
Success Probability after N iterations (%)
90 90 RRT*
80 RRT* 80
70 70
SA−SBA* SA−SBA*
60 60
50 50
40
SBA* 40
30 30
20 20
10 10
SBA*
0 0
0 50 100 150 200 250 300 0 1000 2000 3000 4000 5000 6000
Number of Nodes Number of Nodes
(c) (d)
Figure 11. Monte Carlo results comparing RRT*, SBA*, and SA-SBA* for the robot-airship capture scenario in a zeroth-order space
(position only, 7 dimensions) and a first-order space (state-space, 14 dimensions), showing the average cost of solutions (a), (b) and
the cumulative success rates (c), (d).
found by the planners on the 1000 runs. We can see that all for the zeroth-order and first-order planning problem,
algorithms produce comparable results, with only marginal respectively. By comparing the results, we can see that the
improvements on the mean value and the standard devia- SBA* is significantly faster than RRT*, and that the SA-
tion for the SA-SBA* algorithm. Then, we reported the SBA* falls in between, as expected. The significant run-
average number of nodes required to find the first solution time reductions achieved by SBA* iterations must be due
for each planner. We can observe that SBA* algorithm is to the fact that it requires only a single nearest-neighbor
significantly less reliable, but the SA-SBA* recoups that query as opposed to two for RRT* iterations, since the only
reliability and even achieves lower standard deviation than other significant computational cost is the collision-check-
the RRT* algorithm. Finally, the table shows the running ing, which we would expect to incur the same cost in all
time, in seconds, required to run 3000 and 6000 iterations algorithms tested.
Downloaded from ijr.sagepub.com by guest on October 17, 2015
Persson and Sharf 1707
Ferguson D, Kalra N and Stentz A (2006) Replanning with RRTs. 2000 IEEE international conference on robotics and automa-
In: Proceedings of the 2006 IEEE international conference on tion (ICRA’00), vol. 2, San Francisco, CA, pp. 995–1001.
robotics and automation (ICRA’06), Orlando, FL, pp. 1243– Kullback S and Leibler RA (1951) On information and suffi-
1248. ciency. Annals of Mathematical Statistics 22(1): 79–86.
Fu AWC, Chan PMS, Cheung YL and Moon YS (2000) Dynamic LaValle SM (1998) Rapidly-exploring Random Tree: A New Tool
VP-tree indexing for N-nearest neighbor search given pair- for Path Planning. Technical Report 98-11, Department of
wise distances. The Very Large Data Bases Journal (VLDB) 9: Computer Science. Iowa State University.
154–173. LaValle SM and Kuffner JJ (2001) Randomized kinodynamic
Geraerts RJ and Overmars MH (2002) A Comparative Study of planning. The International Journal of Robotics Research 20:
Probabilistic Roadmap Planners. Technical Report, Utrecht 378–400.
University: Information and Computer Science, Utrecht, The Likhachev M, Ferguson D, Gordon G, Stentz A and Thrun S
Netherlands. (2005) Anytime dynamic A*: An anytime, replanning algo-
Gonzalez JP and Likhachev M (2011) Search-based planning with rithm. In: Proceedings of the international conference on auto-
provable suboptimality bounds for continuous state spaces. In: mated planning and scheduling (ICAPS), Monterey, CA.
Proceedings of the fourth international symposium on combi- AAAI Press, pp. 262–271.
natorial search (SoCS’11), Castell de Cardona, Barcelona, Likhachev M, Gordon G and Thrun S (2003) ARA*: Anytime A*
Spain. AAAI Press, pp. 60–67. with provable bounds on sub-optimality. In: Proceedings of the
Hart PE, Nilsson NJ and Raphael B (1968) A formal basis for the 2003 conference in advances in neural information processing
heuristic determination of minimum cost paths. IEEE Transac- systems (NIPS), Whistler, BC, Canada. Cambridge, MA: MIT
tions on Systems Science and Cybernetics 4(2): 100–107. Press, pp. 767–774.
Hershey JR and Olsen PA (2007) Approximating the Kullback– Phillips JM, Bedrossian N and Kavraki LE (2004) Guided expan-
Leibler divergence between Gaussian mixture models. In: sive spaces trees: a search strategy for motion- and cost-
IEEE international conference on acoustics, speech and signal constrained state spaces. In: Proceedings of the 2004 IEEE
processing, volume 4, Honolulu, HI, pp. 317–320. international conference on robotics and automation
Hsu D (2000) Randomized Single-Query Motion Planning in (ICRA’04), vol. 4, New Orleans, LA, pp. 3968–3973.
Expansive Spaces. PhD Thesis. Department of Computer Sci- Plaku E (2012) Guiding sampling-based motion planning by for-
ence, Stanford University, Stanford, CA, pp. 71–90. ward and backward discrete search. In: Su CY, Rakheja S and
Hsu D, Latombe JC and Motwani R (1999) Path planning in Liu H (eds.), Intelligent Robotics and Applications (Lecture
expansive configuration spaces. International Journal of Com- Notes in Computer Science, vol. 7508). Berlin: Springer, pp.
putational Geometry & Applications 09(4/5): 495–512. 289–298.
Ingber L (1996) Adaptive simulated annealing (ASA): lessons Rickert M, Brock O and Knoll A (2008) Balancing Exploration
learned. Journal of Control and Cybernetics 25: 33–54. and Exploitation in Motion Planning. In: IEEE international
Jamriska O, Sykora D and Hornung A (2012) Cache-efficient conference on robotics and automation (ICRA), Pasadena, CA,
graph cuts on structured grids. In: IEEE Conference on Com- pp. 2812–2817.
puter Vision and Pattern Recognition (CVPR’12). IEEE Com- Sanchez G and Latombe JC (2001) A single-query bi-directional
puter Society Press, pp. 3673–3680. probabilistic roadmap planner with lazy collision checking. In:
Karaman S and Frazzoli E (2011) Sampling-based algorithms for Jarvis R and Zelinsky A (eds.), International Symposium on
optimal motion planning. The international journal of robotics Robotics Research (Springer Tracts in Advanced Robotics, vol.
research 30(7): 846–894. 6). Berlin: Springer, pp. 403–417.
Karaman S, Walter MR, Perez A, Frazzoli E and Teller S (2011) Sucan I and Kavraki L (2012) A sampling-based tree planner for
Anytime motion planning using the RRT*. In: IEEE interna- systems with complex dynamics. IEEE Transactions on
tional conference on robotics and automation (ICRA), Shang-
Robotics 28(1): 116–131.
hai, China, pp. 1478–1483.
Sucan IA and Kavraki LE (2010) On the implementation of
Kasheff Z (2004) Cache-Oblivious Dynamic Search Trees.
single-query sampling-based motion planners. In: Proceedings
M.Eng. Thesis, Department of Electrical Engineering and
of the 2010 IEEE international conference on robotics and
Computer Science, Massachusetts Institute of Technology.
automation (ICRA’10), Anchorage, AK, pp. 2005–2011.
Kavraki LE and Latombe JC (1998) Probabilistic roadmaps for
Vazquez-Otero A, Faigl J and Munuzuri AP (2012) Path planning
robot path plannings. In: Motion Planning in Robotics. New
based on reaction–diffusion process. In: IEEE/RSJ interna-
York: John Wiley & Sons.
tional conference on intelligent robots and systems, Vilamoura,
Kavraki LE, Svestka P, Latombe JC and Overmars MH (1996)
Algarve, Portugal, pp. 896–901.
Probabilistic roadmaps for path planning in high-dimensional
Zucker M, Ratliff N, Dragan A, et al. (2013) CHOMP: Covariant
configuration spaces. IEEE Transactions on Robotics and
Hamiltonian Optimization for Motion Planning. The
Automation 12(4): 566–580.
International Journal of Robotics Research 32(11):
Kuffner JJ and LaValle SM (2000) RRT-Connect: an efficient
1164–1193.
approach to single-query path planning. In: Proceedings of the