Volumetric Range Image Integration Method
Volumetric Range Image Integration Method
Abstract gle description of the surface. A set of desirable properties for such a
A number of techniques have been developed for reconstructing sur- surface reconstruction algorithm includes:
faces by integrating groups of aligned range images. A desirable set
of properties for such algorithms includes: incremental updating, rep-
Representation of range uncertainty. The data in range images
typically have asymmetric error distributions with primary di-
resentation of directional uncertainty, the ability to fill gaps in the re- rections along sensor lines of sight, as illustrated for optical tri-
construction, and robustness in the presence of outliers. Prior algo- angulation in Figure 1a. The method of range integration should
rithms possess subsets of these properties. In this paper, we present a reflect this fact.
volumetric method for integrating range images that possesses all of
these properties. Utilization of all range data, including redundant observations
Our volumetric representation consists of a cumulative weighted of each object surface. If properly used, this redundancy can re-
signed distance function. Working with one range image at a time, duce sensor noise.
we first scan-convert it to a distance function, then combine this with Incremental and order independent updating. Incremental up-
the data already acquired using a simple additive scheme. To achieve dates allow us to obtain a reconstruction after each scan or small
space efficiency, we employ a run-length encoding of the volume. To set of scans and allow us to choose the next best orientation for
achieve time efficiency, we resample the range image to align with the scanning. Order independence is desirable to ensure that results
voxel grid and traverse the range and voxel scanlines synchronously. are not biased by earlier scans. Together, they allow for straight-
We generate the final manifold by extracting an isosurface from the forward parallelization.
volumetric grid. We show that under certain assumptions, this isosur-
face is optimal in the least squares sense. To fill gaps in the model, we Time and space efficiency. Complex objects may require many
tessellate over the boundaries between regions seen to be empty and range images in order to build a detailed model. The range
regions never observed. images and the model must be represented efficiently and pro-
Using this method, we are able to integrate a large number of range cessed quickly to make the algorithm practical.
images (as many as 70) yielding seamless, high-detail models of up to Robustness. Outliers and systematic range distortions can create
2.6 million triangles. challenging situations for reconstruction algorithms. A robust
algorithm needs to handle these situations without catastrophic
CR Categories: I.3.5 [Computer Graphics] Computational Geome- failures such as holes in surfaces and self-intersecting surfaces.
try and Object Modeling
Additional keywords: Surface fitting, three-dimensional shape re- No restrictions on topological type. The algorithm should not
covery, range image integration, isosurface extraction assume that the object is of a particular genus. Simplifying as-
sumptions such as “the object is homeomorphic to a sphere”
yield useful results in only a restricted class of problems.
1 Introduction
Ability to fill holes in the reconstruction. Given a set of range
Recent years have witnessed a rise in the availability of fast, accurate images that do not completely cover the object, the surface re-
range scanners. These range scanners have provided data for applica- construction will necessarily be incomplete. For some objects,
tions such as medicine, reverse engineering, and digital film-making. no amount of scanning would completely cover the object, be-
Many of these devices generate range images; i.e., they produce depth cause some surfaces may be inaccessible to the sensor. In these
values on a regular sampling lattice. Figure 1 illustrates how an op- cases, we desire an algorithm that can automatically fill these
tical triangulation scanner can be used to acquire a range image. By holes with plausible surfaces, yielding a model that is both “wa-
connecting nearest neighbors with triangular elements, one can con- tertight” and esthetically pleasing.
struct a range surface as shown in Figure 1d. Range images are typi-
cally formed by sweeping a 1D or 2D sensor linearly across an object In this paper, we present a volumetric method for integrating range
or circularly around it, and generally do not contain enough informa- images that possesses all of these properties. In the next section, we
tion to reconstruct the entire object being scanned. Accordingly, we review some previous work in the area of surface reconstruction. In
require algorithms that can merge multiple range images into a sin- section 3, we describe the core of our volumetric algorithm. In sec-
tion 4, we show how this algorithm can be used to fill gaps in the re-
Authors’ Address: Computer Science Department, Stanford University,
construction using knowledge about the emptiness of space. Next, in
Stanford, CA 94305 section 5, we describe how we implemented our volumetric approach
E-mail: fcurless,[email protected] so as to keep time and space costs reasonable. In section 6, we show
World Wide Web: http://www-graphics.stanford.edu the results of surface reconstruction from many range images of com-
plex objects. Finally, in section 7 we conclude and discuss limitations
and future directions.
Permission to make digital or hard copies of part or all of this work or
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies 2 Previous work
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers, or to redistribute to lists, requires prior Surface reconstruction from dense range data has been an active area
specific permission and/or a fee. of research for several decades. The strategies have proceeded along
© 1996 ACM-0-89791-746-4/96/008...$3.50 two basic directions: reconstruction from unorganized points, and
303
Surface Object
σz
Direction of travel
σx
Laser
sheet
CCD
CCD image
plane
Cylindrical lens
Laser Laser CCD
Figure 1. From optical triangulation to a range surface. (a) In 2D, a narrow laser beam illuminates a surface, and a linear sensor images the reflection from an
object. The center of the image pulse maps to the center of the laser, yielding a range value. The uncertainty, x , in determining the center of the pulse results
in range uncertainty, z along the laser’s line of sight. When using the spacetime analysis for optical triangulation [6], the uncertainties run along the lines of
sight of the CCD. (b) In 3D, a laser stripe triangulation scanner first spreads the laser beam into a sheet of light with a cylindrical lens. The CCD observes the
reflected stripe from which a depth profile is computed. The object sweeps through the field of view, yielding a range image. Other scanner configurations rotate
the object to obtain a cylindrical scan or sweep a laser beam or stripe over a stationary object. (c) A range image obtained from the scanner in (b) is a collection
of points with regular spacing. (d) By connecting nearest neighbors with triangles, we create a piecewise linear range surface.
reconstruction that exploits the underlying structure of the acquired Tarbox and Gottschlich [28] also describe methods for generating bi-
data. These two strategies can be further subdivided according to nary voxel grids from range images. None of these methods has been
whether they operate by reconstructing parametric surfaces or by re- used to generate surfaces. Further, without an underlying continuous
constructing an implicit function. function, there are no mechanism for representing range uncertainty
A major advantage of the unorganized points algorithms is the fact or for combining overlapping, noisy range surfaces.
that they do not make any prior assumptions about connectivity of The last category of our taxonomy consists of implicit function
points. In the absence of range images or contours to provide connec- methods that use samples of a continuous function to combine struc-
tivity cues, these algorithms are the only recourse. Among the para- tured data. Our method falls into this category. Previous efforts in this
metric surface approaches, Boissanat [2] describes a method for De- area include the work of Grosso, et al [12], who generate depth maps
launay triangulation of a set of points in 3-space. Edelsbrunner and from stereo and average them into a volume with occupancy ramps of
Mücke [9] generalize the notion of a convex hull to create surfaces varying slopes corresponding to uncertainty measures; they do not,
called alpha-shapes. Examples of implicit surface reconstruction in- however, perform a final surface extraction. Succi, et al [26] create
clude the method of Hoppe, et al [16] for generating a signed distance depth maps from stereo and optical flow and integrate them volumet-
function followed by an isosurface extraction. More recently, Bajaj, rically using a straight average. The details of his method are unclear,
et al [1] used alpha-shapes to construct a signed distance function to but they appear to extract an isosurface at an arbitrary threshold. In
which they fit implicit polynomials. Although unorganized points al- both the Grosso and Succi papers, the range maps are sparse, the di-
gorithms are widely applicable, they discard useful information such rections of range uncertainty are not characterized, they use no time
as surface normal and reliability estimates. As a result, these algo- or space optimizations, and the final models are of low resolution. Re-
rithms are well-behaved in smooth regions of surfaces, but they are cently, Hilton, et al [14] have developed a method similar to ours in
not always robust in regions of high curvature and in the presence of that it uses weighted signed distance functions for merging range im-
systematic range distortions and outliers. ages, but it does not address directions of sensor uncertainty, incre-
mental updating, space efficiency, and characterization of the whole
Among the structured data algorithms, several parametric ap- space for potential hole filling, all of which we believe are crucial for
proaches have been proposed, most of them operating on range im- the success of this approach.
ages in a polygonal domain. Soucy and Laurendeau [25] describe Other relevant work includes the method of probabilistic occu-
a method using Venn diagrams to identify overlapping data regions, pancy grids developed by Elfes and Matthies [10]. Their volumetric
followed by re-parameterization and merging of regions. Turk and space is a scalar probability field which they update using a Bayesian
Levoy [30] devised an incremental algorithm that updates a recon- formulation. The results have been used for robot navigation, but not
struction by eroding redundant geometry, followed by zippering along for surface extraction. A difficulty with this technique is the fact that
the remaining boundaries, and finally a consensus step that rein- the best description of the surface lies at the peak or ridge of the proba-
troduces the original geometry to establish final vertex positions. bility function, and the problem of ridge-finding is not one with robust
Rutishauser, et al [24] use errors along the sensor’s lines of sight to es- solutions [8]. This is one of our primary motivations for taking an iso-
tablish consensus surface positions followed by a re-tessellation that surface approach in the next section: it leverages off of well-behaved
incorporates redundant data. These algorithms typically perform bet- surface extraction algorithms.
ter than unorganized point algorithms, but they can still fail catas- The discrete-state implicit function algorithms described above
trophically in areas of high curvature, as exemplified in Figure 9. also have much in common with the methods of extracting volumes
Several algorithms have been proposed for integrating structured from silhouettes [15] [21] [23] [27]. The idea of using backdrops to
data to generate implicit functions. These algorithms can be classified help carve out the emptiness of space is one we demonstrate in sec-
as to whether voxels are assigned one of two (or three) states or are tion 4.
samples of a continuous function. Among the discrete-state volumet-
ric algorithms, Connolly [4] casts rays from a range image accessedas
a quad-tree into a voxel grid stored as an octree, and generates results 3 Volumetric integration
for synthetic data. Chien, et al [3] efficiently generate octree models Our algorithm employs a continuous implicit function, D(x), repre-
under the severe assumption that all views are taken from the direc- sented by samples. The function we represent is the weighted signed
tions corresponding to the 6 faces of a cube. Li and Crebbin [19] and distance of each point x to the nearest range surface along the line of
2
304
Range surface d (x) D(x)
Volume 1
d (x)
2 W(x)
w2(x)
w1(x)
Far x x
Near Sensor r1 r2 R
x x
Sensor
(a) (b)
Distance
from x x Figure 3. Signed distance and weight functions in one dimension. (a) The
surface sensor looks down the x-axis and takes two measurements, r1 and r2 .
Zero-crossing New zero-crossing d1 (x) and d2 (x) are the signed distance profiles, and w1 (x) and w2 (x)
(isosurface) are the weight functions. In 1D, we might expect two sensor measure-
ments to have the same weight magnitudes, but we have shown them to
(a) (b) be of different magnitude here to illustrate how the profiles combine in the
general case. (b) D (x) is a weighted combination of d1 (x) and d2 (x),
Figure 2. Unweighted signed distance functions in 3D. (a) A range sen- and W (x) is the sum of the weight functions. Given this formulation, the
sor looking down the x-axis observes a range image, shown here as a re- zero-crossing, R, becomes the weighted combination of r1 and r2 and rep-
constructed range surface. Following one line of sight down the x-axis, resents our best guess of the location of the surface. In practice, we truncate
we can generate a signed distance function as shown. The zero crossing the distance ramps and weights to the vicinity of the range points.
of this function is a point on the range surface. (b) The range sensor re-
where, di (x) and wi (x) are the signed distance and weight functions
peats the measurement, but noise in the range sensing process results in a
slightly different range surface. In general, the second surface would inter-
penetrate the first, but we have shown it as an offset from the first surface from the ith range image.
for purposes of illustration. Following the same line of sight as before, we Expressed as an incremental calculation, the rules are:
obtain another signed distance function. By summing these functions, we
arrive at a cumulative function with a new zero crossing positioned mid-
way between the original range measurements. Di+1 (x) = Wi (x)DWi (i (xx))++wwi+1 (x)di+1 (x)
i+1 (x)
(3)
sight to the sensor. We construct this function by combining signed Wi+1 (x) = Wi (x) + wi+1 (x) (4)
distance functions d1 (x), d2 (x), ... dn (x) and weight functions
w1 (x), w2 (x), ... wn (x) obtained from range images 1 ... n. Our where Di(x) and Wi (x) are the cumulative signed distance and
combining rules give us for each voxel a cumulative signed distance weight functions after integrating the ith range image.
function, D(x), and a cumulative weight W (x). We represent these In the special case of one dimension, the zero-crossing of the cu-
functions on a discrete voxel grid and extract an isosurface corre- mulative function is at a range, R given by:
sponding to D(x) = 0. Under a certain set of assumptions, this iso-
surface is optimal in the least squares sense. A full proof of this op- R = wwi ri i (5)
timality is beyond the scope of this paper, but a sketch appears in ap-
pendix A. i.e., a weighted combination of the acquired range values, which is
Figure 2 illustrates the principle of combining unweighted signed what one would expect for a least squares minimization.
distances for the simple case of two range surfaces sampled from the In principle, the distance and weighting functions should extend in-
same direction. Note that the resulting isosurface would be the sur- definitely in either direction. However, to prevent surfaces on oppo-
face created by averaging the two range surfaces along the sensor’s site sides of the object from interfering with each other, we force the
lines of sight. In general, however, weights are necessary to repre- weighting function to taper off behind the surface. There is a trade-off
sent variations in certainty across the range surfaces. The choice of involved in choosing where the weight function tapers off. It should
weights should be specific to the range scanning technology. For op- persist far enough behind the surface to ensure that all distance ramps
tical triangulation scanners, for example, Soucy [25] and Turk [30] will contribute in the vicinity of the final zero crossing, but, it should
make the weight depend on the dot product between each vertex nor- also be as narrow as possible to avoid influencing surfaces on the other
mal and the viewing direction, reflecting greater uncertainty when the side. To meet these requirements, we force the weights to fall off at a
illumination is at grazing angles to the surface. Turk also argues that distance equal to half the maximum uncertainty interval of the range
the range data at the boundaries of the mesh typically have greater measurements. Similarly, the signed distance and weight functions
uncertainty, requiring more down-weighting. We adopt these same need not extend far in front of the surface. Restricting the functions
weighting schemes for our optical triangulation range data. to the vicinity of the surface yields a more compact representation and
Figure 3 illustrates the construction and usage of the signed dis- reduces the computational expense of updating the volume.
tance and weight functions in 1D. In Figure 3a, the sensor is posi- In two and three dimensions, the range measurements correspond
tioned at the origin looking down the +x axis and has taken two mea- to curves or surfaces with weight functions, and the signed distance
surements, r1 and r2 . The signed distance profiles, d1 (x) and d2 (x) ramps have directions that are consistent with the primary directions
may extend indefinitely in either direction, but the weight functions, of sensor uncertainty. The uncertainties that apply to range image in-
w1 (x) and w2 (x), taper off behind the range points for reasons dis- tegration include errors in alignment between meshes as well as er-
cussed below. rors inherent in the scanning technology. A number of algorithms for
Figure 3b is the weighted combination of the two profiles. The aligning sets of range images have been explored and shown to yield
combination rules are straightforward: excellent results [11][30]. The remaining error lies in the scanner it-
self. For optical triangulation scanners, for example, this error has
D(x) = wi(wxi)(dxi)(x)
been shown to be ellipsoidal about the range points, with the major
(1) axis of the ellipse aligned with the lines of sight of the laser [13][24].
Figure 4 illustrates the two-dimensional case for a range curve de-
rived from a single scan containing a row of range samples. In prac-
W (x) = wi(x) (2) tice, we use a fixed point representation for the signed distance func-
3
305
Dmax Isosurface
Volume
d
Dmin wb wa
w
(a) (b) (c)
wc
Viewing Voxel
n2 ray Range surface
n1 Sensor
4
306
D(x) = Dmax light while the reflections are triangulated into depth profiles through
Unseen Hole fill
isosurface W(x) = 0 a CCD camera positioned off axis. To improve the quality of the data,
Unseen we apply the method of spacetime analysis as described in [6]. The
benefits of this analysis include reduced range noise, greater immu-
Observed nity to reflectance changes, and less artifacts near range discontinu-
Near
isosurface ities.
surface When using traditional triangulation analysis implemented in hard-
ware in our Cyberware scanner, the uncertainty in triangulation for
Empty Near surface our system follows the lines of sight of the expanding laser beam.
Empty
D(x) = Dmin Dmin < D(x) < Dmax When using the spacetime analysis, however, the uncertainty follows
Sensor the lines of sight of the camera. The results described in section 6 of
W(x) = 0 W(x) > 0
this paper were obtained with one or the other triangulation method.
(a) (b) In each case, we adhere to the appropriate lines of sight when laying
down signed distance and weight functions.
Figure 6. Volumetric grid with space carving and hole filling. (a) The re-
gions in front of the surface are seen as empty, regions in the vicinity of 5.2 Software
the surface ramp through the zero-crossing, while regions behind remain
unseen. The green (dashed) segments are the isosurfaces generated near The creation of detailed, complex models requires a large amount of
the observed surface, while the red (dotted) segments are hole fillers, gen- input data to be merged into high resolution voxel grids. The exam-
erated by tessellating over the transition from empty to unseen. In (b), we ples in the next section include models generated from as many as 70
identify the three extremal voxel states with their corresponding function scans containing up to 12 million input vertices with volumetric grids
values. ranging in size up to 160 million voxels. Clearly, time and space opti-
mizations are critical for merging this data and managing these grids.
We take advantage of this distinction when smoothing surfaces as de-
scribed below. 5.2.1 Run-length encoding
Figure 6 illustrates the method for a single range image, and pro- The core data structure is a run-length encoded (RLE) volume with
vides a diagram for the three-state classification scheme. The hole three run types: empty, unseen, and varying. The varying fields are
filler isosurfaces are “false” in that they are not representative of the stored as a stream of varying data, rather than runs of constant value.
observed surface, but they do derive from observed data. In particular, Typical memory savings vary from 10:1 to 20:1. In fact, the space
they correspond to a boundary that confines where the surface could required to represent one of these voxel grids is usually less than the
plausibly exist. In practice, we find that many of these hole filler sur- memory required to represent the final mesh as a list of vertices and
faces are generated in crevices that are hard for the sensor to reach. triangle indices.
Because the transition between unseen and empty is discontinuous 5.2.2 Fast volume traversal
and hole fill triangles are generated as an isosurface between these bi- Updating the volume from a range image may be likened to inverse
nary states, with no smooth transition, we generally observe aliasing volume rendering: instead of reading from a volume and writing to an
artifacts in these areas. These artifacts can be eliminated by prefilter- image, we read from a range image and write to a volume. As a re-
ing the transition region before sampling on the voxel lattice using sult, we leverage off of a successful idea from the volume rendering
straightforward methods such as analytic filtering or super-sampling community: for best memory system performance, stream through
and averaging down. In practice, we have obtained satisfactory re- the volume and the image simultaneously in scanline order [18]. In
sults by applying another technique: post-filtering the mesh after re- general, however, the scanlines of a range image are not aligned with
construction using weighted averages of nearest vertex neighbors as the scanlines of the voxel grid, as shown in Figure 7a. By suitably
described in [29]. The effect of this filtering step is to blur the hole resampling the range image, we obtain the desired alignment (Fig-
fill surface. Since we know which triangles correspond to hole fillers, ure 7b). The resampling process consists of a depth rendering of the
we need only concentrate the surface filtering on the these portions of range surface using the viewing transformation specific to the lines of
the mesh. This localized filtering preserves the detail in the observed sight of the range sensor and using an image plane oriented to align
surface reconstruction. To achieve a smooth blend between filtered with the voxel grid. We assign the weights as vertex “colors” to be
hole fill vertices and the neighboring “real” surface, we allow the fil- linearly interpolated during the rendering step, an approach equiva-
ter weights to extend beyond and taper off into the vicinity of the hole lent to Gouraud shading of triangle colors.
fill boundaries. To merge the range data into the voxel grid, we stream through
We have just seen how “space carving” is a useful operation: it tells the voxel scanlines in order while stepping through the corresponding
us much about the structure of free space, allowing us to fill holes in scanlines in the resampled range image. We map each voxel scanline
an intelligent way. However, our algorithm only carves back from ob- to the correct portion of the range scanline as depicted in Figure 7d,
served surfaces. There are numerous situations where more carving and we resample the range data to yield a distance from the range sur-
would be useful. For example, the interior walls of a hollow cylinder face. Using the combination rules given by equations 3 and 4, we up-
may elude digitization, but by seeing through the hollow portion of date the run-length encoded structure. To preserve the linear mem-
the cylinder to a surface placed behind it, we can better approximate ory structure of the RLE volume (and thus avoid using linked lists of
its geometry. We can extend the carving paradigm to cover these situ- runs scattered through the memory space), we read the voxel scanlines
ations by placing such a backdrop behind the surfaces being scanned. from the current volume and write the updated scanlines to a second
By placing the backdrop outside of the voxel grid, we utilize it purely RLE volume; i.e., we double-buffer the voxel grid. Note that depend-
for carving space without introducing its geometry into the model. ing on the scanner geometry, the mapping from voxels to range image
pixels may not be linear, in which case care must be taken to resample
5 Implementation appropriately [5].
For the case of merging range data only in the vicinity of the sur-
5.1 Hardware face, we try to avoid processing voxels distant from the surface. To
The examples in this paper were acquired using a Cyberware 3030 that end, we construct a binary tree of minimum and maximum depths
MS laser stripe optical triangulation scanner. Figure 1b illustrates for every adjacent pair of resampled range image scanlines. Before
the scanning geometry: an object translates through a plane of laser processing each voxel scanline, we query the binary tree to decide
5
307
Voxel Voxel
Volume Volume
slices slices
Resampled Range
Range
Range image range image image Sensor
Sensor image
Figure 7. Range image resampling and scanline order voxel updates. (a) Range image scanlines are not in general oriented to allow for coherently streaming
through voxel and range scanlines. (b) By resampling the range image, we can obtain the desired range scanline orientation. (c) Casting rays from the pixels on
the range image means cutting across scanlines of the voxel grid, resulting in poor memory performance. (d) Instead, we run along scanlines of voxels, mapping
them to the correct positions on the resampled range image.
which voxels, if any, are near the range surface. In this way, only rel- algorithm has a number of desirable properties, including the repre-
evant pieces of the scanline are processed. In a similar fashion, the sentation of directional sensor uncertainty, incremental and order in-
space carving steps can be designed to avoid processing voxels that dependent updating, robustness in the presence of sensor errors, and
are not seen to be empty for a given range image. The resulting speed- the ability to fill gaps in the reconstruction by carving space. Our use
ups from the binary tree are typically a factor of 15 without carving, of a run-length encoded representation of the voxel grid and synchro-
and a factor of 5 with carving. We did not implement a brute-force nized processing of voxel and resampled range image scanlines make
volume update method, however we would expect the overall algo- the algorithm efficient. This in turn allows us to acquire and integrate
rithm described here would be much faster by comparison. a large number of range images. In particular, we demonstrate the
ability to integrate up to 70 scans into a high resolution voxel grid to
5.2.3 Fast surface extraction
generate million polygon models in a few hours. These models are
To generate our final surfaces, we employ a Marching Cubes algo- free of holes, making them suitable for surface fitting, rapid prototyp-
rithm [20] with a lookup table that resolves ambiguous cases [22]. To ing, and rendering.
reduce computational costs, we only process voxels that have varying
data or are at the boundary between empty and unseen. There are a number of limitations that prevent us from generating
models from an arbitrary object. Some of these limitations arise from
the algorithm while others arise from the limitations of the scanning
6 Results technology. Among the algorithmic limitations, our method has dif-
We show results for a number of objects designed to explore the ro- ficulty bridging sharp corners if no scan spans both surfaces meeting
bustness of our algorithm, its ability to fill gaps in the reconstruction, at the corner. This is less of a problem when applying our hole-filling
and its attainable level of detail. To explore robustness, we scanned a algorithm, but we are also exploring methods that will work without
thin drill bit using the traditional method of optical triangulation. Due hole filling. Thin surfaces are also problematic. As described in sec-
to the false edge extensions inherent in data from triangulation scan- tion 3, the influences of observed surfaces extend behind their esti-
ners [6], this particular object poses a formidable challenge, yet the mated positions for each range image and can interfere with distance
volumetric method behaves robustly where the zippering method [30] functions originating from scans of the opposite side of a thin surface.
fails catastrophically. The dragon sequence in Figure 11 demonstrates In this respect, the apexes of sharp corners also behave like thin sur-
the effectiveness of carving space for hole filling. The use of a back- faces. While we have limited this influence as much as possible, it
drop here is particularly effective in filling the gaps in the model. Note still places a lower limit on the thickness of surface that we can reli-
that we do not use the backdrop at all times, in part because the range ably reconstruct without causing artifacts such as thickening of sur-
images are much denser and more expensive to process, and also be- faces or rounding of sharp corners. We are currently working to lift
cause the backdrop tends to obstruct the path of the object when auto- this restriction by considering the estimated normals of surfaces.
matically repositioning it with our motion control platform. Finally,
Other limitations arise from the scanning technologies themselves.
the “Happy Buddha” sequence in Figure 12 shows that our method
can be used to generate very detailed, hole-free models suitable for Optical methods such as the one we use in this paper can only pro-
vide data for external surfaces; internal cavities are not seen. Further,
rendering and rapid manufacturing.
very complicated objects may require an enormous amount of scan-
Statistics for the reconstruction of the dragon and Buddha models
ning to cover the surface. Optical triangulation scanning has the ad-
appear in Figure 8. With the optimizations described in the previous
ditional problem that both the laser and the sensor must observe each
section, we were able to reconstruct the observed portions of the sur-
point on the surface, further restricting the class of objects that can be
faces in under an hour on a 250 MHz MIPS R4400 processor. The
scanned completely. The reflectance properties of objects are also a
space carving and hole filling algorithm is not completely optimized,
factor. Optical methods generally operate by casting light onto an ob-
but the execution times are still in the range of 3-5 hours, less than the
ject, but shiny surfaces can deflect this illumination, dark objects can
time spent acquiring and registering the range images. For both mod-
absorb it, and bright surfaces can lead to interreflections. To minimize
els, the RMS distance between points in the original range images and
these effects, we often paint our objects with a flat, gray paint.
points on the reconstructed surfaces is approximately 0.1 mm. This
figure is roughly the same as the accuracy of the scanning technology, Straightforward extensions to our algorithm include improving the
indicating a nearly optimal surface reconstruction. execution time of the space carving portion of the algorithm and
demonstrating parallelization of the whole algorithm. In addition,
7 Discussion and future work more aggressive space carving may be possible by making inferences
about sensor lines of sight that return no range data. In the future, we
We have described a new algorithm for volumetric integration of hope to apply our methods to other scanning technologies and to large
range images, leading to a surface reconstruction without holes. The scale objects such as terrain and architectural scenes.
6
308
Voxel Exec.
Model Scans Input size Volume
time Output Holes
triangles (mm) dimensions
(min) triangles
Acknowledgments
We would like to thank Phil Lacroute for his many helpful sugges-
tions in designing the volumetric algorithms. Afra Zomorodian wrote
the scripting interface for scanning automation. Homan Igehy wrote
the fast scan conversion code, which we used for range image resam- (c) (d)
pling. Thanks to Bill Lorensen for his marching cubes tables and
mesh decimation software, and for getting the 3D hardcopy made.
Matt Pharr did the accessibility shading used to render the color Bud-
dha, and Pat Hanrahan and Julie Dorsey made helpful suggestions for
RenderMan tricks and lighting models. Thanks also to David Addle-
man and George Dabrowski of Cyberware for their help and for the
use of their scanner. This work was supported by the National Sci-
ence Foundation under contract CCR-9157767 and Interval Research
Corporation.
References
[1] C.L. Bajaj, F. Bernardini, and G. Xu. Automatic reconstruction of surfaces and
scalar fields from 3D scans. In Proceedings of SIGGRAPH ’95 (Los Angeles, CA, (e) (f)
Aug. 6-11, 1995), pages 109–118. ACM Press, August 1995.
[2] J.-D. Boissonnat. Geometric structures for three-dimensionalshape representation.
ACM Transactions on Graphics, 3(4):266–286, October 1984.
[3] C.H. Chien, Y.B. Sim, and J.K. Aggarwal. Generation of volume/surface octree
from range data. In The Computer Society Conference on Computer Vision and
Pattern Recognition, pages 254–60, June 1988.
[4] C. I. Connolly. Cumulative generation of octree models from range data. In Pro-
ceedings, Intl. Conf. Robotics, pages 25–32, March 1984.
[5] B. Curless. Better optical triangulation and volumetric reconstruction of complex
models from range images. PhD thesis, Stanford University, 1996.
[6] B. Curless and M. Levoy. Better optical triangulation through spacetime analysis.
In Proceedings of IEEE International Conference on Computer Vision, pages 987–
994, June 1995.
[7] A. Dolenc. Software tools for rapid prototyping technologies in manufactur-
ing. Acta Polytechnica Scandinavica: Mathematics and Computer Science Series,
Ma62:1–111, 1993.
[8] D. Eberly, R. Gardner, B. Morse, S. Pizer, and C. Scharlach. Ridges for image anal-
ysis. Journal of Mathematical Imaging and Vision, 4(4):353–373, Dec 1994.
[9] H. Edelsbrunner and E.P. Mücke. Three-dimensional alpha shapes. In Workshop (g)
on Volume Visualization, pages 75–105, October 1992.
[10] A. Elfes and L. Matthies. Sensor integration for robot navigation: combining sonar Figure 9. Merging range images of a drill bit. We scanned a 1.6 mm drill
and range data in a grid-based representation. In Proceedings of the 26th IEEE bit from 12 orientations at a 30 degree spacing using traditional optical tri-
Conference on Decision and Control, pages 1802–1807, December 1987. angulation methods. Illustrations (a) - (d) each show a plan (top) view of a
[11] H. Gagnon, M. Soucy, R. Bergevin, and D. Laurendeau. Registration of multiple slice taken through the range data and two reconstructions. (a) The range
range views for automatic 3-D model building. In Proceedings 1994 IEEE Com- data shown as unorganized points: algorithms that operate on this form of
puter Society Conference on Computer Vision and Pattern Recognition,pages 581– data would likely have difficulty deriving the correct surface. (b) The range
586, June 1994. data shown as a set of wire frame tessellations of the range data: the false
[12] E. Grosso, G. Sandini, and C. Frigato. Extraction of 3D information and volumet- edge extensions pose a challenge to both polygon and volumetric meth-
ric uncertainty from multiple stereo images. In Proceedings of the 8th European ods. (c) A slice through the reconstructed surface generated by a polygon
Conference on Artificial Intelligence, pages 683–688, August 1988. method: the zippering algorithm of Turk [31]. (d) A slice through the re-
constructed surface generated by the volumetric method described in this
[13] P. Hebert, D. Laurendeau, and D. Poussart. Scene reconstruction and description:
paper. (e) A rendering of the zippered surface. (f) A rendering of the volu-
geometric primitive extraction from multiple viewed scattered data. In Proceedings
metrically generated surface. Note the catastrophic failure of the zippering
of IEEE Conference on Computer Vision and Pattern Recognition, pages 286–292,
algorithm. The volumetric method, however, produces a watertight model.
June 1993.
(g) A photograph of the original drill bit. The drill bit was painted white
[14] A. Hilton, A.J. Toddart, J. Illingworth, and T. Windeatt. Reliable surface recon- for scanning.
struction from multiple range images. In Fourth European Conference on Com-
7
309
puter Vision, volume I, pages 117–126, April 1996.
[15] Tsai-Hong Hong and M. O. Shneier. Describing a robot’s workspace using a se-
quence of views from a moving camera. IEEE Transactions on Pattern Analysis
and Machine Intelligence, 7(6):721–726, November 1985. f2 w2
[16] H. Hoppe, T. DeRose, T. Duchamp, J. McDonald, and W. Stuetzle. Surface recon-
struction from unorganized points. In Computer Graphics (SIGGRAPH ’92 Pro-
z = f (x; y) d2
ceedings), volume 26, pages 71–78, July 1992. (x; y; z )
[17] V. Krishnamurthyand M. Levoy. Fitting smooth surfaces to dense polygonmeshes.
In these proceedings. f1 w1 d1
[18] P. Lacroute and M. Levoy. Fast volume rendering using a shear-warp factorization
of the viewing transformation. In Proceedings of SIGGRAPH ’94 (Orlando, FL,
July 24-29, 1994), pages 451–458. ACM Press, July 1994. z
[19] A. Li and G. Crebbin. Octree encoding of objects from range images. Pattern v2
Recognition, 27(5):727–739, May 1994.
[20] W.E. Lorensen and H. E. Cline. Marching cubes: A high resolution 3D surface
v1 y
construction algorithm. In Computer Graphics (SIGGRAPH ’87 Proceedings),vol-
ume 21, pages 163–169, July 1987. x
[21] W.N. Martin and J.K. Aggarwal. Volumetric descriptions of objects from multiple Figure 10. Two range surfaces, f1 and f2 , are tessellated range images
views. IEEE Transactions on Pattern Analysis and Machine Intelligence, 5(2):150–
acquired from directions v1 and v2 . The possible range surface, z =
158, March 1983.
f (x; y), is evaluated in terms of the weighted squared distances to points
[22] C. Montani, R. Scateni, and R. Scopigno. A modified look-up table for implicit on the range surfaces taken along the lines of sight to the sensor. A point,
disambiguation of marching cubes. Visual Computer, 10(6):353–355, 1994. (x; y; z ), is shown here being evaluated to find its corresponding signed
[23] M. Potmesil. Generating octree models of 3D objects from their silhouettes in a distances, d1 and d2 , and weights, w1 and w2 .
sequence of images. Computer Vision, Graphics, and Image Processing, 40(1):1–
29, October 1987.
n ZZ
X
wi (s; t; f )di (s;t; f )2 dsdt
[24] M. Rutishauser, M. Stricker, and M. Trobina. Merging range images of arbitrar-
ily shaped objects. In Proceedings 1994 IEEE Computer Society Conference on E (f ) = (6)
Computer Vision and Pattern Recognition, pages 573–580, June 1994. i=1 Ai
where each (s; t) corresponds to a particular sensor line of sight for
[25] M. Soucy and D. Laurendeau. A general surface approach to the integration of a set
each range image, Ai is the domain of integration for the i’th range
of range views. IEEE Transactions on Pattern Analysis and Machine Intelligence,
image, and wi(s; t; f ) and di (s; t; f ) are the weights and signed dis-
17(4):344–358, April 1995.
that minimizes this integral. Solving this equation we arrive at the fol-
lowing relation:
A Isosurface as least squares minimizer n
X
@v [wi(x; y; z )di (x;y; z )2 ] = 0
i
(8)
It is possible to show that the isosurface of the weighted signed dis- i=1
where @vi is the directional derivative along vi . Since the weight as-
tance function is equivalent to a least squares minimization of squared
distances between points on the range surfaces and points on the de-
sired reconstruction. The key assumptions are that the range sensor is sociated with a line of sight does not vary along that line of sight, and
orthographic and that the range errors are independently distributed the signed distance has a derivative of unity along the line of sight, we
along sensor lines of sight. A full proof is beyond the scope of this can simplify this equation to:
paper, but we provide a sketch here. See [5] for details. n
X
Consider a region, R, on the desired surface, f , which is observed wi (x; y; z )di (x;y; z ) = 0 (9)
by n range images. We define the error between an observed range i=1
surface and a possible reconstructed surface as the integral of the This weighted sum of signed distances is the same as what we com-
weighted squared distances between points on the range surface and pute in equations 1 and 2, without the division by the sum of the
the reconstructed surface. These distances are taken along the lines of weights. Since the this divisor is always positive, the isosurface we
sight of the sensor, commensurate with the predominant directions of extract in section 3 is exactly the least squares minimizing surface de-
uncertainty (see Figure 10). The total error is the sum of the integrals
for the n range images:
scribed here.
8
310
(a) (b)
(c) (d)
Figure 11. Reconstruction of a dragon. Illustrations (a) - (d) are full views of the dragon. Illustrations (e) - (h) are magnified views of the section highlighted
by the green box in (a). Regions shown in red correspond to hole fill triangles. Illustrations (i) - (k) are slices through the corresponding volumetric grids at
the level indicated by the green line in (e). (a)(e)(i) Reconstruction from 61 range images without space carving and hole filling. The magnified rendering
highlights the holes in the belly. The slice through the volumetric grid shows how the signed distance ramps are maintained close to the surface. The gap in
the ramps leads to a hole in the reconstruction. (b)(f)(j) Reconstruction with space carving and hole filling using the same data as in (a). While some holes are
filled in a reasonable manner, some large regions of space are left untouched and create extraneous tessellations. The slice through the volumetric grid reveals
that the isosurface between the unseen (brown) and empty (black) regions will be connected to the isosurface extracted from the distance ramps, making it part
of the connected component of the dragon body and leaving us with a substantial number of false surfaces. (c)(g)(k) Reconstruction with 10 additional range
images using “backdrop” surfaces to effect more carving. Notice how the extraneous hole fill triangles nearly vanish. The volumetric slice shows how we have
managed to empty out the space near the belly. The bumpiness along the hole fill regions of the belly in (g) corresponds to aliasing artifacts from tessellating
over the discontinuous transition between unseen and empty regions. (d)(h) Reconstruction as in (c)(g) with filtering of the hole fill portions of the mesh. The
filtering operation blurs out the aliasing artifacts in the hole fill regions while preserving the detail in the rest of the model. Careful examination of (h) reveals
a faint ridge in the vicinity of the smoothed hole fill. This ridge is actual geometry present in all of the renderings, (e)-(h). The final model contains 1.8 million
polygons and is watertight.
9
311
312
10