CSEN226 EngineeringSurveyingII Notes2023-24
CSEN226 EngineeringSurveyingII Notes2023-24
LECTURE NOTES
CSEN 226 - ENGINEERING SURVEYING II
By OKUSIMBA George
[email protected]
ii
1. INTRODUCTION
Course Content
[1] Horizontal Control: Triangulation;
[2] Point Positioning Techniques: Resection, Intersection;
[3] Curves in Highway Engineering: Horizontal Curves, Vertical Curves;
[4] Principles of Least Squares Adjustment: Errors and their Propagation, Weighting of Observations,
Principles of Least Squares Adjustment;
[5] Field work: Electromagnetic Distance Measurements, setting out of curves.
[6] Laboratory work: computer exercises on least squares method of adjustment.
Assessment
SNo. Item Marks
1 Examination 70
2 Course Work (CATs, Practicals, 30
Assignments)
Total 100
Pass Mark 40
References
[i] ASEC: Journal of Engineering Survey
[ii] Bannister, A and Raymond, S (1998, 7th Ed). Surveying, ELBS.
[iii] Ghilani, C.D. and Wolf, P.R. 2003. Elementary Surveying: An Introduction to Geomatics, 12th
Edition. Published by Pearson/Prentice- Hall London
[iv] Ghilani, C.D. and Wolf, P.R. 2006. Adjustment Computations: Spatial Data Analysis, 4th Edition.
Published by Wiley & Sons, Inc., Hoboken New Jersey.
[v] Schofied, W., 2001. Engineering Surveying. Theory and Examination Problems for students,
Published by Butterworth-Heinemann, London
[vi] Schofield, W and Breach, M (1997, 6th Ed) Engineering Surveying, Elsevier Ltd.
iii
2. TRIANGULATION
Introduction
A horizontal control survey provides a framework of survey points, whose relative positions, in two
dimensions, are known to specified degrees of accuracy. The areas covered by these points may extend
over a whole country and form the basis for the national maps of that country. Alternatively the area may
be relatively small, encompassing a construction site for which a large-scale plan is required. Although the
areas covered in construction are usually quite small, the accuracy may be required to a very high order.
The types of engineering project envisaged are the construction of long tunnels and/or bridges,
deformation surveys for dams and reservoirs, three-dimensional tectonic ground movement for landslide
prediction, to name just a few. Hence control networks provide a reference framework of points for:
(i) Topographic mapping and large-scale plan production.
(ii) Dimensional control of construction work.
(iii) Deformation surveys for all manner of structures, both new and old.
(iv) The extension and densification of existing control networks.
Control frame works can be formed in many different ways using a variety of observing techniques. The
position and orientation can be fixed with respect to other data such as the national coordinate system
or magnetic north.
Triangulation
Because, at one time, it was easier to measure angles than it was distance, triangulation was the preferred
method of establishing the position of control points. Many countries used triangulation as the basis of
their national mapping system. The procedure was generally to establish primary triangulation networks,
with triangles having sides ranging from 30 to 50 km in length. The primary trig points were fixed at the
corners of these triangles and the sum of the measured angles was correct to ±3”. These points were
usually established on the tops of mountains to afford long, uninterrupted sight lines. The primary
network was then densified with points at closer intervals connected into the primary triangles. This
secondary network had sides of 10–20 km with a reduction in observational accuracy. Finally, a third-
order net, adjusted to the secondary control, was established at 3–5-km intervals and fourth-order points
1
fixed by intersection (to be discussed later). The base line and check base line were measured by invar
tapes in catenary and connected into the triangulation by angular extension procedures. This approach is
classical triangulation, which is now obsolete. The more modern approach would be to use GPS which
would be much easier and would afford greater control of scale error.
In triangulation, the area is divided into a series of standard geometrical figures, the corners of which form
a series of accurately located control points/stations. In the figure above, AB is the base line. Its length
and orientation are known. The coordinates of A are also known. The angles a, b, c, d,…..t are measured.
The other sides of the whole triangulation and the coordinates of the stations are worked out from these
observed angles.
Measurement of one line provides the scale of survey while angular observations define the shape. More
than enough data is needed/measured for checking purposes and adjustment to improve precision.
These simple designs are useful when EDM instruments and calculating aids are not available. Distance
measurement and calculations can be kept to a minimum. Distance measurement by tape is difficult and
tedious when carried out to some order of accuracy as good angle measurements. When EDM equipment
is available, layout of control scheme will not be limited to braced quadrilaterals and center point
polygons. Distances can be measured only, thereby producing a trilateration framework. Most modern
control schemes involve both angular and distance measurements. This is called triangulateration.
2
Choice of Stations
Use existing maps of the area to be covered by triangulation scheme during reconnaissance to select the
most suitable control points. The following should be kept in mind when planning for the triangulation
scheme:
Every station should be visible from adjacent stations. Grazing rays or those through obstacles should
be avoided;
Triangles should be well conditioned-nearly equilateral and no angle should be less than 30o;
Make the scheme as simple as possible but with redundant observations;
Triangles should be as large as possible.
Once the framework has been designed, stations are marked and beaconed. Stations should be visible to
a distant theodolite. Use poles with flags, tall tripods with brightly colored conical tops are usually the
norm. If observations are to be made at night, luminous beacons are usually employed.
Angular Measurements
Horizontal angles are measured using the method of rounds to improve accuracy. The theodolite is set up
on say station O (centering, levelling and elimination of parallax); the optical micrometer is set to read
zero approximately in the FL position and pointing is made on reference station A.
In a clockwise direction, point the instrument to B and note the reading on horizontal circle. Do the same
for C,D,E and finally back to A. the face may be changed and pointings made to stations E,D,C and B. the
final reading on A should differ now from the first by 180 o. This completes one set of readings.
The zero setting is altered to use other parts of the scale and a similar procedure is followed. The numbers
of zeroes depend on the degree of precision required. For primary triangulation, 16-32 zeroes may be
taken in which side lengths of up to 50km apply. Secondary triangulations, 8-16 zeroes having side lengths
of up to 15km.
Individual angles are deduced from readings in the usual way and if the difference between one reading
and the mean for that angle exceeds 4”, it should be repeated before leaving the station.
3
Distance Measurement
The side lengths of triangulation framework will be measures using EDM. For larger triangles, GPS may be
used to obtain coordinates. For simple schemes, only the base line is measured but it is normally good to
have more than one measurement to check the quality of work.
Before the advent of EDM, atleast one length had to be measured by taping-made repeated
measurements by an invar band hung in catenery. Corrections for tension, temperature etc had to be
applied. This gave accuracies of 0.0001m which were of the order 1:500000. GPS and EDM have rendered
these methods obsolete.
Accuracy of Triangulation
Precision in triangulation is measured by the average triangular error which is the average deviation of
the sum of measured angles in the triangles from 180o after correction of curvature. For small triangles of
sides of order 2km, the curvature of earth may be neglected and the three measured angles should sum
to 180o. In larger triangles, the curvature of earth results in these three angles adding up to more than
180o and the excess is known as spherical excess.
(measured angles)-( 180o+E)=; E is the spherical excess; is the triangular error.
𝐴
𝐸=
𝑅 2 sin 1"
A is the area of triangular; R is mean radius of earth.
For all works (except geodetic) the area of the triangle can be estimated as if it were a plane, so that:
𝐴𝑟𝑒𝑎 𝑜𝑓 𝑡𝑟𝑖𝑎𝑛𝑔𝑙𝑒 𝑖𝑛 𝑠𝑞 𝑘𝑚
𝐸= 𝑥5.09"
1000
In geodetic work (highest accuracy) the average value of should be less than 1” of arc.
Computation of Coordinates
Coordinates are computed using the basic formulae for computing coordinates from observed angles and
distances. This has been discussed already. In triangulation, too many data is usually observed and hence
some adjustment is required to get the best unique set of coordinates.
4
Application of Triangulation
Horizontal control for majority of survey work is carried out by traversing because distances are easily
measured by EDMs. However, triangulation, trilateration and triangulateration are usually selected for:
i) Establishment of accurately located control points for surveys of large areas.
ii) Accurate location of engineering works-center lines, shafts for long tunnels etc.
iii) Accurate control points for aerial photogrammetry.
iv) Measurement of deformations of structures e.g. dams.
5
3. POINT POSITIONING TECHNIQUES
Intersection
This is a process of locating and coordinating a point from at least two existing control stations by
observing horizontal directions at the control points.
Applications:
i) Coordinating new control points. Could be high or inaccessible points;
ii) Surveying detail in inaccessible positions;
iii) Point location in industrial measurement system.
(a) Intersection with angles
Consider the figure above, A and B are control points and the coordinates of point Q are to be determined
as follows:
Compute AB and distance AB.
Angles a and b are measured;
Determine lengths AQ and BQ using sine rule;
Establish bearings AQ and BQ;
Compute coordinates of Q using: AQ and SAQ or BQ and SBQ.
The use of standard expressions is common if a number of points are to be located from A and B.
From similar triangles ARP and ASB;
𝐴𝑃
𝐸𝑃 = 𝐸𝐴 + (𝐸 − 𝐸𝐴 )
𝐴𝐵 𝐵
𝐸𝑄 = 𝐸𝑃 + 𝑃𝑄 cos ∅
𝐴𝑃 (𝑁𝐵 − 𝑁𝐴 )
𝐸𝑄 = (𝐸𝐵 − 𝐸𝐴 ) + 𝐸𝐴 + 𝑃𝑄
𝐴𝐵 𝐴𝐵
But AP=PQ Cot a and AB=PQ Cot a + PQ Cot b
𝑃𝑄 cot 𝑎 (𝑁𝐵 − 𝑁𝐴 )
𝐸𝑄 = (𝐸𝐵 − 𝐸𝐴 ) + 𝐸𝐴 + 𝑃𝑄
𝑃𝑄 cot 𝑎 + 𝑃𝑄 cot 𝑏 𝑃𝑄 cot 𝑎 + 𝑃𝑄 cot 𝑏
6
𝑬𝑩 𝐜𝐨𝐭 𝒂 + 𝑬𝑨 𝐜𝐨𝐭 𝒃 + 𝑵𝑩 − 𝑵𝑨
𝑬𝑸 =
𝐜𝐨𝐭 𝒂 + 𝐜𝐨𝐭 𝒃
Similarly,
𝑵𝑩 𝐜𝐨𝐭 𝒂 + 𝑵𝑨 𝐜𝐨𝐭 𝒃 + 𝑬𝑨 − 𝑬𝑩
𝑵𝑸 =
𝐜𝐨𝐭 𝒂 + 𝐜𝐨𝐭 𝒃
ABQ is lettered in a clockwise direction and care must be taken to ensure that the data is presented in a
similar manner.
(b) Intersection with known bearings
𝐸3 − 𝐸1
tan 𝐴1 = … … (𝑖)
𝑁3 − 𝑁1
𝐸3 − 𝐸2
tan 𝐴2 = … … (𝑖𝑖)
𝑁3 − 𝑁2
Rearranging (i) and (ii) to make E3 the subject:
𝐸1 + (𝑁3 − 𝑁1 ) tan 𝐴1 = 𝐸3 … … (𝑖𝑎)
𝐸2 + (𝑁3 − 𝑁2 ) tan 𝐴2 = 𝐸3 … … (𝑖𝑏)
Equating (ia) and (ib)
𝐸1 + (𝑁3 − 𝑁1 ) tan 𝐴1 = 𝐸2 + (𝑁3 − 𝑁2 ) tan 𝐴2
Making N3 the subject;
(𝑬𝟐 − 𝑬𝟏 ) + 𝑵𝟏 𝐭𝐚𝐧 𝑨𝟏 − 𝑵𝟐 𝐭𝐚𝐧 𝑨𝟐
𝑵𝟑 =
𝐭𝐚𝐧 𝑨𝟏 − 𝐭𝐚𝐧 𝑨𝟐
Similarly;
(𝑵𝟐 − 𝑵𝟏 ) + 𝑬𝟏 𝐜𝐨𝐭 𝑨𝟏 − 𝑬𝟐 𝐜𝐨𝐭 𝑨𝟐
𝑬𝟑 =
𝐜𝐨𝐭 𝑨𝟏 − 𝐜𝐨𝐭 𝑨𝟐
7
(c) Intersection with distances (trilateration)
8
𝟏 𝟏 𝟐𝑨
𝑵𝟑 = (𝑵𝟏 + 𝑵𝟐 ) + 𝟐 (𝒍𝟐𝟏 − 𝒍𝟐𝟐 )(𝑵𝟏 − 𝑵𝟐 ) + 𝟐 (𝑬𝟏 − 𝑬𝟐 )
𝟐 𝟐𝒍𝟑 𝒍𝟑
The area of the triangle is computed using the three sides
Resection
It is a method of locating a single point by measuring horizontal angles from it to three visible stations
whose positions are known. The method is often interchangeably called three-point problem (special case
of simple triangulation.) and is a weaker solution compared to intersection. It is extremely useful
technique for quickly fixing position where it is best required for setting-out purposes.
The theodolite occupies station P and angles and are measured between stations A and B, and B and
C.
Analytical method
Let BAP be , then
BCP=(360o---)-
is computed from coordinates A, B and C
Therefore S is known
From PAB, PB=BAsin/sin…………………(i)
From PCB, PB=BCsin(S-)/sin………………(ii)
Equating (i) and (ii)
9
The co-ordinates of P can be solved with the three values; distances and bearings of AP, BP and CP. This
method fails if P lies on the circumference of a circle passing through A, B, and C, and has an infinite
number of positions.
Danger Circle
Although the computations will always give the E and N coordinates of the resected station, those co-
ordinates will be suspect in all probability. In choosing resection station, care should be exercised such
that it does not lie on the circumference of the "danger circle".
10
Tienstra Formula
It is the standard formula if horizontal coordinates of several points have to be determined by the method
of resection. The stations are lettered/numbered in a clockwise direction.
is clockwise angle between directions PB and PC;
is clockwise angle between directions PC and PA;
is clockwise angle between directions PA and PB
Question: A new control point F is to be established from existing control points T and D as shown in the
figure below. The horizontal clockwise angles at T and D have been observed as DTF=44o52’36” and
TDF=284o26’38” respectively. From the data given below, determine the coordinates of station F.
11
4. CURVES
The center line of a road consists of series of straight lines interconnected by curves that are used to
change the alignment, direction, or slope of the road. Those curves that change the alignment or direction
are known as horizontal curves, and those that change the slope are vertical curves.
Horizontal Curves
When a highway changes horizontal direction, making the point where it changes direction a point of
intersection between two straight lines is not feasible. The change in direction would be too abrupt for
the safety of modern, high-speed vehicles. It is therefore necessary to interpose a curve between the
straight lines. The straight lines of a road are called tangents because the lines are tangent to the curves
used to change direction. In practically all modem highways, the curves are circular curves; that is, curves
that form circular arcs. The smaller the radius of a circular curve, the sharper the curve. For modern, high-
speed highways, the curves must be flat, rather than sharp. That means they must be large-radius curves.
In highway work, the curves needed for the location or improvement of small secondary roads may be
worked out in the field. Usually, however, the horizontal curves are computed after the route has been
selected, the field surveys have been done, and the survey base line and necessary topographic features
have been plotted. In urban work, the curves of streets are designed as an integral part of the preliminary
and final layouts, which are usually done on a topographic map. In highway work, the road itself is the
end result and the purpose of the design. But in urban work, the streets and their curves are of secondary
importance; the best use of the building sites is of primary importance.
The principal consideration in the design of a curve is the selection of the length of the radius or the degree
of curvature (explained later).
12
3. REVERSE. A reverse curve consists of two simple curves joined together, but curving in opposite
direction. For safety reasons, the use of this curve should be avoided when possible.
4. SPIRAL. The spiral is a curve that has a varying radius. It is used on railroads and most modern highways.
Its purpose is to provide a transition from the tangent to a simple curve or between simple curves in a
compound curve.
13
POC POINT OF CURVE. The point of curve is any point along the curve.
L LENGTH OF CURVE. The length of curve is the distance from the PC to the PT, measured along the curve.
T TANGENT DISTANCE. The tangent distance s the distance along the tangents from the PI to the PC or the
PT. These distances are equal on a simple curve.
LC LONG CHORD. The long chord is the straight-line distance from the PC to the PT. Other types of chords
are designated as follows:
C The full-chord distance between adjacent stations (full, half, quarter, or one tenth stations)
along a curve.
C1 The subchord distance between the PC and the first station on the curve.
C2 The subchord distance between the last station on the curve and the PT.
E EXTERNAL DISTANCE. The external distance (also called the external secant) is the distance from the PI
to the midpoint of the curve. The external distance bisects the interior angle at the PI.
M MIDDLE ORDINATE. The middle ordinate is the distance from the midpoint of the curve to the midpoint
of the long chord. The extension of the middle ordinate bisects the central angle.
D DEGREE OF CURVE. The degree of curve defines the sharpness or flatness of the curve.
14
Curve Formulas
The relationship between the elements of a curve is expressed in a variety of formulas. The equations that
will be used in the computation of curves are discussed below.
15
curve length, L is the distance from PC to PT
measured along a curve
Degree of Curvature
The last of the elements of a curve listed above (degree of curve) deserves special attention. Curvature
may be expressed by simply stating the length of the radius of the curve. Stating the radius is a common
practice in land surveying and in the design of urban roads. For highway and railway work, however,
curvature is expressed by the degree of curve. Two definitions are used for the degree of curve. These
definitions are arc definition and chord definition.
16
of a chord 100 feet (or 100 meters) long. If you take a flat curve, mark a 100-foot chord, and determine
the central angle to be 0°30’, then you have a 30-minute curve (chord definition).
Notice that in both the arc definition and the chord definition, the radius of curvature is inversely
proportional to the degree of curvature. In other words, the larger the degree of curve, the shorter the
radius; for example, using the arc definition, the radius of a 1° curve is 5,729.58 units, and the radius of a
5° curve is 1,145.92 units. Under the chord definition, the radius of a 1° curve is 5,729.65 units, and the
radius of a 5° curve is 1,146.28 units.
17
A further point to note in regard to chainage is that if the chainage at I1 is known, then the chainage at T1
= Chn I1 − distance I1T1, the tangent length. However the chainage at T2 = Chn T1 + curve length, as chainage
is measured along the route under construction.
Considering Figure 10.3. The straights OI1, I1I2, I2I3, etc., will have been designed on the plan in the first
instance. Using railway curves, appropriate curves will now be designed to connect the straights. The
tangent points of these curves will then be fixed, making sure that the tangent lengths are equal, i.e. T1I1
= T2I1 and T3I2 = T4I2. The coordinates of the origin, point O, and all the intersection points only will now be
carefully scaled from the plan. Using these coordinates, the bearings of the straights are computed and,
using the tangent lengths on these bearings, the coordinates of the tangent points are also
computed. The difference of the bearings of the straights provides the deflection angles (Δ) of the curves
which, combined with the tangent length, enables computation of the curve radius, through chainage and
all setting-out data. Now the tangent and intersection points are set out from existing control survey
stations and the curves ranged between them using the methods detailed below.
18
prepare a list of appropriate stations and cumulative deflection angles:
Chord No. Chord length Chainage Deflection Setting out Remarks
angle angle
0 0 0+196.738 00 00 00 00 00 00 BC
1 3.262 0+200 00 14 01 00 14 01 Peg 1
2 20 0+220 01 25 57 00 39 58 Peg 2
3 20 0+240 01 25 57 03 05 55 Peg 3
4 20 0+260 01 25 57 04 31 52 Peg 4
5 20 0+280 01 25 57 05 57 49 Peg 5
6 6.448 0+286.448 00 27 42 06 25 31 Peg 6 (EC)
19
station 0+220 can be located by placing a stake on theodolite line at 01 39’ 58” at a distance of 20m
along arc from stake that locates 0+200
other stations located in a similar manner
Chord Computations
The above technique has errors because distances measured are not arc distances, they are
cords/subcords. Subcords can be computed using the cord equation derived earlier. Any subcord can be
calculated if the deflection angle is known and for the last example, the relevant chords are computed
using the equation below as:
Vertical Curves
In addition to horizontal curves that go to the right or left, roads also have vertical curves that go up or
down. Vertical curves at a crest or the top of a hill are called summit curves, or oververticals. Vertical
curves at the bottom of a hill or dip are called sag curves, or underverticals.
20
Grades
Vertical curves are used to connect stretches of road that go up or down at a constant slope. These lines
of constant slope are called grade tangents. The rate of slope is called the gradient, or simply the grade.
Grades that ascend in the direction of the stationing are designated as plus; those that descend in the
direction of the stationing are designated as minus. Grades are measured in terms of percent; that is, the
number of feet of rise or fall in a 100-foot horizontal stretch of the road.
After the location of a road has been determined and the necessary fieldwork has been obtained, the
engineer designs or fixes (sets) the grades. A number of factors are considered, including the intended
use and importance of the road and the existing topography. If a road is too steep, the comfort and safety
of the users and fuel consumption of the vehicles will be adversely affected; therefore, the design criteria
will specify maximum grades.
Typical maximum grades are a 4-percent desired maximum and a 6-percent absolute maximum for a
primary road. (The 6 percent means, as indicated before, a 6-foot rise for each 100 feet ahead on the
road.) For a secondary road or a major street, the maximum grades might be a 5-percent desired and an
8-percent absolute maximum; and for a tertiary road or a secondary street, an 8-percent desired and a
10-percent (or perhaps a 12-percent) absolute maximum. Conditions may sometimes demand that grades
or ramps, driveways, or short access streets go as high as 20 percent. The engineer must also consider
minimum grades. A Street with curb and gutter must have enough fall so that the storm water will drain
to the inlets; 0.5 percent is a typical minimum grade for curb and gutter (that is, 1/2 foot minimum fall for
21
each 100 feet ahead). For roads with side ditches, the desired minimum grade might be 1 percent; but
since ditches may slope at a grade different from the pavement, a road may be designed with a zero-
percent grade. Zero-percent grades are not unusual, particularly through plains or tidewater areas.
Another factor considered in designing the finished profile of a road is the earthwork balance; that is, the
grades should be set so that all the soil cut off of the hills may be economically hauled to fill in the low
areas. In the design of urban streets, the best use of the building sites next to the street will generally be
more important than seeking an earthwork balance.
22
Elements of a Vertical Curve
The figure below shows the elements of a vertical curve. The meaning of the symbols and the units of
measurement usually assigned to them follow:
PVC Point of vertical curvature; the place where the curve begins.
23
x Horizontal distance from the PVC to any POVC or POVT back of the PVI, or the distance from the PVT to
any POVC or POVT ahead of the PW, measured in feet.
y Vertical distance (offset) from any POVT to the corresponding POVC, measured in feet;
which is the fundamental relationship of the parabola that permits convenient calculation of the vertical
offsets.
The vertical curve computation takes place after the grades have been set and the curve designed.
Therefore, at the beginning of the detailed computations, the following are known: g1, g2, l1, l2, L, and the
elevation of the PVI. The general procedure is to compute the elevations of certain POVTs and then to use
the foregoing formulas to compute G, then e, and then the Ys that correspond to the selected POVTs.
When the y is added or subtracted from the elevation of the POVT, the result is the elevation of the POVC.
The POVC is the finished elevation on the road, which is the end result being sought. In the figure showing
the elements of a vertical curve above, the y is subtracted from the elevation of the POVT to get the
elevation of the curve; but in the case of a sag curve, the y is added to the POVT elevation to obtain the
POVC elevation.
The computation of G requires careful attention to the signs of g1 and g2. Vertical curves are used at
changes of grade other than at the top or bottom of a hill; for example, an uphill grade that intersects an
even steeper uphill grade will be eased by a vertical curve. The six possible combinations of plus and minus
grades, together with sample computations of G, are shown in figure below. Note that the algebraic sign
for G indicates whether to add or subtract y from a POVT.
24
The selection of the points at which to compute the y and the elevations of the POVT and POVC is generally
based on the stationing. The horizontal alignment of a road is often staked out on 50-foot or 100-foot
stations. Customarily, the elevations are computed at these same points so that both horizontal and
vertical information for construction will be provided at the same point. The PVC, PVI, and PVT are usually
set at full stations or half stations. In urban work, elevations are sometimes computed and staked every
25 feet on vertical curves. The same, or even closer, intervals may be used on complex ramps and
interchanges.
25
Symmetrical Vertical Curves
A symmetrical vertical curve is one in which the horizontal distance from the PVI to the PVC is equal to
the horizontal distance from the PVI to the PVT. In other words, l1 equals l2. The solution of a typical
problem dealing with a symmetrical vertical curve will be presented step by step. Assume that you know
the following data:
The problem is to compute the grade elevation of the curve to the nearest hundredth of a foot at each
50-foot station. Figure below shows the vertical curve to be solved.
26
table of computation of elevations on symmetrical vertical curve
STEP 1: Prepare a table as the one above. In this table, column 1 shows the stations; column 2, the
elevations on tangent; column 3, the ratio of x/l; column 4, the ratio of (x/l) 2; column 5, the vertical offsets
[(x/l)2(e)]; column 6, the grade elevations on the curve; column 7, the first difference; and column 8, the
second difference.
STEP 2: Compute the elevations and set the stations on the PVC and the PVT. Knowing both the gradients
at the PVC and PVT and the elevation and station at the PVI, you can compute the elevations and set the
stations on the PVC and the PVT. The gradient (g1) of the tangent at the PVC is given as +9 percent. This
means a rise in elevation of 9 feet for every 100 feet of horizontal distance. Since L is 400.00 feet and the
curve is symmetrical, l1 equals l2 equals 200.00 feet; therefore, there will be a difference of 9 x 2, or 18,
feet between the elevation at the PVI and the elevation at the PVC. The elevation at the PVI in this problem
is given as 239.12 feet; therefore, the elevation at the PVC is
239.12 – 18 = 221.12 feet.
Calculate the elevation at the PVT in a similar manner. The gradient (g2) of the tangent at the PVT is given
as –7 percent. This means a drop in elevation of 7 feet for every 100 feet of horizontal distance. Since l1
equals l2 equals 200 feet, there will be a difference of 7 x 2, or 14, feet between the elevation at the PVI
and the elevation at the PVT. The elevation at the PVI therefore is 239.12 – 14 = 225,12 feet.
27
In setting stations on a vertical curve, remember that the length of the curve (L) is always measured as a
horizontal distance. The half-length of the curve is the horizontal distance from the PVI to the PVC. In this
problem, l1 equals 200 feet. That is equivalent to two 100-foot stations and may be expressed as 2 + 00.
Thus the station at the PVC is
30 + 00 minus 2 + 00, or 28 + 00.
The station at the PVT is
30 + 00 plus 2 + 00, or 32 + 00.
List the stations under column 1.
STEP 3: Calculate the elevations at each 50-foot station on the tangent. From Step 2, you know there is a
9-foot rise in elevation for every 100 feet of horizontal distance from the PVC to the PVI. Thus, for every
50 feet of horizontal distance, there will be a rise of 4.50 feet in elevation. The elevation on the tangent
at station 28 + 50 is
221.12 + 4.50 = 225.62 feet.
The elevation on the tangent at station 29 + 00 is
225.62 + 4.50 = 230.12 feet.
The elevation on the tangent at station 29 + 50 is
230.12 + 4.50 = 234.62 feet.
The elevation on the tangent at station 30 + 00 is
234.62 + 4.50 = 239.12 feet.
In this problem, to find the elevation on the tangent at any 50-foot station starting at the PVC, add 4.50
to the elevation at the preceding station until you reach the PVI. At this point use a slightly different
method to calculate elevations because the curve slopes downward toward the PVT. Think of the
elevations as being divided into two groups—one group running from the PVC to the PVI; the other group
running from the PVT to the PVI. Going downhill on a gradient of –7 percent from the PVI to the PVT, there
will be a drop of 3.50 feet for every 50 feet of horizontal distance. To find the elevations at stations
between the PVI to the PVT in this particular problem, subtract 3.50 from the elevation at the preceding
station. The elevation on the tangent at station 30 + 50 is 239.12-3.50, or 235.62 feet.
The elevation on the tangent at station 31 + 50 is 235.62-3.50, or 232.12 feet.
The elevation on the tangent at station 31 + 50 is 232.12-3.50, or 228.62 feet.
The elevation on the tangent at station 32+00 (PVT) is 228.62-3.50, or 225.12 feet,
28
The last subtraction provides a check on the work you have finished. List the computed elevations under
column 2.
STEP 4: Calculate (e), the middle vertical offset at the PVI.
First, find the (G), the algebraic difference of the gradients using the formula
G = g2– g1
G= -7 –(+9)
G= –16%
The middle vertical offset (e) is calculated as follows:
e = LG/8 = [(4)(–16) ]/8 = -8.00 feet.
The negative sign indicates e is to be subtracted from the PVI.
STEP 5: Compute the vertical offsets at each 50-foot station, using the formula (x/l)2e. To find the vertical
offset at any point on a vertical curve, first find the ratio x/l; then square it and multiply by e; for example,
at station 28 + 50, the ratio of x/l = 50/200 = 1/4.
Therefore, the vertical offset is (1/4)2 e = (1/16) e.
The vertical offset at station 28 + 50 equals (1/16)(–8) = –0.50 foot.
Repeat this procedure to find the vertical offset at each of the 50-foot stations. List the results under
columns 3, 4, and 5.
STEP 6: Compute the grade elevation at each of the 50-foot stations. When the curve is on a crest, the
sign of the offset will be negative; therefore, subtract the vertical offset (the figure in column 5) from the
elevation on the tangent (the figure in column 2); for example, the grade elevation at station 29 + 50 is
234.62 – 4.50 = 230.12 feet.
Obtain the grade elevation at each of the stations in a similar manner. Enter the results under column 6.
Note: When the curve is in a dip, the sign will be positive; therefore, you will add the vertical offset (the
figure in column 5) to the elevation on the tangent (the figure in column 2).
STEP 7: Find the turning point on the vertical curve. When the curve is on a crest, the turning point is the
highest point on the curve. When the curve is in a dip, the turning point is the lowest point on the curve.
The turning point will be directly above or below the PVI only when both tangents have the same percent
of slope (ignoring the algebraic sign); otherwise, the turning point will be on the same side of the curve as
the tangent with the least percent of slope.
29
The horizontal location of the turning point is either measured from the PVC if the tangent with the lesser
slope begins there or from the PVT if the tangent with the lesser slope ends there. The horizontal location
is found by the formula:
Where:
xt= distance of turning point from PVC or PVT
g = lesser slope (ignoring signs)
L = length of curve in stations
G = algebraic difference of slopes.
For the curve we are calculating, the computations would be (7 x 4)/16 = 1.75 feet; therefore, the turning
point is 1.75 stations, or 175 feet, from the PVT (station 30 + 25).
The vertical offset for the turning point is found by the formula:
For this curve, then, the computation is (1.75/2)2 x 8 = 6.12 feet. The elevation of the POVT at 30 + 25
would be 237.37, calculated as explained earlier. The elevation on the curve would be 237.37-6.12 =
231.25.
30
Unsymmetrical Vertical Curves
An unsymmetrical vertical curve is a curve in which the horizontal distance from the PVI to the PVC is
different from the horizontal distance between the PVI and the PVT. In other words, l1 does NOT equal l2.
Unsymmetrical curves are sometimes described as having unequal tangents and are referred to as dog
legs. Figure below shows an unsymmetrical curve with a horizontal distance of 400 feet on the left and a
horizontal distance of 200 feet on the right of the PVI.
The gradient of the tangent at the PVC is –4 percent; the gradient of the tangent at the PVT is +6 percent.
Note that the curve is in a dip.
As an example, let’s assume you are given the following values:
Elevation at the PVI is 332.68
Station at the PVI is 42 + 00
l1 is 400 feet
l2 is 200 feet
g1 is –4%
g2 is +6%
31
In this example, then, the middle vertical offset at the PVI is calculated in the following manner:
e = [(4 x 2)/2(4 + 2)] x [(+6) - (–4)] = 6.67 feet.
Second, you are cautioned that the check on your computations by the use of second difference does NOT
work out the same way for unsymmetrical curves as for a symmetrical curve. The second difference will
not check for the differences that span the PVI. The reason is that an unsymmetrical curve is really two
parabolas, one on each side of the PVI, having a common POVC opposite the PVI; however, the second
difference will check out back, and ahead, of the first station on each side of the PVI.
Third, the turning point is not necessarily above or below the tangent with the lesser slope.
32
The horizontal location is found by the use of one of two formulas as follows:
The procedure is to estimate on which side of the PVI the turning point is located and then use the proper
formula to find its location. If the formula indicates that the turning point is on the opposite side of the
PVI, you must use the other formula to determine the correct location; for example, you estimate that
the turning point is between the PVC and PVI for the curve in figure above. Solving the formula:
xt= (l1)2(g1)/2e
xt= [(4)2(4)]/(2 x 6.67) = 4.80, or Station 42 + 80.
However, Station 42 + 80 is between the PVI and PVT; therefore, use the formula:
xt= (l2)2(g2)//2e
xt= [(2)2(6)]/(2 x 6.67) = 1.80, or station 42 + 20.
Station 42 + 20 is the correct location of the turning point. The elevation of the POVT, the amount of the
offset, and the elevation on the curve is determined as previously explained.
33
5. PRINCIPLES OF LEAST SQUARES ADJUSTMENT
Introduction
Error Types
[1] Gross errors: are results of mistakes that are due to the carelessness of the observer. The gross errors
must be detected and eliminated from the survey measurements before such measurements can be
used. Identification can be through verification of recorded data, use of common sense or having
independent repeated checks.
[2] Systematic errors: they arise due to some physical phenomena, the instrument or environment.
Systematic errors are eliminated by laws governing these contributory factors. They are caused by
incorrect calibration or failure to standardize equipment; imperfect measurement techniques; failure
to make necessary corrections; bias by observer; constructional faults in equipment and
environmental conditions.
[3] Random errors: after all mistakes and systematic errors have been detected and removed from the
measurements, there will still remain some errors in the measurements, called the random or
accidental errors. They have no known functional relationship and are quantified by repeated
measurements. The random errors are treated using probability models. Theory of errors and Least
Squares Adjustment deals only with such type of observational errors.
34
time of measurement, the quality of the equipment used to make the observations, and the observer’s
skill with the equipment and observational procedures. A discrepancy is defined as the algebraic
difference between two observations of the same quantity. When small discrepancies exist between
repeated observations, it is generally believed that only small errors exist. Thus, the tendency is to give
higher credibility to such data and to call the observations precise. However, precise values are not
necessarily accurate values.
Accuracy is the measure of the absolute nearness of a measured quantity to its true value. Since the true
value of a quantity can never be determined, accuracy is always an unknown.
35
[a] it is the most rigorous of adjustments;
[b] it can be applied with greater ease than other adjustments;
[c] it enables rigorous post adjustment analyses to be made; and
[d] it can be used to perform pre-survey planning.
Least squares adjustment is rigorously based on the theory of mathematical probability, whereas in
general, the other methods do not have this rigorous base. As described later, in a least squares
adjustment, the following condition of mathematical probability is enforced: The sum of the squares of
the errors times their respective weights is minimized. By enforcing this condition in any adjustment, the
set of errors that is computed has the highest probability of occurrence. Another aspect of least squares
adjustment that adds to its rigor is that it permits all observations, regardless of their number or type, to
be entered into the adjustment and used simultaneously in the computations. Thus, an adjustment can
combine distances, horizontal angles, azimuths, zenith or vertical angles, height differences, coordinates,
and even GPS observations. One important additional asset of least squares adjustment is that it enables
‘‘relative weights’’ to be applied to the observations in accordance with their estimated relative
reliabilities. These reliabilities are based on estimated precisions. Thus, if distances were observed in the
same survey by pacing, taping, and using an EDM instrument, they could all be combined in an adjustment
by assigning appropriate relative weights.
Years ago, because of the comparatively heavy computational effort involved in least squares, non-
rigorous or ‘‘rule-of-thumb’’ adjustments were most often used. However, now because computers have
eliminated the computing problem, the reverse is true and least squares adjustments are performed more
easily than these rule-of-thumb techniques. Least squares adjustments are less complicated because the
same fundamental principles are followed regardless of the type of survey or the type of observations.
Also, the same basic procedures are used regardless of the geometric figures involved (e.g., triangles,
closed polygons, quadrilaterals, or more complicated networks). On the other hand, rules of thumb are
not the same for all types of surveys (e.g., level nets use one rule and traverses use another), and they
vary for different geometric shapes. Furthermore, the rule of thumb applied for a particular survey by one
surveyor may be different from that applied by another surveyor. A favorable characteristic of least
squares adjustments is that there is only one rigorous approach to the procedure, and thus no matter
who performs the adjustment for any particular survey, the same results will be obtained.
Least squares has the advantage that after an adjustment has been finished, a complete statistical analysis
can be made of the results. Based on the sizes and distribution of the errors, various tests can be
conducted to determine if a survey meets acceptable tolerances or whether the observations must be
36
repeated. If blunders exist in the data, these can be detected and eliminated. Least squares enables
precisions for the adjusted quantities to be determined easily.
Besides its advantages in adjusting survey data, least squares can be used to plan surveys. In this
application, prior to conducting a needed survey, simulated surveys can be run in a trial-and-error
procedure. For any project, an initial trial geometric figure for the survey is selected. Based on the figure,
trial observations are either computed or scaled. Relative weights are assigned to the observations in
accordance with the precision that can be estimated using different combinations of equipment and field
procedures. A least squares adjustment of this initial network is then performed and the results analyzed.
If goals have not been met, the geometry of the figure and the observation precisions are varied and the
adjustment performed again. In this process different types of observations can be used, and observations
can be added or deleted. These different combinations of geometric figures and observations are varied
until one is achieved that produces either optimum or satisfactory results. The survey crew can then
proceed to the field, confident that if the project is conducted according to the design, satisfactory results
will be obtained.
37
also known as the median. When there is an even number of values, the median is given by the average
of the two values closest to (which straddle) the midpoint.
Histogram
Although an ordered numerical tabulation of data allows for some data distribution analysis, it can be
improved with a frequency histogram, usually called simply a histogram. Histograms are bar graphs that
show the frequency distributions in data. To create a histogram, the data are divided into classes-sub
regions of data that usually have a uniform range in values, or class width. Although there are no
universally applicable rules for the selection of class width, generally 5 to 20 classes are used. As a rule of
thumb, a data set of 30 values may have only five or six classes, whereas a data set of 100 values may
have 10 or more classes. In general, the smaller the data set, the lower the number of classes used.
The histogram class width (range of data represented by each histogram bar) is determined by dividing
the total range by the selected number of classes. Consider data of below table. If they were divided into
seven classes, the class width would be the range divided by the number of classes, or 6.0/7 = 0.857 ≈0.86.
The first class interval is found by adding the class width to the lowest data value. For the data in the table,
the first class interval is from 20.1 to (20.1 + 0.86), or 20.96. This class interval includes all data from 20.1
up to (but not including) 20.96. The next class interval is from 20.96 up to (20.96 + 0.86), or 21.82.
Remaining class intervals are found by adding the class width to the upper boundary value of the
preceding class. The class intervals for the data are listed in column (1) of table below.
After creating class intervals, the number of data values in each interval, called the class frequency, is
tallied. Obviously, having data ordered consecutively as shown aids greatly in this counting process.
Column (2) of below shows the class frequency for each class interval of the data. Often, it is also useful
to calculate the class relative frequency for each interval. This is found by dividing the class frequency by
the total number of observations. For the data, the class relative frequency for the first class interval is
38
2/50 = 0.04. Similarly, the class relative frequency of the fourth interval (from 22.67 to 23.53) is 13/50 =
0.26. The class relative frequencies for the data are given in column (3) of table below. The sum of all class
relative frequencies is always 1. The class relative frequency enables easy determination of percentages.
For instance, the class interval from 21.82 to 22.67 contains 16% (0.16 x 100%) of the sample observations.
A histogram is a bar graph plotted with either class frequencies or relative class frequencies on the
ordinate, versus values of the class interval bounds on the abscissa. Using the data from second table
above, the histogram shown was constructed. Notice that in this figure, relative frequencies have been
plotted as ordinates.
Histograms drawn with the same ordinate and abscissa scales can be used to compare two data sets. If
one data set is more precise than the other, it will have comparatively tall bars in the center of the
histogram, with relatively short bars near its edges. Conversely, the less precise data set will yield a wider
range of abscissa values, with shorter bars at the center.
A summary of items easily seen on a histogram include:
Whether the data are symmetrical about a central value;
The range or dispersion in the measured values;
The frequency of occurrence of the measured values;
The steepness of the histogram, which is an indication of measurement precision
Several possible histogram shapes are shown below. Figure (a) depicts a histogram that is symmetric
about its central value with a single peak in the middle. Figure (b) is also symmetric about the center but
has a steeper slope than Figure (a), with a higher peak for its central value. Assuming the ordinate and
39
abscissa scales to be equal, the data used to plot Figure (b) are more precise than those used for Figure
(a). Symmetric histogram shapes are common in surveying practice as well as in many other fields. In fact,
they are so common that the shapes are said to be examples of a normal distribution.
Figure (c) has two peaks and is said to be a bimodal histogram. In the histogram of Figure (d), there is a
single peak with a long tail to the left. This results from a skewed data set, and in particular, these data
are said to be skewed to the right. The data of histogram Figure (e) are skewed to the left.
In surveying, the varying histogram shapes just described result from variations in personnel, physical
conditions, and equipment.
Typically, the symbol ŷ is used to represent a sample’s arithmetic mean and the symbol µ is used to
represent the population mean.
[b] Median: this has been mentioned earlier.
[c] Mode: Within a sample of data, the mode is the most frequently occurring value. It is seldom used in
surveying because of the relatively small number of values observed in a typical set of observations.
In small sample sets, several different values may occur with the same frequency, and hence the mode
can be meaningless as a measure of central tendency.
[d] Variance, σ2: a value by which the precision for a set of data is given.
40
Population variance applies to a data set consisting of an entire population. It is the mean of the
squares of the errors and is given by
∑𝒏 𝟐
𝒊=𝟏 Ɛ𝒊
𝝈𝟐 = ……equation 6.4
𝒏
Sample variance applies to a sample set of data. It is an unbiased estimate for the population variance
and is calculated as
∑𝒏 𝟐
𝒊=𝟏 ʋ𝒊
𝑺𝟐 = ……equation 6.5
𝒏−𝟏
Note that the two equations are identical except that ε has been changed to ʋ and n has been changed
to n – 1. Variance can also be given by the below expression (derive!):
∑𝒚𝟐𝒊 −𝒏ŷ𝟐
𝑺𝟐 = ……equation 6.6
𝒏−𝟏
[e] Standard error, σ: the square root of the population variance. The following equation is written for
the standard error:
∑𝒏 𝟐
𝒊=𝟏 Ɛ𝒊
𝝈=√ ……equation 6.7
𝒏
Where n is the number of observations and ∑𝒏𝒊=𝟏 Ɛ𝟐𝒊 is the sum of the squares of the errors.
Note that both the population variance and the standard error are indeterminate because true values,
and hence errors, are indeterminate. As will be explained later, 68.3% of all observations in a
population data set lie within ±σ of the true value, µ. Thus, the larger the standard error, the more
dispersed are the values in the data set and the less precise is the measurement
[f] Standard deviation, S: the square root of the sample variance. It is calculated using the expression
∑𝒏 𝟐
𝒊=𝟏 ʋ𝒊
𝑺=√ ……equation 6.8
𝒏−𝟏
Where S is the standard deviation, n - 1 the degrees of freedom or number of redundancies, and
∑𝒏𝒊=𝟏 ʋ𝟐𝒊 the sum of the squares of the residuals. Standard deviation is an estimate for the standard
error of the population. Since the standard error cannot be determined, the standard deviation is a
practical expression for the precision of a sample set of data. Residuals are used rather than errors
because they can be calculated from most probable values, whereas errors cannot be determined.
As will be discussed later, for a sample data set, 68.3% of the observations will theoretically lie
between the most probable value plus and minus the standard deviation, S.
[g] Standard deviation of the mean: the error in the mean computed from a sample set of measured
values that results because all measured values contain errors. The standard deviation of the mean is
computed from the sample standard deviation according to the equation
41
𝑺
𝑺ŷ = ± ……equation 6.9
√𝒏
Notice that as n → ∞, then 𝑺ŷ → 0. This illustrates that as the size of the sample set approaches the
total population, the computed mean ŷ will approach the true mean µ.
Error Propagation
Unknown values are often determined indirectly by making direct measurements of other quantities
which are functionally related to the desired unknowns. Examples in surveying include computing station
coordinates from distance and angle observations, obtaining station elevations from rod readings in
differential leveling, and determining the azimuth of a line from astronomical observations. Since all
quantities that are measured directly contain errors, any values computed from them will also contain
errors. This intrusion, or propagation, of errors that occurs in quantities computed from direct
measurements is called error propagation.
To derive the basic error propagation equation, consider the simple function, 𝑧 = 𝑎1 𝑥1 + 𝑎2 𝑥2 where x1
and x2 are two independently observed quantities with standard errors σ1 and σ2, and a1 and a2 are
constants. By analyzing how errors propagate in this function, a general expression can be developed for
the propagation of random errors through any function.
Since x1 and x2 are two independently observed quantities, they each have different probability density
functions. Let the errors in n determinations of x1 be 𝜀1𝑖 , 𝜀1𝑖𝑖 , 𝜀1𝑖𝑖𝑖 , … . . 𝜀1𝑛 and the errors in n determinations
of x2 be 𝜀2𝑖 , 𝜀2𝑖𝑖 , 𝜀2𝑖𝑖𝑖 , … . . 𝜀2𝑛 ; then zT, the true value of z for each independent measurement, is
𝒛𝑻 = 𝒂𝟏 (𝒙𝒊𝟏 − 𝜺𝒊𝟏 ) + 𝒂𝟐 (𝒙𝒊𝟐 − 𝜺𝒊𝟐 )
𝒛𝑻 = 𝒂𝟏 (𝒙𝒊𝒊 𝒊𝒊 𝒊𝒊 𝒊𝒊
𝟏 − 𝜺𝟏 ) + 𝒂𝟐 (𝒙𝟐 − 𝜺𝟐 )
42
𝒛𝒊𝒊𝒊 = 𝒂𝟏 𝒙𝒊𝒊𝒊 𝒊𝒊𝒊
𝟏 + 𝒂𝟐 𝒙𝟐
𝒛𝒏 = 𝒂𝟏 𝒙𝒏𝟏 + 𝒂𝟐 𝒙𝒏𝟐
Substituting this equation in the above equation we get
𝒛𝒊 − 𝒛𝑻 = 𝒂𝟏 𝜺𝒊𝟏 + 𝒂𝟐 𝜺𝒊𝟐
𝒛𝒊𝒊 − 𝒛𝑻 = 𝒂𝟏 𝜺𝒊𝒊 𝒊𝒊
𝟏 + 𝒂𝟐 𝜺𝟐
𝒛𝒏 − 𝒛𝑻 = 𝒂𝟏 𝜺𝒏𝟏 + 𝒂𝟐 𝜺𝒏𝟐
The equation for variance is given as
∑𝒏𝒊=𝟏 Ɛ𝟐𝒊
𝝈𝟐 =
𝒏
This implies that 𝒏𝝈𝟐 = ∑𝒏𝒊=𝟏 Ɛ𝟐𝒊 and thus for the case under consideration, the sum of the squared errors
for the value computed is
𝒏 𝟐 𝟐
∑ Ɛ𝟐𝒊 = (𝒂𝟏 𝜺𝒊𝟏 + 𝒂𝟐 𝜺𝒊𝟐 )𝟐 + ( 𝒂𝟏 𝜺𝒊𝒊 𝒊𝒊 𝒊𝒊𝒊 𝒊𝒊𝒊 𝒏 𝒏 𝟐 𝟐
𝟏 + 𝒂𝟐 𝜺𝟐 ) + (𝒂𝟏 𝜺𝟏 + 𝒂𝟐 𝜺𝟐 ) + ⋯ + (𝒂𝟏 𝜺𝟏 + 𝒂𝟐 𝜺𝟐 ) = 𝒏𝝈𝒛
𝒊=𝟏
𝟐 𝟐 𝟐 𝟐 𝟐 𝟐
𝒏𝝈𝟐𝒛 = 𝒂𝟐𝟏 (𝜺𝒊𝟏 + 𝜺𝒊𝒊 𝒊𝒊𝒊 𝟐 𝒊 𝒊𝒊 𝒊𝒊𝒊 𝒊 𝒊 𝒊𝒊 𝒊𝒊 𝒊𝒊𝒊 𝒊𝒊𝒊
𝟏 + 𝜺𝟏 + ⋯ ) + 𝒂𝟐 (𝜺𝟐 + 𝜺𝟐 + 𝜺𝟐 + ⋯ ) + 𝟐𝒂𝟏 𝒂𝟐 (𝜺𝟏 𝜺𝟐 + 𝜺𝟏 𝜺𝟐 + 𝜺𝟏 𝜺𝟐 + ⋯ )
43
For a set of m functions with n independently measured quantities, x1, x2, . . . , xn, the above equation
expands to
Similarly, if the functions are nonlinear, a first-order Taylor series expansion can be used to linearize
them.1 Thus, a11, a12, . . . are replaced by the partial derivatives of Z1, Z2, . . . with respect to the unknown
parameters, x1, x2,… Therefore, after linearizing a set of nonlinear equations, the matrix for the function
of Z can be written in linear form as
The above two equations are known as the general law of propagation of variances (GLOPOV) for linear
and nonlinear equations, respectively. They can be written symbolically in matrix notation as
∑𝒛𝒛 = 𝑨∑𝑨𝑻
where ∑zz is the covariance matrix for the function Z. For a nonlinear set of equations that is linearized
using Taylor’s theorem, the coefficient matrix (A) is called a Jacobian matrix, a matrix of partial derivatives
with respect to each unknown, as shown above. If the measurements x1, x2, . . . , xn are unrelated (i.e., are
statistically independent), the covariance terms 𝝈𝒙𝟏 𝒙𝟐 , 𝝈𝒙𝟏𝒙𝟑 … equal to zero, and thus the right-hand
sides of the above two equations can be rewritten, respectively, as
44
If there is only one function Z, involving n unrelated quantities, x1, x2, . . . , xn, equation above can be
rewritten in algebraic form as
The above three equations express the special law of propagation of variances (SLOPOV). They govern the
manner in which errors from statistically independent measurements (i.e., 𝝈𝒙𝒊 𝒙𝒋 = 𝟎) propagate in a
function. In these equations, individual terms represent the individual contributions to the total
error that occur as the result of observational errors in each independent variable. When the size of a
function’s estimated error is too large, inspection of these individual terms will indicate the largest
contributors to the error. The most efficient way to reduce the overall error in the function is to closely
examine ways to reduce the largest error terms.
In the equations above, standard error (σ) and standard deviation (S) can be used interchangeably.
45
𝟏 𝟏 𝟏 𝒏𝑺𝟐 𝑺
𝑺𝒚̅ = √[ 𝒔𝒚𝟏 ]𝟐 + [ 𝒔𝒚𝟐 ]𝟐 + ⋯ + [ 𝒔𝒚𝒏 ]𝟐 = √ 𝟐 =
𝒏 𝒏 𝒏 𝒏 √𝒏
Errors associated with any indirect measurement problem can be analyzed as described above. Besides
being able to compute the estimated error in a function, the sizes of the individual errors contributing to
the functional error can also be analyzed. This identifies those observations whose errors are most critical
in reducing the functional error. An alternative use of the error propagation equation involves computing
the error in a function of observed values prior to fieldwork. The calculation can be based on the geometry
of the problem and the observations that are included in the function. The estimated errors in each value
can be varied to correspond with those expected using different combinations of available equipment and
field procedures. The particular combination that produces the desired accuracy in the final computed
function can then be adopted in the field.
Weights of Observations
When surveying data are collected, they must usually conform to a given set of geometric conditions, and
when they do not, the measurements are adjusted to force that geometric closure. For a set of
uncorrelated observations, a measurement with high precision, as indicated by a small variance, implies
a good observation, and in the adjustment it should receive a relatively small portion of the overall
correction. Conversely, a measurement with lower precision, as indicated by a larger variance, implies an
observation with a larger error, and should receive a larger portion of the correction.
The weight of an observation is a measure of its relative worth compared to other measurements. Weights
are used to control the sizes of corrections applied to measurements in an adjustment. The more precise
an observation, the higher its weight; in other words, the smaller the variance, the higher the weight.
From this analysis it can be stated intuitively that weights are inversely proportional to variances. Thus, it
also follows that correction sizes should be inversely proportional to weights.
In situations where measurements are correlated, weights are related to the inverse of the covariance
matrix, ∑. The elements of this matrix are variances and covariances. Since weights are relative, variances
and covariances are often replaced by cofactors. A cofactor is related to its covariance by the equation
𝝈𝒊𝒋
𝒒𝒊𝒋 =
𝝈𝟐𝟎
where qij is the cofactor of the ijth measurement, σij the covariance of the ijth measurement, and 𝜎02 the
reference variance, a value that can be used for scaling. The equation can be expressed in matrix notation
as
46
𝟏
𝑸= ∑
𝝈𝟐𝟎
where Q is defined as the cofactor matrix. The structure and individual elements of the ∑ matrix are
diagonal. Thus, Q is also a diagonal matrix with elements equal to 𝜎𝑥2𝑖 /𝜎02 . The inverse of a diagonal matrix
is also a diagonal matrix, with its elements being the reciprocals of the original diagonals, and therefore
the equation becomes
From above equation, any independent measurement with variance equal to 𝝈𝟐𝒊 has a weight of
𝝈𝟐𝟎
𝒘𝒊 =
𝝈𝟐𝒊
If the ith observation has a weight wi = 1, then 𝝈𝟐𝟎 = 𝝈𝟐𝒊 = 𝟏. Thus, 𝝈𝟐𝟎 is often called the variance of an
observation of unit weight, shortened to variance of unit weight or simply unit variance. Its square root is
called the standard deviation of unit weight. If 𝝈𝟐𝟎 is set equal to 1, then
𝟏
𝒘𝒊 =
𝝈𝟐𝒊
Note in Equation above that as stated earlier, the weight of an observation is inversely proportional to its
variance. With correlated observations, it is possible to have a covariance matrix, ∑ , and a cofactor matrix,
Q, but not a weight matrix. This occurs when the cofactor matrix is singular, and thus an inverse for Q
does not exist. Most situations in surveying involve uncorrelated observations and therefore only the
uncorrelated case with variance of unit weight is considered.
47
Weighted Mean
If two measurements are taken of a quantity and the first is twice as good asthe second, their relative
worth can be expressed by giving the first measurement a weight of 2 and the second a weight of 1. A
simple adjustment involving these two measurements would be to compute the mean value. In this
calculation, the observation of weight 2 could be added twice, and the observation of weight 1 added
once. As an illustration, suppose that a distance is measured with a tape to be 151.9 ft, and the same
distance is measured with an EDM instrument as 152.5 ft. Assume that experience indicates that the
electronically measured distance is twice as good as the taped distance, and accordingly, the taped
distance is given a weight of 1 and the electronically measured distance is given a weight of 2. Then one
method of computing the mean from these observations is
𝟏𝟓𝟏. 𝟗 + 𝟏𝟓𝟐. 𝟓 + 𝟏𝟓𝟐. 𝟓
̅ =
𝑴 = 𝟏𝟓𝟐. 𝟑
𝟑
This can also be written as
𝟏(𝟏𝟓𝟏. 𝟗) + 𝟐(𝟏𝟓𝟐. 𝟓)
̅ =
𝑴 = 𝟏𝟓𝟐. 𝟑
𝟏+𝟐
Note that the weights of 1 and 2 were entered directly into the second computation and that the result
of this calculation is the same as the first. Note also that the computed mean tends to be closer to the
measured value having the higher weight (i.e., 152.3 is closer to 152.5 than it is to 151.9). A mean value
computed from weighted observations is called the weighted mean.
To develop a general expression for computing the weighted mean, suppose that we have m independent
uncorrelated observations (z1, z2, . . . , zm) for a quantity z and that each observation has standard deviation
σ. Then the mean of the observations is
∑𝒎
𝒊=𝟏 𝒛𝒊
𝒛̅ =
𝒎
If these m observations were now separated into two sets, one of size ma and the other mb such that ma
+ mb = m, the means for these two sets would be
∑𝒎 𝒂
𝒊=𝟏 𝒛𝒊
𝒛̅𝒂 =
𝒎𝒂
𝒎𝒃
∑𝒊=𝒎 𝒛𝒊
𝒂+𝟏
𝒛̅𝒃 =
𝒎𝒃
The mean z is found by combining the means of these two sets as
∑𝒎 𝒂 𝒎𝒃
𝒊=𝟏 𝒛𝒊 + ∑𝒊=𝒎 𝒛𝒊 ∑𝒎 𝒂 𝒎𝒃
𝒊=𝟏 𝒛𝒊 + ∑𝒊=𝒎 𝒛𝒊
𝒂+𝟏 𝒂+𝟏
𝒛̅ = =
𝒎 𝒎𝒂 + 𝒎 𝒃
But
48
𝒎𝒂
𝒛̅𝒂 𝒎𝒂 = ∑ 𝒛𝒊
𝒊=𝟏
𝒎𝒃
𝒛̅𝒃 𝒎𝒃 = ∑ 𝒛𝒊
𝒊=𝒎𝒂+𝟏
Thus
𝒛̅𝒂 𝒎𝒂 + 𝒛̅𝒃 𝒎𝒃
𝒛̅ =
𝒎 𝒂 + 𝒎𝒃
Note the correspondence between Equation above and the second equation used to compute the
weighted mean in the simple illustration given earlier.
By intuitive comparison it should be clear that ma and mb correspond to weights that could be symbolized
as wa and wb, respectively. Thus, the Equation can be written as
𝒘𝒂 𝒛̅𝒂 + 𝒘𝒃 𝒛̅𝒃 ∑𝒘𝒛
𝒛̅ = =
𝒘𝒂 + 𝒘𝒃 ∑𝒘
Equation above is used in calculating the weighted mean for a group of uncorrelated observations having
unequal weights. The weighted mean is the most probable value for a set of weighted observations.
Relationship between Weights and Standard Errors
By applying the special law of propagation of variances above to the equation for weighted mean of za
below we get
∑𝒎 𝒂
𝒊=𝟏 𝒛𝒊
𝒛̅𝒂 =
𝒎𝒂
Substituting partial derivatives with respect to the measurements into above equation yields
Similarly
49
In the Equations, σ is a constant and the weights of 𝒛̅𝒂 and 𝒛̅𝒃 were established as ma and mb respectively.
Since the weights are relative, it follows that
Conclusion: With uncorrelated observations, the weights of the observations are inversely proportional
to their variances.
∑𝒏 Ɛ 𝟐
𝝈 = √ 𝒊=𝟏 𝒊
𝒏
In the case where the observations are not equal in weight, the above equation becomes
50
Standard Error of Weight and Standard Error of Weighted Mean
Standard error of weight is given by
𝝈𝟎 𝝈𝟎 𝝈𝟎
𝝈𝟏 = , 𝝈𝟐 = , … , 𝝈𝒏 =
√𝒘𝟏 √𝒘𝟐 √𝒘𝒏
The standard error is given by
𝝈𝟎 ∑𝒘𝜺𝟐 𝟏 ∑𝒘𝜺𝟐
𝝈𝟏 = =√ =√
√𝒘𝟏 𝒏 √𝒘𝟏 𝒏𝒘𝟏
𝝈𝟎 ∑𝒘𝜺𝟐 𝟏 ∑𝒘𝜺𝟐
𝝈𝟐 = =√ =√
√𝒘𝟐 𝒏 √𝒘𝟐 𝒏𝒘𝟐
𝝈𝟎 ∑𝒘𝜺𝟐 𝟏 ∑𝒘𝜺𝟐
𝝈𝒏 = =√ =√
√𝒘𝒏 𝒏 √𝒘𝒏 𝒏𝒘𝒏
∑𝒘𝜺𝟐
𝝈𝑴
̅ = √
𝒏∑𝒘
∑𝒘ʋ𝟐
̅ =√
𝑺𝑴
(𝒏 − 𝟏)∑𝒘
In angle observation, when all conditions are equal except for the number of turnings, angle weights are
proportional to the number of times the angles are turned. It has also been shown that weights of
differential leveling lines are inversely proportional to their lengths, and since any course length is
proportional to its number of setups, weights are also inversely proportional to the number of setups.
51
Principles of Least Squares
In surveying, observations must often satisfy established numerical relationships known as geometric
constraints. As an example, in a differential leveling loop, the elevation differences should sum to a given
quantity. However, because the geometric constraints rarely meet perfectly, if ever, the data are adjusted.
As discussed earlier, errors in observations conform to the laws of probability; that is, they follow normal
distribution theory. Thus, they should be adjusted in a manner that follows these mathematical laws.
Whereas the mean has been used extensively throughout history, the earliest works on least squares
started in the late eighteenth century. Its earliest application was primarily for adjusting celestial
observations. Laplace first investigated the subject and laid its foundation in 1774. The first published
article on the subject, entitled ‘‘Me´thode des moindres quarre´s’’ (Method of Least Squares), was written
in 1805 by Legendre. However, it is well known that although Gauss did not publish until 1809, he
developed and used the method extensively as a student at the University of Go¨ttingen beginning in 1794
and thus is given credit for the development of the subject.
To develop the principle of least squares, a specific case is considered. Suppose that there are n
independent equally weighted measurements, z1, z2, . . . , zn, of the same quantity z, which has a most
probable value denoted by M. By definition,
M is the quantity that is to be selected in such a way that it gives the greatest probability of occurrence,
or, stated differently, the value of M that maximizes the value of P. The probability P is maximized when
the quantity ʋ12 + ʋ22 + ʋ23 + ⋯ + ʋ2𝑛 is minimized. In other words, to maximize P, the sum of the squares
of the residuals must be minimized. Equation below expresses the fundamental principle of least squares:
∑ʋ𝟐 = ʋ𝟐𝟏 + ʋ𝟐𝟐 + ʋ𝟐𝟑 + ⋯ + ʋ𝟐𝒏 = 𝒎𝒊𝒏𝒊𝒎𝒖𝒏
This condition states: The most probable value (MPV) for a quantity obtained from repeated observations
of equal weight is the value that renders the sum of the residuals squared a minimum. From calculus, the
minimum value of a function can be found by taking its first derivative and equating the resulting function
with zero. That is, the condition stated in Equation above is enforced by taking the first derivative of the
function with respect to the unknown variable M and setting the results equal to zero. The equation can
be rewritten as
52
∑ʋ𝟐 = (𝑴 − 𝒛𝟏 )𝟐 + (𝑴 − 𝒛𝟐 )𝟐 + ⋯ + (𝑴 − 𝒛𝒏 )𝟐
Taking the first derivative with respect to M and setting the resulting equation equal to zero yields
𝝏(∑ʋ𝟐 )
= 𝟐(𝑴 − 𝒛𝟏 )(𝟏) + 𝟐(𝑴 − 𝒛𝟐 )(𝟏) + ⋯ + 𝟐(𝑴 − 𝒛𝒏 )(𝟏) = 𝟎
𝝏𝑴
53
𝒘𝟏 𝒛𝟏 + 𝒘𝟐 𝒛𝟐 + ⋯ + 𝒘𝒏 𝒛𝒏 = 𝒘𝟏 𝑴 + 𝒘𝟐 𝑴 + ⋯ + 𝒘𝒏 𝑴
∑𝒘𝒛
𝑴=
∑𝒘
Stochastic Model
The determination of variances, and subsequently the weights of the observations, is known as the
stochastic model in a least squares adjustment. It is crucial in adjustment to select a proper stochastic
(weighting) model since, as was discussed, the weight of an observation controls the amount of correction
it receives during the adjustment. However, development of the stochastic model is important not only
to weighted adjustments. When doing an unweighted adjustment, all observations are assumed to be of
equal weight, and thus the stochastic model is created implicitly.
Functional Model
A functional model in adjustment computations is an equation or set of equations that represents or
defines an adjustment condition. It must be either known or assumed. If the functional model represents
the physical situation adequately, the observational errors can be expected to conform to the normal
distribution curve. For example, a well-known functional model states that the sum of angles in a triangle
is 1800. This model is adequate if the survey is limited to a small region. However, when the survey covers
very large areas, this model does not account for the systematic errors caused by Earth’s curvature. In this
case, the functional model is inadequate and needs to be modified to include corrections for spherical
excess. Needless to say, if the model does not fit the physical situation, an incorrect adjustment will result.
There are two basic forms for functional models: the conditional and parametric adjustments. In a
conditional adjustment, geometric conditions are enforced on the observations and their residuals.
Examples of conditional adjustment are: (1) the sum of the angles in a closed polygon is (n - 2)1800, where
n is the number of sides in the polygon; and (2) the northings and eastings of a polygon traverse sum to
zero. A least squares adjustment example using condition equations is discussed later.
When performing a parametric adjustment, observations are expressed in terms of unknown parameters
that were never observed directly. For example, the well-known coordinate equations are used to model
the angles, directions, and distances observed in a horizontal plane survey. The adjustment yields the
most probable values for the coordinates (parameters), which in turn provide the most probable values
for the adjusted observations.
54
The choice of the functional model will determine which quantities or parameters are adjusted. A primary
purpose of an adjustment is to ensure that all observations are used to find the most probable values for
the unknowns in the model. In least squares adjustments, no matter if conditional or parametric, the
geometric checks at the end of the adjustment are satisfied and the same adjusted observations are
obtained. In complicated networks, it is often difficult and time consuming to write the equations to
express all conditions that must be met for a conditional adjustment.
The combination of stochastic and functional models results in a mathematical model for the adjustment.
The stochastic and functional models must both be correct if the adjustment is to yield the most probable
values for the unknown parameters. That is, it is just as important to use a correct stochastic model as it
is to use a correct functional model. Improper weighting of observations will result in the unknown
parameters being determined incorrectly.
Observation Equations
Equations that relate observed quantities to both observational residuals and independent unknown
parameters are called observation equations. One equation is written for each observation and for a
unique set of unknowns. For a unique solution of unknowns, the number of equations must equal the
number of unknowns. Usually, there are more observations (and hence equations) than unknowns, and
this permits determination of the most probable values for the unknowns based on the principle of least
squares.
As an example of a least squares adjustment by the observation equation method, consider the following
three equations:
𝒙 + 𝒚 = 𝟑. 𝟎
𝟐𝒙 − 𝒚 = 𝟏. 𝟓
𝒙 − 𝒚 = 𝟎. 𝟐
The equations relate the two unknowns, x and y, to the quantities observed (the values on the right side
of the equations). One equation is redundant since the values for x and y can be obtained from any two
of the three equations. For example, if Equations (1) and (2) are solved, x would equal 1.5 and y would
equal 1.5, but if Equations (2) and (3) are solved, x would equal 1.3 and y would equal 1.1, and if Equations
(1) and (3) are solved, x would equal 1.6 and y would equal 1.4. Based on the inconsistency of these
equations, the observations contain errors. Therefore, new expressions, called observation equations, can
be rewritten that include residuals.
𝒙 + 𝒚 − 𝟑. 𝟎 = 𝒗𝟏
55
𝟐𝒙 − 𝒚 − 𝟏. 𝟓 = 𝒗𝟐
𝒙 − 𝒚 − 𝟎. 𝟐 = 𝒗𝟑
These equations relate the unknown parameters to the observations and their errors. Obviously, it is
possible to select values of v1, v2, and v3 that will yield the same values for x and y no matter which pair of
equations are used. For example, to obtain consistencies through all of the equations, arbitrarily let v1 =
0, v2 = 0, and v3 = -0.2. In this arbitrary solution, x would equal 1.5 and y would equal 1.5, no matter which
pair of equations is solved. This is a consistent solution; however, there are other values for the v’s that
will produce a smaller sum of squares. To find the least squares solution for x and y, the residual equations
are squared and these squared expressions are added to give a function, ƒ(x,y), that equals ∑𝒗𝟐 .
𝒇(𝒙, 𝒚) = ∑𝒗𝟐 = (𝒙 + 𝒚 − 𝟑. 𝟎)𝟐 + (𝟐𝒙 − 𝒚 − 𝟏. 𝟓)𝟐 + (𝒙 − 𝒚 − 𝟎. 𝟐)𝟐
As discussed, to minimize a function, its derivatives must be set equal to zero. Thus, in the Equation, the
partial derivatives of Equation with respect to each unknown must be taken and set equal to zero.
This leads to the two equations
𝝏𝒇(𝒙, 𝒚)
= 𝟐(𝒙 + 𝒚 − 𝟑. 𝟎) + 𝟐(𝟐𝒙 − 𝒚 − 𝟏. 𝟓)(𝟐) + 𝟐(𝒙 − 𝒚 − 𝟎. 𝟐) = 𝟎
𝝏𝒙
𝝏𝒇(𝒙, 𝒚)
= 𝟐(𝒙 + 𝒚 − 𝟑. 𝟎) + 𝟐(𝟐𝒙 − 𝒚 − 𝟏. 𝟓)(−𝟏) + 𝟐(𝒙 − 𝒚 − 𝟎. 𝟐)(−𝟏) = 𝟎
𝝏𝒚
Equations above are called normal equations. Simplifying them gives reduced normal equations of
𝟔𝒙 − 𝟐𝒚 − 𝟔. 𝟐 = 𝟎
−𝟐𝒙 + 𝟑𝒚 − 𝟏. 𝟑 = 𝟎
Simultaneous solution of the Equations yield x equal to 1.514 and y equal to 1.442. Substituting these
adjusted values into observation equations, numerical values for the three residuals can be computed.
Table below provides a comparison of the arbitrary solution to the least squares solution. The tabulated
summations of residuals squared shows that the least squares solution yields the smaller total and thus
the better solution. In fact, it is the most probable solution for the unknowns based on the observations.
56
Systematic Formulation of Normal Equations
Equal-Weight Case
In large systems of observation equations, it is helpful to use systematic procedures to formulate the
normal equations. In developing these procedures, consider the following generalized system of linear
observation equations having variables of (A, B, C, . . . , N):
Summing the above Equations, the function ƒ(A,B,C, . . . , N) = ∑v2 is obtained. This expresses the equal-
weight least squares condition as
According to least squares theory, the minimum for the above Equation is found by setting the partial
derivatives of the function with respect to each unknown equal to zero. This results in the normal
equations
57
Dividing each expression by 2 and regrouping the remaining terms results in
58
Generalized equations expressing normal Equations are now written as
In above Equation the a’s, b’s, c’s, . . . , n’s are the coefficients for the unknowns A, B, C, . . . , N; the l values
are the observations; and ∑ signifies summation from i = 1 to m.
Weighted Case
In a similar manner, it can be shown that normal equations can be formed systematically for weighted
observation equations in the following manner:
In above Equation, w are the weights of the observations, l; the a’s, b’s, c’s, . . . , n’s are the coefficients
for the unknowns A, B, C, . . . , N; the l values are the observations; and ∑ signifies summation from i = 1
to m.
59
Notice that the terms in the Equations are the same as those in earlier Equations except for the addition
of the w’s which are the relative weights of the observations. In fact, the above Equations can be thought
of as the general set of equations for forming the normal equations, since if the weights are equal, they
can all be given a value of 1.
Equal-Weight Case
To develop the matrix expressions for performing least squares adjustments, an analogy will be made with
the systematic procedures demonstrated. For this development, let a system of observation equations be
represented by the matrix notation
𝑨𝑿 = 𝑳 + 𝑽
Where
In the system of observation equations above the unknowns are x1, x2, . . . , xn instead of A, B, . . . , N, and
the coefficients of the unknowns are a11, a12, . . . , a1n instead of a1, b1, . . . , n1. Subjecting the foregoing
60
matrices to the manipulations given in the following expression, above Equation produces the normal
equations:
𝑨𝑻 𝑨𝑿 = 𝑨𝑻 𝑳
Which can also be expressed as
𝑵𝑿 = 𝑨𝑻 𝑳
The individual elements of the N matrix can be expressed in the following summation forms:
The above equations produce the normal equations of a least squares adjustment. By inspection, it can
also be seen that the N matrix is always symmetric (i.e., nij = nji). By employing matrix algebra, the solution
of normal equations is
𝑿 = (𝑨𝑻 𝑨)−𝟏 𝑨𝑻 𝑳
= 𝑵−𝟏 𝑨𝑻 𝑳
61
Worked Example
The observation equations below can be written in matrix form as indicated.
𝒙 + 𝒚 − 𝟑. 𝟎 = 𝒗𝟏
𝟐𝒙 − 𝒚 − 𝟏. 𝟓 = 𝒗𝟐
𝒙 − 𝒚 − 𝟎. 𝟐 = 𝒗𝟑
Finally, the adjusted unknowns, the X matrix, are obtained using the matrix methods and yields
Weighted Case
A system of weighted linear observation equations can be expressed in matrix notation as
𝑾𝑨𝑿 = 𝑾𝑳 + 𝑾𝑽
Using the methods demonstrated, it is possible to show that the normal equations for this weighted
system are
𝑨𝑻 𝑾𝑨𝑿 = 𝑨𝑻 𝑾𝑳
𝑵𝑿 = 𝑨𝑻 𝑾𝑳
Where 𝑵 = 𝑨𝑻 𝑾𝑨
Using matrix algebra, the least squares solution of these weighted normal equations is
𝑿 = (𝑨𝑻 𝑾𝑨)−𝟏 𝑨𝑻 𝑾𝑳
= 𝑵−𝟏 𝑨𝑻 𝑾𝑳
In the Equation, W is the weight matrix as defined earlier.
62
Least Squares Solution of Non-Linear Systems
Solving a nonlinear system of equations uses Taylor series approximation. Following this procedure, the
least squares solution for a system of nonlinear equations can be found as follows:
Write the first-order Taylor series approximation for each equation;
Determine initial approximations for the unknowns in the equations of step 1;
Use matrix methods to find the least squares solution for the equations of step 1 (these are
corrections to the initial approximations);
Apply the corrections to the initial approximations;
Repeat steps 1 through 4 until the corrections become sufficiently small.
A system of nonlinear equations that are linearized by a Taylor series approximation can be written as
𝑱𝑿 = 𝑲 + 𝑽
Where the Jacobian matrix J contains the coefficients of the linearized observation equations. The
individual matrices in the Equation are
The vector of least squares corrections in the equally weighted system is given by
𝑿 = (𝑱𝑻 𝑱)−𝟏 𝑱𝑻 𝑲
= 𝑵−𝟏 𝑱𝑻 𝑲
Similarly, the system of weighted equations is
𝑾𝑱𝑿 = 𝑾𝑲
And the solution is
𝑿 = (𝑱𝑻 𝑾𝑱)−𝟏 𝑱𝑻 𝑾𝑲
Where W is the weight matrix. Notice that the least squares solution of a nonlinear system of equations
is similar to the linear case. In fact, the only difference is the use of the Jacobian matrix rather than the
63
coefficient matrix and the use of the K matrix rather than the observation matrix, L. Many authors use the
same nomenclature for both the linear and nonlinear cases. In these cases, the differences in the two
systems of equations are stated implicitly.
Worked Example
Find the least squares solution for the following system of nonlinear equations:
𝑭: 𝒙 + 𝒚 − 𝟐𝒚𝟐 = 𝟒. 𝟎
𝑮: 𝒙𝟐 + 𝒚𝟐 = 𝟖. 𝟎
𝑯: 𝟑𝒙𝟐 − 𝒚𝟐 = 𝟕. 𝟕
Solution
Step 1: Determine the elements of the J matrix by taking partial derivatives of Equations with respect to
the unknowns x and y. Then write the first order Taylor series equations.
Step 2: Determine initial approximations for the solution of the equations. Initial approximations can be
derived by solving any two equations for x and y. The solution for equations for F and G is x0 = 2 and y0 =
2. Using these values, the evaluation of the equations yields
It should not be surprising that the first two rows of the K matrix are zero since the initial approximations
were determined using these two equations.
In successive iterations, these values will change and all terms will become nonzero.
Step 3: Solve the system using Equation 𝑿 = (𝑱𝑻 𝑱)−𝟏 𝑱𝑻 𝑲
64
Substituting the matrices of Equation into Equation in step 3, the solution for the first iteration is
Step 4: Apply the corrections to the initial approximations for the first iteration.
Iterating a third time yields extremely small corrections, and thus the final solution, rounded to the
hundredths place, is x = 1.98 and y = 2.00. Notice that N changed by a relatively small amount from the
first iteration to the second iteration. If the initial approximations are close to their final values, this can
be expected. Thus, when doing these computations by hand, it is common to use the initial N for each
iteration, making it only necessary to recompute JTK between iterations. However, this procedure should
be used with caution since if the initial approximations are poor, it will result in an incorrect solution. One
should always perform complete computations when doing the solution with the aid of a computer.
65
What are the most probable values for these observations?
Solution
In a conditional adjustment, the most probable set of residuals are found that satisfy a given functional
condition. In this case, the condition is that the sum of the three angles is equal to 360 0. Since the three
angles observed actually sum to 359o59’45”, the angular misclosure is 15”. Thus, errors are present. The
following residual equations are written for the observations listed above.
𝒗𝟏 + 𝒗𝟐 + 𝒗𝟑 = 𝟑𝟔𝟎𝒐 − (𝒂𝟏 + 𝒂𝟐 + 𝒂𝟑 ) = 𝟏𝟓"
In the Equation, the a’s represent the observations and the v’s are residuals. Applying the fundamental
condition for a weighted least squares adjustment, the following equation must be minimized:
𝑭 = 𝒘𝟏 𝒗𝟐𝟏 + 𝒘𝟐 𝒗𝟐𝟐 + 𝒘𝟑 𝒗𝟐𝟑
Where the w’s are weights, which are the inverses of the squares of the standard deviations. Equation
can be rearranged such that v3 is expressed as a function of the other two residuals, or
𝒗𝟑 = 𝟏𝟓" − (𝒗𝟏 + 𝒗𝟐 )
=> 𝑭 = 𝒘𝟏 𝒗𝟐𝟏 + 𝒘𝟐 𝒗𝟐𝟐 + 𝒘𝟑 (𝟏𝟓" − (𝒗𝟏 + 𝒗𝟐 ))𝟐
Taking the partial derivatives of F with respect to both v1 and v2, respectively, the above Equation results
in the following two equations:
Rearranging the Equations and substituting in the appropriate weights yields the following normal
equations:
66
Note that geometric closure has been enforced in the adjusted angles to make their sum exactly 360o.
Note also that the angle having the smallest standard deviation received the smallest correction (i.e., its
residual is smallest).
67
X is computed as
It can now be determined that a3 is 360o- (134o39’00.2”+ 83o17’44.1”) = 142o03’15.7”. The same result is
obtained as in condition equations method. It is important to note that no matter what method of least
squares adjustment is used, if the procedures are performed properly, the same solution will always be
obtained.
68