0% found this document useful (0 votes)
15 views43 pages

Computer Graphics Notes

A comprehensive note on computer graphics

Uploaded by

stevechris371
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
15 views43 pages

Computer Graphics Notes

A comprehensive note on computer graphics

Uploaded by

stevechris371
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 43

Unit 5 – Graphics Pipeline

5.1 Computer Graphics in Games

Of all the applications of computer graphics, computer and video games attract perhaps the most attention.
The graphics methods selected for a given game have a profound effect, not only on the game engine code,
but also on the art asset creation, and even sometimes on the gameplay, or core game mechanics. Games
need to make highly efficient use of graphics hardware, so an understanding of the material is important. In
this unit, Detail specific considerations that apply to graphics in game development, from the platforms on
which games run to the game production process are considered.

5.6.1 Platforms
The term platform refers to a specific combination of hardware, operating system, and API (application
programming interface) for which a game is designed. Games run on a large variety of platforms, ranging from
virtual machines used for browser-based games to dedicated game consoles using specialized hardware and
APIs. In the past, it was common for games to be designed for a single platform. The increasing cost of game
development has made this rare; multiplatform game development is now the norm. The incremental increase
in development cost to support multiple platforms is more than repaid by a potential doubling or tripling of the
customer base.

Some platforms are quite loosely defined. For example, when developing a game for the Windows PC platform,
the developer must account for a very large variety of possible hardware configurations. Games are even
expected to run (and run well) on PC configurations that did not exist when the game was developed! This is
only possible due to the abstractions afforded by the APIs defining the Windows platform.

5.6.2 Limited Resources


One of the primary challenges of game graphics is the need to manage multiple pools of limited resources.
Each platform imposes its own constraints on hardware resources such as processing time, storage, and
memory bandwidth. At a higher level, development resources also need to be managed; there is a fixed-size
team of programmers, artists, and game designers with limited time to complete the game, hopefully without
working too much overtime! This needs to be taken into account when deciding which graphics techniques to
adopt.

5.6.3 Optimization Techniques


Making proper use of these limited resources is the primary challenge of the game graphics programmer. To
this end, various optimization techniques are commonly employed. In many games, pixel shader processing
is a primary bottleneck. Most GPUs contain hierarchical depth-culling hardware which can avoid executing
pixel shaders on occluded surfaces. To make good use of this hardware, opaque objects can be rendered
back-to-front. Alternatively, optimal depth-culling usage can be achieved by performing a depth prepass, i.e.,
rendering all the opaque objects into the depth buffer (without any colour output or pixel shaders) before
Unit 5 – Graphics Pipeline

rendering the scene normally. This does incur some overhead (due to the need to render every object twice),
but in many cases the performance gain is worth it.

The fastest way to render an object is to not render it at all; thus any method of discerning early on that an
object is occluded can be useful. This saves not only pixel processing but also vertex processing and even
CPU time that would be spent submitting the object to the graphics API. View frustum culling is universally
employed, but in many games it is not sufficient. High-level occlusion culling algorithms are often used, utilizing
data structures such as PVS (potentially visible sets) or BSP (binary spatial partitioning) trees to quickly narrow
down the pool of potentially visible objects.

5.6.4 Game Types


Since game requirements vary widely, the selection of graphics techniques is driven by the exact type of game
being developed. The allocation of processing time depends strongly on the frame rate. Currently, most
console games tend to target 30 frames per second, since this enables much higher graphics quality. However,
certain game types with fast gameplay require very low latency, and such games typically render at 60 frames
per second. This includes music games such as Guitar Hero and first-person shooters such as
Call of Duty.

The frame rate determines the available time to render the scene. The composition of the scene itself also
varies widely from game to game. Most games have a division between background geometry (scenery, mostly
static) and foreground geometry (characters and dynamic objects). These are handled differently by the
rendering engine. For example, background geometry will often have lightmaps containing precomputed
lighting, which is not feasible for foreground objects. Precomputed lighting is typically applied to foreground
objects via some type of volumetric representation which can take account of the changing position of each
object over time.

Some games have relatively enclosed environments, where the camera remains largely in place. The purest
examples are fighting games such as the Street Fighter series, but this is also true to some extent for games
such as Devil May Cry and God of War. These games have cameras that are not under direct player control,
and the game play tends to move from one enclosed environment to another, spending a significant amount
of playing time in each. This allows the game developer to lavish large amounts of resources (processing,
storage, and artist time) on each room or enclosed environment, resulting in very high levels of graphics fidelity.

Other games have extremely large worlds, where the player can move about freely. This is most true for
“sandbox games” such as the Grand Theft Auto series and online role-playing games such as World of
Warcraft. Such games pose great challenges to the graphics developer, since resource allocation is very
difficult when during each frame the player can see a large extent of the world. Further complicating things,
the player can freely go to some formerly distant part of the world and observe it from up close. Such games
typically have changing time of day, which makes precomputation of lighting difficult at best, if not impossible.
Unit 5 – Graphics Pipeline

5.6.5 The Game Production Process


The game production process starts with the basic game design or concept. In some cases (such as sequels),
the basic gameplay and visual design is clear, and only incremental changes are made. In the case of a new
game type, extensive prototyping is needed to determine gameplay and design. Most cases sit somewhere in
the middle, where there are some new gameplay elements and the visual design is somewhat open. After this
step there may be a greenlight stage where some early demo or concept is shown to the game publisher to
get approval (and funding!) for the game.

The next step is typically pre-production. While other teams are working on finishing up the last game, a small
core team works on making any needed changes to the game engine and production tool chain, as well as
working out the rough details of any new gameplay elements. This core team is working under a strict deadline.
After the existing game ships and the rest of the team comes back from a well-deserved vacation, the entire
tool chain and engine must be ready for them. If the core team misses this deadline, several dozen developers
may be left idle—an extremely expensive proposition!

Full production is the next step, with the entire team creating art assets, designing levels, tweaking gameplay,
and implementing further changes to the game engine. In a perfect world, everything done during this process
would be used in the final game, but in reality there is an iterative nature to game development which will result
in some work being thrown out and redone. The goal is to minimize this with careful planning and prototyping.
When the game is functionally complete, the final stage begins. The term alpha release usually refers to the
version which marks the start of extensive internal testing, beta release to the one which marks the start of
extensive external testing, and gold release to the final release submitted to the console manufacturer, but
different companies have slightly varying definitions of these terms. In any case, testing, or quality assurance
(QA) is an important part of this phase, and it involves testers at the game development studio, at the publisher,
at the console manufacturer, and possibly external QA contractors as well. These various rounds of testing
result in bug reports which are submitted back to the game developers and worked on until the next release.

After the game ships, most of the developers go on vacation for a while, but a small team may have to stay
to work on patches or downloadable content. In the meantime, a small core team has been working on pre-
production for the next game.

Test your knowledge

1. Suppose that in the perspective transform we have n = 1and f = 2. Under what circumstances will we
have a “reversal” where a vertex before and after the perspective transform flips from in front of to
behind the eye or vice versa?
Unit 5 – Graphics Pipeline

2. Is there any reason not to clip in x and y after the perspective divide (see Figure 11.2 of Marschner,
et al. 2016, stage 3)?
3. Derive the incremental form of the midpoint line-drawing algorithm with colours at endpoints for 0 <m
≤ 1.
4. Modify the triangle-drawing algorithm so that it will draw exactly one pixel for points on a triangle
edge which goes through (x, y) = (−1, −1).
5. Suppose you are designing an integer z-buffer for flight simulation where all of the objects are at
least one meter thick, are never closer to the viewer than 4 meters, and may be as far away as 100
km. How many bits are needed in the z-buffer to ensure there are no visibility errors? Suppose that
visibility errors only matter near the viewer, i.e., for distances less than 100 meters. How many bits
are needed in that case?
6. Examine the visuals of two dissimilar games. What differences can you deduce in the graphics
requirements of these two games? Analyse the effect on rendering time, storage budgets, etc.
Unit 6 – Visualisation and Computer Animation

Unit 6 – Visualisation and Computer


Animation
This unit is aligned to:
Learning outcomes
• Understand graphics in computer games and visualisation
Assessment criteria:
• Explore graphics in computer games

• Explore visualisation in graphics

6.1 Introduction
In this unit we understand the concept of computer graphics and visualisation. Animation is derived from the
Latin word anima which means the act, process, or result of imparting life, interest, spirit, motion, or activity.
However, a major application area of computer graphics is visualization, where computer generated images
are used to help people understand both spatial and non-spatial data. This Unit is concluded with some
related examples to explain our understanding of the unit.

6.2 Computer Animation


Animation means giving life to any object in computer graphics. It has the power of injecting energy and
emotions into the most seemingly inanimate objects. Computer-assisted animation and computer-generated
animation are two categories of computer animation. It can be presented via film or video.

The basic idea behind animation is to play back the recorded images at the rates fast enough to fool the human
eye into interpreting them as continuous motion. Animation can make a series of static images come alive.
Animation can be used in many areas like entertainment, computer aided-design, scientific visualization,
training, education, e-commerce, and computer art. There are four main animation approaches:
• Keyframing: This gives the most direct control to the animator who provides necessary data at
some moments in time and the computer fills in the rest.
• Procedural: Here animation involves specially designed, often empirical, mathematical functions
and procedures whose output resembles some particular motion.
• Physics-based: This techniques solve differential equation of motion.
• Motion: The capture uses special equipment or techniques to record real-world motion and then
transfers this motion into that of computer models.
We discuss each of the four main animation approaches in details.
Unit 6 – Visualisation and Computer Animation

6.2.1 Keyframing
The term keyframing can be misleading when applied to 3D computer animation since no actual completed
frames (i.e., images) are typically involved. At any given moment, a 3D scene being animated is specified by
a set of numbers: the positions of centres of all objects, their RGB colours, the amount of scaling applied to
each object in each axis, modelling transformations between different parts of a complex object, camera
position and orientation, light sources intensity, etc. To animate a scene, some subset of these values have to
change with time. One can, of course, directly set these values at every frame, but this will not be particularly
efficient. Short of that, some number of important moments in time (key frames 𝑡𝑘 ) can be chosen along the
timeline of animation for each of the parameters and values of this parameter (key values 𝑓𝑘 ) are set only for
these selected frames.

We will call a combination (𝑡𝑘 , 𝑓𝑘 ) of key frame and key value simply a key. Key frames do not have to be the
same for different parameters, but it is often logical to set keys at least for some of them simultaneously. For
example, key frames chosen for x-, y- and z-coordinates of a specific object might be set at exactly the same
frames forming a single position vector key (𝑡𝑘 ,𝑃𝑘 ). These key frames, however, might be completely different
from those chosen for the object’s orientation or colour. The closer key frames are to each other, the more
control the animator has over the result; however the cost of doing more work of setting the keys has to be
assessed. It is, therefore, typical to have large spacing between keys in parts of the animation which are
relatively simple, concentrating them in intervals where complex action occurs, as shown in the Figure below.

Different patterns of setting keys (black circles above) can be used simultaneously for the same scene.
It is assumed that there are more frames before, as well as after, this portion.
(Source: Marschner, et al., 2016)

Once the animator sets the key (𝑡𝑘 ,𝑓𝑘 ), the system has to compute values of 𝑓 for all other frames. Although
we are ultimately interested only in a discrete set of values, it is convenient to treat this as a classical
interpolation problem which fits a continuous animation curve 𝑓(𝑡) through a provided set of data points (see
the Figure below).
Unit 6 – Visualisation and Computer Animation

Continuous curve f(t) is fit through the keys provided by the animator.
(Source: Marschner, et al., 2016)

6.2.1.1 Motion Controls


So far, we have described how to control the shape of the animation curve through key positioning and fine
tweaking of tangent values at the keys. This, however, is generally not sufficient when one would like to have
control both over where the object is moving, i.e., its path, and how fast it moves along this path. Given a set
of positions in space as keys, automatic curve-fitting techniques can fit a curve through them, but resulting
motion is only constrained by forcing the object to arrive at a specified key position 𝑃𝑘 at the corresponding key
frame 𝑡𝑘 , and nothing is directly said about the speed of motion between the keys. This can create problems.

For example, if an object moves along the x-axis with velocity 11 meters per second for 1 second and then
with 1 meter per second for 9 seconds, it will arrive at position x = 20 after 10 seconds thus satisfying animator’s
keys (0,0) and (10, 20). It is rather unlikely that this jerky motion was actually desired, and uniform motion with
speed 2 meters/second is probably closer to what the animator wanted when setting these keys. Although
typically not displaying such extreme behaviour, polynomial curves resulting from standard fitting procedures
do exhibit no uniform speed of motion between keys as demonstrated in Figure below.
Unit 6 – Visualisation and Computer Animation

All three motions are along the same 2D path and satisfy the set of keys at the tips of the black triangles.
The tips of the white triangles show object position at Δt = 1 intervals. Uniform speed of motion between the keys (top) might be closer to what the
animator wanted, but automatic fitting procedures could result in either of the other two motions.
(Source: Marschner, et al., 2016).

While this can be tolerable (within limits) for some parameters for which the human visual system is not very
good at determining no uniformities in the rate of change (such as colour or even rate of rotation), we have to
do better for position P of the object where velocity directly corresponds to everyday experience.

However, we will first distinguish curve parameterization used during the fitting procedure from that used for
animation. When a curve is fit through position keys, we will write the result as a function p(u) of some
parameter u. This will describe the geometry of the curve in space. The arc length s is the physical length of
the curve. A natural way for the animator to control the motion along the now existing curve is to specify an
extra function s(t) which corresponds to how far along the curve the object should be at any given time. To get
an actual position in space, we need one more auxiliary function u(s) which computes a parameter value u for
given arc length s. The complete process of computing an object position for a given time t is then given by
composing these functions (see Figure below)
𝑝(𝑡) = 𝑝(𝑢(𝑠(𝑡))).
Unit 6 – Visualisation and Computer Animation

To get position in space at a given time t, one first utilizes user-specified motion control to obtain the distance along the curve s(t) and then computes
the corresponding curve parameter value u(s(t)). Previously fitted curve P(u) can now be used to find the position P(u(s(t))). (Source: Marschner, et
al., 2016)

6.2.1.2 Interpolation Rotation


The techniques presented above can be used to interpolate the keys set for most of the parameters
describing the scene. Three-dimensional rotation is one important motion for which more specialized
interpolation methods and representations are common. The reason for this is that applying standard
techniques to 3D rotations often leads to serious practical problems. Rotation (a change in orientation of an
object) is the only motion other than translation which leaves the shape of the object intact. It therefore plays
a special role in animating rigid objects. There are several ways to specify the orientation of an object. First,
transformation matrices, unfortunately, naïve (element-by-element) interpolation of rotation matrices does not
produce a correct result.

For example, the matrix “halfway” between 2D clock- and counter clockwise 90 degree rotation is the null
matrix:
1 0 1 1 0 −1 0 0
[ ]+ [ ]= [ ].
2 −1 0 2 1 0 0 0

The correct result is, of course, the unit matrix corresponding to no rotation. Second, one can specify
arbitrary orientation as a sequence of exactly three rotations around coordinate axes chosen in some
specific order. These axes can be fixed in space (fixed-angle representation) or embedded into the object
therefore changing after each rotation (Euler-angle representation as shown in Figure below).

Three Euler angles can be used to specify arbitrary object orientation through a sequence of three rotations around coordinate axes
embedded into the object (axis Y always points to the tip of the cone). Note that each rotation is given in a new coordinate system.
Fixed angle representation is very similar, but the coordinate axes it uses are fixed in space and do not rotate with the object
(Source: Marschner, et al., 2016)

These three angles of rotation can be animated directly through standard keyframing, but a subtle problem
known as gimbal lock arises. Gimbal lock occurs if during rotation one of the three rotation axes is by accident
aligned with another, thereby reducing by one the number of available degrees of freedom as shown in the
Figure below for a physical device.
Unit 6 – Visualisation and Computer Animation

In this example, gimbal lock occurs when a 90 degree turn around axis
Z is made. Both X and Y rotations are now performed around the same axis leading to the loss of one degree of freedom.
(Source: Marschner, et al., 2016)

This effect is more common than one might think—a single 90 degree turn to the right (or left) can potentially
put an object into a gimbal lock. Finally, any orientation can be specified by choosing an appropriate axis in
space and angle of rotation around this axis. While animating in this representation is relatively straightforward,
combining two rotations, i.e., finding the axis and angle corresponding to a sequence of two rotations both
represented by axis and angle, is nontrivial. A special mathematical apparatus, quaternions has been
developed to make this representation suitable both for combining several rotations into a single one and for
animation.

6.2.2 Character Animation


Animation of articulated figures is most often performed through a combination of keyframing and specialized
deformation techniques. The character model intended for animation typically consists of at least two main
layers as shown below. The motion of a highly detailed surface representing the outer shell or skin of the
character is what the viewer will eventually see in the final product. The skeleton underneath it is a
hierarchical structure (a tree) of joints which provides a kinematic model of the figure and is used exclusively
for animation. In some cases, additional intermediate layer(s) roughly corresponding to muscles are inserted
between the skeleton and the skin.
Unit 6 – Visualisation and Computer Animation

Left) A hierarchy of joints, a skeleton, serves as a kinematic abstraction of the character; (middle) repositioning the skeleton deforms a separate skin
object attached to it; (right) a tree data structure is used to represent the skeleton. For compactness, the internal structure of several nodes is hidden
(they are identical to a corresponding sibling). (Source: Marschner, et al., 2016)

Each of the skeleton’s joints acts as a parent for the hierarchy below it. The root represents the whole character
and is positioned directly in the world coordinate system. If a local transformation matrix which relates a joint
to its parent in the hierarchy is available, one can obtain a transformation which relates local space of any joint
to the world system (i.e., the system of the root) by simply concatenating transformations along the path from
the root to the joint. To evaluate the whole skeleton (i.e., find position and orientation of all joints), a depth-first
traversal of the complete tree of joints is performed.

6.2.2.1 Facial Animation


Skeletons are well suited for creating most motions of a character’s body, but they are not very convenient for
realistic facial animation. The reason is that the skin of a human face is moved by muscles directly attached
to it, contrary to other parts of the body where the primary objective of the muscles is to move the bones of the
skeleton and any skin deformation is a secondary outcome. The result of this facial anatomical arrangement
is a very rich set of dynamic facial expressions humans use as one of the main instruments of communication.
We are all very well trained to recognize such facial variations and can easily notice any unnatural appearance.
This not only puts special demands on the animator but also requires a high-resolution geometric model of the
face and, if photorealism is desired, accurate skin reflection properties and textures.

While it is possible to set key poses of the face vertex-by-vertex and interpolate between them or directly
simulate the behaviour of the underlying muscle structure using physics-based techniques, more specialized
high-level approaches also exist.
Unit 6 – Visualisation and Computer Animation

6.2.2.2 Motion Capture


To create a realistic-looking character animation from scratch remains a daunting task. It is therefore only
natural that much attention is directed toward techniques which record an actor’s motion in the real world and
then apply it to computer-generated characters. Two main classes of such motion capture (MC) techniques
exist: electromagnetic and optical.

In electromagnetic motion capture, an electromagnetic sensor directly measures its position (and possibly
orientation) in 3D, often providing the captured results in real time. Disadvantages of this technique include
significant equipment cost, possible interference from nearby metal objects, and noticeable size of sensors
and batteries which can be an obstacle in performing high-amplitude motions.

In optical MC, small coloured markers are used instead of active sensors making it a much less intrusive
procedure. The Figure below shows the operation of such a system. In the most basic arrangement, the motion
is recorded by two calibrated video cameras, and simple triangulation is used to extract the marker’s 3D
position. More advanced computer vision algorithms used for accurate tracking of multiple markers from video
are computationally expensive, so, in most cases, such processing is done offline.

Optical motion capture: markers attached


to a performer’s body allow skeletal motion to be extracted.
Image courtesy of Motion Analysis Corp.
(Source: Marschner, et al., 2016)

6.2.3 Physic-Based Animation


Physics-based animation is most commonly used in situations when other techniques are either unavailable
or do not produce sufficiently realistic results. Prime examples include animation of fluids (which includes many
gaseous phase phenomena described by the same equations—smoke, clouds, fire, etc.), cloth simulation,
Unit 6 – Visualisation and Computer Animation

rigid body motion, and accurate deformation of elastic objects. Governing equations and details of commonly
used numerical approaches are different in each of these cases, but many fundamental ideas and difficulties
remain applicable across applications. Many methods for numerically solving ODEs and PDEs exist, but
discussing them in details is far beyond the scope of this section. To give the reader a flavour of physics-based
techniques and some of the issues involved, we will briefly mention here only the finite difference approach—
one of the conceptually simplest and most popular families of algorithms which has been applied to most, if
not all, differential equations encountered in animation.

The key idea of this approach is to replace a differential equation with its discrete analog—a difference
equation. To do this, the continuous domain of interest is represented by a finite set of points at which the
solution will be computed. In the simplest case, these are defined on a uniform rectangular grid as shown in
the Figure below.

Two possible difference schemes for an equation involving derivatives ∂f/∂x and ∂f/∂t. (Left) An explicit scheme expresses unknown values
(open circles) only through known values at the current (orange circles) and possibly past (blue circles) time; (Right) Implicit schemes mix known
and unknown values in a single equation making it necessary to solve all such equations as a system. For both schemes, information about values on
the right boundary is needed to close the process.

Every derivative present in the original ODE or PDE is then replaced by its approximation through function
values at grid points. One way of doing this is to subtract the function value at a given point from the function
value for its neighbouring point on the grid:

𝑑𝑓(𝑡) Δ𝑓 𝑓(𝑡 + Δ𝑡) − 𝑓(𝑡) 𝜕𝑓(𝑥, 𝑡) Δ𝑓 Δ𝑓(𝑥 + Δ𝑥, 𝑡) − 𝑓(𝑥, 𝑡)𝑓


≈ = 𝑜𝑟 ≈ =
𝑑𝑡 Δ𝑡 Δ𝑡 𝜕𝑥 Δ𝑥 Δ𝑥

These expressions are, of course, not the only way. One can, for example, use f(t − Δt) instead of f(t) above
and divide by 2Δt. For an equation containing a time derivative, it is now possible to propagate values of an
unknown function forward in time in a sequence of Δt-size steps by solving the system of difference equations
(one at each spatial location) for unknown f(t + Δt). Some initial conditions, i.e., values of the unknown function
at t = 0, are necessary to start the process. Other information, such as values on the boundary of the domain,
might also be required depending on the specific problem.
Unit 6 – Visualisation and Computer Animation

6.2.4 Procedural Techniques


Imagine that one could write (and implement on a computer) a mathematical function which outputs precisely
the desired motion given some animator guidance. Physics-based techniques outlined above can be treated
as a special case of such an approach when the “function” involved is the procedure to solve a particular
differential equation and “guidance” is the set of initial and boundary conditions, extra equation terms, etc.

However, if we are only concerned with the final result, we do not have to follow a physics-based approach.
For example, a simple constant amplitude wave on the surface of a lake can be directly created by applying
the function f(x, t) = Acos(ωt − kx + φ) with constant frequency ω, wave vector k and phase φ to get
displacement at the 2D point x at time t. A collection of such waves with random phases and appropriately
chosen amplitudes, frequencies, and wave vectors can result in a very realistic animation of the surface of
water without explicitly solving any fluid dynamics equations. It turns out that other rather simple mathematical
functions can also create very interesting patterns or objects. Adding time dependence to these functions
allows us to animate certain complex phenomena much easier and cheaper than with physics-based
techniques while maintaining very high visual quality of the results. If noise(x) is the underlying pattern-
generating function, one can create a time-dependant variant of it by moving the argument position through
the lattice.

The simplest case is motion with constant speed: timenoise(x, t) = noise(x + vt), but more complex motion
through the lattice is, of course, also possible and, in fact, more common. One such path, a spiral, is shown in
Figure below.

A path through the cube defining procedural


noise is traversed to animate the resulting pattern.
(Source: Marschner, et al., 2016)

Another approach is to animate parameters used to generate the noise function. This is especially appropriate
if the appearance changes significantly with time—a cloud becoming more turbulent, for example. In this way
one can animate the dynamic process of formation of clouds using the function which generates static ones.

For some procedural techniques, time dependence is a more integral component. The simplest cellular
automata operate on a 2D rectangular grid where a binary value is stored at each location (cell). To create a
time varying pattern, some user-provided rules for modifying these values are repeatedly applied. Rules
typically involve some set of conditions on the current value and that of the cell’s neighbours. For example, the
Unit 6 – Visualisation and Computer Animation

rules of the popular 2D Game of Life cellular automaton invented in 1970 by British mathematician John
Conway are the following:
1. A dead cell (i.e., binary value at a given location is 0) with exactly three live neighbours becomes a
live cell (i.e., its value set to 1).
2. A live cell with two or three live neighbours stays alive.
3. In all other cases, a cell dies or remains dead.

Once the rules are applied to all grid locations, a new pattern is created and a new evolution cycle can be
started. Three sample snapshots of the live cell distribution at different times are shown below.

Several (non-consecutive) stages in the evolution of a Game of Life automaton. Live cells are shown in black. Stable objects, oscillators, traveling
patterns, and many other interesting constructions can result from the application of very simple rules. Figure created using a program by Alan
Hensel. (Source: Marschner, et al., 2016)

6.2.5 Principles of Animation


Disney's 12 principles of animation were first introduced by animators Ollie Johnston and Frank Thomas in
their book Ollie Johnston. The Illusion of Life: Disney Animation, first released in 1981. Through examining the
work of leading Disney animators from the 1930s onwards, this book sees Johnston and Thomas boil their
approach down to 12 basic principles of animation. These basic principles include: squash and stretch, timing,
anticipation, follow through and overlapping action, slow-in and slow-out, staging, arcs, secondary action,
straight-ahead and pose-to-pose action, exaggeration, solid drawing skill, and appeal.

Forming the basis of all animation work, these principles are relevant for a number of different fields. Though
the clearest use is for animating a character, these rules are also an invaluable guide in other areas, for
instance, when introducing motion into your interface with some CSS animation.

Squash and Stretch


The squash and stretch principle is considered the most important of the 12 principles of animation. When
applied, it gives your animated characters and objects the illusion of gravity, weight, mass and flexibility. Think
about how a bouncing rubber ball may react when tossed into the air: the ball stretches when it travels up and
down and squishes when it hits the ground.
Unit 6 – Visualisation and Computer Animation

When using squash and stretch, it's important to keep the object's volume consistent. So when you stretch
something it needs to get thinner, and when you squash something it needs to get wider.

Anticipation
Anticipation helps to prepare the viewer for what's about to happen. When applied, it has the effect of making
the object's action more realistic.

Consider how if might look if you were to jump in the air without bending your knees, or perhaps to throw a
ball without first pulling your arm back. It would appear very unnatural (it may not even be possible to jump
without bending your knees!). In the same way, animating movements without a flicker of anticipation will
also make your motion seem awkward, stale and lifeless.

Staging
Staging in animation is akin to composition in artwork. This mean that, you should use motion to guide the
viewer's eye and draw attention to what is important within the scene. Keep the focus on what is important
within the scene, and keep the motion of everything else of non-importance to a minimum.

Straight ahead action and pose to pose


There are two ways to handle drawing animation: straight ahead and pose to pose. Each has its own
benefits, and the two approaches are often combined. Straight ahead action involves drawing frame-by-
frame from start to finish. If you're looking for fluid, realistic movements, straight ahead action is your best
bet.

With the pose to pose technique, you draw the beginning frame, the end frame, and a few key frames in-
between. Then you go back and complete the rest. This technique gives you a bit more control within the
scene and allows you to increase the dramatic effect of the motion.

Follow through and overlapping action


When objects come to a standstill after being in motion, different parts of the object will stop at different
rates. Similarly, not everything on an object will move at the same rate. This forms the essence of the fifth of
Disney's principles of animation.

If your character is running across the scene, their arms and legs may be moving at a different rate from their
head. This is overlapping action. Likewise, when they stop running, their hair will likely continue to move for a
few frames before coming to rest – this is follow through. These are important principles to understand if you
want your animation to flow realistically.

Slow in and slow out


The best way to understand slow in and slow out is to think about how a car starts up and stops. It will start
moving slowly, before gaining momentum and speeding up. The reverse will happen when the car brakes. In
Unit 6 – Visualisation and Computer Animation

animation, this effect is achieved by adding more frames at the beginning and end of an action sequence.
Apply this principle to give your objects more life.

Arc
When working in animation, it's best to stick with the laws of physics. Most objects follow an arc or a path
when they're moving, and your animations should reflect that arc. For example, when you toss a ball into the
air, it follows a natural arc as the effects of the Earth's gravity act upon it.

Secondary action
Secondary actions are used to support or emphasise the main action going on within a scene. Adding
secondary actions help add more dimension to your characters and objects.

For instance, the subtle movement of your character’s hair as they walk, or perhaps a facial expression or a
secondary object reacting to the first. Whatever the case may be, this secondary action should not distract
from the primary one.

Timing
For this principle of animation we need to look to the laws of physics again, and apply what we see in the
natural world to our animations. In this case, the focus is on timing.

If you move an object more quickly or slowly than it would naturally move in the real world, the effect won't
be believable. Using the correct timing allows you to control the mood and the reaction of your characters
and objects. That's not to say you can't push things a little (especially if you're creating an imaginary world) –
but if you do, be consistent.

Exaggeration
Too much realism can ruin an animation, making it appear static and boring. Instead, add some exaggeration
to your characters and objects to make them more dynamic. Find ways to push the limits just beyond what's
possible, and your animations will pop.

Solid drawing
You need to understand the basics of drawing. This includes knowing how to draw in three-dimensional
space and understanding form and anatomy, weight and volume, and lights and shadows.

While you can push the limits here, too, it's important to remain consistent. If your world has wonky doors
and a warped perspective, keep that perspective throughout the entire animation. Otherwise, things will fall
apart.

Appeal
Unit 6 – Visualisation and Computer Animation

Your characters, objects, and the world in which they live need to appeal to the viewer. This includes having
an easy-to-read design, solid drawing, and a personality. There is no formula for getting this right, but it
starts with strong character development and being able to tell your story through the art of animation.

6.2.6 Deformation
Although techniques for object deformation might be more properly treated as modelling tools, they are
traditionally discussed together with animation methods. Probably the simplest example of an operation which
changes object shape is a non-uniform scaling. More generally, some function can be applied to local
coordinates of all points specifying the object (i.e., vertices of a triangular mesh or control polygon of a spline
surface), repositioning these points and creating a new shape: p` = f(p,γ) where γ is a vector of parameters
used by the deformation function. Choosing different f (and combining them by applying one after another) can
help to create very interesting deformations. Examples of useful simple functions include bend, twist, and taper
which are shown below.

Popular examples of global deformations.


Bending and twist angles, as well as the degree of taper, can all be animated to achieve dynamic shape change.
(Source: Marschner, et al., 2016)

Animating shape change is very easy in this case by keyframing the parameters of the deformation function.
Disadvantages of this technique include difficulty of choosing the mathematical function for some nonstandard
deformations and the fact that the resulting deformation is global in the sense that the complete object, and
not just some part of it, is reshaped.

To deform an object locally while providing more direct control over the result, one can choose a single vertex,
move it to a new location and adjust vertices within some neighbourhood to follow the seed vertex. The area
affected by the deformation and the specific amount of displacement in different parts of the object are
controlled by an attenuation function which decreases with distance (typically computed over the object’s
surface) to the seed vertex. Seed vertex motion can be keyframed to produce animated shape change.

6.2.7 Groups of Objects


To animate multiple objects one can, of course, simply apply standard techniques described in the unit so far
to each of them. This works reasonably well for a moderate number of independent objects whose desired
motion is known in advance. However, in many cases, some kind of coordinated action in a dynamic
environment is necessary. If only a few objects are involved, the animator can use an artificial intelligence
Unit 6 – Visualisation and Computer Animation

(AI)-based system to automatically determine immediate tasks for each object based on some high-level
goal, plan necessary motion, and execute the plan. Many modern games use such autonomous objects to
create smart monsters or player’s collaborators.

Interestingly, as the number of objects in a group grows from just a few to several dozens, hundreds, and
thousands, individual members of a group must have only very limited “intelligence” in order for the group as
a whole to exhibit what looks like coordinated goal-driven motion. It turns out that this flocking is emergent
behaviour which can arise as a result of limited interaction of group members with just a few of their closest
neighbours (Reynolds, 1987). Flocking should be familiar to anyone who has observed the fascinatingly
synchronized motion of a flock of birds or a school of fish. The technique can also be used to control groups
of animals moving over terrain or even a human crowd.

6.3 Visualisation
Visualization is used when the goal is to augment human capabilities in situations where the problem is not
sufficiently well defined for a computer to handle algorithmically. If a totally automatic solution can completely
replace human judgment, then visualization is not typically required. Visualization can be used to generate
new hypotheses when exploring a completely unfamiliar dataset, to confirm existing hypotheses in a partially
understood dataset, or to present information about a known dataset to another audience.

Visualization allows people to offload cognition to the perceptual system, using carefully designed images as
a form of external memory. The human visual system is a very high-bandwidth channel to the brain, with a
significant amount of processing occurring in parallel and at the pre-conscious level. We can thus use external
images as a substitute for keeping track of things inside our own heads. For an example, let us consider the
task of understanding the relationships between a subset of the topics in the splendid book Gödel, Escher,
Bach: The Eternal Golden Braid (Hofstadter, 1979); see Figure below:

Infinity - Lewis Carroll Epimenides - Self-ref


Infinity – Zeno Epimenides - Tarski
Infinity – Paradoxes Tarski - Epimenides
Infinity - Halting problem Halting problem - Decision procedures
Zeno - Lewis Carroll Halting problem - Turing
Paradoxes - Lewis Carroll Lewis Carroll - Wordplay
Paradoxes – Epimenides Tarski - Truth vs. provability
Paradoxes - Self-ref Tarski - Undecidability

Keeping track of relationships between topics is difficult using a text list.


(Source: Marschner, et al., 2016)

When we see the dataset as a text list, at the low level we must read words and compare them to memories
of previously read words. It is hard to keep track of just these dozen topics using cognition and memory alone,
let alone the hundreds of topics in the full book. The higher-level problem of identifying neighbourhoods, for
instance finding all the topics two hops away from the target topic Paradoxes, is very difficult.
Unit 6 – Visualisation and Computer Animation

Substituting perception for cognition and memory allows us to understand relationships between book topics quickly.
(Source: Marschner, et al., 2016)

We call the mapping of dataset attributes to a visual representation a visual encoding. One of the central
problems in visualization is choosing appropriate encodings from the enormous space of possible visual
representations, taking into account the characteristics of the human perceptual system, the dataset in
question, and the task at hand.

When designing a visualization system, we must consider three different kinds of limitations: computational
capacity, human perceptual and cognitive capacity, and display capacity. As with any application of computer
graphics, computer time and memory are limited resources and we often have hard constraints. If the
visualization system needs to deliver interactive response, then it must use algorithms that can run in a fraction
of a second rather than minutes or hours.

6.3.1 Data Types


Many aspects of a visualization design are driven by the type of the data that we need to look at. For example,
is it a table of numbers, or a set of relations between items, or inherently spatial data such as a location on the
Earth’s surface or a collection of documents? We start by considering a table of data. We call the rows items
of data and the columns are dimensions, also known as attributes. For example, the rows might represent
people, and the columns might be names, age, height, shirt size, and favourite fruit. We distinguish between
three types of dimensions: quantitative, ordered, and categorical. Quantitative data, such as age or height, is
numerical and we can do arithmetic on it. For example, the quantity of 68 inches minus 42 inches is 26 inches.
With ordered data, such as shirt size, we cannot do full-fledged arithmetic, but there is a well-defined ordering.
For example, large minus medium is not a meaningful concept, but we know that medium falls between small
and large. Categorical data, such as favourite fruit or names, does not have an implicit ordering. We can only
distinguish whether two things are the same (apples) or different (apples vs. bananas).
Unit 6 – Visualisation and Computer Animation

Relational data, or graphs, are another data type where nodes are connected by links. One specific kind of
graph is a tree, which is typically used for hierarchical data. Both nodes and edges can have associated
attributes. The word graph is unfortunately overloaded in visualization. The node-link graphs we discuss here,
following the terminology of graph drawing and graph theory, could also be called networks. In the field of
statistical graphics, graph is often used for chart, as in the line charts for time-series data.

Some data is inherently spatial, such as geographic location or a field of measurements at positions in three-
dimensional space as in the MRI or CT scans used by doctors to see the internal structure of a person’s body.
The information associated with each point in space may be an unordered set of scalar quantities, or indexed
vectors, or tensors. In contrast, non-spatial data can be visually encoded using spatial position, but that
encoding is chosen by the designer rather than given implicitly in the semantics of the dataset itself. This
choice is one of the most central and difficult problems of visualization design. Notable data structures in
visualization are dimensions and item count, data transformation and derived dimensions.

6.3.2 Human-Centred Design Process


The visualization design process can be split into a cascading set of layers, as shown below.

Four nested layers of validation for visualization.


(Source: Marschner, et al., 2016)

These layers all depend on each other; the output of the level above is input into the level below.

Task Characterization: A given dataset has many possible visual encodings. Choosing which visual encoding
to use can be guided by the specific needs of some intended user. Different questions, or tasks, require very
different visual encodings. For example, consider the domain of software engineering.

Abstraction: Problems from very different domains can map to the same visualization abstraction. That is
presenting it into a more generic representation of operations. These generic operations include sorting,
filtering, characterizing trends and distributions, finding anomalies and outliers, and finding correlation. They
also include operations that are specific to a particular data type, for example following a path for relational
data in the form of graphs or trees. This abstraction step often involves data transformations from the original
raw data into derived dimensions. These derived dimensions are often of a different type than the original data:
a graph may be converted into a tree, tabular data may be converted into a graph by using a threshold to
decide whether a link should exist based on the field values, and so on.
Unit 6 – Visualisation and Computer Animation

Techniques and Algorithm Design: Once an abstraction has been chosen, the next layer is to design
appropriate visual encoding and interaction techniques. Section 6.3.3 covers the principles of visual encoding,
and we discuss interaction principles in Sections 6.3.4.

Validation: Each of the four layers has different validation requirements. The first layer is designed to
determine whether the problem is correctly characterized. The next layer is used to determine whether the
abstraction from the domain problem into operations on specific data types actually solves the desired problem.
The purpose of the third layer is to verify that the visual encoding and interaction techniques chosen by the
designer effectively communicate the chosen abstraction to the users. A fourth layer is employed to verify that
the algorithm designed to carry out the encoding and interaction choices is faster or takes less memory than
previous algorithms.

6.3.3 Visual Encoding Principles


We can describe visual encodings as graphical elements, called marks that convey information through
visual channels. A zero-dimensional mark is a point, a one-dimensional mark is a line, a two-dimensional
mark is an area, and a three dimensional mark is a volume. Many visual channels can encode information,
including spatial position, colour, size, shape, orientation, and direction of motion. Multiple visual channels
can be used to simultaneously encode different data dimensions; for example, Figure below shows the use
of horizontal and vertical spatial position, colour, and size to display four data dimensions. More than one
channel can be used to redundantly code the same dimension, for a design that displays less information but
shows it more clearly.

The four visual channels of horizontal and vertical spatial position, colour, and size are used to encode information in this scatterplot chart Image.
(Source: Marschner, et al., 2016)
Unit 6 – Visualisation and Computer Animation

Visual Channel Characteristics


Important characteristics of visual channels are distinguishability, separability, and popout. Channels are not
all equally distinguishable. Many psychophysical experiments have been carried out to measure the ability of
people to make precise distinctions about information encoded by the different visual channels. Our abilities
depend on whether the data type is quantitative, ordered, or categorical.

Colour
Colour can be a very powerful channel, but many people do not understand its properties and use it improperly.
We can consider colour in terms of three separate visual channels: hue, saturation, and lightness. Region size
strongly affects our ability to sense colour. Colour in small regions is relatively difficult to perceive, and
designers should use bright, highly saturated colours to ensure that the colour coding is distinguishable. The
inverse situation is true when coloured regions are large, as in backgrounds, where saturation pastel colours
low should be used to avoid blinding the viewer.

2D vs. 3D Spatial Layouts


The question of whether to use two or three channels for spatial position has been extensively studied. When
computer-based visualization began in the late 1980s, and interactive 3D graphics was a new capability, there
was a lot of enthusiasm for 3D representations. As the field matured, researchers began to understand the
costs of 3D approaches when used for abstract datasets.

Occlusion, where some parts of the dataset are hidden behind others, is a major problem with 3D. Although
hidden surface removal algorithms such as zbuffers and BSP trees allow fast computation of a correct 2D
image, people must still synthesize many of these images into an internal mental map.

In contrast, if a dataset consists of inherently 3D spatial data, such as showing fluid flow over an airplane wing
or a medical imaging dataset from an MRI scan, then the costs of a 3D view are outweighed by its benefits in
helping the user construct a useful mental model of the dataset structure.

Text Labels
Text in the form of labels and legends is a very important factor in creating visualizations that are useful rather
than simply pretty. Axes and tick marks should be labelled. Legends should indicate the meaning of colours,
whether used as discrete patches or in continuous colour ramps. Individual items in a dataset typically have
meaningful text labels associated with them. In many cases showing all labels at all times would result in too
much visual clutter, so labels can be shown for a subset of the items using label positioning algorithms that
show labels at a desired density while avoiding overlap.

6.3.4 Interaction Principles


Several principles of interaction are important when designing a visualization. Low-latency visual feedback
allows users to explore more fluidly, for example by showing more detail when the cursor simply hovers over
an object rather than requiring the user to explicitly click. Selecting items is a fundamental operation when
Unit 6 – Visualisation and Computer Animation

interacting with large datasets, as is visually indicating the selected set with highlighting. Colour coding is a
common form of highlighting, but other channels can also be used. Many forms of interaction can be
considered in terms of what aspect of the display they change. Navigation can be considered a change of
viewport. Sorting is a change to the spatial ordering; that is, changing how data is mapped to the spatial position
visual channel. The entire visual encoding can also be changed. Others are overview first, zoom and filter,
details or demand, interactivity cost and animation.

6.3.5 Composite and Adjacent Views


A very fundamental visual encoding choice is whether to have a single composite view showing everything in
the same frame or window, or to have multiple views adjacent to each other.

Single Drawing
When there are only one or two data dimensions to encode, then horizontal and vertical spatial position are
the obvious visual channel to use, because we perceive them most accurately and position has the strongest
influence on our internal mental model of the dataset. The traditional statistical graphics displays of line
charts, bar charts, and scatterplots all use spatial ordering of marks to encode information. These displays
can be augmented with additional visual channels, such as colour and size and shape.

Superimposing and Layering


Multiple items can be superimposed in the same frame when their spatial position is compatible. Several
lines can be shown in the same line chart, and many dots in the same scatterplot, when the axes are shared
across all items. One benefit of a single shared view is that comparing the position of different items is very
easy. If the number of items in the dataset is limited, then a single view will often suffice. Visual layering can
extend the usefulness of a single view when there are enough items that visual clutter becomes a concern.

Glyphs
We have been discussing the idea of visual encoding using simple marks, where a single mark can only have
one value for each visual channel used. With more complex marks, which we will call glyphs, there is internal
structure where sub-regions have different visual channel encodings.

Designing appropriate glyphs has the same challenges as designing visual encodings. The Figure below
shows a variety of glyphs, including the notorious faces originally proposed by Chernoff. The danger of using
faces to show abstract data dimensions is that our perceptual and emotional response to different facial
features is highly nonlinear in a way that is not fully understood, but the variability is greater than between the
visual channels that we have discussed so far. We are probably far more attuned to features that indicate
emotional state, such as eyebrow orientation, than other features, such as nose size or face shape.
Unit 6 – Visualisation and Computer Animation

Complex marks, which we call glyphs, have subsections that visually encode different data dimensions.
(Source: Marschner, et al., 2016)

Multiple Views
We now turn from approaches with only a single frame to those which use multiple views that are linked
together. The most common form of linkage is linked highlighting, where items selected in one view are
highlighted in all others. In linked navigation, movement in one view triggers movement in the others.

There are many kinds of multiple-view approaches. In what is usually called simply the multiple-view approach,
the same data is shown in several views, each of which has a different visual encoding that shows certain
aspects of the dataset most clearly. The power of linked highlighting across multiple visual encodings is that
items that fall in a contiguous region in one view are often distributed very differently in the other views. In the
small-multiples approach, each view has the same visual encoding for different datasets, usually with shared
axes between frames so that comparison of spatial position between them is meaningful. Side-by-side
comparison with small multiples is an alternative to the visual clutter of superimposing all the data in the same
view, and to the human memory limitations of remembering previously seen frames in an animation that
changes over time.

6.3.6 Data Reduction


The visual encoding techniques that we have discussed so far show all of the items in a dataset. However,
many datasets are so large that showing everything simultaneously would result in so much visual clutter that
the visual representation would be difficult or impossible for a viewer to understand. The main strategies to
reduce the amount of data shown are overviews and aggregation, filtering and navigation, the focus+context
techniques, and dimensionality reduction.

Overviews and Aggregation


With tiny datasets, a visual encoding can easily show all data dimensions for all items. For datasets of medium
size, an overview that shows information about all items can be constructed by showing less detail for each
item. Many datasets have internal or derivable structure at multiple scales. In these cases, a multiscale visual
Unit 6 – Visualisation and Computer Animation

representation can provide many levels of overview, rather than just a single level. Overviews are typically
used as a starting point to give users clues about where to drill down to inspect in more detail.

For larger datasets, creating an overview requires some kind of visual summarization. One approach to data
reduction is to use an aggregate representation where a single visual mark in the overview explicitly represents
many items. The challenge of aggregation is to avoid eliminating the interesting signals in the dataset in the
process of summarization.

Filtering and Navigation


Another approach to data reduction is to filter the data, showing only a subset of the items. Filtering is often
carried out by directly selecting ranges of interest in one or more of the data dimensions. Navigation is a
specific kind of filtering based on spatial position, where changing the viewpoint changes the visible set of
items. Both geometric and nongeometric zooming are used in visualization. With geometric zooming, the
camera position in 2D or 3D space can be changed with standard computer graphics controls. In a realistic
scene, items should be drawn at a size that depends on their distance from the camera, and only their apparent
size changes based on that distance. However, in a visual encoding of an abstract space, nongeometric
zooming can be useful.

Focus+Context
Focus+context techniques are another approach to data reduction. A subset of the dataset items are
interactively chosen by the user to be the focus and are drawn in detail. The visual encoding also includes
information about some or all of the rest of the dataset shown for context, integrated into the same view that
shows the focus items. Many of these techniques use carefully chosen distortion to combine magnified focus
regions and minified context regions into a unified view.

Dimensionality Reduction
When there are many data dimensions, dimensionality reduction can also be effective. With slicing, a single
value is chosen from the dimension to eliminate, and only the items matching that value for the dimension are
extracted to include in the lower-dimensional slice. Slicing is particularly useful with 3D spatial data, for
example when inspecting slices through a CT scan of a human head at different heights along the skull. Slicing
can be used to eliminate multiple dimensions at once.
Unit 6 – Visualisation and Computer Animation

Examples

We conclude this chapter with several examples of visualizing specific types of data
using the techniques discussed above.

Tables
Tabular data is extremely common, as all spreadsheet users know. The goal in
visualization is to encode this information through easily perceivable visual channels
rather than forcing people to read through it as numbers and text. Figure 26.20 shows
the Table Lens, a focus+context approach where quantitative values are encoded as the
length of one-pixel high lines in the context regions, and shown as numbers in the focus
regions. Each dimension of the dataset is shown as a column, and the rows of items can
be resorted according to the values in that column with a single click in its header.

The Table Lens provides focus+context interaction with tabular data, immediately reorderable by the values in each
dimension column. (Source: Marschner, et al., 2016)

Graphs
The field of graph drawing is concerned with finding a spatial position for the nodes in a
graph in 2D or 3D space and routing the edges between these nodes. In many cases
the edge-routing problem is simplified by using only straight edges, or by only allowing
right angle bends for the class of orthogonal layouts, but some approaches handle true
curves. If the graph has directed edges, a layered approach can be used to show
Unit 6 – Visualisation and Computer Animation

hierarchical structure through the horizontal or vertical spatial ordering of nodes, as


shown in Figure 26.2.

Graph layout aesthetic criteria. Top: Edge crossings should be minimized.


Middle: Angular resolution should be maximized.
Bottom: Symmetry is maximized on the left, whereas
crossings are minimized on the right, showing the conflict
between the individually NPhard criteria (Source: Marschner, et al., 2016)

Trees
Trees are a special case of graphs so common that a great deal of visualization research
has been devoted to them. A straightforward algorithm to lay out trees in the two-
dimensional plane works well for small trees, while a more complex but scalable
approach runs in linear time. Treemaps use containment rather than connection to show
the hierarchical relationship between parent and child nodes in a tree. That is, treemaps
show child nodes nested within the outlines of the parent node.
Unit 6 – Visualisation and Computer Animation

Treemap showing a filesystem of nearly one million files (Source: Marschner, et al., 2016)

The Figure above shows a hierarchical file system of nearly one million files, where file
size is encoded by rectangle size and file type is encoded by colour. The size of nodes
at the leaves of the tree can encode an additional data dimension, but the size of nodes
in the interior does not show the value of that dimension; it is dictated by the cumulative
size of their descendants.
Unit 6 – Visualisation and Computer Animation

Unit 7 – Application Programming


Interface with OpenGL and OpenGL
Geometry
This unit is aligned to:
Learning outcomes
• Develop the skills to design simple video games

Assessment criteria:

• Examine different methods to create video games

7.1 Introduction
OpenGL is the premier environment for developing portable, interactive 2D and 3D graphics
applications. Since its introduction in 1992, OpenGL has become the industry's most widely used and
supported 2D and 3D graphics application programming interface (API), bringing thousands of
applications to a wide variety of computer platforms. OpenGL fosters innovation and speeds application
development by incorporating a broad set of rendering, texture mapping, special effects, and other
powerful visualization functions. Developers can leverage the power of OpenGL across all popular
desktop and workstation platforms, ensuring wide application deployment.

7.1.1 High Visual Quality and Performance


Any visual computing application requiring maximum performance-from 3D animation to CAD to
visual simulation-can exploit high-quality, high-performance OpenGL capabilities. These capabilities
allow developers in diverse markets such as broadcasting, CAD/CAM/CAE, entertainment, medical
imaging, and virtual reality to produce and display incredibly compelling 2D and 3D graphics

7.1.2 Developer-Driven Advantages


Industry standard
An independent consortium, the OpenGL Architecture Review Board, guides the OpenGL specification.
With broad industry support, OpenGL is the only truly open, vendor-neutral, multiplatform graphics
standard.
Unit 6 – Visualisation and Computer Animation

Stable
OpenGL implementations have been available for more than seven years on a wide variety of
platforms. Additions to the specification are well controlled, and proposed updates are announced in
time for developers to adopt changes. Backward compatibility requirements ensure that existing
applications do not become obsolete.

Reliable and portable


All OpenGL applications produce consistent visual display results on any OpenGL API-compliant
hardware, regardless of operating system or windowing system.

Evolving
Because of its thorough and forward-looking design, OpenGL allows new hardware innovations to be
accessible through the API via the OpenGL extension mechanism. In this way, innovations appear in
the API in a timely fashion, letting application developers and hardware vendors incorporate new
features into their normal product release cycles.

Scalable
OpenGL API-based applications can run on systems ranging from consumer electronics to PCs,
workstations, and supercomputers. As a result, applications can scale to any class of machine that
the developer chooses to target.

Easy to use
OpenGL is well structured with an intuitive design and logical commands. Efficient OpenGL routines
typically result in applications with fewer lines of code than those that make up programs generated
using other graphics libraries or packages. In addition, OpenGL drivers encapsulate information about
the underlying hardware, freeing the application developer from having to design for specific hardware
features.

Well-documented
Numerous books have been published about OpenGL, and a great deal of sample code is readily
available, making information about OpenGL inexpensive and easy to obtain.

7.1.3 Developer-Driven Advantages


Unit 6 – Visualisation and Computer Animation

OpenGL operates on image data as well as geometric primitives.


(Source: https://www.opengl.org/about/#12)
7.1.4 Simplifies Software Development, Speeds Time-to-Market
OpenGL routines simplify the development of graphics software—from rendering a simple geometric
point, line, or filled polygon to the creation of the most complex lighted and texture-mapped NURBS
curved surface. OpenGL gives software developers access to geometric and image primitives, display
lists, modelling transformations, lighting and texturing, anti-aliasing, blending, and many other features.

Every conforming OpenGL implementation includes the full complement of OpenGL functions. The well-
specified OpenGL standard has language bindings for C, C++, Fortran, Ada, and Java. All licensed
OpenGL implementations come from a single specification and language binding document and are
required to pass a set of conformance tests. Applications utilizing OpenGL functions are easily portable
across a wide array of platforms for maximized programmer productivity and shorter time-to-market.

All elements of the OpenGL state—even the contents of the texture memory and the frame buffer—can
be obtained by an OpenGL application. OpenGL also supports visualization applications with 2D images
treated as types of primitives that can be manipulated just like 3D geometric objects. As shown in the
OpenGL visualization programming pipeline diagram above, images and vertices defining geometric
primitives are passed through the OpenGL pipeline to the frame buffer.

7.1.5 Available Everywhere


Supported on all UNIX® workstations, and shipped standard with every Windows 95/98/2000/NT and
MacOS PC, no other graphics API operates on a wider range of hardware platforms and software
environments. OpenGL runs on every major operating system including Mac OS, OS/2, UNIX, Windows
95/98, Windows 2000, Windows NT, Linux, OPENStep, and BeOS; it also works with every major
windowing system, including Win32, MacOS, Presentation Manager, and X-Window System. OpenGL
is callable from Ada, C, C++, Fortran, Python, Perl and Java and offers complete independence from
network protocols and topologies.

7.1.5 Architected for Flexibility and Differentiation: Extensions


Although the OpenGL specification defines a particular graphics processing pipeline, platform vendors
have the freedom to tailor a particular OpenGL implementation to meet unique system cost and
performance objectives. Individual calls can be executed on dedicated hardware, run as software
routines on the standard system CPU, or implemented as a combination of both dedicated hardware
and software routines. This implementation flexibility means that OpenGL hardware acceleration can
range from simple rendering to full geometry and is widely available on everything from low-cost PCs
to high-end workstations and supercomputers. Application developers are assured consistent display
results regardless of the platform implementation of the OpenGL environment.
Unit 6 – Visualisation and Computer Animation

Using the OpenGL extension mechanism, hardware developers can differentiate their products by
developing extensions that allow software developers to access additional performance and
technological innovations.

Many OpenGL extensions, as well as extensions to related APIs like GLU, GLX, and WGL, have been
defined by vendors and groups of vendors. The OpenGL Extension Registry is maintained by SGI and
contains specifications for all known extensions, written as modifications to the appropriate specification
documents. The registry also defines naming conventions, guidelines for creating new extensions and
writing suitable extension specifications, and other related documentation.

7.1.6 API Hierarchy

Demonstration of the relationship between OpenGL GLU and windowing APIs


(Source: https://www.opengl.org/about/#12)

7.1.7 The Foundation for Advanced APIs


Leading software developers use OpenGL, with its robust rendering libraries, as the 2D/3D graphics
foundation for higher-level APIs. Developers leverage the capabilities of OpenGL to deliver highly
differentiated, yet widely supported vertical market solutions. For example, Open Inventor provides a
cross-platform user interface and flexible scene graph that makes it easy to create OpenGL
applications. IRIS Performer < leverages OpenGL functionality and delivers additional features tailored
for the demanding high frame rate markets such as visual simulation and virtual sets OpenGL Optimizer
is a toolkit for real-time interaction, modification, and rendering of complex surface-based models such
as those found in CAD/CAM and special effects creation. OpenGL Volumizer is a high-level immediate
mode volume rendering API for the energy, medical and sciences markets. OpenGL Shader provides
a common interface to support realistic visual effects, bump mapping, multiple textures, environment
maps, volume shading and an unlimited array of new effects using hardware acceleration on standard
OpenGL graphics cards.

7.1.8 Governance
The OpenGL Architecture Review Board (ARB), was an independent consortium formed in 1992 that
governed the future of OpenGL, proposing and approving changes to the specification, new releases,
and conformance testing. In Sept 2006, the ARB became the OpenGL Working Group under
the Khronos Group consortium for open standard APIs.
Unit 6 – Visualisation and Computer Animation

The OpenGL Performance Characterization Committee, another independent organization, creates and
maintains OpenGL benchmarks and publishes the results of those benchmarks on its Web site:
http://www.specbench.org/benchmarks.html#gwpg.

Continued Innovation

The OpenGL standard is constantly evolving. Formal revisions occur at periodic intervals, and
extensions allowing application developers to access the latest hardware advances through OpenGL
are continuously being developed. As extensions become widely accepted, they are considered for
inclusion into the core OpenGL standard. This process allows OpenGL to evolve in a controlled yet
innovative manner.

OpenGL Applications & Games

OpenGL is the pervasive standard for 3D consumer and professional applications across all major OS
platforms.

7.2 Design
The OpenGL specification describes an abstract API for drawing 2D and 3D graphics. Although it is
possible for the API to be implemented entirely in software, it is designed to be implemented mostly or
entirely in hardware.

The API is defined as a set of functions which may be called by the client program, alongside a set of
named integer constants (for example, the constant GL_TEXTURE_2D, which corresponds to the
decimal number 3553). Although the function definitions are superficially similar to those of the
programming language C, they are language-independent. As such, OpenGL has many language
bindings, some of the most noteworthy being the JavaScript binding WebGL (API, based on OpenGL
ES 2.0, for 3D rendering from within a web browser); the C bindings WGL, GLX and CGL; the C binding
provided by iOS; and the Java and C bindings provided by Android.

In addition to being language-independent, OpenGL is also cross-platform. The specification says


nothing on the subject of obtaining, and managing an OpenGL context, leaving this as a detail of the
underlying windowing system. For the same reason, OpenGL is purely concerned with rendering,
providing no APIs related to input, audio, or windowing
Unit 6 – Visualisation and Computer Animation

An illustration of the graphics pipeline process


(Source: https://en.wikipedia.org/wiki/OpenGL)

7.3 Development
OpenGL is an evolving API. New versions of the OpenGL specifications are regularly released by the
Khronos Group, each of which extends the API to support various new features. The details of each
version are decided by consensus between the Group's members, including graphics card
manufacturers, operating system designers, and general technology companies such as Mozilla and
Google.

In addition to the features required by the core API, graphics processing unit (GPU) vendors may
provide additional functionality in the form of extensions. Extensions may introduce new functions and
new constants, and may relax or remove restrictions on existing OpenGL functions. Vendors can use
extensions to expose custom APIs without needing support from other vendors or the Khronos Group
as a whole, which greatly increases the flexibility of OpenGL. All extensions are collected in, and defined
by, the OpenGL Registry.

Each extension is associated with a short identifier, based on the name of the company which
developed it. For example, Nvidia's identifier is NV, which is part of the extension name
GL_NV_half_float, the constant GL_HALF_FLOAT_NV, and the function glVertex2hNV().[8] If multiple
vendors agree to implement the same functionality using the same API, a shared extension may be
released, using the identifier EXT. In such cases, it could also happen that the Khronos Group's
Architecture Review Board gives the extension their explicit approval, in which case the identifier ARB
is used.

The features introduced by each new version of OpenGL are typically formed from the combined
features of several widely implemented extensions, especially extensions of type ARB or EXT.

7.4 Documentation
The OpenGL Architecture Review Board released a series of manuals along with the specification
which have been updated to track changes in the API. These are commonly referred to by the colours
of their covers:
Unit 6 – Visualisation and Computer Animation

The Red Book


OpenGL Programming Guide, 9th Edition. ISBN 978-0-134-49549-1
The Official Guide to Learning OpenGL, Version 4.5 with SPIR-V

The Orange Book


OpenGL Shading Language, 3rd edition. ISBN 0-321-63763-1
A tutorial and reference book for GLSL.
Historic books (pre-OpenGL 2.0):

The Green Book


OpenGL Programming for the X Window System. ISBN 978-0-201-48359-8
A book about X11 interfacing and OpenGL Utility Toolkit (GLUT).
The Blue Book
OpenGL Reference manual, 4th edition. ISBN 0-321-17383-X
Essentially a hard-copy printout of the Unix manual (man) pages for OpenGL.
Includes a poster-sized fold-out diagram showing the structure of an idealised OpenGL
implementation.

The Alpha Book (white cover)


OpenGL Programming for Windows 95 and Windows NT. ISBN 0-201-40709-4
A book about interfacing OpenGL with Microsoft Windows.

7.5 Basic OpenGL Application Layout


A simple and basic OpenGL application has, at its heart, a display loop that is called either as fast as
possible, or at a rate that coincides with the refresh rate of the monitor or display device. The example
loop below uses the GLFW library, which supports OpenGL coding across multiple platforms.

Example
Unit 6 – Visualisation and Computer Animation

while (!glfwWindowShouldClose(window)) {
{
// OpenGL code is called here,
// each time this loop is executed.
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
// Swap front and back buffers
glfwSwapBuffers(window);
// Poll for events
glfwPollEvents();
if (glfwGetKey( window, GLFW_KEY_ESCAPE ) == GLFW_PRESS)
glfwSetWindowShouldClose(window, 1);
}

The loop is tightly constrained to operate only while the window is open. This example loop resets the
colour buffer values and also resets the z-buffer depth values in the graphics hardware memory based
on previously set (or default) values. Input devices, such as keyboards, mouse, network, or some other
interaction mechanism are processed at the end of the loop to change the state of data structures
associated with the program. The call to glfwSwapBuffers synchronizes the graphics context with the
display refresh, performing the pointer swap between the front and back buffers so that the updated
graphics state is displayed on the user’s screen. The call to swap the buffers occurs after all graphics
calls have been issued.

While conceptually separate, the depth and colour buffers are often collectively called the framebuffer.
By clearing the contents of the framebuffer, the application can proceed with additional OpenGL calls
to push geometry and fragments through the graphics pipeline. The framebuffer is directly related to
the size of the window that has been opened to contain the graphics context. The window, or viewport,
dimensions are needed by OpenGL to construct the Mvp matrix within the hardware. This is
accomplished through the following code, demonstrated again with the GLFW toolkit, which provides
functions for querying the requested window (or framebuffer) dimensions:

int nx, ny;


glfwGetFramebufferSize(window, &nx, &ny);
glViewport(0, 0, nx, ny);

In this example, glViewport sets the OpenGL state for the window dimension using nx and ny for the
width and height of the window and the viewport being specified to start at the origin. Technically,
OpenGL writes to the framebuffer memory as a result of operations that rasterize geometry, and
process fragments. These writes happen before the pixels are displayed on the user’s monitor.
Unit 6 – Visualisation and Computer Animation

7.6 Projection and Viewing in OpenGL


In this section, we discuss projection and viewing in OpenGL.

7.6.1 Projection in OpenGL


The projection is represented in OpenGL as a matrix. OpenGL keeps track of the projection matrix
separately from the matrix that represents the modelview transformation. The same transform functions,
such as glRotatef, can be applied to both matrices, so OpenGL needs some way to know which matrix
those functions apply to. This is determined by an OpenGL state property called the matrix mode. The
value of the matrix mode is a constant such as GL_PROJECTION or GL_MODELVIEW. When a
function such as glRotatef is called, it modifies a matrix; which matrix is modified depends on the current
value of the matrix mode. The value is set by calling the function glMatrixMode. The initial value is
GL_MODELVIEW. This means that if you want to work on the projection matrix, you must first call

glMatrixMode(GL_PROJECTION);

If you want to go back to working on the modelview matrix, you must call

glMartrixMode(GL_MODELVIEW);

We generally set the matrix mode to GL_PROJECTION, set up the projection transformation, and
then immediately set the matrix mode back to GL_MODELVIEW. This means that anywhere else in
the program, we can be sure that the matrix mode is GL_MODELVIEW.

There are two general types of projection, perspective projection and orthographic projection.
Perspective projection is more physically realistic. That is, it shows what you would see if the OpenGL
display rectangle on your computer screen were a window into an actual 3D world (one that could
extend in front of the screen as well as behind it). It shows a view that you could get by taking a picture
of a 3D world with a camera. In a perspective view, the apparent size of an object depends on how far
it is away from the viewer. Only things that are in front of the viewer can be seen. In fact, ignoring
clipping in the z-direction for the moment, the part of the world that is in view is an infinite pyramid, with
the viewer at the apex of the pyramid, and with the sides of the pyramid passing through the sides of
the viewport rectangle.
Unit 6 – Visualisation and Computer Animation

However, OpenGL can’t actually show everything in this pyramid, because of its use of the depth test
to solve the hidden surface problem. Since the depth buffer can only store a finite range of depth
values, it can’t represent the entire range of depth values for the infinite pyramid that is theoretically in
view. Only objects in a certain range of distances from the viewer can be part of the image. That
range of distances is specified by two values, near and far. For a perspective transformation, both of
these values must be positive numbers, and far must be greater than near. Anything that is closer to
the viewer than the near distance or farther away than the far distance is discarded and does not
appear in the rendered image. The volume of space that is represented in the image is thus a
“truncated pyramid.” This pyramid is the view volume for a perspective projection:

View volume for a perspective projection:


(Source: shorturl.at/iwVY9)
Orthographic projections are easier to understand: In an orthographic projection, the 3D world is
projected onto a 2D image by discarding the z-coordinate of the eye-coordinate system. This type of
projection is unrealistic in that it is not what a viewer would see. For example, the apparent size of an
object does not depend on its distance from the viewer. Objects in back of the viewer as well as in front
of the viewer can be visible in the image. Orthographic projections are still useful, however, especially
in interactive modelling programs where it is useful to see true sizes and angles, undistorted by
perspective. In orthographic projection, the value of near is allowed to be negative, putting the “near”
clipping plane behind the viewer in the finite view volume, as shown in the lower section of this
illustration below:
Unit 6 – Visualisation and Computer Animation

The finite view volume in Orthographic projection


(Source: shorturl.at/iwVY9)

An orthographic projection can be set up in OpenGL using the glOrtho method, which is has the
following form:

glOrtho( xmin, xmax, ymin, ymax, near, far );

The first four parameters specify the x- and y-coordinates of the left, right, bottom, and top of the view
volume. Note that the last two parameters are near and far, not zmin and zmax. In fact, the minimum
z-value for the view volume is −far and the maximum z-value is −near. However, it is often the case that
near = −far, and if that is true then the minimum and maximum z-values turn out to be near and far after
all!

As with glFrustum, glOrtho should be called when the matrix mode is GL_PROJECTION. As an
example, suppose that we want the view volume to be the box centred at the origin containing x, y, and
z values in the range from -10 to 10. This can be accomplished with
Unit 6 – Visualisation and Computer Animation

glMatrixMode(GL_PROJECTION);
glLoadIdentity();
glFrustum( xmin, xmax, ymin, ymax, near, far );
glMatrixMode(GL_MODELVIEW);

The call to glLoadIdentity ensures that the starting point is the identity transform. This is important
since glFrustum modifies the existing projection matrix rather than replacing it, and although it is
theoretically possible, you don’t even want to try to think about what would happen if you combine
several projection transformations into one.

7.6.2 Viewing in OpenGL


The default viewing conditions in computer image formation are similar to the settings on a basic
camera with a fixed lens

The Orthographic view


Direction of Projection: When image plane is fixed and the camera is moved far from the plane, the
projectors become parallel and the COP becomes “direction of projection”

OpenGL Camera
• OpenGL places a camera at the origin in object space pointing in the negative z direction
• The default viewing volume is a box centred at the origin with a side of length 2

In the default orthographic view, points are projected forward along the z axis onto the plane
Unit 6 – Visualisation and Computer Animation

OpenGL Camera
(Source: shorturl.at/klrsF)

Transformations and Viewing


• The pipeline architecture depends on multiplying together a number of transformation
matrices to achieve the desired image of a primitive.
• Two important matrices:
Unit 6 – Visualisation and Computer Animation

Model-view
Projection
• The values of these matrices are part of the state of the system.

In OpenGL, projection is carried out by a projection matrix (transformation)


There is only one set of transformation functions so we must set the matrix mode first

glMatrixMode (GL_PROJECTION)

Transformation functions are incremental so we start with an identity matrix and alter it with a
projection matrix that gives the view volume

glLoadIdentity();
glOrtho(-1.0, 1.0, -1.0, 1.0, -1.0, 1.0);

For more information on OpenGL viewing visit


http://info.ee.surrey.ac.uk/Teaching/Courses/CGI/lectures_pdf/opengl3.pdf and
https://www.khronos.org/opengl/wiki/Viewing_and_Transformations

Test your knowledge

1. Lab 1: Basic code setup for OpenGL applications. This includes installing the necessary drivers
and related software such as GLM and GLFW. Students can then write code to open a window
and clear the color buffers.

2. Lab 2: Creating a shader. Since a rudimentary shader is necessary to visualize the output in
modern OpenGL, starting with efforts to create a very basic shader will go a long way. In this
lab, or labs, students could build (or use provided) classes to load, compile, and link shaders
into shader programs.

You might also like