ArcPy and ArcGIS - Geospatial Analysis With Python - Sample Chapter
ArcPy and ArcGIS - Geospatial Analysis With Python - Sample Chapter
$ 44.99 US
27.99 UK
P U B L I S H I N G
Sa
m
pl
C o m m u n i t y
E x p e r i e n c e
D i s t i l l e d
Silas Toms
ee
Silas Toms
Chapter 6, Working with ArcPy Geometry Objects, explores ArcPy Geometry objects and
how they are combined with cursors to perform spatial analysis. It demonstrates how to
buffer, clip, reproject, and more using the data cursors and the Arcpy geometry types
without using ArcToolbox.
Chapter 7, Creating a Script Tool, explains how to make scripts into tools that appear in
ArcToolbox and are dynamic in nature. It explains how the tools and scripts
communicate and how to set up the ArcTool dialog to correctly pass parameters to
the script.
Chapter 8, Introduction to ArcPy.Mapping, explores the powerful Arcpy.Mapping
module and how to fix broken layer links, turn layers on and off, and dynamically adjust
titles and text. It shows how to create dynamic map output based on a geospatial analysis.
Chapter 9, More ArcPy.Mapping Techniques, introduces Layer objects, and their
methods and properties. It demonstrates how to control map scales and extents for data
frames, and covers automated map export.
Chapter 10, Advanced Geometry Object Methods, expands on the ArcPy Geometry
object methods and properties. It also explains how to create a module to save code for
reuse in subsequent scripts, and demonstrates how to create Excel spreadsheets
containing results from a geospatial analysis.
Chapter 11, Network Analyst and Spatial Analyst with ArcPy, introduces the basics of
using ArcPy for advanced geospatial analysis using the ArcGIS for Desktop Network
Analyst and Spatial Analyst Extensions.
Chapter 12, The End of the Beginning, covers other important topics that need to be
understood to have a full grasp of ArcPy. These topics include the Environment Settings,
XY values and Z and M resolutions, Spatial Reference Systems (Projections), the
Describe functions, and more.
Introduction to Python
for ArcGIS
In this chapter, we will discuss the development of Python as a programming
language, from its beginning in the late 1980s to its current state. We will discuss the
philosophy of design that spurred its development, and touch on important modules
that will be used throughout the book, especially focusing on the modules built into
the Python standard library. This overview of the language and its features will help
explain what makes Python a great language for ArcGIS automation.
This chapter will cover:
A quick overview of Python: What it is and does, who created it, and
where it is now
Overview of Python
Python, created by Guido van Rossum in 1989, was named after his favorite comedy
troupe, Monty Python. His work group at the time had a tradition of naming
programs after TV shows, and he wanted something irreverent and different from
its predecessors - ABC, Pascal, Ada, Eiffel, FORTRAN, and others. So he settled on
Python, feeling it was a bit edgy and catchy as well. It's certainly more fun to say
than C, the language on which Python is based.
Interpreted language
Python is an interpreted language. It is written in C, a compiled language, and the code
is interpreted from Python into C before it is executed. Practically, this means that the
code is executed as soon as it is converted and compiled. While code interpretation
can have speed implications for the execution of Python-based programs, the faster
development time allowed by Python makes this drawback easy to ignore. Testing
of code snippets is much faster in an interpretive environment, and it is perfect to
create scripts to automate basic, repeatable computing tasks. Python scripts have
the .py extentions. Once the code has been interpreted, a second Python script (with
the .pyc extentions) is generated to save the compiled code. The .pyc script will be
automatically recompiled when changes are made in the original .py script.
[8]
Chapter 1
Wrapper modules
The ArcPy module is a wrapper module. Wrapper modules are common in Python,
and are so named because they wrap Python onto the tools we will need. They allow
us to use Python to interface with other programs written in C or other programming
languages, using the Application Programming Interface (API) of those programs.
For example, wrappers make it possible to extract data from an Excel spreadsheet and
transform or load the data into another program, such as ArcGIS. Not all modules are
wrappers; some modules are written in pure Python and perform their analysis and
computations using the Python syntax. Either way, the end result is that a computer
and its programs are available to be manipulated and controlled using Python.
[9]
Go to https://www.python.org/doc/humor/
for more information.
[ 10 ]
Chapter 1
Import statements
Import statements are used to augment the power of Python by calling other
modules for use in the script. These modules can be part of the standard Python
library of modules, such as the math module (used to do higher mathematical
calculations) or, importantly, ArcPy, which will allow us to interact with ArcGIS.
Import statements can be located anywhere before the
module is used, but by convention, they are located at the
top of a script.
There are three ways to create an import statement. The first, and most standard, is
to import the whole module as follows:
import arcpy
Using this method, we can even import more than one module on the same
line. The following imports three modules: arcpy, os (the operating system
module), and sys (the Python system module):
import arcpy, os, sys
This method is used when only a portion of the code from ArcPy will be
needed; it has the practical effect of limiting the amount of memory used by
the module when it is called. We can also import multiple portions of the
module in the same fashion:
from arcpy import mapping, da
[ 11 ]
This last method is still used but it is discouraged as it can have unforeseen
consequences. For instance, the names of the variables in the module might conflict
with another variable in another module if they are not explicitly imported. For this
reason, it is best to avoid this third method. However, lots of existing scripts include
import statements of this type so be aware of these consequences.
Variables
Variables are a part of all programming languages. They are used to reference data
and store it in memory for use later in a script. There are a lot of arguments over the
best method to name variables. No standard has been developed for Python scripting
for ArcGIS. The following are some best practices to use when naming variables.
Make them descriptive: Don't just name a variable x; that variable will
be useless later when the script is reviewed and there is no way to know
what it is used for, or why. They should be longer rather than shorter, and
should hint at the data they reference or even the data type of the object they
reference:
shapefilePath = 'C:/Data/shapefile.shp'
Use camel case to make the variable readable: Camel case is a term used for
variables that start with a lower case letter but have upper case letters in the
middle, resembling a camel's hump:
camelCase = 'this is a string'
Include the data type in the variable name: If the variable contains a string,
call it variableString. This is not required, and will not be used dogmatically
in this book, but it can help organize the script and is helpful for others
who will read these scripts. Python is dynamically typed instead of statically.
A programming language distinction means that a variable does not have
to be declared before it can be used, unlike Visual Basic or other statically
typed languages. This improves the speed of writing a script, but it can be
problematic in long scripts as the data type of a variable will not be obvious.
The ArcGIS does not use camel case when it exports Python
scripts, and many examples will not include it; nevertheless,
it is recommended when writing new scripts. Also, variables
cannot start with a number.
[ 12 ]
Chapter 1
For loops
Built into programming languages is the ability to iterate, or perform a repeating
process, over a dataset to transform or extract data that meets specific criteria.
Python's main iteration tool is known as a for loop. The term for loop means that
an operation will loop, or iterate, over the items in a dataset to perform the operation
on each item. The dataset must be iterable to be used in a for loop, a distinction
discussed further ahead.
We will be using for loops throughout this book. Here is a simple example that uses
the Python Interpreter to take string values and print them in an uppercase format,
using a for loop:
>>> newlist = [ 'a' , 'b' , 'c' , 'd' ]
>>> for item in newlist:
print item.upper()
The variable item is a generic variable assigned to each object as it is entered into
the for loop, and not a term required by Python. It could have been x or value
instead. Within the loop, the first object (a) is assigned to the generic variable item
and has the upper string function applied to it to produce the output A. Once this
action has been performed, the next object (b) is assigned to the generic variable
to produce an output. This loop is repeated for all members of the dataset newlist;
once completed, the variable item will still carry the value of the last member of the
dataset (d in this case).
Downloading the example code
You can download the example code files for all Packt books you have
purchased from your account at http://www.packtpub.com. If you
purchased this book elsewhere, you can visit http://www.packtpub.
com/support and register to have the files e-mailed directly to you.
[ 13 ]
If/Elif/Else statements
Conditional statements, called if/else statements in Python, are also standard
in programming languages. They are used when evaluating data; when certain
conditions are met, one action will be taken (the initial if statement; if another
condition is met, another action is taken; this is an elif statement), and if the data
does not meet the condition, a final action is assigned to deal with those cases (the
else statement). These are similar to a where conditional in a SQL statement used
with the Select tool in ArcToolbox. Here is an example of how to use an if/else
statement to evaluate data in a list (a data type discussed further ahead) and find the
remainder when divided using the modulus operator (%) and Python's is equal to
operator (==):
>>> data = [1,2,4,5,6,7,10]
>>> for val in data:
if val % 2 == 0:
print val,"no remainder"
elif val % 3 == 2:
print val, "remainder of two"
else:
print "final case"
While statements
Another important evaluation tool is the while statement. It is used to perform an
action while a condition is true; when the condition is false, the evaluation will stop.
Note that the condition must become false, or the action will be always performed,
creating an infinite loop that will not stop until the Python interpreter is shut off
externally. Here is an example of using a while loop to perform an action until a true
condition becomes false:
[ 14 ]
Chapter 1
>>> x = 0
>>> while x < 5:
print x
x+=1
Comments
Comments in Python are used to add notes within a script. They are marked by
a pound sign, and are ignored by the Python interpreter when the script is run.
Comments are useful to explain what a code block does when it is executed, or to
add helpful notes that script authors would like future script users to read:
# This is a comment
Data types
GIS uses points, lines, polygons, coverages, and rasters to store data. Each of these
GIS data types can be used in different ways when performing an analysis and have
different attributes and traits. Python, similar to GIS, has data types that organize
data. The main data types in Python are strings, integers, floats, lists, tuples, and
dictionaries. They each have their own attributes and traits (or properties), and
are used for specific parts of code automation. There are also built-in functions
that allow for data types to be converted (or casted) from one type to another; for
instance, the integer 1 can be converted to the string 1 using the str() function:
>>> variable = 1
>>> newvar = str(variable)
>>> newvar
[ 15 ]
Strings
Strings are used to contain any kind of character. They begin and end with quotation
marks, with either single or double quotes used, though the string must begin and
end with the same type of quotation marks. Within a string, quoted text can appear;
it must use the opposite quotation marks to avoid conflicting with the string.Check
the following example:
>>> quote = 'This string contains a quote: "Here is the quote" '
A third type of string is also employed, a multiple line string that starts and ends
with three single quote marks:
>>> multiString = '''This string has
multiple lines and can go for
as long as I want it too'''
Integers
Integers are whole numbers that do not have any decimal places. There is a special
consequence to the use of integers in mathematical operations; if integers are used
for division, an integer result will be returned. Check out this code snippet below to
see an example of this:
>>> 5 / 2
Instead of an accurate result of 2.5, Python will return the floor value, or the lowest
whole integer for any integer division calculation. This can obviously be problematic
and can cause small bugs in scripts that can have major consequences.
Please be aware of this issue when writing scripts and use floats to
avoid it as described in the following section.
[ 16 ]
Chapter 1
Floats
Floating point values, or floats, are used by Python to represent decimal values. The
use of floats when performing division is recommended:
>>> 5.0 / 2
Because computers store values in a base 2 binary system, there can be issues
representing a floating value that would normally be represented in a base 10
system. Read docs.python.org/2/tutorial/floatingpoint.html for a further
discussion of the ramifications of this limitation.
Lists
Lists are ordered sets of data that are contained in square brackets ([]). Lists can
contain any other type of data, including other lists. Data types can be mixed
within a single list. Lists also have a set of methods that allow them to be extended,
reversed, sorted, summed, or extract the maximum or minimum value, along with
many other methods. Data pieces within a list are separated by commas.
List members are referenced by their index, or position in the list, and the index
always starts at zero. Look at the following example to understand this better:
>>> alist = ['a','b','c','d']
>>> alist[0]
This example shows us how to extract the first value (at the index 0) from the list
called alist. Once a list has been populated, the data within it is referenced by its
index, which is passed to the list in square brackets. To get the second value in a list
(the value at index 1), the same method is used:
>>> alist[1]
[ 17 ]
Tuples
Tuples are related to lists and are denoted by parentheses (()). Unlike lists, tuples are
immutablethey cannot be adjusted or extended once they have been created. Data
within a tuple is referenced in the same way as a list, using index references starting
at zero:
>>> atuple = ('e','d','k')
>>> atuple[0]
Dictionaries
Dictionaries are denoted by curly brackets ({}) and are used to create key:value
pairs. This allows us to map values from a key to a value, so that the value can
replace the key and data from the value can be used in processing. Here is a
simple example:
>>> adic = {'key':'value'}
>>> adic['key']
Note that instead of referring to an index position, such as lists or tuples, the values
are referenced using a key. Also, keys can be any other type of data except lists
(because lists are mutable).
[ 18 ]
Chapter 1
This can be very valuable when reading a shapefile or feature class. Using an
ObjectID as a key, the value would be a list of row attributes associated with
ObjectID. Look at the following example to better understand this behavior:
>>> objectIDdic = { 1 : [ '100' , 'Main' , 'St' ] }
>>> objectIDdic[1]
Dictionaries are very valuable for reading in feature classes and easily parsing
through the data by calling only the rows of interest, among other operations. They
are great for ordering and reordering data for use later in a script, so be sure to pay
attention to them moving forward.
[ 19 ]
Dictionaries are also iterable, but with a specific implementation that will only allow
direct access to the keys of the dictionary (which can then be used to access the
values). Also, the keys are not returned in a specific order:
>>> aDict = {"key1":"value1",
"key2":"value2"}
>>> for value in aDict:
print value, aDict[value]
Indentation
Python, unlike most other programming languages, enforces strict rules on indenting
lines of code. This concept is derived again from Guido's desire to produce clean,
readable code. When creating functions or using for loops, or if/else statements,
indentation is required on the succeeding lines of code. If a for loop is included inside
an if/else statement, there will be two levels of indentation. Veteran programmers of
other languages have complained about the strict nature of Python's indentation. New
programmers generally find it to be helpful as it makes it easy to organize code. Note
that a lot of programmers new to Python will create an indentation error at some point,
so make sure to pay attention to the indentation levels.
[ 20 ]
Chapter 1
Functions
Functions are used to take code that is repeated over and over within a script, or
across scripts, and make formal tools out of them. Using the keyword def, short for
the define function, functions are created with defined inputs and outputs. The idea
of a function in computing is that it takes data in one state and converts it into data in
another state, without affecting any other part of the script. This can be very valuable
to automate a GIS analysis.
Here is an example of a function that returns the square of any number supplied:
def square(inVal):
return inVal ** 2
>>> square(3)
While this of course duplicates a similar function built into the math module, it
shows the basics of a function. A function (generally) accepts data, transforms it as
needed, and then returns the new state of the data using the return keyword.
Keywords
There are a number of keywords built into Python that should be avoided when
naming variables. These include max, min, sum, return, list, tuple, def,
del, from, not, in, as, if, else, elif, or, while, and, with, among
many others. Using these keywords will result in an error.
Namespaces
Namespaces are a logical way to organize variable names when a variable inside a
function (a local variable) shares the same name as a variable outside of the function
(a global variable). Local variables contained within a function (either in the script or
within an imported module) and global variables can share a name as long as they
do not share a namespace.
This issue often arises when a variable within an imported module unexpectedly
has the same name of a variable in the script. Python Interpreter will use namespace
rules to decide which variable has been called, which can lead to undesirable results.
[ 21 ]
Zero-based indexing
As mentioned in the preceding section that describes lists and tuples, Python
indexing and counting starts at zero, instead of one. This means that the first member
of a group of data is at the zero position, and the second member is at the first
position, and so on till the last position.
This rule also applies when there is a for loop iteration within a script. When the
iteration starts, the first member of the data being iterated is in the zero position.
Also, indexing can be performed when counting from the last member of an iterable
object. In this case, the index of the last member is -1, and the second to last is -2, and
so on back to the first member of the object.
[ 22 ]
Chapter 1
The ability to control geospatial analyses using ArcPy allows for the integration of
ArcGIS tools into workflows that contain other powerful Python modules. Python's
glue language abilities increase the usefulness of ArcGIS by reducing the need to
treat geospatial data in a special manner.
[ 23 ]
str: The string function is used to convert any other type of data into a string
int: The integer function is used to convert a string or float into an integer.
float: The float function is used to convert a string or an integer into a float,
To not create an error, any string passed to the integer function must be a
number such as 1.
much like the integer function.
datetime: The datetime module is used to get information about the date
and time, and convert string dates into Python dates.
math: The math module is used for higher level math functions that are
necessary at times, such as getting a value for Pi or calculating the square
of a number.
csv: The CSV module is used to create and edit comma-separated value
type files.
Summary
In this chapter, we discussed about the Zen of Python and covered the basics of
programming using Python. We began our exploration of ArcPy and how it can
be integrated with other Python modules to produce complete workflows. We also
discussed the Python standard library and the basic data types of Python.
Next, we will discuss how to configure Python for use with ArcGIS, and explore how
to use Integrated Development Environments (IDEs) to write scripts.
[ 24 ]
Get more information ArcPy and ArcGIS Geospatial Analysis with Python
www.PacktPub.com
Stay Connected: