PythonInEarthScience
PythonInEarthScience
A B RIEF I NTRODUCTION
by
V ersion 1.0
February, 2017.
This document is a summary of our experiences in learning to use Python over last
several years. It is not intended to be a standalone document that will help the user to
solve every problem. What we hope is to encourage new users to delve into a wonderful
programming language.
i
C ONTENTS
ii
C ONTENTS iii
3 Input/Output of files 30
3.1 Read Text File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.1.1 Plain Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.1.2 Comma Separated Text . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.1.3 Unstructured Text. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2 Save Text File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3 Read Binary Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.4 Write Binary Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.5 Read NetCDF Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.6 Write NetCDF Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.7 Read MatLab Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.8 Read Excel Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.5.2 os Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.5.3 Errors and Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . 60
v
1
I NSTALLATION OF P YTHON AND
PACKAGE M ANAGEMENT
1
1.1. Introduction 2
1.1. I NTRODUCTION
If you are currently using a recent Mac or Linux operating system, open a terminal and
type,
:∼ $ python
and you should see something like,
Python 2.7.12
Type "help", "copyright", "credits" or "license" for more information.
>>>
You have just entered the native installation of Python on your computer, no extra
steps needed. This is because, though it is a great tool for earth science and data
analytics, Python is a general purpose language that is used by all sorts of programs
and utilities. While is it nice the Python is a very open and widely used tool, one should
also take care that this native installation is not modified to the point that the other
useful and essential utilities that depend on it are disrupted. For instance, a package or
command may no longer be installed where it originally was by the operating system.
For this reason, this chapter will outline how to install a modern version of Python, as
well as many packages useful for data science, in a tidy environment all it’s own.
already built into a package called NumPy (Numerical Py thon) which gives you access
to a whole bunch of bits of code. Not only that, there entire package managers that
will take care of downloading and installing the package, as well as making sure it
plays nice with all the other packages you are using, all you have to do is tell it which
package!
• Anaconda will handle not only the Python packages, but non-Python thing such
as HDF5 (which allows us to read some data files) and the Math Kernel Library.
It will even manage an R installation.
Keep separate collections of packages in case some don’t work well together.
which one you mean. In this way, by using these tools, you keep everything nice and
tidy.
the installation is complete, the installer will ask "You may wish to edit your .bashrc
or prepend the Anaconda3 install location:", followed by a suggested command that
looks something like,
export PATH=/YOUR/PATH/TO/anaconda3/bin:$PATH
In order to make Anaconda work, you need to add the file path to Anaconda to a
variable the operating system uses called $PATH. To do this, you can add a modified
version of this line to a file called .bashrc in your home folder. Simply go to your home
folder and open the file .bashrc with a text editor, and at the end of the file add the
line,
export PATH=$PATH:/YOUR/PATH/TO/anaconda3/bin
where the /YOUR/PATH/TO/anaconda3/bin is the same one that Anaconda sug-
gested at the end of installation. If you forgot it, it should be something like
/home/YOURNAME/anaconda3/bin
You may notice that you switched our path and the $PATH around. This is because
you want to add our Anaconda location to end of $PATH, meaning that the operating
system looks in this folder last instead of first. The insures that you don’t cause any
problems with the native Python installation.
out for you. So now you have our nice new environment, and you can activate it by
entering
:∼ $ source activate CoursePy
on Mac or Linux and
:∼ $ activate CoursePy
on Windows.
You command line should now tell you that you are now in the CoursePy environ-
ment. If you now open a Python console by typing python in the command line, our
version should now be 3.6.0. In this same manner, you can do things like duplicate and
export our environments, or make new environments with different packages or even
different Python versions.
T his chapter provides information on the basic data types in Python. It also
introduces the basic operations used to access and manipulate the data
8
2.1. Basic Data Types 9
In python, there are various types of data. Every data has a type and a value.
Every value has a fixed data type but it should not specified beforehand. The most
basic data types in python are:
1. Boolean: These are data which have only two values: True or False.
Furthermore, these data types can be combined and following types of datasets can
be produced:
Out[2]: 1 False
Out[3]: 1 True
2.1.2. N UMBERS
Python supports both integers and floating point numbers. There’s no type declaration
to distinguish them and Python automatically distinguishes them apart by the presence
or absence of a decimal point.
• You can use type() function to check the type of any value or variable.
In [4]: 1 type (1)
Out[4]: 1 int
Out[5]: 1 float
Out[6]: 1 2
In [7]: 1 1+1.
Out[7]: 1 2.0
XAdding an int to a float yields a float. Python coerces the int into a float
to perform the addition, then returns a float as the result.
• Integer can be converted to float using float() and float can be converted to
integer using int()
In [8]: 1 float (2)
Out[8]: 1 2.0
Out[9]: 1 2
Out[10]: 1 3.0
N UMERICAL O PERATIONS
• The / operator performs division.
In [11]: 1 1/2
Out[11]: 1 0
In [12]: 1 1/2.
Out[12]: 1 0.5
XBe careful on float or integer data type as the result can be different as
shown above.
2.1. Basic Data Types 12
Out[13]: 1 0.0
In [14]: 1 -1.//2
Out[14]: 1 -1.0
• The ‘**’ operator means “raised to the power of”. 112 is 121.
In [15]: 1 11**2
Out[15]: 1 121
In [16]: 1 11**2.
Out[16]: 1 121.0
XBe careful on float or integer data type as the result can be different as
shown above.
• The ‘%’ operator gives the remainder after performing integer division.
In [17]: 1 11%2
Out[17]: 1 1
F RACTIONS
To start using fractions, import the fractions module. To define a fraction, create a
Fraction object as
In [18]: 1 import fractions
2 fractions . Fraction (1 ,2)
Out[18]: 1 Fraction (1 , 2)
2.1. Basic Data Types 13
You can perform all the usual mathematical operations with fractions as
In [19]: 1 fractions . Fraction (1 , 2) *2
Out[19]: 1 Fraction (1 , 1)
T RIGONOMETRY
You can also do basic trigonometry in Python.
In [20]: 1 import math
2 math . pi
Out[20]: 1 3.1415926535897931
Out[21]: 1 1.0
2.1.3. S TRINGS
In Python, all strings are sequences of Unicode characters. It is an immutable sequence
and cannot be modified.
• To create a string, enclose it in quotes. Python strings can be defined with either
single quotes (' ') or double quotes ('' '').
In [22]: 1 s = ' sujan '
• The built-in len() function returns the length of the string, i.e. the number of
characters.
In [24]: 1 len ( s )
Out[24]: 1 5
• You can get individual characters out of a string using index notation.
In [25]: 1 s [1]
Out[25]: 1 u
2.2. Combined Data Types 14
2.1.4. B YTES
An immutable sequence of numbers between 0 and 255 is called a bytes object. Each
byte within the bytes object can be an ascii character or an encoded hexadecimal
number from \x00 to \xff (0–255).
• To define a bytes object, use the b' 'syntax. This is commonly known as “byte
literal” syntax.
In [27]: 1 by = b ' abcd \ x65 '
2 by
X\x65 is 'e'.
• Just like strings, you can use len() function and use the + operator to concatenate
bytes objects. But you cannot join strings and bytes.
In [28]: 1 len ( by )
Out[28]: 1 5
2.2.1. L ISTS
Lists are the sequence of data stored in an arranged form. It can hold different types
of data (strings, numbers etc.) and it can be modified to add new data or remove old
data.
C REATING A L IST
To create a list: use square brackets “[ ]” to wrap a comma-separated list of values of
any data types.
In [30]: 1 a_list =[ 'a ' , 'b ' , ' mpilgrim ' , 'z ' , ' example ' , 2]
2 a_list
Out[30]: 1 [ 'a ' , 'b ' , ' mpilgrim ' , 'z ' , ' example ' , 2]
XAll data except last data are strings. Last one is integer.
In [31]: 1 a_list [ 0]
Out[32]: 1 str
Out[33]: 1 int
S LICING A L IST
Once a list has been created, a part of it can be taken as a new list. This is called
slicing the list. A slice can be extracted using indices. Let’s consider same list as
above:
In [34]: 1 a_list =[ 'a ' , 'b ' , ' mpilgrim ' , 'z ' , ' example ' , 2]
Out[35]: 1 6
2.2. Combined Data Types 16
1. ‘+’ operator: The + operator concatenates lists to create a new list. A list
can contain any number of items; there is no size limit.
In [38]: 1 b_list = a_list +[ ' Hydro ' , ' Aqua ']
2 b_list
Out[38]: 1 [ 'a ' , 'b ' , ' mpilgrim ' , 'z ' , ' example ' , 2 , ' Hydro ' , ' Aqua ']
2. append(): The append() method adds a single item to the end of the list. Even
if the added item is a list, the whole list is added as a single item in the old list.
In [39]: 1 b_list . append ( True )
2 b_list
Out[39]: 1 [ 'a ' , 'b ' , ' mpilgrim ' , 'z ' , ' example ' , 2 , ' Hydro ' , ' Aqua ' , True ]
Out[40]: 1 9
Out[41]: 1 [ 'a ' , 'b ' , ' mpilgrim ' , 'z ' , ' example ' , 2 , ' Hydro ' , ' Aqua ' , True ,[ 'd ' , '
e '] ]
Out[42]: 1 10
XThe length of b_list has increased by only one even though two items,
['d', 'e'], were added.
3. extend(): Similar to append but each item is added separately. For e.g., let’s
consider the list
In [43]: 1 b_list =[ 'a ' , 'b ' , ' mpilgrim ' , 'z ' , ' example ' , 2 , ' Hydro ' , ' Aqua ' , True
]
2 len ( b_list )
Out[43]: 1 9
Out[44]: 1 [ 'a ' , 'b ' , ' mpilgrim ' , 'z ' , ' example ' , 2 , ' Hydro ' , ' Aqua ' , True , 'd ' , 'e
']
Out[45]: 1 11
XThe length of b_list has increased by two as two items in the list, ['d',
'e'], were added.
4. insert(): The insert() method inserts a single item into a list. The first argument
is the index of the first item in the list that will get bumped out of position.
In [46]: 1 b_list =[ 'a ' , 'b ' , ' mpilgrim ' , 'z ' , ' example ' , 2 , ' Hydro ' , ' Aqua ' ,
True ]
2 b_list . insert (0 , 'd ')
Out[47]: 1 [ 'd ' , 'a ' , 'b ' , ' mpilgrim ' , 'z ' , ' example ' , 2 , ' Hydro ' , ' Aqua ' , True ]
In [49]: 1 b_list
Out[49]: 1 [[ 'x ' , 'y '] , 'd ' , 'a ' , 'b ' , ' mpilgrim ' , 'z ' , ' example ' , 2 , ' Hydro ' , '
Aqua ' , True ]
XThe list ['x', 'y'] is added as one item as in the case of append().
Out[51]: 1 2
Out[52]: 1 True
Out[53]: 1 False
Out[54]: 1 0
Out[55]: 1 1
XEven though there are 2 'b', the index of first 'b' is returned.
2.2. Combined Data Types 19
• Suppose we want to remove the element 'mpilgrim' from the list. Its index is 2.
In [57]: 1 b_list [2]
Out[58]: 1 [ 'a ' , 'b ' , 'z ' , ' example ' , 2 , ' Hydro ' , ' Aqua ' , 'b ']
The pop() command can also remove an item by specifying an index. But, it is even
more versatile as it can be used without any argument to remove the last item of a
list.
• Suppose we want to remove the element 'mpilgrim' from the list. Its index is 2.
In [60]: 1 b_list [2]
In [65]: 1 b_list
Out[65]: 1 [ 'a ' , 'z ' , ' example ' , 2 , ' Hydro ' , ' Aqua ']
2.2.2. T UPLES
A tuple is an immutable list. A tuple can not be changed/modified in any way once it
is created.
• A tuple is defined in the same way as a list, except that the whole set of elements
is enclosed in parentheses instead of square brackets.
• The elements of a tuple have a defined order, just like a list. Tuples indices are
zero based, just like a list, so the first element of a non empty tuple is always
t[0].
• Negative indices count from the end of the tuple, just as with a list.
• Slicing works too, just like a list. Note that when you slice a list, you get a new
list; when you slice a tuple, you get a new tuple.
• A tuple is used because reading/writing a tuple is faster than the same for lists.
If you do not need to modify a set of item, a tuple can be used instead of list.
C REATING T UPLES
A tuple can be created just like the list but parentheses “( )” has to be used instead
of square brackets“[ ]”. For e.g.,
In [66]: 1 a_tuple =( 'a ' , 'b ' , ' mpilgrim ' , 'z ' , ' example ' , 2 , ' Hydro ' , ' Aqua ' , 'b ')
T UPLE O PERATIONS
All the list operations except the ones that modify the list itself can be used for tuples
too. For e.g., you cannot use append(), extend(), insert(), del, remove(), and pop() for
tuples. For other operations, please follow the same steps as explained in the previous
section. Here are some examples of tuple operations.
Out[68]: 1 3
2.2. Combined Data Types 22
Xitem 'z' is at the index 3, i.e., it is the fourth element of the tuple.
In [69]: 1 b_tuple = a_tuple [0:4]
In [70]: 1 b_tuple
Out[70]: 1 ( 'a ' , 'b ' , ' mpilgrim ' , 'z ')
XNew tuple can be created by slicing a tuple as original tuple does not
change.
In [71]: 1 a_tuple
Out[71]: 1 ( 'a ' , 'b ' , ' mpilgrim ' , 'z ' , ' example ' , 2 , ' Hydro ' , ' Aqua ' , 'b ')
2.2.3. S ETS
A set is an unordered collection of unique values. A single set can contain values of
any datatype.
C REATING S ET
There are basically two ways of creating set.
1. From scratch: Sets can be created like lists but curly brackets “ {}” have to be
used instead of square brackets “[ ]”. For e.g.,
In [72]: 1 a_set ={ 'a ' , 'b ' , ' mpilgrim ' , 'z ' , ' example ' , 2 , ' Hydro ' , ' Aqua ' , 'b '}
Out[73]: 1 set
In [74]: 1 a_set
Out[74]: 1 {2 , ' Aqua ' , ' Hydro ' , 'a ' , 'b ' , ' example ' , ' mpilgrim ' , 'z '}
XThe set has different orders than the values given inside {} because it is
unordered and original orders are ignored. Also, there is only one 'b' in the set
even though two 'b' were given because a set is a collection of unique values.
Duplicate values are taken as one.
2.2. Combined Data Types 23
2. From list or tuple: A set can be created from a list or tuple as,
In [75]: 1 set ( a_list )
2 set ( a_tuple )
M ODIFYING S ET
A set can be modified by adding an item or another set to it. Also, items of set can
be removed.
A DDING E LEMENTS
• Consider a set as follows,
In [76]: 1 a_set ={2 , ' Aqua ' , ' Hydro ' , 'a ' , 'b ' , ' example ' , ' mpilgrim ' , 'z '}
In [78]: 1 a_set
Out[78]: 1 {2 , ' Aqua ' , ' Hydro ' , 'a ' , 'b ' , 'c ' , ' example ' , ' mpilgrim ' , 'z '}
In [80]: 1 a_set
Out[80]: 1 {2 , ' Aqua ' , ' Hydro ' , ' Koirala ' , ' Sujan ' , 'a ' , 'b ' , 'c ' , ' example ' , '
mpilgrim ' , 'z '}
R EMOVING E LEMENTS
• Consider a set as follows,
In [81]: 1 a_set ={2 , ' Aqua ' , ' Hydro ' , 'a ' , 'b ' , ' example ' , ' mpilgrim ' , 'z '}
2.2. Combined Data Types 24
• Using remove() and discard(): These are used to remove an item from a set.
In [82]: 1 a_set . remove ( 'b ')
In [83]: 1 a_set
Out[83]: 1 {2 , ' Aqua ' , ' Hydro ' , ' Koirala ' , ' Sujan ' , 'a ' , 'c ' , ' example ' , ' mpilgrim
' , 'z '}
In [85]: 1 a_set
Out[85]: 1 {2 , ' Aqua ' , ' Koirala ' , ' Sujan ' , 'a ' , 'c ' , ' example ' , ' mpilgrim ' , 'z '}
• Using pop() and clear(): pop() is same as list but it does not remove the last
item as list. pop() removes one item ramdomly. clear() is used to clear the whole
set and create an empty set.
In [86]: 1 a_set . pop ()
In [87]: 1 a_set
Out[87]: 1 {2 , ' Koirala ' , ' Sujan ' , 'a ' , 'c ' , ' example ' , ' mpilgrim ' , 'z '}
S ET O PERATIONS
Two sets can be combined or common elements in two sets can be combined to form
a new set. These functions are useful to combine two or more lists.
In [90]: 1 c_set
Out[90]: 1 {1 ,2 ,195 ,4 ,5 ,6 ,8 ,12 ,76 ,15 ,17 ,18 ,3 ,21 ,30 ,51 ,9 ,127}
2.2. Combined Data Types 25
• Intersection: Can be used to create a set with elements common to two sets.
In [91]: 1 d_set = a_set . intersection ( b_set )
In [92]: 1 d_set
2.2.4. D ICTIONARIES
A dictionary is an unordered set of key-value pairs. A value can be retrieved for a
known key but the other-way is not possible.
C REATING D ICTIONARY
Creating a dictionary is similar to set in using curled brackets “ ” but key:value pairs
©ª
In [94]: 1 a_dict
Out[94]: 1 { ' Aqua ': ' 192.168.1.154 ' , ' Hydro ': ' 131.112.42.40 '
M ODIFYING D ICTIONARY
Since the size of the dictionary is not fixed, new key:value pair can be freely added to
the dictionary. Also values for a key can be modified.
In [98]: 1 a_dict
Out[98]: 1 { ' Aqua ': ' 192.168.1.154 ' , ' Hydro ': ' 131.112.42.40 '
In [100]: 1 a_dict
Out[100]: 1 { ' Aqua ': ' 192.168.1.154 ' , ' Hydro ': ' 131.112.42.40 ' , ' Lab ': ' Kanae '}
• Dictionary values can also be lists instead of single values. For e.g.,
In [101]: 1 k_lab ={ ' Female ' :[ ' Yoshikawa ' , ' Imada ' , ' Yamada ' , ' Sasaki ' , '
Watanabe ' , ' Sato '] , ' Male ' :[ ' Sujan ' , ' Iseri ' , ' Hagiwara ' , '
Shiraha ' , ' Ishida ' , ' Kusuhara ' , ' Hirochi ' , ' Endo ' ]}
Out[102]: 1 [ ' Yoshikawa ' , ' Imada ' , ' Yamada ' , ' Sasaki ' , ' Watanabe ' , ' Sato ']
2.2.5. A RRAYS
Arrays are similar to lists but it contains homogeneous data, i.e., data of same type
only. Arrays are commonly used to store numbers and hence used in mathematical
calculations.
C REATING A RRAYS
Python arrays can be created in many ways. It can also be read from some data file in
text or binary format, which are explained in latter chapters of this guide. Here, some
commonly used methods are explained. For a detailed tutorial on python arrays, refer
here.
XThe list has mixed datatypes. First two items are strings and last two are
numbers.
In [104]: 1 b_array = array ( b_list )
1 array ([ 'a ' , 'b ' , '1 ' , '2 '] , dtype = '| S8 ')
XSince first two elements are string, numbers are also converted to strings
when array is created.
In [105]: 1 b_list2 =[1 ,2 ,3 ,4]
In [107]: 1 b_array2
(b) From arange(number): Creates an array from the range of values. Ex-
amples are provided below. For details of arange follow chapter 4.
In [109]: 1 yy = arange (2 ,5 ,1)
In [110]: 1 yy
XCreates an array from lower value (2) to upper value (5) in specified
interval (1) excluding the last value (5).
In [111]: 1 yy = arange (5)
2.2. Combined Data Types 28
In [112]: 1 yy
XIf the lower value and interval are not specified, they are taken as 0
and 1, respectively.
In [113]: 1 yy = arange (5 ,2 , -1)
In [114]: 1 yy
Xwill create an array with 20 blocks with each block having 20 rows
and 20 columns (total 20*20*20=8000 elements) with all elements as zero.
A RRAY O PERATIONS
Arithmetic operators on arrays apply elementwise. A new array is created and filled
with the result.
In [118]: 1 a = array ([20 ,30 ,40 ,50])
2 b = arange (4)
In [119]: 1 b
In [120]: 1 c = a-b
In [121]: 1 c
Read and write data from/to files in commonly used data formats, such as
text (csv), binary, excel, netCDF, R data frame and Matlab.
30
3.1. Read Text File 31
This chapter explains the method to read and write data from/to commonly used
data formats, such as text (csv), binary, excel, netCDF, and Matlab.
Reads the data in the text file as an ’array’. Will raise an error if the data is
non-numeric (float or integer).
In [129]: 1 type ( a )
3.2. Save Text File 32
Out[129]: 1 file
Xreadlines() reads contents (each line) of the file object ’a’ and puts it in a
a_list.
Xread() reads contents of the file object ’a’ and stores it as a string.
ASCII files are coded with special characters. These characters need to be removed
from each line/item of the data using read or readlines.
• Drop the ’\n’ or ’\r \n’ sign at the end of each line:
• type code: can be defined as type code (e.g., 'f') or python type (e.g., 'float') as
shown in Table 3.1. It determines the size and byte-order of items in the binary
file.
In [136]: 1 dat = fromfile ( ' exam ple_bina ry . float32 ' , 'f ')
2 dat
3.4. Write Binary Data 34
In [142]: 1 a . keys ()
3.8. Read Excel Data 35
Out[143]: 1 [ u ' SimpBM ' , u ' SimpBM2L ' , u ' SimpBMtH ' , u ' SimpGWoneTfC ' , u ' SimpGWvD ']
In [144]: 1 dat = a [ ' Results / SimpGWvD / Default / ModelOutput / actET ' ][:]
2 dat = a [ ' Results ' ][ ' SimpGWvD ' ][ ' Default ' ][ ' ModelOutput ' ][ ' actET ' ][:]
36
4.1. Size and Shape 37
Out[147]: 1 10
Out[148]: 1 (10 ,)
Xfor list.
In [149]: 1 array ( a ) . shape
Out[149]: 1 (10 ,)
Xfor array.
Xcan be used for both array and list. List is converted to array by using this
function.
In [153]: 1 b = a . reshape ( -1 ,5)
XBy using the ‘-1’ flag, the first dimension is automatically set to match the
total size of the array. For e.g., if there are 10 elements in an array/list and 5
columns is specified during reshape, number of rows is automatically calculated
as 2. The shape will be (2,5).
I NDEX B ASICS
Indexing is done in two ways:
1. Positive Index: The counting order is from left to right. The index for the first
element is 0 (not 1).
4.2. Slicing and Dicing 39
In [158]: 1 a [0]
Out[158]: 1 1
In [159]: 1 a [1]
Out[159]: 1 2
In [160]: 1 a [4]
Out[160]: 1 5
2. Negative Index: The counting order is from right to left. The index for the last
item is -1. In some cases, the list is very long and it is much easier to count
from the end rather than the beginning.
In [161]: 1 a [ -1]
Out[161]: 1 5
Out[162]: 1 4
D ATA E XTRACTION
Data extraction is carried out by using indices. In this section, some examples of using
indices are provided. Details of array indexing and slicing can be found here.
In [164]: 1 a [0:2]
Out[164]: 1 [1 ,2]
In [165]: 1 a [3:4]
Out[165]: 1 4
Out[166]: 1 [1 ,2]
Xsame as a[0:2].
In [167]: 1 a [2:]
Out[167]: 1 [3 ,4 ,5]
Xsame as a[2:5].
3. Consider a 2-D list and 2-D array Different method for array and list as indexing
is different in two cases as explained below.
In [168]: 1 a_list =[[1 ,2 ,3] ,[4 ,5 ,6]]
2 a_array = array ([[1 ,2 ,3] ,[4 ,5 ,6]])
Out[169]: 1 (2 ,3)
Out[170]: 1 (2 ,3)
Out[171]: 1 [1 ,2 ,3]
Xwhich is a list.
In [172]: 1 a_array [0]
Xwhich is an array.
4.2. Slicing and Dicing 41
Out[173]: 1 2
Out[174]: 1 [4 ,5]
XThe index has to be provided in two different sets of square brackets “[ ]”.
Out[175]: 1 2
Out[176]: 1 [4 ,5]
Out6,7:
Out[179]: 1 6
Out[181]: 1 6
Out[183]: 1 100
In [184]: 1 A . max ()
Out[184]: 1 5
Out[185]: 1 -5
In [186]: 1 A . min ()
Out[186]: 1 -5
4.3. Built-in Mathematical Functions 43
3. mean(iterable): Returns the average of the array elements. The average is taken
over the flattened array by default, otherwise over the specified axis. For details,
click here.
In [187]: 1 mean ([0 ,10 ,15 ,30 ,100 , -5])
Out[187]: 1 75
In [188]: 1 A . mean ()
Out[188]: 1 0.0
Out[189]: 1 12.5
In [190]: 1 A . median ()
Out[190]: 1 0.0
5. sum(iterable): Returns the sum of the array elements. It returns sum of array
elements over an axis if axis is specified else sum of all elements. For details,
click here.
In [191]: 1 sum ([1 ,2 ,3 ,4])
Out[191]: 1 10
In [192]: 1 A . sum ()
Out[192]: 1 0
In [194]: 1 abs ( B )
7. divmod(x,y): Returns the quotient and remainder resulting from dividing the
first argument (some number x or an array) by the second (some number y or
an array).
In [195]: 1 divmod (2 , 3)
Out[195]: 1 (0 , 2)
Out[196]: 1 (2 , 0)
Out[198]: 1 1
10. round(x,n): Returns the floating point value of x rounded to n digits after the
decimal point.
In [200]: 1 round (2.675 ,2)
Out[200]: 1 2.67
11. around(A,n): Returns the floating point array A rounded to n digits after the
decimal point.
In [201]: 1 around (C ,2)
• If step (z) is negative, the last element is the ‘start (x) + i * step (z)’ just
greater than ‘y’.
Out[202]: 1 [0 ,1 ,2 ,3 ,4 ,5 ,6 ,7 ,8 ,9]
Out[203]: 1 [1 ,2 ,3 ,4 ,5 ,6 ,7 ,8 ,9 ,10]
Out[205]: 1 [0 , -1 , -2 , -3 , -4]
Out[206]: 1 [ ]
14. zip(A,B): Returns a list of tuples, where each tuple contains a pair of it h element
of each argument sequences. The returned list is truncated to length of shortest
sequence. For a single sequence argument, it returns a list with 1 tuple. With
no arguments, it returns an empty list.
In [212]: 1 zip (A , B )
Out[212]: 1 [( array ([ -2 , 2]) , array ([2 , 2]) ) , ( array ([ -5 , 5]) , array ([5 ,
5]) ) ]
4.4. Matrix operations 47
In [214]: 1 D . sort ()
In [215]: 1 D
16. ravel(): Returns a flattened array. 2-D array is converted to 1-D array.
In [216]: 1 A . ravel ()
17. transpose(): Returns the transpose of an array (matrix) by permuting the di-
mensions.
In [217]: 1 A . transpose ()
1. Dot product:
In [219]: 1 a = rand (3 ,3)
2 b = rand (3 ,3)
3 dot_p = dot (a , b )
2. Cross product:
In [220]: 1 a = rand (3 ,3)
2 b = rand (3 ,3)
3 cro_p = cross (a , b )
3. Matrix multiplication:
In [221]: 1 a = rand (2 ,3)
2 b = rand (3 ,2)
3 mult_ab = matmul (a , b )
Out[222]: 1 (2 ,2)
1. split(): Splitting the strings. It has one required argument, a delimiter. The
method splits a string into a list of strings based on the delimiter.
In [224]: 1 s . split ()
Out[225]: 1 [ ' suj ' , 'n koir ' , 'l ' , ' ']
2. lower() and upper(): Changes the string to lower case and upper case respec-
tively.
4.6. Other Useful Functions 49
In [227]: 1 s . lower ()
In [228]: 1 s . upper ()
Out[229]: 1 3
4. Replace a substring:
In [230]: 1 s2 = s . replace ( " Su " , " Tsu " )
5. List to String:
In [231]: 1 a_list =[ 'a ' , 'b ' , 'c ']
2 a_str = " and " . join ( str ( x ) for x in a_list )
3 a_str
2. tolist(): Converts the array to an ordinary list with the same items.
In [233]: 1 A . tolist ()
Out[233]: 1 [[ -2 , 2] , [ -5 , 5]]
3. byteswap(): Swaps the bytes in an array and returns the byteswapped array. If
the first argument is 'True', it byteswaps and returns all items of the array in-
place. Supported byte sizes are 1, 2, 4, or 8. It is useful when reading data from
a file written on a machine with a different byte order. For details on machine
dependency, refer this. To convert data from big endian to little endian or vice-
versa, add byteswap() in same line where ‘fromfile’ is used. If your data is made
by big endian.
5
E SSENTIAL P YTHON S CRIPTING
51
5.1. Control Flow Tools 52
5.1.1. IF S TATEMENT
The if statement is used to test a condition, which can have True of False values. An
example if block is:
In [234]: 1 if x < 0:
2 print x , ' is a negative number '
3 elif x > 0:
4 print x , ' is a negative number '
5 else :
6 print , 'x is zero '
Xcan have zero or more elif, and else statement is also optional.
If statement can also be checked if a value exists within an iterable such as list,
tuple, array or a string.
In [235]: 1 a_list =[ 'a ' , 'd ' , 'v ' ,2 ,4]
2 if 'd ' in a_list :
3 print a_list . index ( 'd ')
Out[235]: 1 1
In [238]: 1 words = {1: ' cat ' ,2: ' window ' ,3: ' defenestrate '}
2 for _wor in words . items () :
3 print _wor , len ( _wor )
The break statement breaks out of the smallest enclosing for or while loop. The
continue statement continues with the next iteration of the same loop.
In [240]: 1 for n in range (2 , 10) :
2 for x in range (2 , n ) :
3 if n % x == 0:
4 print n , ' equals ' , x , '* ' , n / x
5 break
6 else :
7 # loop fell through without finding a factor
8 print n , ' is a prime number '
5.1.5. RANGE
As shown in previous chapters and examples, range is used to generate a list of numbers
from start to end at an interval step. In Python 2, range generates the whole list object,
whereas in Python 3, it is a special range generator object that does not use the memory
redundantly.
Out[243]: 1 6
2 NoneType
• return: Returns the result of the function. In the above example, return will be
an empty NoneType object.
If the return command includes arguments, the result can be passed onto the
statement that calls the function. Also, the default values of the parameters can also
be set. If the function is called without any arguments, the default values are used for
calculation. Here is an example.
In [244]: 1 def funcname ( param1 =2 , param2 =3) :
2 prod = param1 * param2
3 return prod
In [245]: 1 funcname ()
Out[245]: 1 6
Out[246]: 1 12
Out[247]: 1 int
26 return hcf
• After a file which includes a function is created and saved, the function can be
used in interactive shell within the directory (with the file) or in other files in the
same directory as a module.
XIf the saved filename is samp_func.py, the function can be called in from
another program file in the same directory.
In [249]: 1 import samp_func as sf
2 print sf . lcommon (3 ,29)
Out[249]: 1 87.0
XIf the program is run, you can get the number 87, which is the least
common multiple of 3 and 29.
5.4. Python Classes 57
The module file can also be run as an standalone program if the following block of
code is added to the end.
In [250]: 1 if __name__ == " __main__ " :
2 import sys
3 lcommon ( int ( sys . argv [1]) , int ( sys . argv [2]) )
4 computeHCF ( int ( sys . argv [1]) , int ( sys . argv [2]) )
Also, the variables defined in the module file can be accessed as long as it is not
within functions of the module.
In [251]: 1 somevariable =[ '1 ' ,2 ,4]
A list of all the objects from the module can be obtained by using dir() as
In [253]: 1 dir ( samp_func )
10 def d i s t a n c e _ f r o m _ o r i g i n ( self ) :
11 return math . sqrt ( self . x **2 + self . y **2)
• self is the object that will be created when the class is called
• __init__ creates the object self and assigns the attributes x and y to it.
Out[255]: 1 1
In [256]: 1 p1 . d i s t a n c e _ f r o m _ o r i g i n ()
Out[256]: 1 4.123105625617661
In [257]: 1 p2 . d i s t a n c e _ f r o m _ o r i g i n ()
Out[257]: 1 3.605551275463989
Some simple and easy to understand examples of class are provided in:
• http://www.jesshamrick.com/2011/05/18/an-introduction-to-classes-and-inheritance-
in-python/
• https://jeffknupp.com/blog/2014/06/18/improve-your-python-python-classes-and-
object-oriented-programming/
• argv: Probably, the most useful of sys methods. sys.argv is a list object contain-
ing the arguments while running a python script from command lines. The first
element is always the program name. Any number of arguments can be passed
into the program as strings, e.g., sys.argv[1] is the second argument and so on.
5.5. Additional Relevant Modules 59
• path: The default path that Python searches is stored in sys.path. If you have
written modules and classes, and want to access it from anywhere, you can add
path to sys.path as,
In [258]: 1 sys . path . append ( ' path to your directory ')
5.5.2. OS M ODULE
This module provides a unified interface to a number of operating system functions.
There are lots of useful functions for process management and file object creation in
this module. Among them, it is especially useful to use functions for manipulating file
and directory, which are briefly introduced below. For details on ‘OS module’, click
here.
Before using file and directory commands, it is necessary to import os module as,
In [259]: 1 import os
2 os . getcwd ()
Xsame as pwd in UNIX. Stands for present working directory and displays the
absolute path to the current directory.
In [260]: 1 os . mkdir ( ' dirname ')
XA very useful os function that checks if a file exists. Returns True if it exists
and False if not.
If you want to know more about these functions, follow this.
61
6.1. Quick overview 62
This exercise will require the following packages (all should be available via "conda
install..."):
• numpy
• scipy
• pandas
• scikit-learn
• statsmodels
So to build our dictionary, we can start with an empty dictionary (remember "")
called "df". Then we can loop through our IncludedVars and use each item in the list
as a key for df, and pair each key with a numpy array from the netCDF:
1 df [ var ]= np . array ( ncdf [ var ]) . flatten () [48*365:]
You may notice two things: first is that we not only turn our netCDF variable
into a numpy array, but we also call "flatten". This is because the netCDF has three
6.3. Setting up the gapfillers 64
dimensions (time, lat, lon), but as this is only one site, the lat and lon dimensions
don’t change, so we can just flatten the array to one dimension. Second is that we are
already slicing the data from 48*365 onwards. This is because the first year is only a
partial year, so we not only have some gaps in the fluxes, but in all the data, which
will mess us up a bit. Thankfully for you, I have been through this dataset and can tell
you to skip the first year. Now, this netCDF is fairly well annotated, so if you would
like more information on a variable, simply ask:
1 ncdf [ var ]
Some highlights are that we will be trying to gap-fill the "LE" variable (Latent
Energy, a measure of the water flux), which we can compare to the professionally filled
version "LE_f".
As this is a regression problem, we need to get things into an "X" vs "Y" format.
For the X variables we will use the following:
1 XvarList =[ ' Tair_f ' , ' Rg_f ' , ' VPD_f ' , ' year ' , ' month ' , ' day ' , ' hour ']
With our list, we can then create a 2 dimensional array in the form number-of-
samples by number-of-features. We can do this by first creating a list of the arrays,
then calling np.array and transposing. If we want to be fancy, we can do this in one
line as:
1 X = np . array ([ df [ var ] for var in XvarList ]) . T
and like magic we are all ready to go with the Xvars. The Y variable is also easy,
it is just equal to LE, which if we remeber is stored in our dictionary as df["LE"].
However, we will do a little trick that will seem a bit silly, but will make sense later.
Lets first store our Y variable name as a string, then set Y as:
1 yvarname = " LE "
2 Y = df [ yvarname ]
I promise this will come in handy. One final task is to figure out where the gaps
are, but we will come to the in the next section, which is...
The only package to import will be numpy. After the import, we can make a very simple
function called "GetMask" that will find our gaps for us. As we extracted the data
from the netCDF, all gaps are given the value -9999, so our function will simply return
a boolean array where all gapped values are True. I tend to be a bit cautious, so I
usually look for things such as:
1 mask =( Y < -9000)
but you could easily say (Y==-9999). Don’t forget to return our mask at the end
of the function!
Now, so we don’t forget, we can go ahead and use this function in our "Calc.py"
file right away. First we need to tell "Calc.py" where to find the "GetMask", so in
"Calc.py" we simply
1 import Regs
Easy as that! Now, we will want to keep everything tidy, so go ahead and also save
our mask into our dictionary (df) as something like "GapMask".
Now, lets go back to "Regs.py" and make a second function. This function will
take all the machine learning algorithms that we will use from the SKLearn package
and gap fill our dataset, so lets call it "GapFillerSKLearn" and it will take four input
variables: X,Y,GapMask, and model. As this function will be a bit abstract, let add
some documentation, which will be a string right after we define the function. I have
made an example documentation for our function here:
6.3. Setting up the gapfillers 66
9 Parameters
10 ----------
11 X : numpy array
12 Predictor variables
13 Y : numpy array
14 Training set
15 GapMask : numpy boolean array
16 array indicating where gaps are with True
17
18 Returns
19 ---- ---
20 Y_hat
21 Gap filled Y as numpy array
22 """
Now that the function is documented, we will never forget what this function does.
So we can now move on to the actual function. The reason we can write this
function is because the SKLearn module organizes all of it’s regressions in the same
way, so the method will be called "model" whether it is a random forest or a neural
net. In all cases we fit the model as:
1 model . fit ( X [~ GapMask ] , Y [~ GapMask ])
where we are fitting only when we don’t have gaps. In this case the (tilda) inverts
the boolean matrix, making all Trues False and all Falses True, which in our case now
gives True to all indeces where we have original data. Next we can build our Y_hat
variable as an array of -9999 values by first creating an array of zeros and subtracting
-9999. This way, if we mess up somewhere, we can see the final values as a -9999.
Now, we can fill the gaps with by making a prediction of the model with the Xvars as
1 Y_hat [ GapMask ]= model . predict ( X [ GapMask ])
where we are no longer using the tilda ( ) because we want the gap indices. We can
return our Yh at at t heend o f our f unc t i onand movebackt oour "C al c.p y" f i l e.
6.4. Actually gapfilling 67
and likewise for the MLPRegressor (just remember to change the df key!). Note
that there are many, many options for both RandomForestRegressor and MLPRegressor
that should likely be changed, but as this is a quick overview, we will just use the
defaults. If you were to add the options, such as increasing to 50 trees in the random
forest, it would look like this
1 df [ yvarname + ' _RF ' ]= Regs . G ap F i l l e r S K L e a r n (X ,Y , mask ,
R a n d o m F o r e s t R e g r e s s o r ( n_estimators =50) )
Unfortunately we cannot use the same function for the linear model, as statsmodels
uses a slightly different syntax (note that SKLearn also has an implementation for linear
models, but it’s good to be well rounded). The statsmodels portion will look strikingly
similar to our "GapFillerSKLearn" function, but with some key differences:
1 X_ols = sm . add_constant ( X )
2 df [ yvarname + ' _OLS ' ]= Y
3 model = sm . OLS ( Y [~ mask ] , X_ols [~ mask ])
4 results = model . fit ()
5 df [ yvarname + ' _OLS ' ][ mask ]= results . predict ( X_ols [ mask ])
Basically, we have to add another row to our array that acts as the intercept variable,
then we run the same set of commands, but the pesky X’s and Y’s are switched in the
fit command, making it too different to adapt for our "GapFillerSKLearn" function.
Now, our script is basically done, and we can actually run it (in in Spyder, just press
f5).
6.5. And now the plots! 68
Depending on the speed of your computer, it may take a few seconds to run, more
than you might want to wait for over and over. Therefor, before we move on to the
"Plots.py" file, it would be a good idea to save the data so we don’t have to run it every
time. For this, we will use the "pickle" package. "pickle" does a nice job of saving
python objects as binary files, which Sujan loves, so after we import the package, we
can dump our pickle with:
1 pickle . dump ( df , open ( yvarname + " _GapFills . pickle " , " wb " ) )
You can notice that we save the file with our yvarname, which you will see can
come in handy.
Now, in the python or ipython console, you can explore "df" a little bit and see
that it is a nice and orderly DataFrame, which R users will feel right at home in. And
with this DataFrame, we can do much of our initial plotting directly, so we didn’t even
have to import Matplotlib.
6.5. And now the plots! 69
As we have three different methods to compare, we can write the plotting steps as a
function so we aviod doing all that copy and pasting. Lets call our function "GapComp"
and it will take the input variables df, xvar, yvar, and GapMask. First thing we will do
is make our scatter plot of the gap filled values. Pandas is actually bundled with much
of the plotting functionally built in, so the plot becomes one line:
1 fig = df [ GapMask ]. plot . scatter ( x = xvar , y = yvar )
Notice that we will be using our boolean array "GapMask" to index the entire
DataFrame, this is the magic of Pandas. Now, we could call it a day, but what fun
is a scatter plot without some lines on it. So, we will add the results of a linear
regression between our gap filling and the "LE_f" using the "linregress" function from
"scipy.stats" (go ahead and add it to the import list). "linregress" gives a nice output
of a simple linear regression including all the standard stuff:
1 slope , intercept , r_value , p_value , std_err = linregress ( df [ GapMask ][
xvar ] , df [ GapMask ][ yvar ])
Now that we have fit a model to our models, we can plot our line. We will need
an x variable that can fill our line, which we can use the "numpy.linspace" command
as
1 x = np . linspace ( df [ GapMask ][ yvar ]. min () , df [ GapMask ][ yvar ]. max () )
And finally, we can print our line with a nice label showing both our equation and
the r 2 value with
1 fig . plot (x , x * slope + intercept , label = " y ={0:0.4}* x +{1:0.4} , r ^2={2:0.4} "
. format ( slope , intercept , r_value **2) )
2 fig . legend ()
And that finishes our function. We can now plot all of our models with a neat little
for loop:
1 for var in [ " _RF " , ' _NN ' , ' _OLS ' ]:
2 GapComp ( df , yvarname + var , yvarname + " _f " , df . GapMask )
6.6. Bonus points! 70
Now we can pass this fancy list, either as a named variable, or in a one-liner if we
are even fancier, to the command
1 KDEs = df [ df . GapMask ][ ThisFancyList ]. plot . kde ()
where our plot is saved as the variable KDEs. Now, we have to plot our final KDE
from the "LE" column, but we can no longer call it using "KDEs.plot" like we did for
our line in the "GapComp" function. What we have to do then is tell the "df.plot.kde"
command which plot we want it in. For this, we pass the "ax=" argument like so
1 df [~ df . GapMask ][[ yvarname ]]. plot . kde ( ax = KDEs )
and viola, our plotting is complete! There, some advanced statistics, easy as cake.
71
7.1. Plotting a simple figure 72
The first part of this chapter introduces plotting standard figures using matplotlib
and the second part introduces interactive plotting using Bokeh.
For comprehensive set of examples with source code used to plot the figure using
matplotlib, click here. For the same for Bokeh, click here.
First, a figure object can be defined. fisize is the figure size in (width,height) tuple.
The unit is inches.
In [268]: 1 from matplotlib import pyplot as plt
2 plt . Figure ( figsize =(3 ,4) )
There are several keyword arguments such as color, style and so on that control
the appearance of the line object. They are listed here. The line and marker styles in
matplotlib are shown in Table 7.1.
For axis labels and figure title:
In [270]: 1 plt . xlabel ( ' time ')
2 plt . ylabel ( ' Precip ' , color = 'k ' , fontsize =10)
3 plt . title ( ' One Figure ')
The axis limits can be set by using xlim() and ylim() as:
In [271]: 1 plt . xlim (0 ,200)
2 plt . ylim (0 ,1 e14 )
The color and fontsize can be change. For color, use color= some color name
such as 'red' or color= hexadecimal color code such as '#0000FF'. For font size, use
fontsize=number (number is > 0). Also, grid lines can be turned on by using
7.2. Multiple plots in a figure 73
Now, date objects can be created using datetime module. In the current file, the
data is available from 1979-01-01 to 2007-12-31. Using these date instances, a range
of date object can be created by using step of dt, that is again a timedelta object from
datetime.
In [277]: 1 sdate = datetime . date (1979 ,1 ,1)
2 edate = datetime . date (2008 ,1 ,1)
3 dt = datetime . timedelta ( days =30.5)
4 dates_mo = dates . drange ( sdate , edate , dt )
Using the functions within tmop module, monthly and year data are created.
In [278]: 1 dat_mo = np . array ([ np . mean ( _m ) for _m in tmop . day2month ( dat1 , sdate ) ])
2 dat_y = np . array ([ np . mean ( _y ) for _y in tmop . day2year ( dat1 , sdate ) ])
Next up, we create axes instances on which the plots will be made. These axes
objects are the founding blocks of all subplots like object in Python and form the
basics for having as many subplots as one wants in a figure. It is defined by using
axes command with [lower left x, lower left y, width, and height] as an argument. The
co-ordinates and sizes are given in relative terms of figure, and thus, they vary from 0
to 1.
In [279]: 1 ax1 = plt . axes ([0.1 ,0.1 ,0.6 ,0.8])
2 ax1 . plot_date ( dates_mo , dat_mo , ls = ' - ' , marker = None )
XWhile plotting dates, plot_date function is used with the date range as the
7.4. Scatter Plots 75
x variable and data as the y variable. Note that the sizes of x and y variables should
be the same. Automatically, the axis is formatted as years.
In [280]: 1 ax2 = plt . axes ([0.75 ,0.1 ,0.25 ,0.8])
2 ax2 . plot ( dat_mo . reshape ( -1 ,12) . mean (0) )
3 ax2 . set_xticks ( range (12) )
4 ax2 . se t_ x ti ck la b el s ([ ' Jan ' , ' Feb ' , ' Mar ' , ' Apr ' , ' May ' , ' Jun ' , ' Jul ' , ' Aug ' ,
' Sep ' , ' Oct ' , ' Nov ' , ' Dec '] , rotation =90)
5 plt . show ()
XSometimes, it is easier to set the ticks and labels manually. In this case, the
mean seasonal cycle is plotted normally, and the xticks are changed to look like dates.
Remember that with proper date range object, this can be achieved automatically with
plot_date as well.
XMatplotlib has a dedicated ticker module that handles the location and for-
matting of the ticks. Even though we dont go through the details, we recommend
everyone to read and skim through the ticker page.
Once the data is read, we can open a figure object and start adding things to it.
In [282]: 1 plt . Figure ( figsize =(3 ,4) )
2 plt . scatter ( dat1 , dat2 , facecolor = ' blue ' , edgecolor = None )
3 plt . scatter ( dat1 , dat3 , marker = 'd ' , facecolor = ' red ' , alpha =0.4 , edgewidth
=0.7)
4 plt . xlabel ( ' Precip ( $kg \ d ^{ -1} $ ) ')
5 plt . ylabel ( ' Runoff or ET ( $ \\ frac { kg }{ d }) $ ' , color = 'k ' , fontsize =10)
6 plt . grid ( which = ' major ' , axis = ' both ' , ls = ': ' , lw =0.5)
7 plt . title ( 'A scatter ')
Xscatter has a slightly different name for colors. The color of the marker, and
the lines around it can be set separately using facecolor or edgecolor respectively. It
also allows changing the transparency using alpha argument. Note than the width of
the the line around the markers is set by edgewidth and not linewidth like in plot.
7.5. Playing with the Elements 76
In [283]: 1 plt . legend (( ' Runoff ' , ' ET ') , loc = ' best ')
• The Ugly lines: The boxes around figures are stored as splines, which is actually
a dictionary object with information of which line, and their properties. In the
rem_axLine function of plotTools, you can see that the linewidth of some of the
splines have been set to zero.
In [284]: 1 import plotTools as pt
2 pt . rem_axLine ()
• Getting the limits of the axis from the figure. Use gca() method of pyplot to get
x and y limits.
In [285]: 1 ymin , ymax = plt . gca () . get_ylim ()
2 xmin , xmax = plt . gca () . get_xlim ()
• A legendary legend: Here is an example of how flexible a legend object can be. It
has a tonne of options and methods. Sometimes, becomes a manual calibration.
7.6. Map Map Map! 77
In [287]: 1 leg = plt . legend (( ' Runoff ' , ' ET ') , loc =(0.05 ,0.914) , markerscale
=0.5 , scatterpoints =4 , ncol =2 , fancybox = True , handlelength =3.5 ,
handletextpad =0.8 , borderpad =0.1 , labelspacing =0.1 ,
columnspacing =0.25)
2 leg . get_frame () . set_linewidth (0)
3 leg . get_frame () . set_facecolor ( ' firebrick ')
4 leg . legendPatch . set_alpha (0.25)
5 texts = leg . get_texts ()
6 for t in texts :
7 tI = texts . index ( t )
8 # t . set_color ( cc [ tI ])
9 plt . setp ( texts , fontsize =7.83)
7.6. M AP M AP M AP !
This section explains the procedure to draw a map using basemap and matplotlib.
Once the data is read, first a map object should be created using basemap module.
In [289]: 1 from mpl_toolkits . basemap import Basemap
2 _map = Basemap ( projection = ' cyl ' , \
3 llcrnrlon = lonmin , \
4 urcrnrlon = lonmax , \
5 llcrnrlat = latmin , \
6 urcrnrlat = latmax , \
7 resolution = 'c ')
Xresolution: specifies the resolution of the map. 'c', 'l', 'i', 'h', 'f'or None
can be used. 'c'(crude), 'l'(low), 'i'(intermediate), 'h'(high) and 'f'(full).
XThe lontitude and latitude for lower left corner and upper right corner can
7.6. Map Map Map! 78
In the current case, the latitude and longitude of the lower left corner of the map
are set at the following values:
In [290]: 1 latmin = -90
2 lonmin = -180
3 latmax =90
4 lonmax =180
In the example program, the lines and ticks around the map are also removed by
In [295]: 1 import plotTools as pt
2 pt . rem_axLine ([ ' right ' , ' bottom ' , ' left ' , ' top ' ])
3 pt . rem_ticks ()
Now the data are plotted over the map object as:
In [296]: 1 from matplotlib import pyplot as plt
2 fig = plt . figure ( figsize =(9 ,7) )
3 ax1 = plt . subplot (211)
4 _map . imshow ( np . ma . masked_less ( data . mean (0) ,0.) , cmap = plt . cm . jet ,
interpolation = ' none ' , origin = ' upper ' , vmin =0 , vmax =200)
5 plt . colorbar ( orientation = ' vertical ' , shrink =0.5)
6 ax2 = plt . axes ([0.18 ,0.1 ,0.45 ,0.4])
7 data_gm = np . array ([ np . ma . masked_less ( _data ,0) . mean () for _data in data
])
8 plt . plot ( data_gm )
9 data_gm_msc = data_gm . reshape ( -1 ,12) . mean (0)
10 pt . rem_axLine ()
11 ax3 = plt . axes ([0.72 ,0.1 ,0.13 ,0.4])
12 plt . plot ( data_gm_msc )
13 pt . rem_axLine ()
14 plt . show ()
XA subplot can be combined with axes in a figure. In this case, a global mean
of runoff and its mean seasonal scyle are plotted at axes ax2 and ax3, respectively.
In [297]: 1 colorbar ()
Xlength:width = 20:1.
Various other colormaps are available in python. Fig. 7.1 shows some commonly
used colorbars and the names for it. More details of the options for colorbar can be
found here.