Numerical Python
Numerical Python
Make sure you have the software Material associated with these slides
Pack TCSE3-3rd-examples.tar.gz out in a directory and let All computer languages intros start with a program that prints "Hello,
scripting be an environment variable pointing to the top directory: World!" to the screen
tar xvzf TCSE3-3rd-examples.tar.gz Scientific computing extension: read a number, compute its sine
export scripting=‘pwd‘ value, and print out
All paths in these slides are given relative to scripting, e.g., The script, called hw.py, should be run like this:
src/py/intro/hw.py is reached as
python hw.py 3.4
$scripting/src/py/intro/hw.py
or just (Unix)
./hw.py 3.4
Output:
Hello, World! sin(3.4)=-0.255541102027
The first line specifies the interpreter of the script Desired output:
(here the first python program in your path) Hello, World! sin(3.4)=-0.255541102027
python hw.py 1.4 # first line is not treated as comment
./hw.py 1.4 # first line is used to specify an interpreter String concatenation:
print "Hello, World! sin(" + str(r) + ")=" + str(s)
Even simple scripts must load modules:
import sys, math printf-like statement:
print "Hello, World! sin(%g)=%g" % (r,s)
Numbers and strings are two different types:
r = sys.argv[1] # r is string Variable interpolation:
s = math.sin(float(r))
print "Hello, World! sin(%(r)g)=%(s)g" % vars()
# sin expects number, not string r
# s becomes a floating-point number
Input file format: two columns with numbers This construction is more flexible and traditional in Python (and a bit
0.1 1.4397 strange...):
0.2 4.325 while 1:
0.5 9.0 line = ifile.readline() # read a line
if not line: break # end of file: jump out of loop
Read a line with x and y, transform y, write x and f(y): # process line
for line in ifile: i.e., an ’infinite’ loop with the termination criterion inside the loop
pair = line.split()
x = float(pair[0]); y = float(pair[1])
fy = myfunc(y) # transform y value
ofile.write(’%g %12.5e\n’ % (x,fy))
Method 1: write just the name of the scriptfile: In method 1, the interpreter to be used is specified in the first line
./datatrans1.py infile outfile Explicit path to the interpreter:
# or #!/usr/local/bin/python
datatrans1.py infile outfile
or perhaps your own Python interpreter:
if . (current working directory) or the directory containing
#!/home/hpl/projects/scripting/Linux/bin/python
datatrans1.py is in the path
Method 2: run an interpreter explicitly: Using env to find the first Python interpreter in the path:
python datatrans1.py infile outfile #!/usr/bin/env python
Use the first python program found in the path
This works on Windows too (method 1 requires the right
assoc/ftype bindings for .py files)
Yes and no, depending on how you see it Everything in Python is an object (number, function, list, file, module,
Python first compiles the script into bytecode class, socket, ...)
The bytecode is then interpreted Objects are instances of a class – lots of classes are defined
(float, int, list, file, ...) and the programmer can define new
No linking with libraries; libraries are imported dynamically when classes
needed
Variables are names for (or “pointers” or “references” to) objects:
It appears as there is no compilation
A = 1 # make an int object with value 1 and name A
Quick development: just edit the script and run! A = ’Hi!’ # make a str object with value ’Hi!’ and name A
(no time-consuming compilation and linking) print A[1] # A[1] is a str object ’i’, print this object
A = [-1,1] # let A refer to a list object with 2 elements
Extensive error checking at run time A[-1] = 2 # change the list A refers to in-place
b = A # let name b refer to the same object as A
print b # results in the string ’[-1, 2]’
Import (more on this later...): A special module loads tabular file data into NumPy arrays:
from numpy import * import scitools.filetable
x = linspace(0, 1, 1001) # 1001 values between 0 and 1 f = open(infilename, ’r’)
x = sin(x) # computes sin(x[0]), sin(x[1]) etc. x, y = scitools.filetable.read_columns(f)
f.close()
x=sin(x) is 13 times faster than an explicit loop:
Now we can compute with the NumPy arrays x and y:
for i in range(len(x)):
x[i] = sin(x[i]) x = 10*x
y = 2*y + 0.1*sin(x)
because sin(x) invokes an efficient loop in C
We can easily write x and y back to a file:
f = open(outfilename, ’w’)
scitools.filetable.write_columns(f, x, y)
f.close()
Multi-dimensional arrays can be constructed: Python statements can be run interactively in a Python shell
x = zeros(n) # array with indices 0,1,...,n-1 The “best” shell is called IPython
x = zeros((m,n)) # two-dimensional array
x[i,j] = 1.0 # indexing Sample session with IPython:
x = zeros((p,q,r)) # three-dimensional array Unix/DOS> ipython
x[i,j,k] = -2.1 ...
x = sin(x)*cos(x) In [1]:3*4-1
Out[1]:11
We can plot one-dimensional arrays:
In [2]:from math import *
from scitools.easyviz import * # plotting
x = linspace(0, 2, 21) In [3]:x = 1.2
y = x + sin(10*x) In [4]:y = sin(x)
plot(x, y)
In [5]:x
NumPy has lots of math functions and operations Out[5]:1.2
Up- and down-arrays: go through command history IPython supports TAB completion: write a part of a command or
Emacs key bindings for editing previous commands name (variable, function, module), hit the TAB key, and IPython will
complete the word or show different alternatives:
The underscore variable holds the last output
In [1]: import math
In [6]:y
Out[6]:0.93203908596722629 In [2]: math.<TABKEY>
math.__class__ math.__str__ math.frexp
In [7]:_ + 1 math.__delattr__ math.acos math.hypot
Out[7]:1.93203908596722629 math.__dict__ math.asin math.ldexp
...
or
In [2]: my_variable_with_a_very_long_name = True
In [3]: my<TABKEY>
In [3]: my_variable_with_a_very_long_name
You can increase your typing speed with TAB completion!
This happens when the infile name is wrong: Consider datatrans1.py: read 100 000 (x,y) data from a pure text (ASCII)
/home/work/scripting/src/py/intro/datatrans2.py file and write (x,f(y)) out again
7 print "Usage:",sys.argv[0], "infile outfile"; sys.exit(1) Pure Python: 4s
8
----> 9 ifile = open(infilename, ’r’) # open file for reading Pure Perl: 3s
10 lines = ifile.readlines() # read file into list of lines
11 ifile.close() Pure Tcl: 11s
IOError: [Errno 2] No such file or directory: ’infile’ Pure C (fscanf/fprintf): 1s
> /home/work/scripting/src/py/intro/datatrans2.py(9)?()
-> ifile = open(infilename, ’r’) # open file for reading Pure C++ (iostream): 3.6s
(Pdb) print infilename Pure C++ (buffered streams): 2.5s
infile
Numerical Python modules: 2.2s (!)
(Computer: IBM X30, 1.2 GHz, 512 Mb RAM, Linux, gcc 3.3)
Simple, classical Unix shell scripts are widely used to replace Parsing command-line options:
sequences of manual steps in a terminal window
somescript -option1 value1 -option2 value2
Such scripts are crucial for scientific reliability and human efficiency!
Shell script newbie? Wake up and adapt this example to your Removing and creating directories
projects! Writing data to file
Typical situation in computer simulation: Running stand-alone programs (applications)
run a simulation program with some input
run a visualization program and produce graphs
Programs are supposed to run from the command line, with input
from files or from command-line arguments
We want to automate the manual steps by a Python script
01 1
1010 0
Input: m, b, c, and so on read from standard input
0
1 Acos(wt) How to run the code:
m1
0
00000000000
11111111111
00000000000
11111111111
oscillator < file
00000000000 10y0
11111111111 where file can be
11111111111 10
00000000000 3.0
0.04
1.0
...
i.e., values of m, b, c, etc. -- in the right order!
c
func b The resulting time series y(t) is stored in a file sim.dat with t and
y(t) in the 1st and 2nd column, respectively
d2 y dy
m 2
+ b + cf (y) = A cos ωt
dt dt
d
y(0) = y0, y(0) = 0
dt
Code: oscillator (written in Fortran 77)
Intro to Python programming – p. 41 Intro to Python programming – p. 42
Commands:
set title ’case: m=3 b=0.7 c=1 f(y)=y A=5 ...’;
# screen plot: (x,y) data are in the file sim.dat
tmp2: m=2 b=0.7 c=5 f(y)=y A=5 w=6.28319 y0=0.2 dt=0.05 plot ’sim.dat’ title ’y(t)’ with lines;
0.3
y(t) # hardcopies:
0.2 set size ratio 0.3 1.5, 1.0;
0.1 set term postscript eps mono dashed ’Times-Roman’ 28;
set output ’case.ps’;
0 plot ’sim.dat’ title ’y(t)’ with lines;
-0.1
# make a plot in PNG format as well:
-0.2 set term png small;
-0.3
set output ’case.png’;
0 5 10 15 20 25 30 plot ’sim.dat’ title ’y(t)’ with lines;
Set default values of m, b, c etc. Set default values of the script’s input parameters:
Parse command-line options (-m, -b etc.) and assign new values to m = 1.0; b = 0.7; c = 5.0; func = ’y’; A = 5.0;
m, b, c etc. w = 2*math.pi; y0 = 0.2; tstop = 30.0; dt = 0.05;
case = ’tmp1’; screenplot = 1
Create and move to subdirectory
Examine command-line options in sys.argv:
Write input file for the simulator
# read variables from the command line, one by one:
Run simulator while len(sys.argv) >= 2:
Write Gnuplot commands in a file option = sys.argv[1]; del sys.argv[1]
if option == ’-m’:
Run Gnuplot m = float(sys.argv[1]); del sys.argv[1]
elif option == ’-b’:
b = float(sys.argv[1]); del sys.argv[1]
...
Note: sys.argv[1] is text, but we may want a float for numerical
operations
Python offers two modules for command-line argument parsing: Python has a rich cross-platform operating system (OS) interface
getopt and optparse Skip Unix- or DOS-specific commands; do all OS operations in
These accept short options (-m) and long options (-mass) Python!
getopt examines the command line and returns pairs of options and Safe creation of a subdirectory:
values ((-mass, 2.3)) dir = case # subdirectory name
optparse is a bit more comprehensive to use and makes the import os, shutil
if os.path.isdir(dir): # does dir exist?
command-line options available as attributes in an object shutil.rmtree(dir) # yes, remove old files
In this introductory example we rely on manual parsing since this os.mkdir(dir) # make dir directory
os.chdir(dir) # move to dir
exemplifies basic Python programming
Make Gnuplot script: Our simviz1.py script is traditionally written as a Unix shell script
f = open(case + ’.gnuplot’, ’w’) What are the advantages of using Python here?
f.write("""
set title ’%s: m=%g b=%g c=%g f(y)=%s A=%g ...’; Easier command-line parsing
... Runs on Windows and Mac as well as Unix
...
""" % (case,m,b,c,func,A,w,y0,dt,case,case)) Easier extensions (loops, storing data in arrays, analyzing results,
...
f.close() etc.)
It is easy to replace Gnuplot by another plotting program Suppose we want to run a series of experiments with different m
Matlab, for instance: values
f = open(case + ’.m’, ’w’) # write to Matlab M-file Put a script on top of simviz1.py,
# (the character % must be written as %% in printf-like strings) ./loop4simviz1.py m_min m_max dm \
f.write(""" [options as for simviz1.py]
load sim.dat %% read sim.dat into sim matrix
plot(sim(:,1),sim(:,2)) %% plot 1st column as x, 2nd as y with a loop over m, which calls simviz1.py inside the loop
legend(’y(t)’)
title(’%s: m=%g b=%g c=%g f(y)=%s A=%g w=%g y0=%g dt=%g’) Each experiment is archived in a separate directory
outfile = ’%s.ps’; print(’-dps’, outfile) %% ps BW plot
outfile = ’%s.png’; print(’-dpng’, outfile) %% png color plot That is, loop4simviz1.py controls the -m and -case options to
""" % (case,m,b,c,func,A,w,y0,dt,case,case)) simviz1.py
if screenplot: f.write(’pause(30)\n’)
f.write(’exit\n’); f.close()
if screenplot:
cmd = ’matlab -nodesktop -r ’ + case + ’ > /dev/null &’
else:
cmd = ’matlab -nodisplay -nojvm -r ’ + case
failure, output = commands.getstatusoutput(cmd)
The first three arguments define the m values: ’ ’.join(list) can make a string out of the list list, with a
try: blank between each item
m_min = float(sys.argv[1]) simviz1_options = ’ ’.join(sys.argv[4:])
m_max = float(sys.argv[2])
dm = float(sys.argv[3]) Example:
except:
print ’Usage:’,sys.argv[0],\ ./loop4simviz1.py 0.5 2 0.5 -b 2.1 -A 3.6
’m_min m_max m_increment [ simviz1.py options ]’
sys.exit(1) results in the same as
m_min = 0.5
Pass the rest of the arguments, sys.argv[4:], to simviz1.py m_max = 2.0
dm = 0.5
Problem: sys.argv[4:] is a list, we need a string simviz1_options = ’-b 2.1 -A 3.6’
[’-b’,’5’,’-c’,’1.1’] -> ’-b 5 -c 1.1’
Cannot use Many runs of simviz1.py can be automated, many results are
for m in range(m_min, m_max, dm): generated, and we need a way to browse the results
because range works with integers only Idea: collect all plots in a common HTML file and let the script
automate the writing of the HTML file
A while-loop is appropriate:
html = open(’tmp_mruns.html’, ’w’)
m = m_min html.write(’<HTML><BODY BGCOLOR="white">\n’)
while m <= m_max:
case = ’tmp_m_%g’ % m m = m_min
s = ’python simviz1.py %s -m %g -case %s’ % \ while m <= m_max:
(simviz1_options, m, case) case = ’tmp_m_%g’ % m
failure, output = commands.getstatusoutput(s) cmd = ’python simviz1.py %s -m %g -case %s’ % \
m += dm (simviz1_options, m, case)
failure, output = commands.getstatusoutput(cmd)
(Note: our -m and -case will override any -m or -case option
provided by the user) html.write(’<H1>m=%g</H1> <IMG SRC="%s">\n’ \
% (m,os.path.join(case,case+’.png’)))
m += dm
html.write(’</BODY></HTML>\n’)
For compact printing a PostScript file with small-sized versions of all psfiles = [] # plot files in PostScript format
...
the plots is useful while m <= m_max:
case = ’tmp_m_%g’ % m
epsmerge (Perl script) is an appropriate tool: ...
# concatenate file1.ps, file2.ps, and so on to psfiles.append(os.path.join(case,case+’.ps’))
...
# one single file figs.ps, having pages with ...
# 3 rows with 2 plots in each row (-par preserves s = ’epsmerge -o tmp_mruns.ps -x 2 -y 3 -par ’ + \
# the aspect ratio of the plots) ’ ’.join(psfiles)
failure, output = commands.getstatusoutput(s)
epsmerge -o figs.ps -x 2 -y 3 -par \
file1.ps file2.ps file3.ps ...
When we vary m, wouldn’t it be nice to see progressive plots put Enable loops over an arbitrary parameter (not only m)
together in a movie? # easy:
Can combine the PNG files together in an animated GIF file: ’-m %g’ % m
# is replaced with
convert -delay 50 -loop 1000 -crop 0x0 \ ’-%s %s’ % (str(prm_name), str(prm_value))
plot1.png plot2.png plot3.png plot4.png ... movie.gif
# prm_value plays the role of the m variable
animate movie.gif # or display movie.gif # prm_name (’m’, ’b’, ’c’, ...) is read as input
(convert and animate are ImageMagick tools) New feature: keep the range of the y axis fixed (for movie)
Collect all PNG filenames in a list and join the list items to form the Files:
convert arguments simviz1.py : run simulation and visualization
Run the convert program simviz2.py : additional option for yaxis scale
loop4simviz1.py : m loop calling simviz1.py
loop4simviz2.py : loop over any parameter in
simviz2.py and make movie
Make a summary report with the equation, a picture of the system, Archiving of experiments and having a system for uniquely relating
the command-line arguments, and a movie of the solution input data to visualizations or result files are fundamental for reliable
Make a link to a detailed report with plots of all the individual scientific investigations
experiments The experiments can easily be reproduced
Demo: New (large) sets of experiments can be generated
./loop4simviz2_2html.py m 0.1 6.1 0.5 -yaxis -0.5 0.5 \ All these items contribute to increased quality and reliability of
-noscreenplot
computer experiments
ls -d tmp_*
firefox tmp_m_summary.html
Input file with time series data: The model1.dat file, arising from column no 2, becomes
some comment line 0 0.1
1.5 1.5 0.1
measurements model1 model2 3 0.2
0.0 0.1 1.0
0.1 0.1 0.188 The time step parameter, here 1.5, is used to generate the first
0.2 0.2 0.25
column
Contents: comment line, time step, headings, time series data
Goal: split file into two-column files, one for each time series
Script: interpret input file, split text, extract data and write files
Read inputfile name (1st command-line arg.) Reading and writing files
Open input file Sublists
Read and skip the 1st (comment) line List of file objects
Extract time step from the 2nd line Dictionaries
Read time series names from the 3rd line Arrays of numbers
Make a list of file objects, one for each time series List comprehension
Read the rest of the file, line by line: Refactoring a flat script as functions in a module
split lines into y values
write t and y value to file, for all series
File: src/py/intro/convert1.py
Open file and read comment line: Make a list of file objects for output of each time series:
infilename = sys.argv[1] outfiles = []
ifile = open(infilename, ’r’) # open for reading for name in ynames:
line = ifile.readline() outfiles.append(open(name + ’.dat’, ’w’))
Read each line, split into y values, write to output files: Dictionary = array with a text as index
t = 0.0 # t value Also called hash or associative array in other languages
# read the rest of the file line by line:
while 1: Can store ’anything’:
line = ifile.readline()
if not line: break prm[’damping’] = 0.2 # number
Could store the time series in memory as a dictionary of lists; the list Specifying a sublist, e.g., the 4th line until the last line: lines[3:]
items are the y values and the y names are the keys Transforming all words in a line to floats:
y = {} # declare empty dictionary yvalues = [float(x) for x in line.split()]
# ynames: names of y curves
for name in ynames: # same as
y[name] = [] # for each key, make empty list numbers = line.split()
yvalues = []
lines = ifile.readlines() # list of all lines for s in numbers:
... yvalues.append(float(s))
for line in lines[3:]:
yvalues = [float(x) for x in line.split()]
i = 0 # counter for yvalues
for name in ynames:
y[name].append(yvalues[i]); i += 1
File: src/py/intro/convert2.py
How to call the load data function Iterating over several lists
Note: the function returns two (!) values; C/C++/Java/Fortran-like iteration over two arrays/lists:
a dictionary of lists, plus a float for i in range(len(list)):
It is common that output data from a Python function are returned, e1 = list1[i]; e2 = list2[i]
# work with e1 and e2
and multiple data structures can be returned (actually packed as a
tuple, a kind of “constant list”) Pythonic version:
Here is how the function is called: for e1, e2 in zip(list1, list2):
y, dt = load_data(’somedatafile.dat’) # work with element e1 from list1 and e2 from list2
print y For example,
Output from print y: for name, value in zip(ynames, yvalues):
>>> y y[name].append(value)
{’tmp-model2’: [1.0, 0.188, 0.25],
’tmp-model1’: [0.10000000000000001, 0.10000000000000001,
0.20000000000000001],
’tmp-measurements’: [0.0, 0.10000000000000001,
0.20000000000000001]}
def dump_data(y, dt): Our goal is to reuse load_data and dump_data, possibly with
# write out 2-column files with t and y[name] for each name: some operations on y in between:
for name in y.keys():
ofile = open(name+’.dat’, ’w’) from convert3 import load_data, dump_data
for k in range(len(y[name])):
ofile.write(’%12g %12.5e\n’ % (k*dt, y[name][k])) y, timestep = load_data(’.convert_infile1’)
ofile.close() from math import fabs
for name in y: # run through keys in y
maxabsy = max([fabs(yval) for yval in y[name]])
print ’max abs(y[%s](t)) = %g’ % (name, maxabsy)
dump_data(y, timestep)
Collect the functions in the module in a file, here the file is called The scripts convert1.py and convert2.py load and dump
convert3.py data - this functionality can be reproduced by an application script
We have then made a module convert3 using convert3
The usage is as exemplified on the previous slide The application script can be included in the module:
if __name__ == ’__main__’:
import sys
try:
infilename = sys.argv[1]
except:
usage = ’Usage: %s infile’ % sys.argv[0]
print usage; sys.exit(1)
y, dt = load_data(infilename)
dump_data(y, dt)
If the module file is run as a script, the if test is true and the
application script is run
If the module is imported in a script, the if test is false and no
statements are executed
Why Python and C are two different worlds Ch. 5 in the course book
Wrapper code F2PY manual
Wrapper tools SWIG manual
F2PY: wrapping Fortran (and C) code Examples coming with the SWIG source code
SWIG: wrapping C and C++ code Ch. 9 and 10 in the course book
Intro to mixed language programming – p. 101 Intro to mixed language programming – p. 102
Every object in Python is represented by C struct PyObject The wrapper function and hw1 must be compiled and linked to a
Wrapper code converts between PyObject variables and plain C shared library file
variables (from PyObject r1 and r2 to double, and double This file can be loaded in Python as module
result to PyObject): Such modules written in other languages are called extension
static PyObject *_wrap_hw1(PyObject *self, PyObject *args) { modules
PyObject *resultobj;
double arg1, arg2, result;
PyArg_ParseTuple(args,(char *)"dd:hw1",&arg1,&arg2))
result = hw1(arg1,arg2);
resultobj = PyFloat_FromDouble(result);
return resultobj;
}
Intro to mixed language programming – p. 103 Intro to mixed language programming – p. 104
Writing wrapper code Integration issues
A wrapper function is needed for each C function we want to call from Direct calls through wrapper code enables efficient data transfer;
Python large arrays can be sent by pointers
Wrapper codes are tedious to write COM, CORBA, ILU, .NET are different technologies; more complex,
There are tools for automating wrapper code development less efficient, but safer (data are copied)
We shall use SWIG (for C/C++) and F2PY (for Fortran) Jython provides a seamless integration of Python and Java
Intro to mixed language programming – p. 105 Intro to mixed language programming – p. 106
Consider this Scientific Hello World module (hw): We start with Fortran (F77)
import math, sys F77 code in a file hw.f:
def hw1(r1, r2): real*8 function hw1(r1, r2)
s = math.sin(r1 + r2) real*8 r1, r2
return s hw1 = sin(r1 + r2)
return
def hw2(r1, r2): end
s = math.sin(r1 + r2)
print ’Hello, World! sin(%g+%g)=%g’ % (r1,r2,s) subroutine hw2(r1, r2)
real*8 r1, r2, s
Usage: s = sin(r1 + r2)
write(*,1000) ’Hello, World! sin(’,r1+r2,’)=’,s
from hw import hw1, hw2 1000 format(A,F6.3,A,F8.6)
print hw1(1.0, 0) return
hw2(1.0, 0) end
We want to implement the module in Fortran 77, C and C++, and use
it as if it were a pure Python module
Intro to mixed language programming – p. 107 Intro to mixed language programming – p. 108
One-slide F77 course Using F2PY
Fortran is case insensitive (reAL is as good as real) F2PY automates integration of Python and Fortran
One statement per line, must start in column 7 or later Say the F77 code is in the file hw.f
Comma on separate lines Run F2PY (-m module name, -c for compile+link):
All function arguments are input and output f2py -m hw -c hw.f
(as pointers in C, or references in C++)
Load module into Python and test:
A function returning one value is called function
from hw import hw1, hw2
A function returning no value is called subroutine print hw1(1.0, 0)
hw2(1.0, 0)
Types: real, double precision, real*4, real*8,
integer, character (array) In Python, hw appears as a module with Python code...
Arrays: just add dimension, as in It cannot be simpler!
real*8 a(0:m, 0:n)
Format control of output requires FORMAT statements
Intro to mixed language programming – p. 109 Intro to mixed language programming – p. 110
In Fortran (and C/C++) functions often modify arguments; here the F2PY generates doc strings that document the interface:
result s is an output argument: >>> import hw
subroutine hw3(r1, r2, s) >>> print hw.__doc__ # brief module doc string
real*8 r1, r2, s Functions:
s = sin(r1 + r2) hw1 = hw1(r1,r2)
return hw2(r1,r2)
end hw3(r1,r2,s)
>>> print hw.hw3.__doc__ # more detailed function doc string
Running F2PY results in a module with wrong behavior: hw3 - Function signature:
>>> from hw import hw3 hw3(r1,r2,s)
>>> r1 = 1; r2 = -1; s = 10 Required arguments:
>>> hw3(r1, r2, s) r1 : input float
>>> print s r2 : input float
10 # should be 0 s : input float
Why? F2PY assumes that all arguments are input arguments We see that hw3 assumes s is input argument!
Output arguments must be explicitly specified! Remedy: adjust the interface
Intro to mixed language programming – p. 111 Intro to mixed language programming – p. 112
Interface files Outline of the interface file
We can tailor the interface by editing an F2PY-generated interface file The interface applies a Fortran 90 module (class) syntax
Run F2PY in two steps: (i) generate interface file, (ii) generate Each function/subroutine, its arguments and its return value is
wrapper code, compile and link specified:
Generate interface file hw.pyf (-h option): python module hw ! in
interface ! in :hw
f2py -m hw -h hw.pyf hw.f ...
subroutine hw3(r1,r2,s) ! in :hw:hw.f
real*8 :: r1
real*8 :: r2
real*8 :: s
end subroutine hw3
end interface
end python module hw
(Fortran 90 syntax)
Intro to mixed language programming – p. 113 Intro to mixed language programming – p. 114
We may edit hw.pyf and specify s in hw3 as an output argument, Load the module and print its doc string:
using F90’s intent(out) keyword: >>> import hw
python module hw ! in >>> print hw.__doc__
interface ! in :hw Functions:
... hw1 = hw1(r1,r2)
subroutine hw3(r1,r2,s) ! in :hw:hw.f hw2(r1,r2)
real*8 :: r1 s = hw3(r1,r2)
real*8 :: r2
real*8, intent(out) :: s Oops! hw3 takes only two arguments and returns s!
end subroutine hw3
end interface This is the “Pythonic” function style; input data are arguments, output
end python module hw data are returned
Next step: run F2PY with the edited interface file: By default, F2PY treats all arguments as input
f2py -c hw.pyf hw.f F2PY generates Pythonic interfaces, different from the original
Fortran interfaces, so check out the module’s doc string!
Intro to mixed language programming – p. 115 Intro to mixed language programming – p. 116
General adjustment of interfaces Specification of input/output arguments; .pyf file
Function with multiple input and output variables In the interface file:
subroutine somef(i1, i2, o1, o2, o3, o4, io1) python module somemodule
interface
...
input: i1, i2 subroutine somef(i1, i2, o1, o2, o3, o4, io1)
real*8, intent(in) :: i1
output: o1, ..., o4 real*8, intent(in) :: i2
input and output: io1 real*8, intent(out) :: o1
real*8, intent(out) :: o2
Pythonic interface (as generated by F2PY): real*8, intent(out) :: o3
real*8, intent(out) :: o4
o1, o2, o3, o4, io1 = somef(i1, i2, io1) real*8, intent(in,out) :: io1
end subroutine somef
...
end interface
end python module somemodule
Intro to mixed language programming – p. 117 Intro to mixed language programming – p. 118
Instead of editing the interface file, we can add special F2PY Let us implement the hw module in C:
comments in the Fortran source code: #include <stdio.h>
subroutine somef(i1, i2, o1, o2, o3, o4, io1) #include <math.h>
real*8 i1, i2, o1, o2, o3, o4, io1 #include <stdlib.h>
Cf2py intent(in) i1
double hw1(double r1, double r2)
Cf2py intent(in) i2 {
Cf2py intent(out) o1 double s; s = sin(r1 + r2); return s;
Cf2py intent(out) o2 }
Cf2py intent(out) o3
Cf2py intent(out) o4 void hw2(double r1, double r2)
Cf2py intent(in,out) io1 {
double s; s = sin(r1 + r2);
Now a single F2PY command generates correct interface: printf("Hello, World! sin(%g+%g)=%g\n", r1, r2, s);
}
f2py -m hw -c hw.f
/* special version of hw1 where the result is an argument: */
void hw3(double r1, double r2, double *s)
{
*s = sin(r1 + r2);
}
Intro to mixed language programming – p. 119 Intro to mixed language programming – p. 120
Using F2PY Step 1: Write Fortran 77 signatures
F2PY can also wrap C code if we specify the function signatures as C file signatures.f
Fortran 90 modules real*8 function hw1(r1, r2)
Cf2py intent(c) hw1
My procedure: real*8 r1, r2
write the C functions as empty Fortran 77 functions or Cf2py intent(c) r1, r2
subroutines end
run F2PY on the Fortran specification to generate an interface file subroutine hw2(r1, r2)
Cf2py intent(c) hw2
run F2PY with the interface file and the C source code real*8 r1, r2
Cf2py intent(c) r1, r2
end
subroutine hw3(r1, r2, s)
Cf2py intent(c) hw3
real*8 r1, r2, s
Cf2py intent(c) r1, r2
Cf2py intent(out) s
end
Intro to mixed language programming – p. 121 Intro to mixed language programming – p. 122
Step 2: Generate interface file Step 3: compile C code into extension module
Run Run
Unix/DOS> f2py -m hw -h hw.pyf signatures.f Unix/DOS> f2py -c hw.pyf hw.c
Intro to mixed language programming – p. 123 Intro to mixed language programming – p. 124
Using SWIG SWIG interface file
Wrappers to C and C++ codes can be automatically generated by The interface file contains C preprocessor directives and special
SWIG SWIG directives:
SWIG is more complicated to use than F2PY /* file: hw.i */
%module hw
First make a SWIG interface file %{
/* include C header files necessary to compile the interface */
Then run SWIG to generate wrapper code #include "hw.h"
%}
Then compile and link the C code and the wrapper code
/* list functions to be interfaced: */
double hw1(double r1, double r2);
void hw2(double r1, double r2);
void hw3(double r1, double r2, double *s);
# or
%include "hw.h" /* make interface to all funcs in hw.h */
Intro to mixed language programming – p. 125 Intro to mixed language programming – p. 126
Note the underscore prefix in _hw.so (these statements are found in make_module_1.sh)
The module consists of two files: hw.py (which loads) _hw.so
Intro to mixed language programming – p. 127 Intro to mixed language programming – p. 128
Building modules with Distutils (1) Building modules with Distutils (2)
Python has a tool, Distutils, for compiling and linking extension Now run
modules python setup.py build_ext
python setup.py install --install-platlib=.
First write a script setup.py: python -c ’import hw’ # test
import os
from distutils.core import setup, Extension Can install resulting module files in any directory
name = ’hw’ # name of the module Use Distutils for professional distribution!
version = 1.0 # the module’s version number
swig_cmd = ’swig -python -I.. %s.i’ % name
print ’running SWIG:’, swig_cmd
os.system(swig_cmd)
sources = [’../hw.c’, ’hw_wrap.c’]
setup(name = name, version = version,
ext_modules = [Extension(’_’ + name, # SWIG requires _
sources,
include_dirs=[os.pardir])
])
Intro to mixed language programming – p. 129 Intro to mixed language programming – p. 130
Intro to mixed language programming – p. 131 Intro to mixed language programming – p. 132
Other tools Integrating Python with C++
Intro to mixed language programming – p. 133 Intro to mixed language programming – p. 134
This is like interfacing C functions, except that pointers are usual Interface file (hw.i):
replaced by references %module hw
void hw3(double r1, double r2, double *s) // C style %{
{ *s = sin(r1 + r2); } #include "hw.h"
%}
void hw4(double r1, double r2, double& s) // C++ style %include "typemaps.i"
{ s = sin(r1 + r2); } %apply double *OUTPUT { double* s }
%apply double *OUTPUT { double& s }
%include "hw.h"
That’s it!
Intro to mixed language programming – p. 135 Intro to mixed language programming – p. 136
Interfacing C++ classes Function bodies and usage
Intro to mixed language programming – p. 137 Intro to mixed language programming – p. 138
Intro to mixed language programming – p. 139 Intro to mixed language programming – p. 140
Adding a class method Using the module
Calling message with standard output (std::cout) is tricky from hw = HelloWorld() # make class instance
r1 = float(sys.argv[1]); r2 = float(sys.argv[2])
Python so we add a print method for printing to std.output hw.set(r1, r2) # call instance method
print coincides with Python’s keyword print so we follow the s = hw.get()
print "Hello, World! sin(%g + %g)=%g" % (r1, r2, s)
convention of adding an underscore: hw.print_()
%extend HelloWorld {
void print_() { self->message(std::cout); } hw2 = HelloWorld2() # make subclass instance
hw2.set(r1, r2)
} s = hw.gets() # original output arg. is now return value
print "Hello, World2! sin(%g + %g)=%g" % (r1, r2, s)
This is basically C++ syntax, but self is used instead of this and
%extend HelloWorld is a SWIG directive
Make extension module:
swig -python -c++ -I.. hw.i
# compile HelloWorld.cpp HelloWorld2.cpp hw_wrap.cxx
# link HelloWorld.o HelloWorld2.o hw_wrap.o to _hw.so
Intro to mixed language programming – p. 141 Intro to mixed language programming – p. 142
Remark
SWIG also makes a proxy class in hw.py, mirroring the original C++ Steering Fortran code from Python
class:
import hw # use hw.py interface to _hw.so
c = hw.HelloWorld()
c.set(r1, r2) # calls _hw.HelloWorld_set(r1, r2)
Intro to mixed language programming – p. 143 Steering Fortran code from Python – p. 144
Computational steering Example on computational steering
y(t)
Python 0.2
0.1
F77/C and NumPy arrays share the same data Consider the oscillator code. The following interactive features
Result: would be nice:
steer simulations through scripts set parameter values
do low-level numerics efficiently in C/F77 run the simulator for a number of steps and visualize
send simulation data to plotting a program change a parameter
The best of all worlds?
option: rewind a number of steps
continue simulation and visualization
Steering Fortran code from Python – p. 145 Steering Fortran code from Python – p. 146
Steering Fortran code from Python – p. 147 Steering Fortran code from Python – p. 148
About the F77 code Creating a Python interface w/F2PY
Physical and numerical parameters are in a common block scan2: trivial (only input arguments)
scan2 sets parameters in this common block: timestep2: need to be careful with
subroutine scan2(m_, b_, c_, A_, w_, y0_, tstop_, dt_, func_) output and input/output arguments
real*8 m_, b_, c_, A_, w_, y0_, tstop_, dt_ multi-dimensional arrays (y)
character func_*(*)
Note: multi-dimensional arrays are stored differently in Python (i.e. C)
can use scan2 to send parameters from Python to F77
and Fortran!
timeloop2 performs nsteps time steps:
subroutine timeloop2(y, n, maxsteps, step, time, nsteps)
integer n, step, nsteps, maxsteps
real*8 time, y(n,0:maxsteps-1)
solution available in y
Steering Fortran code from Python – p. 149 Steering Fortran code from Python – p. 150
This is how we would like to write the Python code: Subroutine signature:
maxsteps = 10000; n = 2 subroutine timeloop2(y, n, maxsteps, step, time, nsteps)
y = zeros((n,maxsteps), order=’Fortran’)
step = 0; time = 0.0 integer n, step, nsteps, maxsteps
real*8 time, y(n,0:maxsteps-1)
def run(nsteps):
global step, time, y Arguments:
y, step, time = \ y : solution (all time steps), input and output
oscillator.timeloop2(y, step, time, nsteps) n : no of solution components (2 in our example), input
maxsteps : max no of time steps, input
y1 = y[0,0:step+1] step : no of current time step, input and output
g.plot(Gnuplot.Data(t, y1, with=’lines’)) time : current value of time, input and output
nsteps : no of time steps to advance the solution
Steering Fortran code from Python – p. 151 Steering Fortran code from Python – p. 152
Interfacing the timeloop2 routine Testing the extension module
Use Cf2py comments to specify argument type: Import and print documentation:
Cf2py intent(in,out) step >>> import oscillator
Cf2py intent(in,out) time >>> print oscillator.__doc__
Cf2py intent(in,out) y This module ’oscillator’ is auto-generated with f2py
Cf2py intent(in) nsteps Functions:
y,step,time = timeloop2(y,step,time,nsteps,
Run F2PY: n=shape(y,0),maxsteps=shape(y,1))
scan2(m_,b_,c_,a_,w_,y0_,tstop_,dt_,func_)
f2py -m oscillator -c --build-dir tmp1 --fcompiler=’Gnu’ \ COMMON blocks:
../timeloop2.f \ /data/ m,b,c,a,w,y0,tstop,dt,func(20)
$scripting/src/app/oscillator/F77/oscillator.f \
only: scan2 timeloop2 : Note: array dimensions (n, maxsteps) are moved to the end of the
argument list and given default values!
Rule: always print and study the doc string since F2PY perturbs the
argument list
Steering Fortran code from Python – p. 153 Steering Fortran code from Python – p. 154
Directory with Python interface to the oscillator code: The demonstrated functionality can be coded in Matlab
src/py/mixed/simviz/f2py/ Why Python + F77?
Files: We can define our own interface in a much more powerful language
simviz_steering.py : complete script running oscillator (Python) than Matlab
from Python by calling F77 routines We can much more easily transfer data to and from or own F77 or C
simvizGUI_steering.py : as simviz_steering.py, but with a GUI
make_module.sh : build extension module or C++ libraries
We can use any appropriate visualization tool
We can call up Matlab if we want
Python + F77 gives tailored interfaces and maximum flexibility
Steering Fortran code from Python – p. 155 Steering Fortran code from Python – p. 156
Contents
Python has interfaces to the GUI toolkits Tkinter has been the default Python GUI toolkit
Tk (Tkinter) Most Python installations support Tkinter
Qt (PyQt) PyGtk, PyQt and wxPython are increasingly popular and more
wxWidgets (wxPython) sophisticated toolkits
Gtk (PyGtk) These toolkits require huge C/C++ libraries (Gtk, Qt, wxWindows) to
be installed on the user’s machine
Java Foundation Classes (JFC) (java.swing in Jython)
Some prefer to generate GUIs using an interactive designer tool,
Microsoft Foundation Classes (PythonWin) which automatically generates calls to the GUI toolkit
Some prefer to program the GUI code (or automate that process)
It is very wise (and necessary) to learn some GUI programming even
if you end up using a designer tool
We treat Tkinter (with extensions) here since it is so widely available
and simpler to use than its competitors
See doc.html for links to literature on PyGtk, PyQt, wxPython and
associated designer tools
Intro to GUI programming – p. 159 Intro to GUI programming – p. 160
More info Tkinter, Pmw and Tix
Ch. 6 in the course book Tkinter is an interface to the Tk package in C (for Tcl/Tk)
“Introduction to Tkinter” by Lundh (see doc.html) Megawidgets, built from basic Tkinter widgets, are available in Pmw
Efficient working style: grab GUI code from examples (Python megawidgets) and Tix
Graphical user interface (GUI) for computing the sine of numbers #!/usr/bin/env python
from Tkinter import *
The complete window is made of widgets import math
(also referred to as windows) root = Tk() # root (main) window
top = Frame(root) # create frame (good habit)
Widgets from left to right: top.pack(side=’top’) # pack frame in main window
a label with "Hello, World! The sine of"
hwtext = Label(top, text=’Hello, World! The sine of’)
a text entry where the user can write a number hwtext.pack(side=’left’)
pressing the button "equals" computes the sine of the number r = StringVar() # special variable to be attached to widgets
a label displays the sine value r.set(’1.2’) # default value
r_entry = Entry(top, width=6, relief=’sunken’, textvariable=r)
r_entry.pack(side=’left’)
The widgets define the event responses One can bind any keyboard or mouse event to user-defined functions
We have also replaced the "equals" button by a straight label
The pack command determines the placement of the widgets: Packing from top to bottom:
widget.pack(side=’left’) widget.pack(side=’top’)
This results in stacking widgets from left to right results in
# platform-independent font name: padx and pady adds space around widgets:
font = ’times 18 bold’
# or X11-style: hwtext.pack(side=’top’, pady=20)
font = ’-adobe-times-bold-r-normal-*-18-*-*-*-*-*-*-*’ rframe.pack(side=’top’, padx=10, pady=20)
hwtext = Label(hwframe, text=’Hello, World!’,
font=font)
sticky=’w’ means anchor=’w’ So far: variables tied to text entry and result label
(move to west) Another method:
sticky=’ew’ means fill=’x’ ask text entry about its content
(move to east and west) update result label with configure
sticky=’news’ means fill=’both’ Can use configure to update any widget property
(expand in all dirs)
With the basic knowledge of GUI programming, you may try out a
designer tool for interactive automatic generation of a GUI
Glade: designer tool for PyGtk
Gtk, PyGtk and Glade must be installed (not part of Python!)
No variable is tied to the entry: See doc.html for introductions to Glade
r_entry = Entry(rframe, width=6, relief=’sunken’)
r_entry.insert(’end’,’1.2’) # insert default value Working style: pick a widget, place it in the GUI window, open a
properties dialog, set packing parameters, set callbacks (signals in
r = float(r_entry.get()) PyGtk), etc.
s = math.sin(r)
s_label.configure(text=str(s))
Glade stores the GUI in an XML file
The GUI is hence separate from the application code
Other properties can be configured:
s_label.configure(background=’yellow’)
self.s = StringVar() # variable to be attached to s_label Event bindings call functions that take an event object as argument:
s_label = Label(rframe, textvariable=self.s, width=12)
s_label.pack(side=’left’) self.parent.bind(’<q>’, self.quit)
# finally, make a quit button: def quit(self,event): # the event arg is required!
quit_button = Button(top, text=’Goodbye, GUI World!’, self.parent.quit()
command=self.quit,
background=’yellow’, foreground=’blue’) Button must call a quit function without arguments:
quit_button.pack(side=’top’, pady=5, fill=’x’)
self.parent.bind(’<q>’, self.quit) def quit():
self.parent.quit()
def quit(self, event=None):
self.parent.quit() quit_button = Button(frame, text=’Goodbye, GUI World!’,
command=quit)
def comp_s(self, event=None):
self.s.set(’%g’ % math.sin(float(self.r.get())))
Here is aunified quit function that can be used with buttons and
event bindings:
def quit(self, event=None): Label + entry + label + entry + button + label
self.parent.quit()
# f_widget, x_widget are text entry widgets
Keyword arguments and None as default value make Python f_txt = f_widget.get() # get function expression as string
programming effective! x = float(x_widget.get()) # get x as float
#####
res = eval(f_txt) # turn f_txt expression into Python code
#####
label.configure(text=’%g’ % res) # display f(x)
Turn strings into code: eval and exec A GUI for simviz1.py
eval(s) evaluates a Python expression s Recall simviz1.py: automating simulation and visualization of an
eval(’sin(1.2) + 3.1**8’) oscillating system via a simple command-line interface
GUI interface:
exec(s) executes the string s as Python code
s = ’x = 3; y = sin(1.2*x) + x**8’
exec(s)
Use three frames: left, middle, right Version 1 of creating a text field: straightforward packing of labels
Place sliders in the left frame and entries in frames:
def textentry(self, parent, variable, label):
Place text entry fields in the middle frame """make a textentry field tied to variable"""
f = Frame(parent)
Place a sketch of the system in the right frame f.pack(side=’top’, padx=2, pady=2)
l = Label(f, text=label)
l.pack(side=’left’)
widget = Entry(f, textvariable=variable, width=8)
widget.pack(side=’left’, anchor=’w’)
return widget
The text entry frames (f) get centered: Use the grid geometry manager to place labels and text entry fields
in a spreadsheet-like fashion:
def textentry(self, parent, variable, label):
"""make a textentry field tied to variable"""
l = Label(parent, text=label)
l.grid(column=0, row=self.row_counter, sticky=’w’)
widget = Entry(parent, textvariable=variable, width=8)
widget.grid(column=1, row=self.row_counter)
self.row_counter += 1
return widget
You can mix the use of grid and pack, but not within the same frame
Ugly!
Example: display a file in a text widget Solution: combine the expand and fill options to pack:
root = Tk() text.pack(expand=1, fill=’both’)
top = Frame(root); top.pack(side=’top’) # all parent widgets as well:
text = Pmw.ScrolledText(top, ... top.pack(side=’top’, expand=1, fill=’both’)
text.pack()
# insert file as a string in the text widget: expand allows the widget to expand, fill tells in which directions
text.insert(’end’, open(filename,’r’).read()) the widget is allowed to expand
Problem: the text widget is not resized when the main window is Try fileshow1.py and fileshow2.py!
resized Resizing is important for text, canvas and list widgets
Array computing and visualization – p. 209 Array computing and visualization – p. 210
Ch. 4 in the course book NumPy enables efficient numerical computing in Python
www.scipy.org NumPy is a package of modules, which offers efficient arrays
The NumPy manual (contiguous storage) with associated array operations coded in C or
Fortran
The SciPy tutorial
There are three implementations of Numerical Python
Numeric from the mid 90s (still widely used)
numarray from about 2000
numpy from 2006
We recommend to use numpy (by Travis Oliphant)
from numpy import *
Array computing and visualization – p. 211 Array computing and visualization – p. 212
A taste of NumPy: a least-squares procedure Resulting plot
1
0 0.2 0.4 0.6 0.8 1
Array computing and visualization – p. 213 Array computing and visualization – p. 214
Array computing and visualization – p. 215 Array computing and visualization – p. 216
Array with a sequence of numbers Warning: arange is dangerous
linspace(a, b, n) generates n uniformly spaced coordinates, arange’s upper limit may or may not be included (due to round-off
starting with a and ending with b errors)
>>> x = linspace(-5, 5, 11) Better to use a safer method: seq(start, stop, increment)
>>> print x
[-5. -4. -3. -2. -1. 0. 1. 2. 3. 4. 5.] >>> from scitools.numpyutils import seq
>>> x = seq(-5, 5, 1)
>>> print x # upper limit always included
A special compact syntax is also available: [-5. -4. -3. -2. -1. 0. 1. 2. 3. 4. 5.]
>>> a = r_[-5:5:11j] # same as linspace(-5, 5, 11)
>>> print a
[-5. -4. -3. -2. -1. 0. 1. 2. 3. 4. 5.]
Array computing and visualization – p. 217 Array computing and visualization – p. 218
Array computing and visualization – p. 219 Array computing and visualization – p. 220
Changing array dimensions Array initialization from a Python function
Array computing and visualization – p. 221 Array computing and visualization – p. 222
Note: all integer indices in Python start at 0! >>> a = linspace(0, 29, 30)
>>> a.shape = (5,6)
a = linspace(-1, 1, 6) >>> a
array([[ 0., 1., 2., 3., 4., 5.,]
a[2:4] = -1 # set a[2] and a[3] equal to -1
[ 6., 7., 8., 9., 10., 11.,]
a[-1] = a[0] # set last element equal to first one [ 12., 13., 14., 15., 16., 17.,]
a[:] = 0 # set all elements of a equal to 0 [ 18., 19., 20., 21., 22., 23.,]
a.fill(0) # set all elements of a equal to 0 [ 24., 25., 26., 27., 28., 29.,]])
a.shape = (2,3) # turn a into a 2x3 matrix >>> a[1:3,:-1:2] # a[i,j] for i=1,2 and j=0,2,4
print a[0,1] # print element (0,1) array([[ 6., 8., 10.],
a[i,j] = 10 # assignment to element (i,j) [ 12., 14., 16.]])
a[i][j] = 10 # equivalent syntax (slower)
print a[:,k] # print column with index k >>> a[::3,2:-1:2] # a[i,j] for i=0,3 and j=2,4
print a[1,:] # print second row array([[ 2., 4.],
a[:,:] = 0 # set all elements of a equal to 0 [ 20., 22.]])
>>> i = slice(None, None, 3); j = slice(2, -1, 2)
>>> a[i,j]
array([[ 2., 4.],
[ 20., 22.]])
Array computing and visualization – p. 223 Array computing and visualization – p. 224
Slices refer the array data Loops over arrays (1)
With a as list, a[:] makes a copy of the data Standard loop over each element:
With a as array, a[:] is a reference to the data for i in xrange(a.shape[0]):
for j in xrange(a.shape[1]):
>>> b = a[1,:] # extract 2nd row of a a[i,j] = (i+1)*(j+1)*(j+2)
>>> print a[1,1] print ’a[%d,%d]=%g ’ % (i,j,a[i,j]),
12.0 print # newline after each row
>>> b[1] = 2
>>> print a[1,1]
2.0 # change in b is reflected in a! A standard for loop iterates over the first index:
>>> print a
Take a copy to avoid referencing via slices: [[ 2. 6. 12.]
[ 4. 12. 24.]]
>>> b = a[1,:].copy() >>> for e in a:
>>> print a[1,1] ... print e
12.0 ...
>>> b[1] = 2 # b and a are two different arrays now [ 2. 6. 12.]
>>> print a[1,1] [ 4. 12. 24.]
12.0 # a is not affected by change in b
Array computing and visualization – p. 225 Array computing and visualization – p. 226
View array as one-dimensional and iterate over all elements: Arithmetic operations can be used with arrays:
for e in a.ravel(): b = 3*a - 1 # a is array, b becomes array
print e
1) compute t1 = 3*a, 2) compute t2= t1 - 1, 3) set b = t2
Use ravel() only when reading elements, for assigning it is better
to use shape or reshape first! Array operations are much faster than element-wise operations:
>>> import time # module for measuring CPU time
For loop over all index tuples and values: >>> a = linspace(0, 1, 1E+07) # create some array
>>> for index, value in ndenumerate(a): >>> t0 = time.clock()
... print index, value >>> b = 3*a -1
... >>> t1 = time.clock() # t1-t0 is the CPU time of 3*a-1
(0, 0) 2.0
(0, 1) 6.0 >>> for i in xrange(a.size): b[i] = 3*a[i] - 1
(0, 2) 12.0 >>> t2 = time.clock()
(1, 0) 4.0 >>> print ’3*a-1: %g sec, loop: %g sec’ % (t1-t0, t2-t1)
(1, 1) 12.0 3*a-1: 2.09 sec, loop: 31.27 sec
(1, 2) 24.0
Array computing and visualization – p. 227 Array computing and visualization – p. 228
Standard math functions can take array arguments Other useful array operations
Array computing and visualization – p. 229 Array computing and visualization – p. 230
More useful array methods and attributes Modules for curve plotting and 2D/3D visualization
Easyviz is a light-weight interface to many plotting packages, using a from scitools.all import * # import numpy and plotting
Matlab-like syntax t = linspace(0, 3, 51) # 51 points between 0 and 3
y = t**2*exp(-t**2) # vectorized expression
Goal: write your program using Easyviz (“Matlab”) syntax and plot(t, y)
postpone your choice of plotting package hardcopy(’tmp1.eps’) # make PostScript image for reports
hardcopy(’tmp1.png’) # make PNG image for web pages
Note: some powerful plotting packages (Vtk, R, matplotlib, ...) may
be troublesome to install, while Gnuplot is easily installed on all 0.4
platforms 0.35
0.3
Easyviz supports (only) the most common plotting commands
0.25
0.1
(imports all of numpy, all of easyviz, plus scitools)
0.05
0
0 0.5 1 1.5 2 2.5 3
Array computing and visualization – p. 233 Array computing and visualization – p. 234
plot(t, y)
My First Easyviz Demo
xlabel(’t’)
ylabel(’y’) 0.6
t2*exp(-t2)
legend(’t^2*exp(-t^2)’)
axis([0, 3, -0.05, 0.6]) # [tmin, tmax, ymin, ymax] 0.5
title(’My First Easyviz Demo’)
# or 0.4
plot(t, y, xlabel=’t’, ylabel=’y’,
legend=’t^2*exp(-t^2)’,
axis=[0, 3, -0.05, 0.6], 0.3
y
title=’My First Easyviz Demo’,
hardcopy=’tmp1.eps’, 0.2
show=True) # display on the screen (default)
0.1
Array computing and visualization – p. 235 Array computing and visualization – p. 236
Plotting several curves in one plot The resulting plot
2 2
Compare f1 (t) = t2 e−t and f2 (t) = t4 e−t for t ∈ [0, 3]
Plotting two curves in the same plot
from scitools.all import * # for curve plotting 0.6 2 2
t *exp(-t )
def f1(t): 4 2
t *exp(-t )
return t**2*exp(-t**2)
0.5
def f2(t):
return t**2*f1(t)
0.4
t = linspace(0, 3, 51)
y1 = f1(t)
y2 = f2(t) 0.3
y
plot(t, y1)
hold(’on’) # continue plotting in the same plot
plot(t, y2) 0.2
xlabel(’t’)
ylabel(’y’) 0.1
legend(’t^2*exp(-t^2)’, ’t^4*exp(-t^2)’)
title(’Plotting two curves in the same plot’)
hardcopy(’tmp2.eps’) 0
0 0.5 1 1.5 2 2.5 3
Array computing and visualization – p. 237 Array computing and visualization – p. 238
t
Example: plot a function given on the command line Plotting 2D scalar fields
Task: plot (e.g.) f (x) = e−0.2x sin(2πx) for x ∈ [0, 4π] from scitools.all import *
Specify f (x) and x interval as text on the command line: x = y = linspace(-5, 5, 21)
xv, yv = ndgrid(x, y)
Unix/DOS> python plotf.py "exp(-0.2*x)*sin(2*pi*x)" 0 4*pi values = sin(sqrt(xv**2 + yv**2))
surf(xv, yv, values)
Program:
from scitools.all import *
formula = sys.argv[1]
xmin = eval(sys.argv[2])
xmax = eval(sys.argv[3])
1
0.8
x = linspace(xmin, xmax, 101) 0.6
0.4
y = eval(formula) 0.2
plot(x, y, title=formula) 0
-0.2
-0.4
-0.6
-0.8
Thanks to eval, input (text) with correct Python syntax can be -1
turned to running code on the fly
6
4
2 6
0 4
-2 2
0
-4 -2
Array computing and visualization – p. 239 -6 -6 -4 Array computing and visualization – p. 240
Adding plot features The resulting plot
Array computing and visualization – p. 241 Array computing and visualization – p. 242
Other commands for visualizing 2D scalar fields Commands for visualizing 3D fields
Array computing and visualization – p. 243 Array computing and visualization – p. 244
More info about Easyviz
Similar class concept as in Java and C++ Declare a base class MyBase:
All functions are virtual class MyBase:
There is no technical way of preventing users from manipulating data i1 is MyBase, i2 is MySub
and methods in an object Dictionary of user-defined attributes:
Convention: attributes and methods starting with an underscore are >>> i1.__dict__ # dictionary of user-defined attributes
treated as non-public (“protected”) {’i’: 5, ’j’: 7}
>>> i2.__dict__
Names starting with a double underscore are considered strictly {’i’: 7, ’k’: 9, ’j’: 8}
private (Python mangles class name with method name in this case:
obj.__some has actually the name _obj__some) Name of class, name of method:
>>> i2.__class__.__name__ # name of class
class MyClass: ’MySub’
def __init__(self): >>> i2.write.__name__ # name of method
self._a = False # non-public ’write’
self.b = 0 # public
self.__c = 0 # private
List names of all methods and attributes:
>>> dir(i2)
[’__doc__’, ’__init__’, ’__module__’, ’i’, ’j’, ’k’, ’write’]
Use isinstance for testing class type: Attributes can be added at run time (!)
if isinstance(i2, MySub): >>> class G: pass
# treat i2 as a MySub instance
>>> g = G()
>>> dir(g)
Can test if a class is a subclass of another: [’__doc__’, ’__module__’] # no user-defined attributes
if issubclass(MySub, MyBase):
... >>> # add instance attributes:
>>> g.xmin=0; g.xmax=4; g.ymin=0; g.ymax=1
Can test if two objects are of the same class: >>> dir(g)
[’__doc__’, ’__module__’, ’xmax’, ’xmin’, ’ymax’, ’ymin’]
if inst1.__class__ is inst2.__class__ >>> g.xmin, g.xmax, g.ymin, g.ymax
(0, 4, 0, 1)
(is checks object identity, == checks for equal contents)
>>> # add static variables:
a.__class__ refers the class object of instance a >>> G.xmin=0; G.xmax=2; G.ymin=-1; G.ymax=1
>>> g2 = G()
>>> g2.xmin, g2.xmax, g2.ymin, g2.ymax # static variables
(0, 2, -1, 1)
Can work with __dict__ directly: Special methods have leading and trailing double underscores (e.g.
>>> i2.__dict__[’q’] = ’some string’ __str__)
>>> i2.q Here are some operations defined by special methods:
’some string’
>>> dir(i2) len(a) # a.__len__()
[’__doc__’, ’__init__’, ’__module__’, c = a*b # c = a.__mul__(b)
’i’, ’j’, ’k’, ’q’, ’write’] a = a+b # a = a.__add__(b)
a += c # a.__iadd__(c)
d = a[3] # d = a.__getitem__(3)
a[3] = 0 # a.__setitem__(3, 0)
f = a(1.2, True) # f = a.__call__(1.2, True)
if a: # if a.__len__()>0: or if a.__nonzero():
Suppose we need a function of x and y with three additional Solution 1: global parameters
parameters a, b, and c: global a, b, c
def f(x, y, a, b, c): ...
def f(x, y):
return a + b*x + c*y*y return a + b*x + c*y*y
Suppose we need to send this function to another function ...
a = 0.5; b = 1; c = 0.01
def gridvalues(func, xcoor, ycoor, file): gridvalues(f, xcoor, ycoor, somefile)
for i in range(len(xcoor)):
for j in range(len(ycoor)): Global variables are usually considered evil
f = func(xcoor[i], ycoor[j])
file.write(’%g %g %g\n’ % (xcoor[i], ycoor[j], f) Solution 2: keyword arguments for parameters
def f(x, y, a=0.5, b=1, c=0.01):
func is expected to be a function of x and y only (many libraries return a + b*x + c*y*y
need to make such assumptions!)
...
How can we send our f function to gridvalues? gridvalues(f, xcoor, ycoor, somefile)
useless for other values of a, b, c
Make a class with function behavior instead of a pure function __init__(self [, args]): constructor
The parameters are class attributes __del__(self): destructor (seldom needed since Python offers
Class instances can be called as ordinary functions, now with x and automatic garbage collection)
y as the only formal arguments __str__(self): string representation for pretty printing of the
class F: object (called by print or str)
def __init__(self, a=1, b=1, c=1):
self.a = a; self.b = b; self.c = c __repr__(self): string representation for initialization
(a==eval(repr(a)) is true)
def __call__(self, x, y): # special method!
return self.a + self.b*x + self.c*y*y
f = F(a=0.5, c=0.01)
# can now call f as
v = f(0.1, 2)
...
gridvalues(f, xcoor, ycoor, somefile)
__eq__(self, x): for equality (a==b), should return True or __getitem__(self, i): used for subscripting:
False b = a[i]
__cmp__(self, x): for comparison (<, <=, >, >=, ==, __setitem__(self, i, v): used for subscripting: a[i] = v
!=); return negative integer, zero or positive integer if self is less __delitem__(self, i): used for deleting: del a[i]
than, equal or greater than x (resp.)
These three functions are also used for slices:
__len__(self): length of object (called by len(x)) a[p:q:r] implies that i is a slice object with attributes
__call__(self [, args]): calls like a(x,y) implies start (p), stop (q) and step (r)
a.__call__(x,y) b = a[:-1]
# implies
b = a.__getitem__(i)
isinstance(i, slice) is True
i.start is None
i.stop is -1
i.step is None
__add__(self, b): used for self+b, i.e., x+y implies __iadd__(self, b): self += b
x.__add__(y) __isub__(self, b): self -= b
__sub__(self, b): self-b __imul__(self, b): self *= b
__mul__(self, b): self*b __idiv__(self, b): self /= b
__div__(self, b): self/b
__pow__(self, b): self**b or pow(self,b)
__radd__(self, b): This method defines b+self, while __int__(self): conversion to integer
__add__(self, b) defines self+b. If a+b is encountered and (int(a) makes an a.__int__() call)
a does not have an __add__ method, b.__radd__(a) is called if __float__(self): conversion to float
it exists (otherwise a+b is not defined).
__hex__(self): conversion to hexadecimal number
Similar methods: __rsub__, __rmul__, __rdiv__
Documentation of special methods: see the Python Reference Manual
(not the Python Library Reference!), follow link from index “overloading -
operator”
Static data (or class variables) are common to all instances New-style classes allow static methods
>>> class Point: (methods that can be called without having an instance)
counter = 0 # static variable, counts no of instances class Point(object):
def __init__(self, x, y): _counter = 0
self.x = x; self.y = y; def __init__(self, x, y):
Point.counter += 1 self.x = x; self.y = y; Point._counter += 1
>>> for i in range(1000): def ncopies(): return Point._counter
p = Point(i*0.01, i*0.001) ncopies = staticmethod(ncopies)
Use direct access if user is allowed to read and assign values to the Example: a is global, local, and class attribute
attribute a = 1 # global variable
Use properties to restrict access, with a corresponding underlying def f(x):
non-public class attribute a = 2 # local variable
Use properties when assignment or reading requires a set of class B:
associated operations def __init__(self):
self.a = 3 # class attribute
Never use get/set functions explicitly def scopes(self):
Attributes and functions are somewhat interchanged in this scheme a = 4 # local (method) variable
⇒ that’s why we use the same naming convention
Dictionaries with variable names as keys and variables as values:
myobj.compute_something()
myobj.my_special_variable = yourobj.find_values(x,y) locals() : local variables
globals() : global variables
vars() : local variables
vars(self) : class attributes
Variable interpolation with vars: exec and eval may take dictionaries for the global and local
class C(B): namespace:
def write(self): exec code in globals, locals
local_var = -1 eval(expr, globals, locals)
s = ’%(local_var)d %(global_var)d %(a)s’ % vars()
Example:
Problem: vars() returns dict with local variables and the string
needs global, local, and class variables a = 8; b = 9
d = {’a’:1, ’b’:2}
Primary solution: use printf-like formatting: eval(’a + b’, d) # yields 3
s = ’%d %d %d’ % (local_var, global_var, self.a) and
from math import *
More exotic solution: d[’b’] = pi
all = {} eval(’a+sin(b)’, globals(), d) # yields 1
for scope in (locals(), globals(), vars(self)):
all.update(scope) Creating such dictionaries can be handy
s = ’%(local_var)d %(global_var)d %(a)s’ % all
(but now we overwrite a...)
Recall the StringFunction-classes for turning string formulas into Idea: hold independent variable and “set parameters” code as strings
callable objects Exec these strings (to bring the variables into play) right before the
f = StringFunction(’1+sin(2*x)’) formula is evaluated
print f(1.2)
class StringFunction_v3:
def __init__(self, expression, independent_variable=’x’,
We would like: set_parameters=’’):
an arbitrary name of the independent variable self._f_compiled = compile(expression,
’<string>’, ’eval’)
parameters in the formula self._var = independent_variable # ’x’, ’t’ etc.
f = StringFunction_v3(’1+A*sin(w*t)’, self._code = set_parameters
independent_variable=’t’,
def set_parameters(self, code):
set_parameters=’A=0.1; w=3.14159’)
self._code = code
print f(1.2)
f.set_parameters(’A=0.2; w=3.14159’) def __call__(self, x):
print f(1.2) exec ’%s = %g’ % (self._var, x) # assign indep. var.
if self._code: exec(self._code) # parameters?
return eval(self._f_compiled)
The exec used in the __call__ method is slow! Ideas: hold parameters in a dictionary, set the independent variable
Think of a hardcoded function, into this dictionary, run eval with this dictionary as local namespace
def f1(x): Usage:
return sin(x) + x**3 + 2*x f = StringFunction_v4(’1+A*sin(w*t)’, A=0.1, w=3.14159)
and the corresponding StringFunction-like objects f.set_parameters(A=2) # can be done later
Test function: sin(x) + x**3 + 2*x Instead of eval in __call__ we may build a (lambda) function
f1 : 1 class StringFunction:
StringFunction_v1: 13 (because of uncompiled eval) def _build_lambda(self):
StringFunction_v2: 2.3 s = ’lambda ’ + ’, ’.join(self._var)
StringFunction_v3: 22 (because of exec in __call__) # add parameters as keyword arguments:
StringFunction_v4: 2.3 if self._prms:
StringFunction_v5: 3.1 (because of loop in __call__) s += ’, ’ + ’, ’.join([’%s=%s’ % (k, self._prms[k]) \
for k in self._prms])
s += ’: ’ + self._f
self.__call__ = eval(s, self._globals)
For a call
f = StringFunction(’A*sin(x)*exp(-b*t)’, A=0.1, b=1,
independent_variables=(’x’,’t’))
the s looks like
lambda x, t, A=0.1, b=1: return A*sin(x)*exp(-b*t)
but there is some overhead associated with the __call__ op. Reconstruction: a = eval(repr(a))
Trick: extract the underlying method and call it directly # StringFunction(’1+x+a*y’,
independent_variables=(’x’,’y’),
f1 = F() a=1)
f2 = f1.__call__
# f2(x,y) is faster than f1(x,y) def __repr__(self):
kwargs = ’, ’.join([’%s=%s’ % (key, repr(value)) \
Can typically reduce CPU time from 1.3 to 1.0 for key, value in self._prms.items()])
return "StringFunction1(%s, independent_variable=%s"
Conclusion: now we can grab formulas from command-line, GUI, ", %s)" % (repr(self._f), repr(self._var), kwargs)
Web, anywhere, and turn them into callable Python functions without
any overhead
Implement a class for vectors in 3D Make the arithmetic operators +, - and * more intelligent:
Application example: u = Vec3D(1, 0, 0)
v = Vec3D(0, -0.2, 8)
>>> from Vec3D import Vec3D a = 1.2
>>> u = Vec3D(1, 0, 0) # (1,0,0) vector u+v # vector addition
>>> v = Vec3D(0, 1, 0) a+v # scalar plus vector, yields (1.2, 1, 9.2)
>>> print u**v # cross product v+a # vector plus scalar, yields (1.2, 1, 9.2)
(0, 0, 1) a-v # scalar minus vector
>>> len(u) # Eucledian norm v-a # scalar minus vector
1.0 a*v # scalar times vector
>>> u[1] # subscripting v*a # vector times scalar
0
>>> v[2]=2.5 # subscripting w/assignment
>>> u+v # vector addition
(1, 1, 2.5)
>>> u-v # vector subtraction
(1, -1, -2.5)
>>> u*v # inner (scalar, dot) product
0
>>> str(u) # pretty print
’(1, 0, 0)’
>>> repr(u) # u = eval(repr(u))
’Vec3D(1, 0, 0)’
Class programming in Python – p. 295 Class programming in Python – p. 296
Integer arrays as indices
More about array computing – p. 297 More about array computing – p. 298
More about array computing – p. 299 More about array computing – p. 300
A root function Array type and data type
# Goal: compute roots of a parabola, return real when possible, >>> import numpy
# otherwise complex >>> a = numpy.zeros(5)
def roots(a, b, c): >>> type(a)
# compute roots of a*x^2 + b*x + c = 0 <type ’numpy.ndarray’>
from numpy.lib.scimath import sqrt >>> isinstance(a, ndarray) # is a of type ndarray?
q = sqrt(b**2 - 4*a*c) # q is real or complex True
r1 = (-b + q)/(2*a)
r2 = (-b - q)/(2*a) >>> a.dtype # data (element) type object
return r1, r2 dtype(’float64’)
>>> a.dtype.name
>>> a = 1; b = 2; c = 100 ’float64’
>>> roots(a, b, c) # complex roots >>> a.dtype.char # character code
((-1+9.94987437107j), (-1-9.94987437107j)) ’d’
>>> a.dtype.itemsize # no of bytes per array element
>>> a = 1; b = 4; c = 1 8
>>> roots(a, b, c) # real roots >>> b = zeros(6, float32)
(-0.267949192431, -3.73205080757) >>> a.dtype == b.dtype # do a and b have the same data type?
False
>>> c = zeros(2, float)
>>> a.dtype == c.dtype
True
More about array computing – p. 301 More about array computing – p. 302
NumPy has an array type, matrix, much like Matlab’s array type For matrix objects, the * operator means matrix-matrix or
>>> x1 = array([1, 2, 3], float) matrix-vector multiplication (not elementwise multiplication)
>>> x2 = matrix(x1) # or just mat(x) >>> A = eye(3) # identity matrix
>>> x2 # row vector >>> A = mat(A) # turn array to matrix
matrix([[ 1., 2., 3.]]) >>> A
>>> x3 = matrix(x1.transpose() # column vector matrix([[ 1., 0., 0.],
>>> x3 [ 0., 1., 0.],
matrix([[ 1.], [ 0., 0., 1.]])
[ 2.], >>> y2 = x2*A # vector-matrix product
[ 3.]]) >>> y2
>>> type(x3) matrix([[ 1., 2., 3.]])
<class ’numpy.core.defmatrix.matrix’> >>> y3 = A*x3 # matrix-vector product
>>> isinstance(x3, matrix) >>> y3
True matrix([[ 1.],
[ 2.],
[ 3.]])
Only 1- and 2-dimensional arrays can be matrix
More about array computing – p. 303 More about array computing – p. 304
Compound expressions generate temporary arrays In-place array arithmetics
Let us evaluate f1(x) for a vector x: Expressions like 3*a-1 generates temporary arrays
def f1(x): With in-place modifications of arrays, we can avoid temporary arrays
return exp(-x*x)*log(1+x*sin(x))
(to some extent)
Calling f1(x) is equivalent to the code b = a
b *= 3 # or multiply(b, 3, b)
temp1 = -x b -= 1 # or subtract(b, 1, b)
temp2 = temp1*x
temp3 = exp(temp2) Note: a is changed, use b = a.copy()
temp4 = sin(x)}
temp5 = x*temp4 In-place operations:
temp6 = 1 + temp4 a *= 3.0 # multiply a’s elements by 3
temp7 = log(temp5) a -= 1.0 # subtract 1 from each element
result = temp3*temp7 a /= 3.0 # divide each element by 3
a += 1.0 # add 1 to each element
a **= 2.0 # square all elements
More about array computing – p. 305 More about array computing – p. 306
Loops over an array run slowly A mathematical function written for scalar arguments can (normally) take
Vectorization = replace explicit loops by functions calls such that the array arguments:
whole loop is implemented in C (or Fortran) >>> def f(x):
Explicit loops: ... return x**2 + sinh(x)*exp(-x) + 1
...
r = zeros(x.shape, x.dtype) >>> # scalar argument:
for i in xrange(x.size): >>> x = 2
>>> f(x)
r[i] = sin(x[i])
5.4908421805556333
Vectorized version: >>> # array argument:
r = sin(x) >>> y = array([2, -1, 0, 1.5])
>>> f(y)
array([ 5.49084218, -1.19452805, 1. , 3.72510647])
Arithmetic expressions work for both scalars and arrays
Many fundamental functions work for scalars and arrays
Ex: x**2 + abs(x) works for x scalar or array
More about array computing – p. 307 More about array computing – p. 308
Vectorization of functions with if tests; problem Vectorization of functions with if tests; solutions
Consider a function with an if test: Simplest remedy: use NumPy’s vectorize class to allow array
def somefunc(x): arguments to a function:
if x < 0:
return 0 >>> somefuncv = vectorize(somefunc, otypes=’d’)
else: >>> # test:
return sin(x) >>> x = linspace(-1, 1, 3); print x
# or [-1. 0. 1.]
def somefunc(x): return 0 if x < 0 else sin(x) >>> somefuncv(x)
array([ 0. , 0. , 0.84147098])
This function works with a scalar x but not an array Note: The data type must be specified as a character (’d’ for
Problem: x<0 results in a boolean array, not a boolean value that double)
can be used in the if test The speed of somefuncv is unfortunately quite slow
>>> x = linspace(-1, 1, 3); print x
[-1. 0. 1.] A better solution, using where:
>>> y = x < 0 def somefuncv2(x):
>>> y x1 = zeros(x.size, float)
array([ True, False, False], dtype=bool) x2 = sin(x)
>>> bool(y) # turn object into a scalar boolean value return where(x < 0, x1, x2)
...
ValueError: The truth value of an array with more than one
element is ambiguous. Use a.any() or a.all()
More about array computing – p. 309 More about array computing – p. 310
More about array computing – p. 311 More about array computing – p. 312
Random numbers Basic linear algebra
Drawing scalar random numbers: NumPy contains the linalg module for
import random solving linear systems
random.seed(2198) # control the seed
computing the determinant of a matrix
u = random.random() # uniform number on [0,1)
u = random.uniform(-1, 1) # uniform number on [-1,1) computing the inverse of a matrix
u = random.gauss(m, s) # number from N(m,s)
computing eigenvalues and eigenvectors of a matrix
Vectorized drawing of random numbers (arrays): solving least-squares problems
from numpy import random computing the singular value decomposition of a matrix
random.seed(12) # set seed
u = random.random(n) # n uniform numbers on (0,1)
computing the Cholesky decomposition of a matrix
u = random.uniform(-1, 1, n) # n uniform numbers on (-1,1)
u = random.normal(m, s, n) # n numbers from N(m,s)
More about array computing – p. 313 More about array computing – p. 314
More about array computing – p. 315 More about array computing – p. 316
File I/O with arrays; plain ASCII format File I/O with arrays; binary pickling
Plain text output to file (just dump repr(array)): Dump arrays with cPickle:
a = linspace(1, 21, 21); a.shape = (2,10) # a1 and a2 are two arrays
file = open(’tmp.dat’, ’w’) import cPickle
file.write(’Here is an array a:\n’) file = open(’tmp.dat’, ’wb’)
file.write(repr(a)) # dump string representation of a file.write(’This is the array a1:\n’)
file.close() cPickle.dump(a1, file)
file.write(’Here is another array a2:\n’)
Plain text input (just take eval on input line): cPickle.dump(a2, file)
file.close()
file = open(’tmp.dat’, ’r’)
file.readline() # load the first line (a comment) Read in the arrays again (in correct order):
b = eval(file.read())
file.close() file = open(’tmp.dat’, ’rb’)
file.readline() # swallow the initial comment line
b1 = cPickle.load(file)
file.readline() # swallow next comment line
b2 = cPickle.load(file)
file.close()
More about array computing – p. 317 More about array computing – p. 318
More about array computing – p. 319 More about array computing – p. 320
SciPy SymPy: symbolic computing in Python
SciPy is a comprehensive package (by Eric Jones, Travis Oliphant, SymPy is a Python package for symbolic computing
Pearu Peterson) for scientific computing with Python Easy to install, easy to extend
Much overlap with ScientificPython Easy to use:
SciPy interfaces many classical Fortran packages from Netlib >>> from sympy import *
(QUADPACK, ODEPACK, MINPACK, ...) >>> x = Symbol(’x’)
>>> f = cos(acos(x))
Functionality: special functions, linear algebra, numerical integration, >>> f
ODEs, random variables and statistics, optimization, root finding, cos(acos(x))
interpolation, ... >>> sin(x).series(x, 4) # 4 terms of the Taylor series
x - 1/6*x**3 + O(x**4)
May require some installation efforts (applies ATLAS) >>> dcos = diff(cos(2*x), x)
>>> dcos
See www.scipy.org -2*sin(2*x)
>>> dcos.subs(x, pi).evalf() # x=pi, float evaluation
0
>>> I = integrate(log(x), x)
>>> print I
-x + x*log(x)
More about array computing – p. 321 More about array computing – p. 322
Migrating slow for loops over NumPy arrays to Fortran, C and C++ Ch. 5, 9 and 10 in the course book
F2PY handling of arrays F2PY manual
Handwritten C and C++ modules SWIG manual
C++ class for wrapping NumPy arrays Examples coming with the SWIG source code
C++ modules using SCXX Electronic Python documentation:
Extending and Embedding..., Python/C API
Pointer communication and SWIG
Efficiency considerations Python in a Nutshell
Python Essential Reference (Beazley)
Fill a NumPy array with function values: Python loops over arrays are extremely slow
n = 2000 NumPy vectorization may be sufficient
a = zeros((n,n))
xcoor = arange(0,1,1/float(n)) However, NumPy vectorization may be inconvenient
ycoor = arange(0,1,1/float(n)) - plain loops in Fortran/C/C++ are much easier
for i in range(n): Write administering code in Python
for j in range(n):
a[i,j] = f(xcoor[i], ycoor[j]) # f(x,y) = sin(x*y) + 8*x Identify bottlenecks (via profiling)
Fortran/C/C++ version: (normalized) time 1.0 Migrate slow Python code to Fortran, C, or C++
NumPy vectorized evaluation of f: time 3.0 Python-Fortran w/NumPy arrays via F2PY: easy
Python loop version (version): time 140 (math.sin) Python-C/C++ w/NumPy arrays via SWIG: not that easy,
handwritten wrapper code is most common
Python loop version (version): time 350 (numarray.sin)
F2PY-generated modules treat storage schemes transparently Insert Cf2py comments to tell that a is an output variable:
If input array has C storage, a copy is taken, calculated with, and subroutine gridloop2(a, xcoor, ycoor, nx, ny, func1)
returned as output integer nx, ny
real*8 a(0:nx-1,ny-1), xcoor(0:nx-1), ycoor(0:ny-1), func1
F2PY needs to know whether arguments are input, output or both external func1
Cf2py intent(out) a
To monitor (hidden) array copying, turn on the flag Cf2py intent(in) xcoor
Cf2py intent(in) ycoor
f2py ... -DF2PY_REPORT_ON_ARRAY_COPY=1 Cf2py depend(nx,ny) a
In-place operations on NumPy arrays are possible in Fortran, but the
default is to work on a copy, that is why our gridloop1 function
does not work
F2PY generates this Python interface: Output arrays are returned and are not part of the argument list, as
>>> import ext_gridloop seen from Python
>>> print ext_gridloop.gridloop2.__doc__ Need depend(nx,ny) a to specify that a is to be created with
gridloop2 - Function signature: size nx, ny in the wrapper
a = gridloop2(xcoor,ycoor,func1,[nx,ny,func1_extra_args])
Required arguments: Array dimensions are optional arguments (!)
xcoor : input rank-1 array(’d’) with bounds (nx) class Grid2Deff(Grid2D):
ycoor : input rank-1 array(’d’) with bounds (ny) ...
func1 : call-back function def ext_gridloop2(self, f):
Optional arguments: a = ext_gridloop.gridloop2(self.xcoor, self.ycoor, f)
nx := len(xcoor) input int return a
ny := len(ycoor) input int
func1_extra_args := () input tuple The modified interface is well documented in the doc strings
Return objects: generated by F2PY
a : rank-2 array(’d’) with bounds (nx,ny)
What if we really want to send a as argument and let F77 modify it? F2PY generated modules has a function for checking if an array has
def ext_gridloop1(self, f): column major storage (i.e., Fortran storage):
lx = size(self.xcoor); ly = size(self.ycoor) >>> a = zeros((n,n), order=’Fortran’)
a = zeros((lx,ly)) >>> isfortran(a)
ext_gridloop.gridloop1(a, self.xcoor, self.ycoor, f) True
return a >>> a = asarray(a, order=’C’) # back to C storage
>>> isfortran(a)
This is not Pythonic code, but it can be realized False
Fortran function: Only when a has Fortran (column major) storage, the Fortran
subroutine gridloop1(a, xcoor, ycoor, nx, ny, func1) function works on a itself
integer nx, ny
real*8 a(0:nx-1,ny-1), xcoor(0:nx-1), ycoor(0:ny-1), func1 If we provide a plain NumPy array, it has C (row major) storage, and
C call this function with an array a that has the wrapper sends a copy to the Fortran function and transparently
C column major storage! transposes the result
Cf2py intent(inout) a
Cf2py intent(in) xcoor Hence, F2PY is very user-friendly, at a cost of some extra memory
Cf2py intent(in) ycoor
Cf2py depend(nx, ny) a The array returned from F2PY has Fortran (column major) storage
Python call:
def ext_gridloop1(self, f):
lx = size(self.xcoor); ly = size(self.ycoor)
a = asarray(a, order=’Fortran’)
ext_gridloop.gridloop1(a, self.xcoor, self.ycoor, f)
return a
intent(out) a is the right specification; a should not be an Find problems with this code (comp is a Fortran function in the
argument in the Python call extension module pde):
F2PY wrappers will work on copies, if needed, and hide problems x = arange(0, 1, 0.01)
with different storage scheme in Fortran and C/Python b = myfunc1(x) # compute b array of size (n,n)
u = myfunc2(x) # compute u array of size (n,n)
Python call: c = myfunc3(x) # compute c array of size (n,n)
a = ext_gridloop.gridloop2(self.xcoor, self.ycoor, f) dt = 0.05
for i in range(n)
u = pde.comp(u, b, c, i*dt)
subroutine gridloop_vec(a, xcoor, ycoor, nx, ny, func1) What about this Python callback:
integer nx, ny
real*8 a(0:nx-1,ny-1), xcoor(0:nx-1), ycoor(0:ny-1) def myfuncf77(a, xcoor, ycoor, nx, ny):
Cf2py intent(in,out) a """Vectorized function to be called from extension module."""
Cf2py intent(in) xcoor x = xcoor[:,NewAxis]; y = ycoor[NewAxis,:]
Cf2py intent(in) ycoor a = myfunc(x, y)
external func1
a now refers to a new NumPy array; no in-place modification of the
C fill array a with values taken from a Python function, input argument
C do that without loop and point-wise callback, do a
C vectorized callback instead:
call func1(a, xcoor, ycoor, nx, ny)
C could work further with array a here...
return
end
Callbacks are expensive Idea: if callback formula is a string, we could embed it in a Fortran
Even vectorized callback functions degrades performace a bit function and call Fortran instead of Python
Alternative: implement “callback” in F77 F2PY has a module for “inline” Fortran code specification and
building
Flexibility from the Python side: use a string to switch between the
source = """
“callback” (F77) functions real*8 function fcb(x, y)
a = ext_gridloop.gridloop2_str(self.xcoor, self.ycoor, ’myfunc’) real*8 x, y
fcb = %s
F77 wrapper: return
end
subroutine gridloop2_str(xcoor, ycoor, func_str) """ % fstr
character*(*) func_str import f2py2e
... f2py_args = "--fcompiler=’Gnu’ --build-dir tmp2 etc..."
if (func_str .eq. ’myfunc’) then f2py2e.compile(source, modulename=’callback’,
call gridloop2(a, xcoor, ycoor, nx, ny, myfunc) extra_args=f2py_args, verbose=True,
else if (func_str .eq. ’f2’) then source_fn=’sourcecodefile.f’)
call gridloop2(a, xcoor, ycoor, nx, ny, f2) import callback
... <work with the new extension module>
To glue F77 gridloop2 and the F77 callback function, we make a source = """
real*8 function fcb(x, y)
gridloop2 wrapper: ...
subroutine gridloop2_fcb(a, xcoor, ycoor, nx, ny)
subroutine gridloop2_fcb(a, xcoor, ycoor, nx, ny) ...
integer nx, ny """ % fstr
real*8 a(0:nx-1,ny-1), xcoor(0:nx-1), ycoor(0:ny-1)
Cf2py intent(out) a f2py_args = "--fcompiler=’Gnu’ --build-dir tmp2"\
Cf2py depend(nx,ny) a " -DF2PY_REPORT_ON_ARRAY_COPY=1 "\
real*8 fcb " ./ext_gridloop.so"
external fcb f2py2e.compile(source, modulename=’callback’,
extra_args=f2py_args, verbose=True,
call gridloop2(a, xcoor, ycoor, nx, ny, fcb) source_fn=’_cb.f’)
return
end import callback
a = callback.gridloop2_fcb(self.xcoor, self.ycoor)
This wrapper and the callback function fc constitute the F77 source
code, stored in source
The source calls gridloop2 so the module must be linked with the
module containing gridloop2 (ext_gridloop.so)
gridloop2 could be generated on the fly Extracting a pointer to the callback function
def ext_gridloop2_compile(self, fstr): We can implement the callback function in Fortran, grab an
if not isinstance(fstr, str):
<error> F2PY-generated pointer to this function and feed that as the func1
# generate Fortran source for gridloop2: argument such that Fortran calls Fortran and not Python
import f2py2e
source = """ For a module m, the pointer to a function/subroutine f is reached as
subroutine gridloop2(a, xcoor, ycoor, nx, ny) m.f._cpointer
...
do j = 0, ny-1 def ext_gridloop2_fcb_ptr(self):
y = ycoor(j) from callback import fcb
do i = 0, nx-1 a = ext_gridloop.gridloop2(self.xcoor, self.ycoor,
x = xcoor(i) fcb._cpointer)
a(i,j) = %s return a
...
""" % fstr # no callback, the expression is hardcoded fcb is a Fortran implementation of the callback in an
f2py2e.compile(source, modulename=’ext_gridloop2’, ...) F2PY-generated extension module callback
def ext_gridloop2_v2(self):
import ext_gridloop2
return ext_gridloop2.gridloop2(self.xcoor, self.ycoor)
Let us write the gridloop1 and gridloop2 functions in C Use single-pointer arrays
Typical C code: Write C function signature with Fortran 77 syntax
void gridloop1(double** a, double* xcoor, double* ycoor, Use F2PY to generate an interface file
int nx, int ny, Fxy func1)
{ Use F2PY to compile the interface file and the C code
int i, j;
for (i=0; i<nx; i++) {
for (j=0; j<ny; j++) {
a[i][j] = func1(xcoor[i], ycoor[j])
}
3: Run SWIG needs some non-trivial tweaking to handle NumPy arrays (i.e.,
Unix/DOS> f2py -m ext_gridloop -h ext_gridloop.pyf signatures.f the use of SWIG is much more complicated for array arguments than
running F2PY)
4: Run
We shall write a complete extension module by hand
Unix/DOS> f2py -c --fcompiler=Gnu --build-dir tmp1 \
-DF2PY_REPORT_ON_ARRAY_COPY=1 ext_gridloop.pyf gridloop.c We will need documentation of the Python C API (from Python’s
electronic doc.) and the NumPy C API (from the NumPy book)
See
src/py/mixed/Grid2D/C/f2py
Source code files in
src/mixed/py/Grid2D/C/plain
for all the involved files
Warning: manual writing of extension modules is very much more
complicated than using F2PY on Fortran or C code! You need to
know C quite well...
Wrap an existing memory segment (with array data) in a NumPy Turn any relevant Python sequence type (list, type, array) into a
array object: NumPy array:
PyObject * PyArray_FromDimsAndData(int n_dimensions, PyObject * PyArray_ContiguousFromObject(PyObject *object,
int dimensions[n_dimensions], int item_type,
int item_type, int min_dim,
char *data); int max_dim);
/* vec is a double* with 10*21 double entries */ Use min_dim and max_dim as 0 to preserve the original
PyArrayObject *a; int dims[2]; dimensions of object
dims[0] = 10; dims[1] = 21;
a = (PyArrayObject *) PyArray_FromDimsAndData(2, dims, Application: ensure that an object is a NumPy array,
PyArray_DOUBLE, (char *) vec);
/* a_ is a PyObject pointer, representing a sequence
Note: vec is a stream of numbers, now interpreted as a (NumPy array or list or tuple) */
two-dimensional array, stored row by row PyArrayObject a;
a = (PyArrayObject *) PyArray_ContiguousFromObject(a_,
PyArray_DOUBLE, 0, 0);
a list, tuple or NumPy array a is now a NumPy array
for (i = 0; i < nx; i++) { There is a major problem with our loop:
for (j = 0; j < ny; j++) {
a_ij = (double *)(a->data+i*a->strides[0]+j*a->strides[1]); arglist = Py_BuildValue("(dd)", *x_i, *y_j);
x_i = (double *)(xcoor->data + i*xcoor->strides[0]); result = PyEval_CallObject(func1, arglist);
y_j = (double *)(ycoor->data + j*ycoor->strides[0]); *a_ij = PyFloat_AS_DOUBLE(result);
/* call Python function pointed to by func1: */ For each pass, arglist and result are dynamically allocated,
arglist = Py_BuildValue("(dd)", *x_i, *y_j);
result = PyEval_CallObject(func1, arglist); but not destroyed
*a_ij = PyFloat_AS_DOUBLE(result); From the Python side, memory management is automatic
}
} From the C side, we must do it ourself
return Py_BuildValue(""); /* return None: */
} Python applies reference counting
Each object has a number of references, one for each usage
The object is destroyed when there are no references
We should check that allocations work fine: gridloop2: as gridloop1, but array a is returned
arglist = Py_BuildValue("(dd)", *x_i, *y_j);
if (arglist == NULL) { /* out of memory */ static PyObject *gridloop2(PyObject *self, PyObject *args)
PyErr_Format(PyExc_MemoryError, {
"out of memory for 2-tuple); PyArrayObject *a, *xcoor, *ycoor;
int a_dims[2];
PyObject *func1, *arglist, *result;
The C code becomes quite comprehensive; much more testing than int nx, ny, i, j;
“active” statements double *a_ij, *x_i, *y_j;
/* arguments: xcoor, ycoor, func1 */
if (!PyArg_ParseTuple(args, "O!O!O:gridloop2",
&PyArray_Type, &xcoor,
&PyArray_Type, &ycoor,
&func1)) {
return NULL; /* PyArg_ParseTuple has raised an exception */
}
nx = xcoor->dimensions[0]; ny = ycoor->dimensions[0];
NumPy array code in C can be simplified using macros Check the length of a specified dimension:
First, a smart macro wrapping an argument in quotes: #define DIMCHECK(a, dim, expected_length) \
if (a->dimensions[dim] != expected_length) { \
#define QUOTE(s) # s /* turn s into string "s" */ PyErr_Format(PyExc_ValueError, \
"%s array has wrong %d-dimension=%d (expected %d)", \
Check the type of the array data: QUOTE(a),dim,a->dimensions[dim],expected_length); \
return NULL; \
#define TYPECHECK(a, tp) \ }
if (a->descr->type_num != tp) { \
PyErr_Format(PyExc_TypeError, \
"%s array is not of correct type (%d)", QUOTE(a), tp); \
return NULL; \
}
Check the dimensions of a NumPy array: Macros can greatly simplify indexing:
#define NDIMCHECK(a, expected_ndim) \ #define IND1(a, i) *((double *)(a->data + i*a->strides[0]))
if (a->nd != expected_ndim) { \ #define IND2(a, i, j) \
PyErr_Format(PyExc_ValueError, \ *((double *)(a->data + i*a->strides[0] + j*a->strides[1]))
"%s array is %d-dimensional, expected to be %d-dimensional",\
QUOTE(a), a->nd, expected_ndim); \ Application:
return NULL; \
} for (i = 0; i < nx; i++) {
for (j = 0; j < ny; j++) {
Application: arglist = Py_BuildValue("(dd)", IND1(xcoor,i), IND1(ycoor,j));
result = PyEval_CallObject(func1, arglist);
NDIMCHECK(xcoor, 1); TYPECHECK(xcoor, PyArray_DOUBLE); Py_DECREF(arglist);
if (result == NULL) return NULL; /* exception in func1 */
If xcoor is 2-dimensional, an exceptions is raised by NDIMCHECK: IND2(a,i,j) = PyFloat_AS_DOUBLE(result);
exceptions.ValueError Py_DECREF(result);
xcoor array is 2-dimensional, but expected to be 1-dimensional }
}
Create return array: The method table must always be present - it lists the functions that
a_dims[0] = nx; a_dims[1] = ny; should be callable from Python:
a = (PyArrayObject *) PyArray_FromDims(2, a_dims, static PyMethodDef ext_gridloop_methods[] = {
PyArray_DOUBLE); {"gridloop1", /* name of func when called from Python */
if (a == NULL) { gridloop1, /* corresponding C function */
printf("creating a failed, dims=(%d,%d)\n", METH_VARARGS, /* ordinary (not keyword) arguments */
a_dims[0],a_dims[1]); gridloop1_doc}, /* doc string for gridloop1 function */
return NULL; /* PyArray_FromDims raises an exception */ {"gridloop2", /* name of func when called from Python */
} gridloop2, /* corresponding C function */
METH_VARARGS, /* ordinary (not keyword) arguments */
After the loop, return a: gridloop2_doc}, /* doc string for gridloop1 function */
return PyArray_Return(a); {NULL, NULL}
};
Usage:
python setup.py build_ext
python setup.py install --install-platlib=.
# test module:
python -c ’import ext_gridloop; print ext_gridloop.__doc__’
The usage is the same as in Fortran, when viewed from Python Things usually go wrong when you program...
No problems with storage formats and unintended copying of a in Errors in C normally shows up as “segmentation faults” or “bus error”
gridloop1, or optional arguments; here we have full control of all - no nice exception with traceback
details Simple trick: run python under a debugger
gridloop2 is the “right” way to do it unix> gdb ‘which python‘
It is much simpler to use Fortran and F2PY (gdb) run test.py
When the script crashes, issue the gdb command where for a
traceback (if the extension module is compiled with -g you can see
the line number of the line that triggered the error)
You can only see the traceback, no breakpoints, prints etc., but a tool,
PyDebug, allows you to do this
In src/py/mixed/Grid2D/C/plain/debugdemo there are some C files Check that the extension module was compiled with debug mode on
with errors (usually the -g option to the C compiler)
Try Run python under a debugger:
./make_module_1.sh gridloop1 unix> gdb ‘which python‘
GNU gdb 6.0-debian
This scripts runs ...
(gdb) run ../../../Grid2Deff.py verify1
../../../Grid2Deff.py verify1 Starting program: /usr/bin/python ../../../Grid2Deff.py verify1
...
which leads to a segmentation fault, implying that something is wrong Program received signal SIGSEGV, Segmentation fault.
in the C code (errors in the Python script shows up as exceptions 0x40cdfab3 in gridloop1 (self=0x0, args=0x1) at gridloop1.c:20
with traceback) 20 if (!PyArg_ParseTuple(args, "O!O!O!O:gridloop1",
This is the line where something goes wrong...
Try Try
./make_module_1.sh gridloop2 ./make_module_1.sh gridloop3
and experience that Most of the program seems to work, but a segmentation fault occurs
python -c ’import ext_gridloop; print dir(ext_gridloop); \ (according to gdb):
print ext_gridloop.__doc__’
(gdb) where
ends with an exception (gdb) #0 0x40115d1e in mallopt () from /lib/libc.so.6
#1 0x40114d33 in malloc () from /lib/libc.so.6
Traceback (most recent call last): #2 0x40449fb9 in PyArray_FromDimsAndDataAndDescr ()
File "<string>", line 1, in ? from /usr/lib/python2.3/site-packages/Numeric/_numpy.so
SystemError: dynamic module not initialized properly ...
#42 0x080d90db in PyRun_FileExFlags ()
This signifies that the module misses initialization #43 0x080d9d1f in PyRun_String ()
#44 0x08100c20 in _IO_stdin_used ()
Reason: no Py_InitModule3 call #45 0x401ee79c in ?? ()
#46 0x41096bdc in ?? ()
Hmmm...no sign of where in gridloop3.c the error occurs,
except that the Grid2Deff.py script successfully calls both
gridloop1 and gridloop2, it fails when printing the
returned array
Numerical mixed-language programming – p. 393 Numerical mixed-language programming – p. 394
Try Try
./make_module_1.sh gridloop4 ./make_module_1.sh gridloop5
and experience and experience
python -c import ext_gridloop; print dir(ext_gridloop); \ python -c import ext_gridloop; print dir(ext_gridloop); \
print ext_gridloop.__doc__ print ext_gridloop.__doc__
Traceback (most recent call last): Traceback (most recent call last):
File "<string>", line 1, in ? File "<string>", line 1, in ?
ImportError: dynamic module does not define init function (initext_gridloo ImportError: ./ext_gridloop.so: undefined symbol: mydebug
Eventuall we got a precise error message (the gridloop2 in gridloop5.c calls a function mydebug, but the
initext_gridloop was not implemented) function is not implemented (or linked)
Again, a precise ImportError helps detecting the problem
Check that import_array() is called if the NumPy C API is in Implement the computational loop in a traditional C function
use! Aim: pretend that we have this loop already in a C library
ImportError suggests wrong module initialization or missing Need to write a wrapper between this C function and Python
required/user functions
Could think of SWIG for generating the wrapper, but SWIG with
You need experience to track down errors in the C code NumPy arrays is a bit tricky - it is in fact simpler to write the wrapper
An error in one place often shows up as an error in another place by hand
(especially indexing out of bounds or wrong memory handling)
Use a debugger (gdb) and print statements in the C code and the
calling script
C++ modules are (almost) as error-prone as C modules
C functions taking a two-dimensional array as argument will normally How can we write wrapper code that sends NumPy array data to a C
represent the array as a double pointer: function as a double pointer?
void gridloop1_C(double **a, double *xcoor, double *ycoor, How can we make callbacks to Python when the C function expects
int nx, int ny, Fxy func1) callbacks to standard C functions, represented as function pointers?
{
int i, j; We need to cope with these problems to interface (numerical) C
for (i=0; i<nx; i++) { libraries!
for (j=0; j<ny; j++) {
a[i][j] = func1(xcoor[i], ycoor[j]);
} src/mixed/py/Grid2D/C/clibcall
}
}
From NumPy array to double pointer Callback via a function pointer (1)
The alternative gridloop1 code (2) gridloop1 with C++ array object
_pyfunc_ptr = func1; /* store func1 for use in _pycall */ Programming with NumPy arrays in C is much less convenient than
/* allocate help array for creating a double pointer: */ programming with C++ array objects
app = (double **) malloc(nx*sizeof(double*)); SomeArrayClass a(10, 21);
ap = (double *) a->data; a(1,2) = 3; // indexing
for (i = 0; i < nx; i++) { app[i] = &(ap[i*ny]); }
xp = (double *) xcoor->data;
yp = (double *) ycoor->data; Idea: wrap NumPy arrays in a C++ class
gridloop1_C(app, xp, yp, nx, ny, _pycall); Goal: use this class wrapper to simplify the gridloop1 wrapper
free(app);
return Py_BuildValue(""); /* return None */
} src/py/mixed/Grid2D/C++/plain
Thin C++ layer on top of the Python C API #include <PWONumber.h> // class for numbers
#include <PWOSequence.h> // class for tuples
Each Python type (number, tuple, list, ...) is represented as a C++ #include <PWOMSequence.h> // class for lists (immutable sequences)
class void test_scxx()
The resulting code is quite close to Python {
double a_ = 3.4;
SCXX objects performs reference counting automatically PWONumber a = a_; PWONumber b = 7;
PWONumber c; c = a + b;
PWOList list; list.append(a).append(c).append(b);
PWOTuple tp(list);
for (int i=0; i<tp.len(); i++) {
std::cout << "tp["<<i<<"]="<<double(PWONumber(tp[i]))<<" ";
}
std::cout << std::endl;
PyObject* py_a = (PyObject*) a; // convert to Python C struct
}
Weave is an easy-to-use tool for inlining C++ snippets in Python The loops: inline C++ with Blitz++ array syntax:
codes code = r"""
int i,j;
A quick demo shows its potential for (i=0; i<nx; i++) {
class Grid2Deff: for (j=0; j<ny; j++) {
... a(i,j) = cppcb(xcoor(i), ycoor(j));
def ext_gridloop1_weave(self, fstr): }
"""Migrate loop to C++ with aid of Weave.""" }
"""
from scipy import weave
# the callback function is now coded in C++
# (fstr must be valid C++ code):
extra_code = r"""
double cppcb(double x, double y) {
return %s;
}
""" % fstr
Compile and link the extra code extra_code and the main code When interfacing many libraries, data must be grabbed from one
(loop) code: code and fed into another
nx = size(self.xcoor); ny = size(self.ycoor) Example: NumPy array to/from some C++ data class
a = zeros((nx,ny))
xcoor = self.xcoor; ycoor = self.ycoor Idea: make filters, converting one data to another
err = weave.inline(code, [’a’, ’nx’, ’ny’, ’xcoor’, ’ycoor’],
type_converters=weave.converters.blitz, Data objects are represented by pointers
support_code=extra_code, compiler=’gcc’) SWIG can send pointers back and forth without needing to wrap the
return a
whole underlying data object
Note that we pass the names of the Python objects we want to Let’s illustrate with an example!
access in the C++ code
Weave is smart enough to avoid recompiling the code if it has not
changed since last compilation
Calling C++ from Python (1) Calling C++ from Python (2)
Instead of just calling In case we work with copied data, we must copy both ways:
ext_gridloop.gridloop1(a, self.xcoor, self.ycoor, func) a_p = self.c.py2my_copy(a)
return a x_p = self.c.py2my_copy(self.xcoor)
y_p = self.c.py2my_copy(self.ycoor)
as before, we need some explicit conversions: f_p = self.c.set_pyfunc(func)
# a is a NumPy array ext_gridloop.gridloop1(a_p, x_p, y_p, f_p)
# self.c is the conversion module (class Convert_MyArray) a = self.c.my2py_copy(a_p)
a_p = self.c.py2my(a) return a
x_p = self.c.py2my(self.xcoor)
y_p = self.c.py2my(self.ycoor) Note: final a is not the same a object as we started with
f_p = self.c.set_pyfunc(func)
ext_gridloop.gridloop1(a_p, x_p, y_p, f_p)
return a # a_p and a share data!
swig -python -c++ -I. ext_gridloop.i We have implemented several versions of gridloop1 and gridloop2:
root=‘python -c ’import sys; print sys.prefix’‘ Fortran subroutines, working on Fortran arrays, automatically
ver=‘python -c ’import sys; print sys.version[:3]’‘
wrapped by F2PY
g++ -I. -O3 -g -I$root/include/python$ver \
-c convert.cpp gridloop.cpp ext_gridloop_wrap.cxx Hand-written C extension module, working directly on NumPy array
g++ -shared -o _ext_gridloop.so \ structs in C
convert.o gridloop.o ext_gridloop_wrap.o
Hand-written C wrapper to a C function, working on standard C
arrays (incl. double pointer)
Hand-written C++ wrapper, working on a C++ class wrapper for
NumPy arrays
As last point, but simplified wrapper utilizing SCXX
C++ functions based on MyArray, plus C++ filter for pointer
conversion, wrapped by SWIG
What is the most convenient approach in this case? Which alternative is computationally most efficient?
Fortran! Fortran, but C/C++ is quite close – no significant difference between
If we cannot use Fortran, which solution is attractive? all the C/C++ versions
C++, with classes allowing higher-level programming Too bad: the (point-wise) callback to Python destroys the efficiency of
To interface a large existing library, the filter idea and exchanging the extension module!
pointers is attractive (no need to SWIG the whole library) Pure Python script w/NumPy is much more efficient...
When using the Python C API extensively, SCXX simplifies life Nevertheless: this is a pedagogical case teaching you how to
migrate/interface numerical code
language function func1 argument CPU time math.sin is much faster than numpy.sin for scalar expressions
F77 gridloop1 F77 function with formula 1.0
C++ gridloop1 C++ function with formula 1.07 Callbacks to Python are extremely expensive
Python Grid2D.__call__ vectorized numpy myfunc 1.5 Python+NumPy is 1.5 times slower than pure Fortran
Python Grid2D.gridloop myfunc w/math.sin 120
Python Grid2D.gridloop myfunc w/numpy.sin 220 C and C++ run equally fast
F77 gridloop1 myfunc w/math.sin 40 C++ w/MyArray was only 7% slower than pure F77
F77 gridloop1 myfunc w/numpy.sin 180
F77 gridloop2 myfunc w/math.sin 40 Minimize the no of callbacks to Python!
F77 gridloop_vec2 vectorized myfunc 2.7
F77 gridloop2_str F77 myfunc 1.1
F77 gridloop_noalloc (no alloc. as in pure C++) 1.0
C gridloop1 myfunc w/math.sin 38
C gridloop2 myfunc w/math.sin 38
C++ (with class NumPyArray) had the same numbers as C
Hide work arrays (i.e., allocate in wrapper): Pyfort for Python-Fortran integration
subroutine myroutine(a, b, m, n, w1, w2) (does not handle F90/F95, not as simple as F2PY)
integer m, n SIP: tool for wrapping C++ libraries
real*8 a(m), b(n), w1(3*n), w2(m)
Cf2py intent(in,hide) w1 Boost.Python: tool for wrapping C++ libraries
Cf2py intent(in,hide) w2
Cf2py intent(in,out) a CXX: C++ interface to Python (Boost is a replacement)
Python interface: Note: SWIG can generate interfaces to most scripting languages
a = myroutine(a, b) (Perl, Ruby, Tcl, Java, Guile, Mzscheme, ...)
Python info
doc.html is the resource portal for the course; load it into a web
browser from
http://www.ifi.uio.no/~inf3330/scripting/doc.html
and make a bookmark
doc.html has links to the electronic Python documentation, F2PY,
Quick Python review SWIG, Numeric/numarray, and lots of things used in the course
The course book “Python scripting for computational science” (the
PDF version is fine for searching)
Python in a Nutshell (by Martelli)
Programming Python 2nd ed. (by Lutz)
Python Essential Reference (Beazley)
Quick Python Book
Strings apply different types of quotes Efficient arrays for numerical computing
s = ’single quotes’ from Numeric import * # classical, widely used module
s = "double quotes" from numarray import * # alternative version
s = """triple quotes are
used for multi-line a = array([[1, 4], [2, 1]], Float) # 2x2 array from list
strings a = zeros((n,n), Float) # nxn array with 0
"""
s = r’raw strings start with r and backslash \ is preserved’
s = ’\t\n’ # tab + newline Indexing and slicing:
s = r’\t\n’ # a string with four characters: \t\n for i in xrange(a.shape[0]):
for j in xrange(a.shape[1]):
Some useful operations: a[i,j] = ...
if sys.platform.startswith(’win’): # Windows machine? b = a[0,:] # reference to 1st row
... b = a[:,1] # reference to 2nd column
file = infile[:-3] + ’.gif’ # string slice of infile
answer = answer.lower() # lower case Avoid loops and indexing, use operations that compute with whole
answer = answer.replace(’ ’, ’_’) arrays at once (in efficient C code)
words = line.split()
Mutable types allow in-place modifications Run arbitrary operating system command:
>>> a = [1, 9, 3.2, 0] cmd = ’myprog -f -g 1.0 < input’
>>> a[2] = 0 failure, output = commands.getstatusoutput(cmd)
>>> a
[1, 9, 0, 0]
Use commands.getstatsoutput for running applications
Types: list, dictionary, NumPy arrays, class instances
Use Python (cross platform) functions for listing files, creating
Immutable types do not allow in-place modifications directories, traversing file trees, etc.
>>> s = ’some string containing x’ psfiles = glob.glob(’*.ps’) + glob.glob(’*.eps’)
>>> s[-1] = ’y’ # try to change last character - illegal! allfiles = os.listdir(os.curdir)
TypeError: object doesn’t support item assignment os.mkdir(’tmp1’); os.chdir(’tmp1’)
>>> a = 5 print os.getcwd() # current working dir.
>>> b = a # b is a reference to a (integer 5)
>>> a = 9 # a becomes a new reference def size(arg, dir, files):
>>> b # b still refers to the integer 5 for file in files:
5 fullpath = os.path.join(dir,file)
s = os.path.getsize(fullpath)
Types: numbers, strings arg.append((fullpath, s)) # save name and size
name_and_size = []
os.path.walk(os.curdir, size, name_and_size)
Files Functions
Find a string in a series of files: Let us put the previous function in a file grep.py
grep.py ’Python’ *.txt *.tmp This file defines a module grep that we can import
Python code: Main program:
def grep_file(string, filename): import sys, re, glob, grep
res = {} # result: dict with key=line no. and value=line grep_res = {}
f = open(filename, ’r’) string = sys.argv[1]
line_no = 1 for filespec in sys.argv[2:]:
for line in f: for filename in glob.glob(filespec):
#if line.find(string) != -1:
grep_res[filename] = grep.grep(string, filename)
if re.search(string, line):
res[line_no] = line # report:
line_no += 1 for filename in grep_res:
for line_no in grep_res[filename]:
print ’%-20s.%5d: %s’ % (filename, line_no,
grep_res[filename][line_no])
Just write python in a terminal window to get an interactive Python Scripts can be run from IPython:
shell: In [1]:run scriptfile arg1 arg2 ...
>>> 1269*1.24
1573.5599999999999 e.g.,
>>> import os; os.getcwd()
’/home/hpl/work/scripting/trunk/lectures’ In [1]:run datatrans2.py .datatrans_infile tmp1
>>> len(os.listdir(’modules’))
60 IPython is integrated with Python’s pdb debugger
pdb can be automatically invoked when an exception occurs:
We recommend to use IPython as interactive shell
In [29]:%pdb on # invoke pdb automatically
Unix/DOS> ipython In [30]:run datatrans2.py infile tmp2
In [1]: 1+1
Out[1]: 2