MATLAB Notes Kevin Sheppard
MATLAB Notes Kevin Sheppard
Contents
Introduction to MATLAB
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
9 10 12 13 13
15
Basic Input
Variable Names . . . . . . . . . Entering Vectors . . . . . . . . Entering Matrices . . . . . . . . Higher Dimension Arrays . . . Empty Matrices ([]) . . . . . . Concatenation . . . . . . . . . Accessing Elements of Matrices Calling Functions . . . . . . . . Exercises . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
15 16 16 17 17 17 18 20 21
23
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11
4
Importing Data . . . . . . . . . . . . . . . . . . Robust Data Importing . . . . . . . . . . . . . . Reading Excel Files . . . . . . . . . . . . . . . . CSV Data . . . . . . . . . . . . . . . . . . . . . Text . . . . . . . . . . . . . . . . . . . . . . . . MATLAB Data Files (.mat) . . . . . . . . . . . . Manually Reading Poorly Formatted Text . . . . Reading Poorly Formatted Text Using textscan Stat Transfer . . . . . . . . . . . . . . . . . . . . Exporting Data . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
23 23 24 27 28 28 28 30 31 31 31
33
Basic Math
4.1 4.2
33 33
CONTENTS
Matrix Multiplication (*) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Matrix Division (/) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Matrix Right Divide (\) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Matrix Exponentiation ( ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Parentheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (dot) operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Transpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
34 34 34 35 35 35 36 36 36
37
5.1
6
Exercises
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
45
47
Special Matrices
6.1
7
Exercises
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47
49
Matrix Functions
7.1 7.2
8
49 51
53
8.1
9
Exercises
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
54
55
55 56 56 57 58 58 59
61
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Exercises
10 Flow Control
61 62 64
65
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11.1 for
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
65 67 68 69 69
CONTENTS
12 Graphics
71
12.1 12.2 12.3 12.4 12.5 12.6 12.7 12.8 12.9 12.10
Support Functions plot . . . . . . . . plot3 . . . . . . . scatter . . . . . . surf . . . . . . . . mesh . . . . . . . . contour . . . . . . subplot . . . . . . Advanced Graphics Exercises . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
71 71 73 75 76 78 79 79 80 83
85
13 Exporting Plots
87 89
91
92 93 94
95
95 95 95 95 96 96 96
97
16 Optimization
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
CONTENTS
18.1 Exercises
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
117
19.1 Structures . . . . . . . . . . . . . . . . 19.1.1 The Problem with Structures . . 19.2 Cell Arrays . . . . . . . . . . . . . . . . 19.3 Accessing Elements of Cell Arrays . . . 19.4 Considerations when Using Cell Arrays
20 File System and Navigation
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
Addressing the File System Programmatically The MATLAB Path . . . . . . . . . . . . . . . Using a Custom Path in a Shared Environment Exercises . . . . . . . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
Pre-allocate Data Arrays . . . . . . . . . . . . . . . . . Avoid Operations that Require Allocating New Memory Use Vector and Matrix Operations . . . . . . . . . . . . Use Pre-computed Values in Optimization Targets . . . Use M-Lint . . . . . . . . . . . . . . . . . . . . . . . . Prole Code to Find Hot-Spots . . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
22.1 Core Random Number Generators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 22.2 Replicating Simulation Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 22.3 Considerations when Running Simulations on Multiple Computers . . . . . . . . . . . . . 132
23 Quick Function Reference 133
23.1 23.2 23.3 23.4 23.5 23.6 23.7 23.8 23.9 23.10 23.11
General Math . . . . . . . . Rounding . . . . . . . . . . Statistics . . . . . . . . . . . Random Numbers . . . . . Logical . . . . . . . . . . . . Special Values . . . . . . . . Special Matrices . . . . . . Vector and Matrix Functions Matrix Manipulation . . . . Set Functions . . . . . . . . Flow Control . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
133 134 135 136 137 138 138 139 140 140 141
CONTENTS
Looping . . . . . . . . . Optimization . . . . . . Graphics . . . . . . . . . Date Functions . . . . . String Function . . . . . Trigonometric Functions File System . . . . . . . MATLAB Specic . . . . Input/Output . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
CONTENTS
Chapter 1
Introduction to MATLAB
These notes provide a brief introduction to MATLAB. All topics relevant to the MFE curriculum should be covered at some basic level but if some important topic is missing or under-explained, please let me know and Ill add examples as necessary. This set of notes follows a few conventions. Typewriter font is used to denote MATLAB commands and code snippets. MATLAB keywords such as if, for and break are highlighted in blue and existing MATLAB functions such as sum, abs and plot are highlighted in cyan. In general both keywords and function names should not be used for variable names, although only keywords are formally excluded from being redened. Strings are highlighted in purple, and comments are in green. The double arrow symbol >> is used to indicate the MATLAB command prompt and is the symbol used in the MATLAB command window. Math font is used to denote algebraic expressions. MATLAB is available on the Economics department servers, mrb-studentlab.manor-road.ox.ac.uk, using Microsofts Remote Desktop Client. For help using the RDC, consult the information that accompanied your username, or consult the IT help desk. For more information on programming in MATLAB, see MATLAB: An Introduction with Applications by Amos Gilat (ISBN:0470108770) or Mastering MATLAB 7 by Bruce L. Littleeld and Duane C. Hanselman (ISBN: 0131857142). The rst book provides more examples for beginners while the second is comprehensive, ranging from basic concepts to advanced applications, and was the rst book I used back when the title was Mastering MATLAB 5.
1.1
The Interface
Figure 1.1 contains an image of the main MATLAB window. There are four sub-windows visible. The command window, labeled 1, is where commands are entered, functions are called and m-les batches of MATLAB commands are run. The current directory window, labeled 2, shows the les located in the current directory. Normally these will include m-les and data. On the right side of the command window is the workspace (3) reveals a list of the variables in memory, such as data loaded or variables declared. The nal window, labeled 4, contains the command history (a display of commands recently executed). The history can be copied and pasted into the command window to re-run commands. The history can also be scrolled through in the command window by pressing the up arrow () key.
10
Introduction to MATLAB
Figure 1.1: Basic MATLAB Window. The standard setup has four pains. 1: The command window, 2: Current Directory, 3: Workspace, and 4: Command History
1.2
The Editor
MATLAB contains a syntax-aware editor that highlights code to improve readability and provides limited error checking. The editor can be launched from the main window in one of two ways, either by clicking File>New>M-File or entering edit into the command window directly. Figure 1.2 contains an example of the editor and syntax highlighting. M-les can contain batches of commands or complete functions. M-le names can include letters, numbers and underscores, although they must begin with a letter. Names should be distinct from reserved words (if, else, for, end, while, . . .) and existing function names (mean, std, var, cov, sum, . . .). To verify whether a name is already in use, the command which lename can be used to list the le which would be executed use if lename was entered in the command window.1
>> which for built-in (C:\Program Files\MATLAB\R2010a\toolbox\matlab\lang\for) >> which mean C:\Program Files\MATLAB\R2010a\toolbox\matlab\datafun\mean.m
1
11
Figure 1.2: MATLAB editor. The editor is a useful programming tool. It can be used to create batch les or custom functions (both called m-les). Note the syntax highlighting.
To check whether a les already created is using duplicating the name of another function, use the command which lename -all to produce a list of all matching les.
>> which mean -all C:\Program Files\MATLAB\R2010a\toolbox\matlab\datafun\mean.m C:\Program Files\MATLAB\R2010a\toolbox\matlab\timeseries\@timeseries\mean.m C:\Program Files\MATLAB\R2010a\toolbox\finance\ftseries\@fints\mean.m % timeseries method % fints method
The semicolon (;) is used at the end of a line to suppress the display of result of a command. Lines are still processed only nothing is returned to the command window. To understand the effect of a ;, examine the result of these two commands,
>> x=ones(3,1);
12
Introduction to MATLAB
>> x=ones(3,1) x = 1 1 1
It is generally a good idea to suppress the output of commands, although in certain cases, such as debugging or examining the output of a particular command, it may be useful to leave the semicolon off until the code is functioning as expected.
Comments
Writing clear comments is an essential practice when coding. Comments assist in tracking completed tasks, documenting unique approaches to solving a difcult problem and are useful if the code needs to be shared. The percentage symbol (%) is used to identify a comment. When a % is encountered processing stops on the current line and continues on the next line. Block comments are not supported and so comment blocks must use a % in front of each line.
% This is the start of a % comment block. % Every line must have a % % symbol before the comment
. . . (dot-dot-dot)
. . . is a special expression that can be used to break a long code expression across multiple lines in an m-le. . . . concatenates the next line onto the end of the present line when processing. It exists purely to improve the readability of code. These two expressions are identical to the MATLAB interpreter.
x = 7; x = x + x * x - x + exp(x) / log(x) * sqrt(2*pi); x = 7; x = x + x * x - x ... + exp(x) / log(x) * sqrt(2*pi);
1.3
Help
MATLAB contains a comprehensive help system which is available both in the command window and in a separate browser. The browser-based help is typically more complete and is both indexed and searchable. Two types of help are available from the command line: toolbox and function. Toolbox help returns a list of available functions in a toolbox. It can be called by help toolbox where toolbox is the short name of the toolbox (e.g. stats, optim, etc.). help, without a second argument, will produce a list of toolboxes. while function specic help can be accessed by calling help function, for example help mean.
1.4 Demos
13
The help browser can be accessed by hitting the F1 key, selecting Help>Full Product Family Help at the top of the command window, or entering doc in the command window. The documentation of a function can be directly accessed by entering doc function in the command window (e.g. doc mean).
1.4
Demos
MATLAB contains an extensive selection of demos. To access the list of available demos, enter demo in the command window.
1.5
Exercises
1. Become familiar with the MATLAB Command Window. 2. Launch the help browser and read the section MATLAB, Getting Started, Introduction. 3. Launch the editor and explore its interface. 4. Enter demo in the command window and play with some of the demos. The demos in the Graphics section are particularly entertaining.
14
Introduction to MATLAB
Chapter 2
Basic Input
Users are not required to manage memory and variables can be input with no setup. The generic form of an expression is Variable Name = Expression and expressions are processes by assigning the value on the right to the variable on the left. For instance,
x = 1; x = y; x = exp(y);
are all valid assignments for x. The rst assigns 1 to x, the second assigns the value of another variable, y, to x and the third assigns the output of exp(y) to x. Assigning one variable to another assigns the value of that variable, not the variable itself. Thus changes to y will not be reected in the value of x in y = 1 and x = y.
>> y = 1; >> x = y; >> x x = 1 >> y = 2; >> x x = 1 >> y y = 2
2.1
Variable Names
Variable names can take many forms, although they can only contain numbers, letters (both upper and lower), and underscores (_). They must begin with a letter and are CaSe SeNsItIve. For example,
x X X1
16
Basic Input
are not.
2.2
Entering Vectors
Almost all of the data used in MATLAB are matrices by construction, even if they are 1 by 1 (scalar), K by 1 or 1 by K (vectors).1 Vectors, both row (1 by K ) and column (K by 1), can be entered directly into the command window. The mathematical notation x = [1 2 3 4 5] is entered as
>> x=[1 2 3 4 5];
In the above input, [ and ] are reserved symbols which are interpreted as begin array and end array, respectively. The column vector,
x =
is entered using a less intuitive structure
>> x=[1; 2; 3; 4; 5];
1 2 3 4 5
where ; is interpreted as new row when used inside an square brackets ([ ]).
2.3
Entering Matrices
Matrices are just column vectors of row vectors. For instance, to input 1 2 3 x = 4 5 6 , 7 8 9 enter the matrix one row at a time, separating the rows with semicolons,
An important exception to the everything is a matrix rule occurs in cell arrays, which are matrices composed of other matrices (formally arrays of arrays or ragged (jagged) arrays). See chapter 19 for more on the use and caveats to cell arrays.
1
17
>> x = [1 2 3 ; 4 5 6; 7 8 9];
2.4
Multi-dimensional (N -dimensional) arrays are available for N up to about 30, depending on the size of each matrix dimension. Unlike scalars, vectors and matrices, higher dimension arrays cannot be directly allocated and so can only be constructed by calling functions such as zeros(2, 2, 2). Higher dimensional arrays can be useful for tracking matrix values through time, such as a time-varying covariance matrices.
2.5
An empty matrix contains no elements, x = []. Empty matrices may be returned from functions in certain cases (e.g. if some criteria is not met). Empty matrices often cause problems, occasionally in difcult to predict ways, although they do have some useful applications. First, they can be used for lazy vector construction using repeated concatenation. For example
>> x=[] x = [] >> x=[x 1] x = 1 >> x=[x 2] x = 1 x = 1 2 3 2 >> x=[x 3]
is a legal operation that builds a 3-element vector by concatenating the previous value with a new value. This type of concatenation is bad for code performance and so it should be avoided by pre-allocating the data array using zeros (see page 47), if possible. Second, empty matrices are needed for calling functions when multiple inputs are required but some are not used, for example std(x,[],2), which passes x as the rst argument, 2 as the third and leaves the second empty.2
2.6
Concatenation
Concatenation is the process by which one vector or matrix is appended to another. Both horizontal and vertical concatenation are possible. For instance, suppose x = and
2
1 2 3 4
and y =
5 6 7 8
The latest version of MATLAB, R2009b, provides an alternative method of passing empty arguments using a tilde (~).
18
Basic Input
z =
x y
needs to be constructed. This can be accomplished by treating x and y as elements of a new matrix.
>> x=[1 2; 3 4] x = 1 3 y = 5 7 6 8 2 4
>> y=[5 6; 7 8]
This is an example of vertical concatenation. x and y can be horizontally concatenated in a similar fashion:
>> z=[x z = 1 3 2 4 5 7 6 8 y]
Note: Concatenating is the code equivalent of block-matrix forms in standard matrix algebra.
2.7
Once a vector or matrix has been constructed, it is important to be able to access the elements individually. Data in matrices is stored in column-major order. This means elements are indexed by rst counting down rows and then across columns. For instance, in the matrix 1 2 3 x = 4 5 6 7 8 9 the rst element of x is 1, the second element is 4, the third is 7, the fourth is 2, and so on. Elements can be accessed by element number using parenthesis (x(#)). After dening x, the elements of x can be accessed
>> x=[1 2 3; 4 5 6; 7 8 9] x = 1 4 2 5 3 6
19
7 >> x(1) ans = 1 >> x(2) ans = 4 >> x(3) ans = 7 >> x(4) ans = 2 >> x(5) ans = 5
The single index notations works well if x is a vector, in which case the indices correspond directly to the order of the elements in x. However single index notation is confusing when x is a matrix. Double indexing of matrices is available using the notation x(#,#).
>> x(1,1) ans = 1 >> x(1,2) ans = 2 >>x(1,3) ans = 3 >> x(2,1) ans = 4 >> x(3,3) ans = 9
Higher dimension matrices can also be accessed in a similar manner, x(#, #, #). For example, x(1,2,3) would return the element in the rst row of the second column of the third panel of a 3-D matrix x. The colon operator (:) plays a special role in accessing elements. When used, it is interpreted as all elements in that dimension. For instance, x(:,1), is interpreted as all elements from matrix x in column 1. Similarly, x(2,:) is interpreted as all elements from x in row 2. Double : notation produces all elements of the original matrix and so x(:,:) returns x. Finally, vectors can be used to access elements of x. For instance, x([1 2],[1 2]), will return the elements from x in rows 1 and 2 and columns 1 and 2, while x([1 2],:) will returns all columns from rows 1 and 2 of x.
>> x(1,:) ans = 1 >> x(2,:) 2 3
20
Basic Input
2.8
Calling Functions
Functions calls have different conventions other expressions. The most important difference is that functions can take more than one input and return more than one output. The generic structure of a function call is [out1, out2, out3, . . .]=functionname(in1, in2, in3, . . .). The important aspects of this structure are If only one output is needed, brackets ([ ]) are optional, for example y=mean(x). If multiple outputs are required, the outputs must be encapsulated in brackets, such as in [y, index] = min(x). The number of output variables determines how many outputs will be returned. Asking for more outputs than the function provides will result in an error. Both inputs and outputs must be separated by commas (,). Inputs can be the result of other functions as long as only the rst output is required. For example, the following are equivalent,
y = var(x); mean(y)
and
mean(var(x))
2.9 Exercises
21
Inputs can contain only selected elements of a matrix or vector (e.g. mean(x([1 2] ,[1 2]))). Details of important function calls will be claried as they are encountered.
2.9
Exercises
v =
1 1 2 3 5 8
x =
1 0 0 1 1 2 3 4
y =
1 2 1 2 z = 3 4 3 4 1 2 1 2 w = x y x y
2. What command would pull x would of w ? (Hint: w([?],[?]) is the same as x .) 3. What command would pull [x; y] out of w? Is there more than one? If there are, list all alternatives. 4. What command would pull y out of z ? List all alternatives.
22
Basic Input
Chapter 3
3.1
Importing Data
Importing data ranges from moderately challenging to very difcult, depending on the data size and format. A few principles can simplify this task: The le imported should contain numbers only, with the exception of the rst row which should contain the variable name. Use another program, such as Microsoft Excel, to manipulate data before importing. Each column of the spreadsheet should contain a single variable. Dates should be imported as numbers by rst formatting the columns as Excel Dates and then reformatting as a number (dates with base year 1900). For example, January 1, 2000 would be 36526.
3.2
The simplest and most robust method to import data is to use a correctly formatted Excel le and the import wizard. The key to the import is to make certain the Excel le has a very particular structure: One variable per column A valid, distinct variable name for the column in the rst row All data in the column must be numeric, especially dates. As an example, consider importing a month of GE prices downloaded from Yahoo! Finance. The original data can be found in GEPrices.xls and is presented in Figure 3.1. This data le nearly ts the requirement although the rst column, containing the dates, is not numeric. The key step to ensure a smooth import is to convert the dates from Excel dates to numbers. To perform the conversion, select the dates, right click and choose format. Select Number from the dialog box the pops up. If the conversion was performed correctly, the output should be similar to gure 3.2. This clean le can be found in GEPricesClean.xls.
24
Figure 3.1: The raw data as taken from Yahoo! Finance. Most of the columns are well formatted with variable names in the rst row and numeric content. However the date column contains Excel dates and not numeric. This structure will prevent the Excel le from being parsed correctly by MATLAB.
Once the Excel le has been formatted, the nal step is to import it. First, change the Current Directory to the directory with the Excel le to be imported. Next, select the Current Directory browser in the upper left pane of the main window.1 The Excel le should be present in this view. To import the le, right click on the lename and select Import (see gure 3.3). This will trigger the dialog in gure 3.4. To complete the import, make sure Create vectors from each column using column names is chosen and click nish. If the import fails the most likely cause is the format of the Excel le. Make certain the le conforms to the rules above and try again.
3.3
Data in Excel sheets can be imported using the function xlsread from the command window. Accompanying this set of notes is an Excel le, deciles.xls, which contains returns for the 10 CRSP deciles from January 1, 2004 to December 31, 2007. The rst column contains the dates while columns 2 through 11 contain the portfolio returns from decile 1 through decile 10 respectively. To load the data, use the command
>> data = xlsread(deciles.xls);
If this pane is absent, it can be enabled in the Desktop tab along the top of the MATLAB window.
25
Figure 3.2: This correctly formatted le contains only the variables to import: date, close and volume. Note that the date column has been converted from Excel dates to numbers so January 3, 2007 appears as 39085.
This command will read the data in sheet1 of le deciles.xls and assign it to data. xlsread can handle a number of other situations, including reading sheets other than sheet1 or reading only specic blocks of cells. For more information, see doc xlsread. Data can be exported to an Excel le using xlswrite. Extended information about an Excel le, such as sheet names and can be read using the command xlsflinfo. Note: MATLAB and Excel do not agree about dates. MATLAB dates are measured as days past January 0, 0000while Excel dates are measured relative to December 31, 1899. In MATLAB serial date 1 corresponds to January 1, 0000 while in Excel day 1 corresponds to January 1, 1900. To convert imported Excel dates into MATLAB dates, datenum(30DEC1899) must be added to the column of data representing the dates. Returning to the example above,
>> [A,finfo]=xlsfinfo(deciles2.xls) A = Microsoft Excel Spreadsheet finfo = deciles >> data = xlsread(deciles2.xls,deciles,A2:K1257); >> dates = data(:,1); >> datestr(dates(1)) ans = 03-Jan-0104 >> dates = dates + datenum(30DEC1899);
26
Figure 3.3: To import data, select the Current Directory view, right click on the Excel le to be imported, and select Import. This will trigger the import wizard in gure 3.4.
27
Figure 3.4: As long as the data is correctly formatted (see gure 3.2), the import wizard should import the data and create variable with the same name as the column headers. To complete this step, make sure the second radio button is selected (Create vectors from each column using column names) and then select Finish.
This example uses a les deciles2.xls which contains the sheet deciles. Opening the les in Excel shows that deciles contains column labels as well as the data. To import data from this le, xlsread needs to know to take the data from deciles in cells A2:K1275 (upper left and lower right corners of block). Running the command xlsread(deciles2.xls, deciles, A2:K1257) does this. Finally, the disagreement in the base date is illustrated and the correction is shown to work. For more on dates, see Chapter 17.
3.4
CSV Data
Comma-separated value (CSV) data is similar to Excel data. Note that CSV les must contain only numeric values. If the le contains strings, such as variable names, the import will fail. The command to read CSV data is essentially identical to the command to read Excel les,
% This command fails since deciles.csv contains variable names in the first row >> data = csvread(deciles.csv)
28
3.5
Text
Reading in text is also simple if the le only contains numbers. The standard command is textread which has a syntax similar to xlsread,
data = textread(deciles.txt); textread can handle a variety of data formats, but it is recommend to keep data les as simple as possible
and to only use tab delimited text les (such as the example deciles.txt). See doc textread for further information. textread has been updated in the function textscan which can read both numeric and string data.
3.6
The native le format is the MATLAB data le, or mat le. Data from a mat le is loaded by entering
load deciles.mat
There is no need to specify an input variable as the mat le contains both variable names and data. See below for saving data in mat format.
3.7
MATLAB can be programmed to read virtually any text format since it contains functions for parsing and interpreting arbitrary text containing numeric data. Reading poorly formatted data les is an advanced technique and should be avoided if possible. However, some data is only available in formats where reading in data line-by-line is the only option. For instance, the standard import method fails if the raw data is very large (too large for Excel) and is poorly formatted. In this case, the only possibility is to write a program to read the le line-by-line and to process each line separately. The le IBM_TAQ.txt contains a simple example of data that is difcult to import. This le was downloaded from WRDS and contains all prices for IBM from the TAQ database in the interval January 1,2001 through January 31, 2001. It is too large to use in Excel and has both numbers, dates and text on each line. The following code block shown how the data in this le can be parsed.
fid=fopen(IBM_TAQ.txt,rt); %Count number of lines count=0; while 1 line=fgetl(fid); if ~ischar(line) break end count=count+1; end %Close the file fclose(fid); %Pre-allocate the data dates = zeros(count-1,1);
29
time
= zeros(count-1,1);
price = zeros(count-1,1); %Reopen the file fid=fopen(IBM_TAQ.txt,rt); %Get one line to throw away since it contains the column labels line=fgetl(fid); %Use count to index the lines this pass count=1; %while 1 and break work well when reading test while 1 line=fgetl(fid); %If the line is not a character value weve reached the end of the file if ~ischar(line) break end %Find all the commas, they delimit the file commas = strfind(line,,); %Dates are places between the first and second commas dates(count)=datenum(line(commas(1)+1:commas(2)-1),yyyymmdd); %Times are between the second and third temptime=line(commas(2)+1:commas(3)-1); %Times are colon separates, so they need further parsing colons=strfind(temptime,:); %Convert the text representing the hours, minutes or and seconds to numbers hour=str2double(temptime(1:colons(1)-1)); minute=str2double(temptime(colons(1)+1:colons(2)-1)); second=str2double(temptime(colons(2)+1:length(temptime))); %Convert these values to seconds past midnight time(count)=hour*3600+minute*60+second; %Read the price from the last comma to the end of the line and convert to number price(count)=str2double(line(commas(3)+1:commas(4)-1)); %Increment the count count=count+1; end fclose(fid);
This block of code does a few thing: Open the le directly using fopen Reads the le line by line using fgetl Counts the number of lines in the le Pre-allocates the dates, times and price variables using zeros Rereads the le parsing each line by the location of the commas using strfind to locate the delimiting character Uses datenum to convert string dates to numerical dates Uses str2double to convert strings to numbers
30
3.8
textscan is a fast method to read le that contain mixed numeric and string data. A text le must satisfy
some constraints in order for textscan to be useful. First, the le must be regular in that it has the same number of columns in every row, and second each row must contain the same type of data that is the le must not mixing strings with numbers in a column. IBM_TAQ.txt is satises these two constraints and so can be read using the command block below.
fid = fopen(IBM_TAQ.csv); data = textscan(fid, %s %f %s %f %f, delimiter, ,, HeaderLines, 1) fclose(fid);
The arguments to textscan instruct the function that the a line is formatted according to string-numberstring-number-number where %s indicates string and %f indicates number, that the columns are delimited by a comma, and that the rst line is a header and so should be skipped. The data read in by textscan is returned as a cell array, where numeric columns are stored as vectors while string values (the ticker and the time in this example) are stored as cell arrays of strings. The use of curly braces, {} indicates that cell array are being used. See chapter 19 for more on accessing the values in cell arrays.
>> data = Columns 1 through 3 {558986x1 cell} Columns 4 through 5 [558986x1 double] >> data{1}(1) ans = IBM >> data{2}(1) ans = 20070103 >> data{3}(1) ans = 9:30:03 >> data{4}(1) ans = 97.1800 >> data{5}(1) ans = 100 [558986x1 double] [558986x1 double] {558986x1 cell}
Note that the time column would need further processing to be converted into a useful format. For more on reading poorly formatted data le, see the documentation for fopen, fscanf, fread, fgetl, dlmread, and textscan. See chapter 18 for more on string manipulation.
31
3.9
Stat Transfer
StatTransfer is available on the servers and is capable of reading and writing approximately 20 different formats, including MATLAB, GAUSS, Stata, SAS, Excel, CSV and text les. It allow users to load data in one format and output some or all of the data in another. StatTransfer can make some hard-to-manage situations (e.g. poorly formatted data) substantially easier. StatTransfer has a comprehensive help le to provide assistance.
3.10
Exporting Data
Saving Data
Once the data has been loaded, save it and any changes in the native MATLAB data format using save
>> save filename
This will produce the le lename.mat containing all variables in memory. lename can be replaced with any valid lename. To save a subset of the variables in memory, use
>> save filename var1 var2 var3
Exporting Data
Data can be exported to a tab-delimited text les using save with the arguments -double-ascii. It is generally a good practice to only export one variable at a time using this method. Exporting more than one variable results in a poorly formatted le that may be hard to import into another program. For example,
>> save filename var1 -ascii -double
would save the data in var1 in a tab-delimited text le. The restriction to a single variable should not be seen as a severe limitation since var1 can always be constructed from other variables (e.g. var1=[var2 var3];. Alternative methods to export data include xlswrite, csvwrite and dlmwrite.
3.11
Exercises
1. The le exercise3.xls contains three columns of data, the date, the return on the S&P 500, and the return on XOM (ExxonMobil). Using Excel, convert the date to a number and save the le. (Hint: Format the cells with dates as numbers. They should be around 35,000). 2. Use xlsread to read the le saved in the previous exercise. Load in the three series into a new variable names returns. 3. Parse returns into three variables, dates, SP500 and XOM. (Hint, use the : operator). 4. Save a MATLAB data le exercise3 with all three variables. 5. Save a MATLAB data le dates with only the variable dates. 6. Construct a new variable, sum_returns as the sum of SP500 and XOM. Create another new variable, output_data as a horizontal concatenation of dates and sum_returns. 7. Export the variable output_data to a new .xls le using xlswrite. See the help available for xlswrite.
32
Chapter 4
Basic Math
Mathematical operations closely follow the rules of linear algebra. Operations legal in linear algebra are legal in MATLAB; operations that are not legal in linear algebra are not legal in MATLAB. For instance, matrices must conform along their inside dimensions to be multiplied; attempting to multiply nonconforming matrices produces an error.
4.1
Operators
These standard operators are available: Operator + * / Meaning Addition Subtraction Multiplication Division (Left divide) Right divide Exponentiation Example
x + y x - y x * y x/y x \ y x y
Algebraic x +y x y xy
x y y x
xy
When x and y are scalars, the behavior of these operators is obvious. When x and y are matrices, things are a bit more complex.
4.2
Addition and substraction require x and y to have the same dimensions or to be scalar. If they are both matrices, z=x+y produces a matrix with z(i,j)=x(i,j)+y(i,j). If x is scalar and y is a matrix, z=x+y results in z(i,j)=x+y(i,j). Suppose z=x+y:
y
Scalar
x
Matrix
34
Basic Math
Note: Note: These conform to the standard rules of matrix addition and substraction. x i j is the element from row i and column j of x .
4.3
Multiplication requires the inside dimensions to be the same or for one input to be scalar. If x is N by M and y is K by L and both are non-scalar matrices, x*y requires M = K . Similarly, y*x requires L = N . If x is scalar and y is a matrix, then z=x*y produces z(i,j)=x*y(i,j). Suppose z=x*y:
y
Scalar
x
Matrix
4.4
Matrix division is not generally dened in linear algebra and so is slightly tricker. The intuition for matrix division comes from thinking about a set of linear equations. Suppose there is some z , a M by L vector, such that yz = x where x is N by M and y is N by L. Division nds z as the solution to this set of linear equations by least squares, and so z = (y y )1 (y x ). Suppose z=x/y:
y
Scalar
x
Matrix
Note: Like linear regression, matrix left division is only well dened if y is nonsingular and thus has full rank.
4.5
Matrix right division is simply the opposite of matrix left division. Suppose z=x\y:
35
Scalar
x
Matrix
4.6
Matrix Exponentiation ( )
Matrix exponentiation is only dened if at least one of x or y are scalars. Suppose z=x y:
y
Scalar
x
Matrix
Matrix
y Square
Note: In the case where x is a matrix and y is an integer, and z=x*x*. . .*x (y times). If y is not integer, this function involves eigenvalues and eigenvalues.1
4.7
Parentheses
Parentheses can be used in the usual way to control the order in which mathematical expressions are evaluated, and can be nested to create complex expressions. See section 4.10 on Operator Precedence for more information on the order mathematical expressions are evaluated.
4.8
. (dot) operator
The . operator (read dot operator) changes matrix operations into element-by-element operations. Suppose x and y are N by N matrices. z=x*y results in usual matrix multiplication where z(i,j) = x(i,:) * y(:,j), while z = x .* y produces z(i,j) = x(i,j) * y(i,j). Multiplication (.*), division (./), right division (.\), and exponentiation (. ) all have dot forms.
z=x.*y z=x./y z=x.\ y z=x. y
Note: These are sometimes called the Hadamard operators, especially .*.
If x is a scalar and y is a square matrix, then x y is dened as V * diag(x. diag(D)) * inv(V) where V is the matrix of eigenvectors and D is a diagonal matrix containing the corresponding eigenvalues of y.
1
36
Basic Math
4.9
Transpose
Matrix transpose is expressed using the operator. For instance, if x is an M by N matrix, x is its transpose with dimensions N by M .
4.10
Operator Precedence
Computer math, like standard math, has operator precedence which determined how mathematical expressions such as
2 3+3 2/7*13
Name Parentheses Transpose, All Exponentiation Negation (Logical) Unary Plus, Unary Minus All multiplication and division Addition and subtraction Colon Operator Logical operators Element-by-Element AND Element-by-Element OR Short Circuit AND Short Circuit OR
Rank 1 2 3 3 4 5 6 7 8 9 10 11
+ , , .*, / , ./ , \, .\ * +, :
|
\&\&
||
In the case of a tie, operations are executed left-to-right. For example, x y z is interpreted as (x y) z. Note: Unary operators are + or - operations that apply to a single element. For example, consider the expression (-4). This is an instance of a unary - since there is only 1 operation. (-4) 2 produces 16. -4 2 produces -16 since has higher precedence than unary negation and so is interpreted as -(4 2) . -4 * -4 produces 16 since it is interpreted as (-4) * (-4).
4.11
Exercises
1. Using the matrices entered in exercise 1 of chapter 2, compute the values of u + v , v + u , v u , u v and xy 2. Is x\ 1 legal? If not, why not. What about x/1? 3. Compute the values (x+y) 2 and x 2+x*y+y*x+y 2. Are they the same? 4. Is x 2+2*x*y+y 2 the same as either above? 5. When will x y and x. y be the same? 6. Is a*b+a*c the same as a*b+c? If so, show it, if not, how can the second be changed so they are equal. 7. Suppose a command x y*w+z was entered. What restrictions on the dimensions of w, x, y and x must be true for this to be a valid statement? 8. What is the value of -2 4? What about (-2) 4?
Chapter 5
Basic Functions
This section provides a reference for a set commonly used functions and a discussion of how they behave.
length
To nd the size of the maximum dimension of x, use z=length(x). If y is T by K , T > K , z = T . If K > T , z = K . Using length is risky since the value it returns can be the number of columns or the number of rows, depending on which is larger. It is better practice to use size(y,1) and size(y,2) depending on whether the number of rows of the number of columns is required.
>> x=[1 2 3; 4 5 6] x = 1 4 ans = 3 >> length(x) ans = 3 2 5 3 6
>> length(x)
size
To determine the size of one dimension of a matrix, use z=size(x,DIM ), where DIM is the dimension. Note that dimension 1 is the number of rows while dimension 2 is the number of columns, so if x is T by K , z=size(x,1) returns T while z=size(x,2) returns K . Alternatively, s=size(x) returns a vector s with the size of each dimension.
>> x=[1 2 3; 4 5 6] x = 1 4 ans = 2 2 5 3 6
>> size(x,1)
38
Basic Functions
sum
To compute the sum matrix,
T
z =
t =1
xt
use the command sum(x). z=sum(x) returns a K by 1 vector containing the sum of each column, so z(i) = sum(x(:,i)) = x(1,i) + x(2,i) + . . . + x(T,i). Note: If x is a vector, sum will add all elements of x whether it is a row or column vector.
>> x=[1 2 3; 4 5 6] x = 1 4 >> sum(x) ans = 5 >> sum(x) ans = 6 15 7 9 2 5 3 6
min
To nd the minimum element of a vector or the rows of a matrix, min x i t , i = 1, 2, . . . , K use the command min(x). If x is a vector, min(x) is scalar. If x is a matrix, min(x) is a K by 1 vector containing the minimum values of each column.
>> x=[1 2 3; 4 5 6] x = 1 4 >> min(x) ans = 1 >> min(x) ans = 1 4 2 3 2 5 3 6
39
max
To nd the maximum element of a vector or the rows of a matrix, max x i t , i = 1, 2, . . . , K use the command max(x). If x is a vector, max(x) is scalar. If x is a matrix, max(x) is a K by 1 vector containing the maximum values of each column.
sort
To sort the values of a vector or the rows of a matrix from smallest to largest, use the command sort(x). If x is a vector, sort(x) is vector where x(1)=min(x) and x(i)x(i+1). If x is a matrix, sort(x) is a matrix of the same size where each column is sorted from smallest to largest.
>> x=[1 5 2; 4 3 6] x = 1 4 >> sort(x) ans = 1 4 >> sort(x) ans = 1 2 5 3 4 6 3 5 2 6 5 3 2 6
exp
To take the exponential of a vector or matrix (element-by-element), ex use exp. z=exp(x) returns a vector or matrix the same size as x where z(i,j)=exp(x(i,j)).
>> x=[1 2 3; 4 5 6] x = 1 4 >> exp(x) ans = 2.7183 54.5982 7.3891 148.4132 20.0855 403.4288 2 5 3 6
40
Basic Functions
log
To take the natural logarithm of a vector or matrix, ln x use log. z=log(x) returns a vector or matrix the same size as x where z(i,j)=log(x(i,j)).
>> x=[1 2 3; 4 5 6] x = 1 4 >> log(x) ans = 0 1.3863 0.6931 1.6094 1.0986 1.7918 2 5 3 6
sqrt
To compute the element-by-element square root of a vector or matrix,
xi j
use sqrt. z=sqrt(x) returns a vector or matrix the same size as x where z(i,j)=sqrt(x(i,j)).
>> x=[1 2 3; 4 5 6] x = 1 4 >> sqrt(x) ans = 1.0000 2.0000 1.4142 2.2361 1.7321 2.4495 2 5 3 6
mean
To compute the mean of a vector or matrix, z =
T t =1
xt
use the command mean(x). z=mean(x) is a K by 1 vector containing the means of each column, so z(i) = sum(x(i,:)) / Note: If x is a vector, mean will compute the mean of x whether it is a row or column vector.
>> x=[1 2 3; 4 5 6] x = 1 4 2 5 3 6
41
var
To compute the sample variance of a vector or matrix, 2 =
T t =1 (x t
x )2
T 1
use the command var(x). If x is a vector, var(x) is scalar. If x is a matrix, var(x) is a K by 1 vector containing the sample variances of each column. Note: This command uses T 1 in the denominator unless an optional second argument is used.
>> x=[1 2 3; 4 5 6] x = 1 4 >> var(x) ans = 4.5000 >> var(x) ans = 1 1 4.5000 4.5000 2 5 3 6
cov
To compute the sample covariance of a vector or matrix =
T t =1 (xt
x) (xt x) T 1
use the command cov(x). If x is a vector, cov(x) is scalar (and is identical of var(x)). If x is a matrix, cov(x) is a K by K matrix with sample variances in the diagonal elements and sample covariances in the off diagonal elements. Note: This command uses T 1 in the denominator unless an optional second argument is used.
x = 1 4 >> cov(x) ans = 4.5000 4.5000 4.5000 4.5000 4.5000 4.5000 4.5000 4.5000 4.5000 2 5 3 6
42
Basic Functions
std
To compute the sample standard deviation of a vector or matrix,
T t =1 (x t
x )2
T 1
use the command std(x). If x is a vector, std(x) is scalar. If x is a matrix, std(x) is a K by 1 vector containing the sample standard deviations of each column. Note: This command always uses T 1 in the denominator, and is equivalent to sqrt(var(x)).
>> x=[1 2 3; 4 5 6] x = 1 4 >> std(x) ans = 2.1213 >> std(x) ans = 1 1 2.1213 2.1213 2 5 3 6
skewness
To compute the sample skewness of a vector or matrix,
T 3 t =1 (x t x )
skew =
use the command skewness(x). If x is a vector, skewness(x) is scalar. If x is a matrix, skewness(x) is a K by 1 vector containing the sample skewness of each column.
>> x=[1 2 3; 4 5 6] x = 1 4 ans = 0 ans = 0 0 0 0 >> skewness(x) 2 5 3 6
>> skewness(x)
43
kurtosis
To compute the sample kurtosis of a vector or matrix,
T 4 t =1 (x t x )
use the command kurtosis(x). If x is a vector, kurtosis(x) is scalar. If x is a matrix, kurtosis(x) is a K by 1 vector containing the sample kurtosis of each column.
>> x=[1 2 3; 4 5 6] x = 1 4 ans = 1 ans = 1.5000 1.5000 1 1 >> kurtosis(x) 2 5 3 6
>> kurtosis(x)
: operator
The : operator has two uses. The rst allows elements in a matrix or vector (e.g. x(1,:)) to be accessed and has previously been described. The other is to create a row vector with evenly spaced points. In this context, the : operator has two forms, rst:last and rst:increment:last. The basic form, rst:last, produces a row vector of the form [rst, rst + 1, . . . rst + N ] where N is the largest integer such that rst+N last. In common usage, rst and last will be integers and N =last-rst. These examples to show how this construction works:
>> x=1:5 x = 1 >> x=1:3.5 x = 1 >> x=-4:6 x = -4 -3 -2 -1 0 1 2 3 4 5 6 2 3 2 3 4 5
The second form for the : operator includes an increment. The resulting sequence will have the form [rst, rst + increment, rst + 2(increment), . . . rst + N (increment)] where N is the largest integer such that rst+N(increment)last. Consider these two examples:
>> x=0:.1:.5
44
Basic Functions
x = 0 >> x=0:pi:10 x = 0 3.1416 6.2832 9.4248 0.1000 0.2000 0.3000 0.4000 0.5000
The increment does not have to be positive. If a negative increment is used, the general form is unchanged but he stopping condition changes to N is the largest integer such that rst+N (increment)last. For example,
>> x=-1:-1:-5 x = -1 x = 0 -3.1416 -6.2832 -9.4248 -2 -3 -4 -5 >> x=0:-pi:-10
linspace
linspace is similar to the : operator. Rather than producing a row vector with a predetermined increment, linspace produces a row vector with a predetermined number of nodes. The generic form is linspace(lower ,upper ,N ) where lower and upper are the two bounds of the series and N is the number of
points to produce. If inc is dened as inc=(upper-lower)/(N -1), the resulting sequence will have the form [lower, lower + inc, lower + 2inc, . . . lower + (N 1)inc] and the command linspace(lower,upper,N) will produce the same output as lower:(upper-lower)/(N-1):upper. Note: Remember : is a low precedence operator so operations involving : should always enclosed in parenthesis if there is anything else on the same line. Failure to do so can result in undesirable or unexpected behavior. For instance, consider:
>> N=4; >> lower=0; >> upper=1; >> linspace(lower,upper,N)-(lower:(upper-lower)/(N-1):upper) ans = 1.0e-015 * 0 ans = 0 0.3333 0.6667 1.0000 0 -0.1110 0
>> linspace(lower,upper,N)-lower:(upper-lower)/(N-1):upper
5.1 Exercises
45
logspace
logspace produces points uniformly distributed in log10 space. logspace(lower, upper, N) is the same
5.1
Exercises
1. Load the MATLAB data le created in the Chapter 4 exercises and compute the mean, standard deviation, variance, skewness and kurtosis of both returns (SP500 and XOM). 2. Create a new matrix, returns = [SP500 XOM]. Repeat exercise 1 on this matrix. 3. Compute the mean of returns. 4. Using both the : operator and linspace, create the sequence 0, 0.01, 0.02, . . . , .99, 1. 5. Create a custom logspace using the natural log (base e ) rather than the logspace created in base 10 (which is what logspace uses). Hint: Use linspace AND exp. 6. Find the max and min of the variable SP500 (see the Chapter 3 exercises). Create a new variable SP500sort which contains the sorted values of this series. Verify that the min corresponds to the rst value of this sorted series and the max corresponds to the last Hint: Use length or size.
46
Basic Functions
Chapter 6
Special Matrices
Commands are available to produce a number of useful matrices.
ones
ones generates a matrix of 1s and is generally called with two arguments, the number of rows and the num-
ber of columns.
oneMatrix = ones(N,M)
will generate a matrix of 1s with N rows and M columns. Note: To use the function call above, N and M must have been previously dened (e.g. N=10;M=7).
eye
eye generates an identity matrix (matrix with ones on the diagonal, zeros every where else). An identity
zeros
zeros produces a matrix of 0s in the same way ones produces a matrix of 1s, and is useful for initializing a
6.1
Exercises
1. Produce two matrices, one containing all zeros and one containing only ones, of size 10 5. 2. Multiply these two matrices in both possible ways. 3. Produce an identity matrix of size 5. Take the exponential of this matrix, element-by-element. 4. How could these be replaced with repmat?
48
Special Matrices
Chapter 7
Matrix Functions
Some functions operate exclusively on matrix inputs. Some are mathematical in nature, for instance computing the eigenvalues and eigenvectors, while other are functions for manipulating the elements of a matrix.
7.1
Matrix Manipulation
repmat
repmat, along with reshape, are two of the most useful non-mathematical functions. repmat replicates a
matrix according to a specied size vector. To understand how repmat functions, imagine forming a matrix composed of blocks. The generic form of repmat is repmat(X , M , N ) where X is the matrix to be replicated, M is the number of rows in the new block matrix, and N is the number of columns in the new block matrix. For example, suppose X was a matrix X = and the block matrix Y = X X X X X X 1 2 3 4
parameters determined at run-time, such as the number of explanatory variables in a model and second repmat can be used for arbitrary dimensions. Manual matrix construction becomes tedious and error prone with as few as 5 rows and columns.
50
Matrix Functions
reshape
reshape transforms a matrix with one set of dimensions and to one with a different set, preserving the
number of elements. reshape can transform an M by N matrix x into an K by L matrix y as long as M N = K L. Note that the number of elements cannot change. The most useful call to reshape switches a matrix into a vector or vice versa. For example
>> x = [1 2; 3 4]; >> y = reshape(x,4,1) y = 1 3 2 4 >> z = reshape(y,1,4) z = 1 w = 1 3 2 4 3 2 4 >> w = reshape(z,2,2)
The crucial implementation detail of reshape is that matrices are stored using column-major notation. Elements in matrices are counted rst down and then across columns. reshape will place elements of the old matrix into the same position in the new matrix and so after calling reshape, x (1) = y (1), x (2) = y (2), and so on.
diag
diag can produce one of two results depending on the form of the input. If the input is a square matrix, it
will return a column vector containing the elements of the diagonal. If the input is a vector, it will return a matrix containing the elements of the diagonal along the vector. Consider the following example:
>> x = [1 2; 3 4]; x = 1 3 y = 1 4 >> z=diag(y) z = 1 0 0 4 2 4
>> y = diag(x)
51
7.2
chol
chol computes the Cholesky factor of a positive denite matrix. The Cholesky factor is a lower triangular
det
det computes the determinant of a square matrix.
eig
eig computes the eigenvalues and eigenvector of a square matrix. Two output arguments are required in
inv
inv computes the inverse of a matrix. inv(x) can alternatively be computed using x (-1).
kron
kron computes the Kronecker product of two matrices,
trace
trace computes the trace of a square matrix (sum of diagonal elements) and so trace(x) equals sum(diag(x)).
52
Matrix Functions
Chapter 8
54
In the rst example, eps/2 < eps so it has no effect while 2*eps > eps so it does. However in the second example, 2*eps/10 < eps, it has no effect when added. This is a very tricky concept to understand, but failure to understand numeric limits can results in errors in code that appears to be otherwise correct.
8.1
Exercises
1. What is the value of log(exp(1000)) both analytically and in MATLAB? Why do these differ? 2. What is the value of eps/10? 3. Is .1 different from .1+eps/10? 3. Is 1e120 (1 10120 ) different from 1e120+1e102? (Hint: Test with ==)
Chapter 9
9.1
The core logical operators are > >= < <= == = Greater than Greater than or equal to Less than Less then or equal to Equal to Not equal to
Logical operators can be used on scalars, vector or matrices. All comparisons are done element-byelement and return either 1 (logical true) or 0 (logical false). For instance, suppose x and y are matrices. z=(x<y); will be a matrix of the same size as x and y composed of 0s and 1s. Alternatively, if one is scalar, say y, then the elements of z are z(i,j)=(x(i,j)<y);. Logical operators can be used to access elements of a vector or matrix. For instance, suppose z = x L y where L is one of the logical operators above such as < of ==. The following table examines the behavior when x and/or y are scalars or matrices. Suppose z=x < y:
y
Scalar
x
Matrix
Logical operators are used in portions of programs known as ow control such as if . . . else . . . end blocks) which will be discussed later. It is important to remember that vector or matrix logical operations return vector or matrix output and that ow control blocks require scalar logical expressions.
56
9.2
Logical expressions can be combined using three logical devices, & AND OR NOT
Recall that (NOT) has higher precedence than & (AND) and | (OR). Aside from the different level of precedence, these operators follow the same rules as other logical operators. If used on two matrices, the dimensions must be the same. If used on a scalar and a matrix, the effect is the same as calling the logical device on the scalar and each element of the matrix. Suppose x and y are logical variables (1s or 0s). and dene z=x \& y:
y
Scalar
x
Matrix
9.3 logical
The command logical is use to convert non-logical elements to logical. Logical values and regular numbers are treated differently. Logical elements only take up 1 byte of memory (The smallest unit of memory MATLAB can address) while regular numbers require 8 bytes. In certain situations, a logical value is required. One such example is in indexing an array. As previously demonstrated, the elements of a matrix x can be accessed by x(#) where # can be a vector of indices. Since the elements of x are indexed 1,2,. . ., an attempt to retrieve x(0) will return an error. However, if # is not a number but a logical value, this behavior changes. Logical indices are interpreted as indicator functions. Consider the behavior in the following code:
>> x = [1 2 3 4]; >> y = [1 1]; >> x(y) ans = 1 1 >> y = logical([1 1]); >> x(y) ans = 1 2 >> y = logical([1 0 1 0]); >> x(y) ans = 1 3
Using logical indexing produces indices which are interpreted as indicator variables when deciding what to return. Logical indices behave like a series of light switches indicating which elements to select: 1 for on (selected) and 0 for off (not selected). For another example, consider the following block of code
57
Note: logical turns any non-zero value into logical true (1), although a warning is generated if the values differ from 0 or 1. For example
>> x=[0 1 2 3] x = 0 1 2 3 >> logical(x) Warning: Values other than 0 or 1 converted to logical 1. ans = 0 1 1 1
58
9.5 find
find is an useful function for working with multiple data series. find is not logical itself, but it takes logical
inputs and returns matrix indices where the logical statement is true. There are two primary ways to call nd. indices = find (x < y) will return indices (1,2,. . .,numel(x)) while [i,j] = find (x < y) will return pairs of matrix indices what correspond to the places where x<y.
>> x = [1 2 3 4]; >> y = x<=2 y = 1 1 0 0 >> find(y) ans = 1 2 >> x = [1 2 ; 3 4]; >> y = x<=3 y = 1 1 ans = 1 2 3 >> [i,j] = find(y) i = 1 2 1 j = 1 1 2 1 0
>> find(y)
9.6 is*
A number of special purpose logical tests are provided to determine if a matrix has special characteristics. Some operate element-by-element and produce a matrix of the same dimension as the input matrix while other produce only scalars. These functions all begin with is.
9.7 Exercises
59
1 if NaN element-by-element isinf 1 if Inf element-by-element isfinite 1 if not Inf element-by-element isreal 1 if not complex valued. scalar ischar 1 if input is a character array scalar isempty 1 if empty scalar isequal 1 if all elements are equal scalar islogical 1 if input is a logical matrix scalar isscalar 1 if scalar scalar isvector 1 if input is a vector (1 K of K 1). scalar There are a number of other special purpose is* expressions. For more details, search for is* in the help le.
isnan >> x=[4 pi Inf Inf/Inf] x = 4.0000 >> isnan(x) ans = 0 >> isinf(x) ans = 0 ans = 1 1 0 0 0 1 0 >> isfinite(x) 0 0 1 3.1416 Inf NaN
Note: isnan(x)+isinf(x)+isfinite(x) always equals 1, implying any element falls into one (and only one) of these categories.
9.7
Exercises
1. Using the data le created in Chapter 3, count the number of negative returns in both the S&P 500 and ExxonMobil. 2. For both series, create an indicator variable that takes the value 1 is the return is larger than 2 standard deviations or smaller than -2 standard deviations. What is the average return conditional on falling into this range for both returns. 3. Construct an indicator variable that takes the value of 1 when both returns are negative. Compute the correlation of the returns conditional on this indicator variable. How does this compare to the correlation of all returns? 4. What is the correlation when at least 1 of the returns is negative? 5. What is the relationship between all and any. Write down a logical expression that allows one or the other to be avoided (i.e. write myany = ??? and myall = ????).
60
Chapter 10
Flow Control
The previous chapter explored one use of logical variables, selecting elements from a matrix. Logical variables have another important use: ow control. Flow control allows different code to be executed depending on whether certain conditions are met. Two ow control structures are available: if . . . elseif . . . else and switch . . . case . . . otherwise.
cal expression and must be terminated with end. elseif and else are optional and can always be replicated using nested if statements at the expense of more complex logic. The generic form of an if . . . elseif. . . \lstinlineelse! block is
if logical1 Code to run if logical1 true elseif elseif ... ... else Code to run if all previous logicals are false end logical2 logical_3 Code to run if logical2 true and logical1 false Code to run if logical3 true and logical j false, j < 3
or
if logical Code to run if logical true else Code to run if logical false end
62
Flow Control
Note: Remember that all logicals must be scalar logical values. A few simple examples
x = 5; if x<5 x=x+1; else x=x-1; end >> x x = 4
and
x = 5; if x<5 x=x+1; elseif x>5 x=x-1; else x=2*x; end >> x x = 10
These examples have all used simple logical expressions. However, any scalar logical expressions, such as (x<0 || x>1) \&\& (y<0 || y>1) or isinf(x) || isnan(x), can be used in if . . . elseif . . . else blocks.
63
Note: There is an equivalence between switch . . . case . . . otherwise and if . . . elseif . . . else blocks. However, if the logical expressions in the if . . . elseif . . . else block contain inequalities, variables must be created prior to using a switch . . . case . . . otherwise block. switch . . . case . . . otherwise blocks also differ from standard C behavior since only one case can be matched per block. The switch . . . case . . . otherwise block is exited after the rst match and the program resumes with the next line after the block. A simple switch . . . case . . . otherwise example:
x=5; switch x case 4 x=x+1; case 5 x=2*x; case 6 x=x-2; otherwise x=0; end >> x x = 10 cases can include multiple values for the switch variable using the notation case {case1 ,case2 ,. . . }. For
example,
x=5; switch x case {4} x=x+1; case {1,2,5} x=2*x; otherwise x=0; end >> x x = 10 x = 4; switch x case {4} x=x+1; case {1,2,5} x=2*x; otherwise
64
Flow Control
10.3
Exercises
1. Write a code block that would take a different path depending on whether the returns on two series are simultaneously positive, both are negative, or they have different signs using an if . . . elseif . . . else block. 2. Construct a variable which takes the values 1, 2 or 3 depending on whether the returns in exercise 1 are both positive (1), both negative (2) or different signs (3). Repeat exercise 1 using a switch . . . case . . . otherwise block.
Chapter 11
Loops
Loops make many problems, particularly when combined with ow control blocks, simple and in many cases, possible. Two types of loop blocks are available: for. . .end and while. . .end. for blocks iterate over a predetermined set of values and while blocks loop as long as some logical expression is satised. All for loops can be expressed as while loops although the opposite is not true. They are nearly equivalent when break is used, although it is generally preferable to use a while loop than a for loop and a break statement.
11.1 for
for loops begin with for iterator=vector and end with end. The generic structure of a for loop is for iterator=vector Code to run end
iterator is the variable that the loop will iterating over. For example, i is a common name for an iterator. vector is a vector of data. It can be an existing vector or it can be generated on the y using linspace or a:b:c syntax (e.g. 1:10). One subtle aspect of loops is that the iterator can contain any vector data, including non-integer and/or negative values. Consider these three examples:
count=0; for i=1:100 count=count+i; end count=0; for i=linspace(0,5,50) count=count+i; end count=0; x=linspace(-20,20,500); for i=x count=count+i; end
66
Loops
The rst loop will iterate over i = 1, 2,. . . , 100. The second loops over the values produced by the function linspace which creates 50 uniform points between 0 and 5, inclusive. The nal loops over x, a vector constructed from a call to linspace. Loops can also iterate over decreasing sequences:
count=0; x=-1*linspace(0,20,500); for i=x count=count+i; end
The key to understanding for loop behavior is that for always iterates over the elements of vector in the order they are presented (i.e. vector(1), vector(2), . . .). Loops can also be nested:
count=0; for i=1:10 for j=1:10 count=count+j; end end
One particularly useful construct is to loop over the length of a vector, which allows each element to be modied one at a time.
trend=zeros(100,1); for i=1:length(trend) trend(i)=i; end
Finally, these ideas can be combined to produce nested loops with ow control.
matrix=zeros(10,10); for i=1:size(matrix,1) for j=1:size(matrix,2) if i<j matrix(i,j)=i+j; else
11.2 while
67
or loops containing nested loops that are executed based on a ow control statement.
matrix=zeros(10,10); for i=1:size(matrix,1) if (i/2)==floor(i/2) for j=1:size(matrix,2) matrix(i,j)=i+j; end else for j=1:size(matrix,2) matrix(i,j)=i-j; end end end
Note: The iterator variable must NOT be modied inside the for loop. Changing the iterator can produce undesirable results. For instance,
for i=1:100; i i=2*i; i end
11.2 while
while loops are useful when the number of iterations needed depends on the outcome of the loop contents. while loops are commonly used when a loop should only stop if a certain condition is met, such as the
68
Loops
Two things are crucial when using a while loop: rst, the logical expression should evaluate to true when the loop begins (or the loop will be ignored) and second the inputs to the logical expression must be updated inside the loop. If they are not, the loop will continue indenitely (hit CTRL+C to break an interminable loop). The simplest while loops are drop-in replacements for for loops:
count=0; i=1; while i<=10 count=count+i; i=i+1; end
In the block above, the number of iterations required is not known in advance and since randn is a standard normal pseudo-random number, it may take many iterations until this criteria is met. Any nite for loop cannot be guaranteed to meet the criteria.
11.3 break
break can be used to terminal a for loop and, as a result, for loops can be constructed to behave similarly
to while loops.
for iterator = vector Code to run if logical break end end
The only difference between this loop and a standard while loop is that the while loop could potentially run for more iterations and iterator. break can also be used to end a while loop before running the code inside the loop. Consider this slightly strange loop:
11.4 continue
69
The use of while 1 will produce a loop, if left alone, that will run indenitely. However, the break command will stop the loop if some condition is met. More importantly, the break will prevent the code after it from being run, which is useful if the operations after the break will create errors if the logical condition is not true.
11.4 continue
continue, when used inside a loop, has the effect of advancing the loop to the next iteration and skipping
any remaining code in the body of the loop. While continue can always be avoided using if. . .else blocks, its use can make result in tidier code. The effect of continue is best seen through a block of code,
for i=1:10 if (i/2)==floor(i/2) continue end i end
demonstrating that continue is forcing the loop to the next iteration whenever i is even (and (i/2)== floor(i/2) evaluates to logical true).
11.5
Exercises
1. Simulate 1000 observations from an ARMA(2,2) where t are independent standard normal innovations. The process of an ARMA(2,2) is given by y t = 1 y t 1 + 2 y t 2 + 1 t 1 + 2 t 2 + t Use the values 1 = 1.4, 2 = .8, 1 = .4 and 2 = .8. Note: When simulating a process, always simulate more data then needed and throw away the rst block of observations to avoid start-up biases. This process is fairly persistent, at least 100 extra observations should be computed. As a rule, always compute extra data points to throw away, even when simulating a short series.
70
Loops
2. Simulate a GARCH(1,1) process where t are independent standard normal innovations. A GARCH(1,1) process is given by y t = t ht
h t = + t 1 + h t 1 Use the values = 0.05, = 0.05 and = 0.9. 3. Simulate a GJR-GARCH(1,1,1) process where t are independent standard normal innovations. A GJRGARCH(1,1) process is given by y t = t h t h t = + t 1 + t 1 I [t 1 <0] + h t 1 Use the values = 0.05, = 0.02 = 0.07 and = 0.9. Hint: Some form of logical expression is needed in the loop. I [t 1 <0] is an indicator variable that takes the value 1 if the expression inside the [ ] is true. 4. Simulate a ARMA(1,1)-GJR-GARCH(1,1)-in-mean process, y t = 1 y t 1 + 1 t 1 h t 1 + h t + t ht
h t = + t 1 + t 1 I [t 1 <0] + h t 1 Use the values from Exercise 3 for the GJR-GARCH model and use the 1 = 0.1, 1 = 0.4 and = 0.03. 5. Using a while loop, write a bit of code that will do a bisection search to invert a normal CDF. A bisection search cuts the interval in half repeatedly, only keeping the sub interval with the target in it. Hint: keep track of the upper and lower bounds of the random variable value and use ow control. This problem requires normcdf. 6. Test out the loop using by nding the inverse CDF of 0, -3 and pi. Verify it is working by taking the absolute value of the difference between the nal value and the value produced by norminv.
Chapter 12
Graphics
Extensive plotting facilities capable of producing a virtually limitless range of graphical data representations are available. This chapter will emphasize the basics of the most useful graphing tools.
12.1
Support Functions
All plotting functions have a set of support functions which are useful for providing labels for various portions of the plot or making adjustments to the range. Remember to fully label plots so others can clearly tell which series are being plotted and the units of the plot. legend labels the various elements on a graph. The specic behavior of legend depends on the type of plot and the order of the data. legend takes as many strings as unique plot elements. Standard usage is legend(Series 1,Series 2) where the number of series is gure dependent. title places a title at the top of a gure. Standard usage is title(Figure Title). xlabel, ylabel and zlabel produce text labels on the x , y and z if the plot is 3-D axes respectively. Standard usage is xlabel(X Data Name). axis can be used to both get the axis limits and set the axis limits. To retrieve the current axis limits, enter AX = axis();. AX will be a row vector of the form [xlow xhigh ylow yhigh (zlow) (zhigh)] where zlow and zhigh are only included if the gure is 3-D. The axis can be changed by calling axis([xlow xhigh ylow yhigh (zlow) (zhigh)]) where the z-variables are only allowed if the gure is 3-D. axis can also be used to tighten the axes to include only the minimum space required to express the data using the command axis tight. These four are the most important support functions, but there are many additional functions available to tweak gures (see section 12.9).
12.2 plot
plot is the most basic plotting command. Like most commands, it can be used many ways. However, the most straight forward is plot(x1,y1,format1 ,x2,y2,format2 },. . .)
72
Graphics
where xi and yi are vector of the same size and formati is a format string of the form color shape linespec. color can be any of b g r c m y k shape can be any of o x + * s d v
circle x-mark plus star square diamond triangle (down) triangle (up) triangle (left) triangle (right) pentagram hexagram
: -. -(none)
The three arguments are combined to produce a format string. For instance gs- will produce a green solid line with squares at every data point while r+ will produce a set of red + symbols at every data point (note that the string is r-plus-space). Arguments which are not needed can be left out. For instance, to produce a green dotted line with no symbol, use the format string g:. If no format string is provided, an automatic color scheme will be used with marker-less solid lines. Suppose the following x and y data were created,
x = linspace(0,1,100); y1 = 1-2*abs(x-0.5); y2 = x; y3 = 1-4*abs(x-0.5).^2;
Calling plot(x,y1,rs:,x,y2,bo-.,x,y3,kp--) will produce the plot in gure 12.1. A lines color information is lost when documents printed are in black and white. Always use physical characteristics to distinguish multiple series either different line types or different markers, or both.
12.3 plot3
73
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Figure 12.1:
All plots should be clearly labeled. The following code labels the axes, gives the gure a title, and provides a legend. The results of running the code along with the plot command above can be seen in gure 12.2.
xlabel(x); ylabel(f(x)); title(Plot of three series); legend(f(x)=1-|x-0.5|,f(x)=x,f(x)=1-4*abs(x-0.5).^2);
One nal method for calling plot is worth mentioning. plot(y) will plot the data in vector y against a simple series which labels each observation 1, 2, . . ., length(y). plot(y) is equivalent to plot(1:length(y),y) when y is a vector. If y is a matrix, plot will draw each column of y as if it was a separate series and plot(y) is equivalent to plot(1:length(y(:,1)), y(:,1), 1:length(y(:,2)), y(:,2), . . .).
12.3 plot3
plot3 is behaves similarly to plot but plots a series against two other series in a 3-dimensional space. All
74
Graphics
0.8
0.7
0.6
f(x)
0.5
0.4
0.3
0.2
0.1
0.1
0.2
0.3
0.4
0.5 x
0.6
0.7
0.8
0.9
Figure 12.2: Labeled plot of three lines. Clearly label axes and provide a title and legend so readers can easily comprehend the contents of a gure.
figure(2) N=200; x=linspace(0,8*pi,N); x=sin(x); y=linspace(0,8*pi,N); y=cos(y); z=linspace(0,1,N); plot3(x,y,z,rs:); xlabel(x); ylabel(y); zlabel(z); title(Spiral); legend(Spiraling Line)
12.4 scatter
75
1 0.9 0.8 0.7 0.6 z 0.5 0.4 0.3 0.2 0.1 0 1 0.5 0 0 0.5 y 1 1 0.5 x 0.5 1
Figure 12.3: 3-D Spiral plot. 3-D lines can be plotted using the plot3 command. This line was plotted by calling plot3(x,y,z,rs:);.
12.4 scatter
scatter, like most graphing functions, is self descriptive. It produces a scatter plot of the elements of a
vector x against the elements of a vector y . Formatting, such as color or marker shape can only be changed by using handle graphics or by manually editing the plot. Simple example of handle graphics are included at the end of this chapter. Consult scatters help le for further information. The following code produces a scatter plot of 1000 pseudo-random numbers from a normal distribution, each with unit variance and correlation of 0.5. The output of this code can be seen in gure 12.4.
figure(4) x=randn(1000,2); Sigma=[2 .5;.5 0.5]; x=x*Sigma^(0.5); scatter(x(:,1),x(:,2),rs) xlabel(x) ylabel(y) legend(Data point) title(Scatter plot of correlated normal random variables)
76
Graphics
3 5
1 x
Figure 12.4: Scatter plot. This plot contains a scatter plot of a bivariate normal random deviations with unit variance and correlation of 0.5. This line was plotted by calling scatter(x(:,1),x(:,2),rs);.
12.5 surf
The next three graphics tools all plot a matrix of z data against vector of x and y data. All three uses the results from a bivariate normal probability density function. The PDF of a bivariate normal with mean 0 is given by f X (x ) = 1 exp( x 1 x ) 2 2||
1 2
In this example, the covariance matrix, , was chosen = 2 0.5 0.5 0.5
A matrix of pdf values, pdf was created with the following code:
N = 100; x = linspace(-3,3,N); y = linspace(-2,2,N); pdf=zeros(N,N);
12.5 surf
77
Figure 12.5: Surface plot. surf plots a 3-D surface from vectors of x and y data and a matrix of z data. This surf contains the PDF bivariate of a bivariate normal, and was created using surf(x,y,pdf) where x, y and pdf are dened in the text.
The rst two lines initialize the x and y values. Since x has a higher variance, it has a larger range. The surf (gure 12.5) was created by
surf(x,y,pdf) xlabel(x) ylabel(y) zlabel(PDF) title(Surf of normal PDF) shading interp
The command shading interp changes how the colors are applied from a discrete grid to a continuous grid. Note: The x and y arguments of surf must match the dimensions of the z argument. If [M,N]=size(z), then length(y) must be M and length(x) must be N. This is true of all 3-D plotting functions that draw matrix data. In the code above, i is the row iterator which corresponds to y and j is the column iterator,
78
Graphics
corresponding to x.
12.6 mesh
mesh produces a graphic similar to surf but with empty space between grid points. Mesh has the advantage
that the hidden side can be seen, potentially revealing more from a single graphic. It also produces much smaller les which can be important when including multiple graphics in a presentation or report. Using the same bivariate normal setup, the following code produces the mesh plot evidenced in gure 12.6.
0.2 0.18 0.16 0.14 0.12 PDF 0.1 0.08 0.06 0.04 0.02 0 2 1.5 1 0.5 0 0.5 1 1.5 y 2 3 x 2 1 0 1 2 3
Figure 12.6: Mesh plot. mesh produce a gure similar to surf but with gaps between grid points, allowing the backside of a gure to be seen in a single view. This mesh contains the PDF of a bivariate normal, and was created using mesh(x,y,pdf) where x, y and pdf are dened in the text.
12.7 contour
79
12.7 contour
contour is similar to surf and mesh in that it takes three arguments, x , y and z . contour differs in that it produces a 2D plot. contour plots, while not as eye-catching as surf or mesh plots, are often better at conveying meaningful information. Contour plots can be either called as contour(x,y,z) or contour(x,y,z,N) where N determines the number of contours drawn. If omitted, the number of contours is automatically determined based on the variance of the z data. The code below and gure 12.7 demonstrate the use of contour.
1.5
0.5
0.5
1.5
2 3
0 x
Figure 12.7: Contour plot. A contour plot is a set of slices through a surf plot. This particular contour plot contains iso-probability lines from a bivariate normal distribution with mean 0, variances of 2 and 0.5, and correlation of 0.5.
12.8 subplot
Subplots allow for multiple plots to be placed in the same gure. All calls to subplot must specify three arguments, the number of rows, the number of columns, and which cell to place the graphic. The generic
80
Graphics
form is
subplot(M ,N ,#).
where M is the number of rows, N is the number of columns, and # indicates the cell to place the graphic. Cells in a subplot are counted across then down For instance, in a call to subplot(3,2,#), the #s would be 1 2 3 4 5 6 A call to subplot should be immediately followed by some plotting function. In the simplest case, this would be a call to plot. However, any graphic function can be used in a subplot. The code below and output in gure 12.8 demonstrates how different data visualizations may be used in every cell. These also show a few of the available graphics function that are not described in these notes.
subplot(2,2,1); x = [5 3 0.5 2.5 2]; explode = [0 1 0 0 0]; pie(x,explode) colormap jet title(pie function) axis tight subplot(2,2,2); Y = cool(7); bar3(Y,detached) title(Detached) title(bar3, Detached) axis tight subplot(2,2,3) bar3(Y,grouped) title(bar3, Grouped) axis tight subplot(2,2,4); x = 1:10; y = sin(x); e = std(y)*ones(size(x)); errorbar(x,y,e) title(errorbar) axis tight
Note: The graphics code in each subplot was taken from the functions help le (see doc function). The help system is comprehensive and most functions are illustrated with example code.
12.9
Advanced Graphics
While the standard graphics functions are powerful, these function are not exible enough to express all of the options available. For instance, if is often useful to change the thickness of a line in order to improve its
81
pie function
15% 1
bar3, Detached
bar3, Grouped 1 0.8 0.6 0.4 0.2 0 0 1 2 3 4 5 6 7 1.5 2 4 1 0.5 1.5 1 0.5
errorbar
10
Figure 12.8: Subplot example. Subplots allow for more than one graphic to be included in a gure. This particular subplot contains three different types of graphics with two variants on the 3-D bar. The upper left contains a call to pie, the upper right contains a call to bar3 specifying the option grouped, the lower left contains a call to bar3 specifying the options detached and the lower right contains the results to a call to errorbar.
appearance or to add an arrow to highlight a particular feature of a graph. Two mechanisms are provided to add elements to a plot. The rst, which will be referred to as pointand-click, involves manually editing the plot in the gure window. The second, and more general of the two, is known as handle graphics. Handle graphics provides a mechanism to programmatically change anything about a graph.
Point-and-click
The simplest method to improve plots is to use the editing facilities of the gure windows directly. A number of buttons are available along the top edge of a plot. One the of these is an arrow, (1) in gure 12.9. Clicking on the arrow will highlight it and allow any element, such as a line, to be selected. Double-clicking on a line will bring up a Property Editor (2) dialog which contains elements of the selected item that can be changed. These include color, line width, and marker (3). For more information in editing plots, search for Editing Plots in the help browser.
82
Graphics
Figure 12.9: Point-and-click editing. Most features of a plot can be editing using the interactive editing tools of a gure window. Interactive editing can be started by rst selecting the arrow icon along the top of the gure (1), then clicking on the element to be edited (e.g. the line, the axes, any text label). This will bring up the Property Editor (2) where the item specic properties can be changed (3). Alternatively, the interactive editing features can be enabled by selecting Edit>Figure Properties.
Handle Graphics
The graphics system is fully programmable. Anything that can be accomplished through manual editing of a plot can be accomplished by using handle graphics. Every graphical element is assigned a handle. The handle contains everything there is to know about the particular graphic, such as colors or line widths. Once familiar with handle graphics, they can be used to create spectacularly complex data visualizations. The use of handle graphics will be illustrated through an example. The example will illustrate the use of handle graphics by showing both before and after plots using subplot.
e y = randn(100,2); = cumsum(e);
12.10 Exercises
83
subplot(2,1,1); plot(y); legend(Random Walk 1,Random Walk 2) title(Two Random Walks) xlabel(Day) ylabel(Level) subplot(2,1,2); h l t = plot(y); = legend(Random Walk 1,Random Walk 2,Location,Southwest); = title(Two Random Walks)
xl = xlabel(Day) yl = ylabel(Level) set(h(1),Color,[1 0 0],LineWidth,3,LineStyle,:) set(h(2),Color,[1 .6 0],LineWidth,3,LineStyle,-.) set(t,FontSize,14,FontName,Bookman Old Style,FontWeight,demi) set(l,FontSize,14,FontName,Bookman Old Style,FontWeight,demi) set(xl,FontSize,14,FontName,Bookman Old Style,FontWeight,demi) set(yl,FontSize,14,FontName,Bookman Old Style,FontWeight,demi) parent = get(h(1),Parent); set(parent,FontSize,14,FontName,Bookman Old Style,FontWeight,demi)
Most modication that can be made using handle graphics can be implemented using the point-andclick editing method previously outlined. The advantage of handle graphics is only apparent when a gure needs to be updated or redrawn. If handle graphics have been used, it is only necessary to change the data and the re-run the m-le. If using the point-and-click editing method, any change in the data or model required manually reapplying the edits. For more on handle graphics, please consult the Handle Graphics Properties in the help le.
12.10
Exercises
1. Generate two random walks using a loop and randn. Plot these two on a gure and provide all of the necessary labels. 2. Generate a 3-D plot from
x = linspace(0,10*pi,300); y = sin(x); z = x*y;
Label all axes, title the gure and provide a legend. 3. Generate 1000 draws from a normal. Plot a histogram with 50 bins of the data. 4. Using the ExxonMobil and S&P 500 data (see the Chapter 3 exercises), produce a 2 2 subplot containing: A scatter plot of the two series Two histograms of the series One plot of the two series against the dates. Change the axis labels to text using datetick.
84
Graphics
5 Level 10 15 20 0
10
20
30
40
50 Day
60
70
80
90
100
Figure 12.10: Handle graphics. The top subplot is a standard call to plot while the bottom highlight some of the possibilities when using handle graphics. It is worth nothing that all of these changes evidenced in the bottom subplot can be reproduces using the point-and-click method.
Chapter 13
Exporting Plots
Once a plot has been nalized, it must be exported to be included in an assignment, report or project. Exporting is straight forward. On the gure, click File, Save As (1 in gure 13.1). In the Save as type box, A select the desired format (TIFF for Microsoft Ofce uses, EPS le for LTEX(2 in gure 13.2)), enter a le name (1 in gure 13.2) and save. Figures 13.1 and 13.2 contain representations of the steps needed to export from a gure box.
Figure 13.1: Steps to export a gure. To export a gure, click Save As. . . in the le menu of a gure (1). The dialog in gure 13.2 will appear.
86
Exporting Plots
Figure 13.2: Save as dialog. To export a gure, enter a le name and use the drop-down box to select a le A type. Select TIFF image if using Microsoft Ofce or EPS File (Encapsulated Postscript) if using LTEX.
If the exported gure does not appear as desired, it may be necessary to alter the shape of the gures window. Exported gures are What-You-See-Is-What-You-Get (WYSIWYG). Figure 13.3 contains an example of a gure with reasonable proportions while the axes in Figures 13.4 and 13.5 poorly scaled. The following code will the three gures.
fig = figure(1); x = linspace(0,1,100); y = 1-abs(x-0.5); plot(x,y,r) xlabel(x); ylabel(y=1-|x-0.5|); title(Roof-top plot); legend(f(x)=1-|x-0.5|); set(fig,Position,[445 -212 957 764]);
13.1 print
87
0.9
0.85
0.8 y=1|x0.5|
0.75
0.7
0.65
0.6
0.55
0.5
0.1
0.2
0.3
0.4
0.5 x
0.6
0.7
0.8
0.9
Figure 13.3: Exporting gures is What-You-See-Is-What-You-Get. The axes in this gure are appropriately scaled.
xlabel(x); ylabel(y=1-|x-0.5|); title(Roof-top plot); legend(f(x)=1-|x-0.5|); set(fig,Position,[ 445 fig = figure(3); x = linspace(0,1,100); y = 1-abs(x-0.5); plot(x,y,r) xlabel(x); ylabel(y=1-|x-0.5|); title(Roof-top plot); legend(f(x)=1-|x-0.5|); set(fig,Position,[ 445 216 957 336]); -212 461 764]);
13.1 print
Figures can also be exported programmatically using the print command. The basic structure of the comA mand is print -dformat lename where format is epsc2 for color encapsulated postscript (LTEXor Microsoft A X) or tiff for Tiff (Microsoft Ofce). Figures exported in EPS Ofce), pdf for portable document format (LTE or PDF formats are vector images and scale both up and down well. Tiff images are static and become blurry when scaled. Note: It is necessary to call set(gcf,Color,[1 1 1],InvertHardcopy,off) before print to remove
88
Exporting Plots
0.9
0.85
0.8 y=1|x0.5|
0.75
0.7
0.65
0.6
0.55
0.5
0.2
0.4 x
0.6
0.8
Figure 13.4: Exporting gures is What-You-See-Is-What-You-Get. The axes in this gure are poorly scaled and the height is too large for the width.
y=1|x0.5|
0.8
0.7
0.6
0.5
0.1
0.2
0.3
0.4
0.5 x
0.6
0.7
0.8
0.9
Figure 13.5: Exporting gures is What-You-See-Is-What-You-Get. The axes in this gure are poorly scaled and the width is too large for the height.
13.2 Exercises
89
x = linspace(0,1,100); y = 1-abs(x-0.5); plot(x,y,r) xlabel(x); ylabel(y=1-|x-0.5|); title(Roof-top plot); legend(f(x)=1-|x-0.5|); set(fig,Position,[445 -212 957 764]); set(gcf,Color,[1 1 1],InvertHardcopy,off) print -depsc2 ExportedFigure.eps print -dpdf ExportedFigure.pdf print -dtiff ExportedFigure.tiff
13.2
Exercises
1. Export the plot from exercise 1 of the previous chapter as a TIFF and an EPS. View the les created outside of MATLAB. 2. Use page setup to change the orientation and dimensions as described in this chapter. Re-export the gure as both a TIFF and EPS (using different names) and compare the new images to the old versions.
90
Exporting Plots
Chapter 14
Custom Functions
In addition to writing batch les and calling predened functions, custom functions can be written to perform repeated tasks or to use as the objective of an optimization routine. All functions must begin with the line of the form
function [out1, out2, . . .] = functionname(in1,in2,. . .)
where out1, out2, . . . are variables the function returns to the command window, functionname is the name of the function (which should be unique and not a reserved word) and in1, in2, . . . are input variables. Obviously functions can take multiple inputs sand return multiple outputs. To begin, consider this simple function:
function y = func1(x) x = x + 1; y = x;
This function, which is not particularly well written1 , takes one input and returns one output, incrementing the input variable (whether a scalar, vector or matrix) by one. Functions have a few important differences relative to standard m-le scripts. Functions operate on a copy of the original data. Thus, the same variable names can be used inside and outside of a function without risking any data.2 Any variables created when the function is running, or any copies of variables made for the function, are lost when the function completes. In the function above, this means that only the value of y is returned and everything else is lost. Specically changes in x do not persist. For example, suppose the following was entered
>> x = 1; >> y = 1; >> z = func1(x); >> x
1 It has no comments and superuous commands. The function should only contain y = x +1; and a comment that describes the functions purpose. 2 MATLAB uses a copy-on-change model where data is only copied if modied. If unmodied, variables passed to functions behave as if passed by reference.
92
Custom Functions
x = 1 >> y y = 1 >> z z = 2
Thus, despite the function using variables named x and y, the values of x and y in the workspace do not change when the function is called. Functions with multiple inputs and outputs can also be constructed. A simple example is given by
function [xpy, xmy] = func2(x,y) xpy = x + y; xmy = x - y;
This function takes two inputs and returns two outputs. It is important to note that despite the two outputs of this function, it does not need to be called using both. For example, consider the following use of this function
>> x = 1; >> y = 1; >> z1 = func2(x, y) z1 = 2 >> [z1, z2] = func2(x ,y) z1 = 2 z2 = 0
There are a number of advanced function specic variables available to determine environmental parameters such as how many input variables were provided to the function (nargin), how many output were requested (nargout), that allow variable numbers of input and outputs (varargin and varargout, respectively) and that allow for early termination of the function (return). This course can be completed without using any of these. However, they are available if needed in other research.
14.1
Comments
Like batch m-les, comments in custom functions are made using the % symbol. However, comments have an additional purpose in custom functions. Whenever help function is entered in the command window, the rst continuous block of comments is displayed in the command window. For instance, in the function
func function y = func(x) % This |function| returns % the value of the input squared. % The next block of comments will not be returned when % help func is entered in the Command Window
14.2 Debugging
93
% This line does the actual work. y=x.^2; help func returns >> help func This function returns the value of the input squared.
Initial comments usually contain the possible combinations of input and output arguments as well as a description of the function. While comments are optional, they should be included both to improve readability of the function and to assist others if the function is shared.
14.2
Debugging
Since the data modied in the function is not available when the function is run, debugging can be difcult. There are three basic strategies to debug a function: Write the function as a script and then convert it to a proper function. Leave off ; as needed to write out the value of variables to the command window. Alternatively, use disp. Use keyboard and return to interrupt the function to inspect the values. The rst of these methods is often the easiest. Consider a script version of the function above,
x = 1; y = 2; %function [xpy, xmy] = func2(x,y) xpy = x + y; xmy = x - y;
Running this script would be equivalent to calling the function func2(1,2). However, when calling it as a script, variables can be examined as they change. The second method can be useful although clumsy. Often the output window becomes lled with numbers and locate the problematic code may be difcult. The third options is the most advanced. Adding keyboard to a function interrupts the function at the location of keyboard and returns control to the command window. When in this situation, the usual >> prompt changes to a K>>. When in keyboard mode, variables inside the function are treated as if they were script variables. Once nished inspecting the variables, enter return to continue the execution of the function. A simple example of keyboard can be adapted to the function above,
function [xpy, xmy] = func3(x,y) keyboard xpy = x + y; xmy = x - y; keyboard
Calling this function will result in an immediate keyboard session (note the K>>). Entering whos will list two variables, x and y. When return is entered, a second keyboard session open. Entering whos will now list four variables, the original two and xpy and xmy. When a function has been debugged, either comment out or remove the keyboard commands.
94
Custom Functions
14.3
Exercises
1. Write a function summstat that take one input, a T by K matrix, and returns a matrix of summary statistics of the form
mean(x(:,1)) mean(x(:,2)) std(x(:,1)) std(x(:,2)) skewness(x(:,1)) skewness(x(:,2)) kurtosis(x(:,1)) kurtosis(x(:,2))
. . .
mean(x(:,K))
. . .
std(x(:,K))
. . .
skewness(x(:,K))
. . .
kurtosis(x(:,K))
2. Rewrite the function so that it outputs 4 vectors, one each for mean, std, skewness and kurtosis. 3. Write a function called normloglikihood that takes two arguments, params and data (in that order) and returns the log-likelihood of a vector of data. Note: params = [mu sigma2] consists of two elements, the mean and the variance. 4. Append to the previous function a second output that returns the score of the log-likelihood (a 2 1 vector) evaluated at params.
Chapter 15
15.1 quantile
quantile returns the empirical quantile of a vector. However, its function is simple and can easily be re-
15.2 prctile
prctile is identical to quantile except it expects an argument from 0 to 100 rather than an argument be-
tween 0 and 1.
15.3 regress
regress performs basic regression and returns key regression statistics. The Statistic Toolbox implemen-
tation is not robust to many empirical realities in nance such as heteroskedasticity and writing a custom regression function is a useful exercise.
96
(gam-) Lognormal (logn-) Normal (Gaussian) (norm-) Poisson (poiss-) Students t (t-) Uniform (unif-)
15.5
The UCSD_GARCH toolbox, also referred to as the MFE toolbox, is a set of functions for making inference on many common problems in nancial econometrics. It is available on the course website.
15.6
The JPL toolbox, available from http:\\www.spatial-econometics.com, contains many econometric functions written by academics. Best of all, it is free. It also has a number of useful plotting functions such as pltdens which plots an kernel smooth of an empirical density. Try downloading this toolbox before writing a custom function to avoid needless replication.
15.7
Exercises
1. Have a look through the statistics toolbox in the help browser and explore the functions available. 2. Download the JPL toolbox and extract its contents. Have a look through the list of functions available.
Chapter 16
Optimization
The optimization toolbox contains a number of routines to the nd extremum of a user-supplied objective function. Most of these implement a form of the Newton-Raphson algorithm which uses the gradient to nd the minimum of a function. Note: The optimization routines can only nd minima. However, if f is a function to be maximized, f is a function with the minimum at located the same point as the maximum of f . A custom function that returns the function value at a set of parameters for example a log-likelihood or a GMM quadratic form is required use one of the optimizers must be constructed. All optimization targets must have the parameters as the rst argument. For example consider nding the minimum of x 2 . A function which allows the optimizer to work correctly has the form
function x2 = optim_target1(x) x2=x^2;
When multiple parameters (a parameter vector) are used, the objective function must take the form
function obj = optim_target2(params) x=params(1); y=params(2); obj= x^2-3*x+3+y*x-3*y+y^2;
Optimization targets can have additional inputs that are not parameters of interest such as data or hyperparameters.
function obj = optim_target3(params,hyperparams) x=params(1); y=params(2); c1=hyperparams(1); c2=hyperparams(2); c3=hyperparams(3); obj= x^2+c1*x+c2+y*x+c3*y+y^2;
This form is useful when optimization targets require at least two inputs: parameters and data. Once an optimization target has been specied, the next step is to use one of the optimizers nd the minimum.
98
Optimization
16.1 fminunc
fminunc performs gradient-based unconstrained minimization. Derivatives can be provided by the user or
where fun is the optimization target, p 0 is the vector of starting values, options is a user supplied optimization options structure (see 16.5), and var1 , var2 , . . . are optional variables containing data or other constant values. Typically, three outputs are requested, the parameters at the optimum (p), the function value at the optimum (fval) and a ag to determine whether the optimization was successful (exitflag). For example, suppose
function obj = optim_target4(params,hyperparams) x=params(1); y=params(2); c1=hyperparams(1); c2=hyperparams(2); c3=hyperparams(3); obj= x^2+c1*x+c2+y*x+c3*y+y^2;
was our objective function and was saved as optim_target4.m. To minimize the function, call
>> options = optimset(fminunc); >> options = optimset(options,Display,iter); >> p0 = [0 0]; >> hyper = [-3 3 -3]; >> [p,fval,exitflag]=fminunc(optim_target4,p0,options,hyper)
which produces
>> [p,fval,exitflag]=fminunc(optim_target4,p0,options,hyper) First-order Iteration 0 1 p = 1 fval = 0 exitflag = 1 fminunc has minimized this function and returns the optimum value of 0 at x = (1, 1) and the exitflag 1 Func-count 3 6 f(x) 3 0 0.333333 Step-size optimality 3 1.49e-008
has the value 1, indicating the optimization was successful. Values less than or equal to 0 indicate the optimization to not converge successfully.
16.2 fminsearch
99
16.2 fminsearch
fminsearch also performs unconstrained optimization but uses a derivative free method (using a simplex). fminsearch uses a virtual amoeba to crawl around in the parameter space that will always move to lower
objective function values. fminsearch has the same generic form as fminunc
[p,fval,exitflag]=fminsearch(fun,p0 ,options, var 1 ,var 2 ,. . .)
where fun is the optimization target, p 0 is the vector of starting values, options is a user supplied optimization options structure (see 16.5), and var1 , var2 , . . . are (optional) variables of data or other constant values. Returning to the previous example but using fminsearch,
>> options = optimset(fminsearch); >> options = optimset(options,Display,iter); >> [x,fval,exitflag]=fminsearch(optim_target4,[0 0],options,hyper) Iteration 0 1 2 3 4 ... ... ... 57 58 59 60 107 109 111 113 8.93657e-009 3.71526e-009 1.99798e-009 5.82712e-010 contract inside contract outside contract inside contract inside Func-count 1 3 5 6 8 min f(x) 3 2.99925 2.99775 2.99775 2.99475 initial simplex expand reflect expand Procedure
Optimization terminated: the current x satisfies the termination criteria using OPTIONS.TolX of 1.000000e-004 and F(X) satisfies the convergence criteria using OPTIONS.TolFun of 1.000000e-004 x = 1.0000 fval = 5.8271e-010 exitflag = 1 fminsearch requires more iterations and many more function evaluations and should not be used if fminunc 1.0000
works satisfactorily. However, for certain problems, such as objective functions which are not continuously differentiable, fminsearch may be the only option.
16.3 fminbnd
fminbnd performs minimization of single parameter problems over a bounded interval using a golden sec-
where fun is the optimization target, lb and ub are the lower and upper bounds of the parameter, options is a user supplied optimization options structure (see 16.5), and var1 , var2 , . . . are (optional) variables containing data or other constant values.
100
Optimization
Since fminbnd only minimizes univariate objectives, consider nding the minimum of
function obj = optim_target5(params,hyperparams) x=params(1); c1=hyperparams(1); c2=hyperparams(2); c3=hyperparams(3); obj= c1*x^2+c2*x+c3;
Optimization terminated: the current x satisfies the termination criteria using OPTIONS.TolX of 1.000000e-004 x = 5 fval = -4 exitflag = 1
16.4 fmincon
fmincon performs constrained optimizations using linear and/or nonlinear constraints which can be either
equality of inequality constraints. fmincon minimizes f (x ) subject to any combination of A EQ x = b EQ Ax b C N EQ (x ) = d N EQ C (x ) d where x is K by 1 parameter vector, A is a Q K matrix and b is a Q 1 vector, A EQ is a P K matrix of b EQ is a P by 1 vector. In the second set of constraints, C () is a function from K to M where M is the number of nonlinear inequality constraints, d is a M 1 vector, C N EQ (x ) is a function from K to N and d N EQ if
16.4 fmincon
101
an N 1 vector where N is the number of nonlinear equality constraints. Note: Any constrain can be transformed into a constraint by multiplying by 1. The generic form of fmincon is EQ [p,fval,exitflag]=fmincon(fun, p0 ,A,b, AEQ ,b ,LB, UB,nlcon,options,var 1 ,var 2 ,. . .) where fun is the optimization target, p 0 is the vector of starting values, A and A EQ are matrices for inequality and equality constraints, respectively and b and b EQ are conformable vectors. LB and UB are vectors with the same size as p 0 that contain upper and lower bounds, respectively. Note: LB and UB can always be represented in A and b . For instance, suppose the constraint was 1 p 1, then A and b would be A=
1
1
b =
1 1
which are expressions for p 1 (which is equivalent to p 1) and p 1. nlcon is a nonlinear constraint function that returns the value of C (x ) d and C N EQ (x ) d N EQ (This is tricky function. See doc fmincon for specics). options is a user supplied optimization options structure (see 16.5), and var1 , var2 , . . . are (optional) variables containing data or other constant values. 1 Consider the problem of optimizing a CRS Cobb-Douglas utility function of the form U (x 1 , x 2 ) = x 1 x 2 subject to a budget constraint p 1 x 1 + p 2 x 2 1. This is a nonlinear function subject to a linear constraint (note that is must also be that case that x 1 0 and x 2 0). First, specify the optimization target
function u = crs_cobb_douglas(x,lambda) x1=x(1); x2=x(2); u=x1^(lambda)*x2^(1-lambda); u=-u %Must change max problem to min!!!
>> b=[0; 0; 1]
102
Optimization
-0.529134
-4.14e-025
2.01e-009
Optimization terminated: first-order optimality measure less than options.TolFun and maximum constraint violation is less than options.TolCon. Active inequalities (to within options.TolCon = 1e-006): lower x = 0.3333 0.6667 fval = -0.5291 exitflag = 1 upper ineqlin 3 ineqnonlin
the exitflag value of 1 indicates success. Suppose that dual to the original problem, that of cost minimization, is used instead. In this alternative formulation, the optimization problems becomes min p 1 x 1 + p 2 x 2 subject to U (x 1 , x 2 ) U
x 1 ,x 2
Since this problem has a nonlinear constraint, it is necessary to specify a nlcon function,
function [C, Ceq] = compensated_utility(x,prices,lambda,Ubar) x1=x(1); x2=x(2); u=x1^(lambda)*x2^(1-lambda); con=u-Ubar; %Note this is a >= constraint C=-con; %This turns it into a <= constraint Ceq = []; %No equality constraints
Note: The constraint function and the optimization must take the same optional arguments in the same order, even if they do not need them. The solution to this problem can be found using
>> options = optimset(fmincon); >> options = optimset(options,Display,iter); >> prices >> lambda = [1 1]; %Change this set of parameters as needed = 1/3; %Change this parameter as needed
16.4 fmincon
103
>> A = [-1 0; 0 -1] %Note, require x1>=0 and x2>=0 A = -1 0 b = 0 0 >> Ubar = .5291; >> x0 = [1.5;1.5]; %Start with all constraints satisfied, since -1.5+1<0 (-u+ubar). >> [x,fval,exitflag]=fmincon(budget_line,x0,A,b,[],[],[],[],compensated_utility,... options,prices,lambda,Ubar) Max Iter F-count 0 1 2 3 4 5 6 3 6 10 13 16 19 22 f(x) 3 1.05238 0.952732 0.999469 0.999653 0.999936 0.999936 constraint -0.9709 6.451e-005 0.02503 0.0004091 0.0001502 1.615e-007 2.535e-011 1 0.5 1 1 1 1 -1.95 -0.199 0.0467 0.000184 0.000283 3.05e-007 0.982 0.083 0.0365 0.00127 2.34e-005 1.31e-008 Line search steplength Directional derivative First-order optimality 0 -1
>> b=[0; 0]
Optimization terminated: first-order optimality measure less than options.TolFun and maximum constraint violation is less than options.TolCon. Active inequalities (to within options.TolCon = 1e-006): lower x = 0.3333 0.6666 fval = 0.9999 exitflag = 1 upper ineqlin ineqnonlin 1
These two examples are problems where the answers can be analytically veried. In many cases it is impossible to verify that the global optimum has been found if there are local minima. The standard practice for addressing the possibility of local minima is to start the optimization from different starting values and then to use the lowest fval. If the optimizer is working well on the specied problem, many of the starting values should produce similar parameter estimates and fvals. Note: Many aspects of constrained optimization (and optimization in general) are more black magic than science. Worse, most are problem class specic so general rules are hard to derive. Practice is the only way to become procient at function minimization.
104
Optimization
16.5 optimset
optimset sets optimization options and has two distinct forms. The initial call to optimset should always
be of the form options = optimset(fmintype) which will return the default options for the selected optimizer. Once the options structure has been initialized, individual options can be changed by calling options = optimset(options,option1 ,option value1 ,option2 ,option value2 ,. . .) For example, to set options for fmincon,
>> options = optimset(fmincon); >> options = optimset(options,MaxFunEvals,1000,MaxIter,1000); >> options = optimset(options,TolFun,1e-3);
For help on the available options or their specic meaning, see doc optimset.
16.6
The Optimization toolbox contains a number of other optimization algorithms: fseminf Multidimensional constrained minimization, semi-innite constraints fgoalattain Multidimensional goal attainment optimization fminimax Multidimensional minimax optimization lsqlin Linear least squares with linear constraints lsqnonneg Linear least squares with nonnegativity constraints lsqcurvefit Nonlinear curve tting via least squares (with bounds) lsqnonlin Nonlinear least squares with upper and lower bounds bintprog Binary integer (linear) programming linprog Linear programming quadprog Quadratic programming
Chapter 17
17.1
datenum
datenum converts either string dates (01JAN2000) or numeric dates ([2000 01 01]) into MATLAB serial
dates. To call the function with string dates, use either datenum(stringdate) or datenum(stringdate,format ) where format is composed of blocks from yyyy Four digit year. yy Two digit year, risky since it can assume the wrong century. mmmm Full name of month (e.g. January) mmm First three letters of month (e.g. JAN) mm Numeric month of year m Capitalized rst letter of month dddd Full name of weekday ddd First three letters of weekday dd Numeric day of month d Capitalized rst letter of weekday HH Hour, should be 24 hour format and padded with 0 if single digit MM Minutes, must be padded with extra 0 if single digit SS Seconds, must be padded with extra 0 if single digit Common string formats are automatically recognized. However, the format strings above can be used to handle non-standard cases. They are particularly useful if the arguments appear in a strange order, such as yyyyddmm (e.g. 20000101), or if the dates are delimited using nonstandard characters, such as a ; or , (e.g. 2000;01;01). A few examples
>> datenum(01JAN2000) ans =
106
730486 >> datenum(01JAN2000,ddmmmyyyy) ans = 730486 >> datenum(01;JAN;2000,dd;mmm;yyyy) ans = 730486 >> datenum(01012000,ddmmyyyy) ans = 730486 datenum also works on string arrays. For example >> strdates=strvcat(01JAN2000,02JAN2000,03JAN2000) strdates = 01JAN2000 02JAN2000 03JAN2000 >> datenum(strdates) ans = 730486 730487 730488 datenum can also be used to convert numeric dates, such as [2000 01 01] to MATLAB serial date format.
For example,
>> datenum([2000 01 01]) ans = 730486 >> years=[2000;2000;2000]; >> months=[01;01;01]; >> days=[01;02;03]; >> [years months days] ans = 2000 2000 2000 ans = 730486 730487 730488 datenum can also be used to translate hours, minutes and seconds to fractional days, although it should be 1 1 1 1 2 3
>> datenum(years,months,days)
17.2
datestr
datestr is the inverse of datenum. It produces a human readable string from a MATLAB serial date. By
default, it will return string dates of the form dd-mmm-yyyy. However, it also knows a number of stan-
17.3 datevec
107
dard formats such as mm/dd/yy or mmm.dd,yyyy. To produce one of the nonstandard date formats, use datestr(serial_date, #) where # corresponds to one of the format strings (see doc datestr for a list). datestr can also produce strings with arbitrary formats using the syntax detailed above (e.g. dd; mm; yyyy to produce a date string with ; delimiters).
>> serial_date=datenum(01JAN2000) serial_date = 730486 >> datestr(serial_date) ans = 01-Jan-2000 >> datestr(serial_date,0) ans = 01-Jan-2000 00:00:00 >> datestr(serial_date,mmm;dd;yyyy) ans = Jan;01;2000
Like datenum, datestr can take a vector input and return a vector output.
>> serial_date=datenum(strvcat(01JAN2000,02JAN2000,03JAN2000)) serial_date = 730486 730487 730488 >> datestr(serial_date) ans = 01-Jan-2000 02-Jan-2000 03-Jan-2000
17.3
datevec
1
datevec converts MATLAB serial dates into human parsable numeric formats. Specically, given a K
vector containing MATLAB serial dates, datevec will produce a K 6 vector of the form [Year Month Day Hour Minute Second]. For example,
>> serial_date=datenum(strvcat(01JAN2000,02JAN2000,03JAN2000 12:00:00)) serial_date = 730486 730487 730488.5 >> datevec(serial_date) ans = 2000 2000 2000 1 1 1 1 2 3 0 0 12 0 0 0 0 0 0
corresponds to 0:00 (midnight) on the January 1-2, 2000 and 12:00 (noon) on January 3, 2000.
108
6 vector (same format as datevec) of the computer clock. datevec(now) produces the same output as clock.
17.6 etime
The elapsed time between two calls to clock can be computed using etime.
>> c=clock; >> j=1; for i=1:10000000; j=j+1; end; >> e=etime(clock,c) e = 0.4830
17.7
datetick
datetick is not a function for explicitly working with dates. datetick converts an axis of a plot expressed in
produces the two plots in gure 17.1. The top plot contains MATLAB serial dates along the x-axis while the bottom contains sting dates. datetick also understands both standard formatting commands (see datestr) and custom formatting commands (see datenum). This function has an unfortunate tendency to produce few x-labels. The solution is to rst choose the axis label points (in serial dates) and than use datetick(x,keepticks,keeplimits) as illustrated in gure 17.2.
>> h=plot(dates, rw); >> axis tight >> serial_dates=datenum(strvcat(01/01/2000,01/02/2000,01/03/2000,01/04/2000,...
17.7 datetick
10 5 0 5 10 15 20 7.3045
109
7.305
7.3055
7.306
7.3065
7.307
7.3075
7.308
7.3085
7.309 x 10
5
10 5 0 5 10 15 20 Q100
Q200
Q300
Q400
Q101
Figure 17.1: Example of datetick. datetick converts MATLAB serial dates into text strings. Unfortunately, datetick changes the location of points and makes fairly bad choices. The solution is to use datetick(x,keepticks,keeplimits).
01/05/2000,01/06/2000,01/07/2000,01/08/2000,... 01/09/2000,01/10/2000,01/11/2000,01/12/2000),... dd/mm/yyyy); >> parent=get(h,Parent); >> set(parent,XTick,serial_dates); >> datetick(x,dd/mm,keeplimits,keeplimits); >> xlabel(Date) >> ylabel(Level) >> title(Demo plot of datetick with keeplimits and keepticks)
110
Level
10
15
01/01
01/02
01/03
01/04
01/05
01/06
01/07 Date
01/08
01/09
01/10
01/11
01/12
Figure 17.2: datetick with keepticks and keeplimits. These two arguments ensure datetick behaves sanely. To use them, set up the gure as is should look but with serial dates, and then call datetick(x,keepticks,keeplimits).
Chapter 18
String Manipulation
While manipulating text is not MATLABs fort, the programming environment does provide a complete set of tools for working with strings. Strings are treated as matrices of character data. Simple strings can be input from the command line
str = Econometrics is my favorite subject.;
Since character data are contained in matrices, they respect the standard behavior of most commands (e.g. str(1:10)). However, using commands designed for numerical data is tedious and special purpose functions are provided to assist with string data. The primary application of string functions is to parse data. Chapter 3 contains an example of parsing a poorly formatted le. It uses a number of string functions to manipulate and parse the text of a le.
char
char changes integer numerical values between 1 and 255 into their ASCII equivalent characters. Other
double
double changes character strings into their numerical values. >> double(MATLAB) ans = 77 97 116 108 97 98
112
String Manipulation
strvcat
strvcat vertically concatenates two strings. In normal math mode, two matrices x and y can be vertically concatenated by [x;y]. However, strings often have different widths which makes vertical concatenation difcult. strvcat makes this easy. >> strvcat(apple,banana,cherry) ans = apple banana cherry >> x=strvcat(alpha,beta); >> y=strvcat(delta,gamma); >> strvcat(x,y) ans = alpha beta delta gamma
strcat
strcat horizontally concatenates strings. z=strcat(x,y) is the same as z=[x y] when x and y have the
same number of rows. If one has a single row, strcat concatenates it to every row of the other vector.
>> strcat(strvcat(a,b),strvcat(c,d)) ans = ac bd >> strcat(strvcat(a,b),c) ans = ac bc
strfind
strfind returns the index of the all matching strings in a text block. It is useful in nding delimiting char-
acters in a block of text. For example, consider a single line from WRDS TAQ output
>> str = IBM,02JAN2001,9:30:07,84.5; >> strfind(str,,) ans = 4 14 22
strfind returns all of the location of ,. If more than one character is searched for, strfind can produce
overlapping blocks.
>> str = ababababa str =
113
strmatch
strmatch compares rows of a character matrix with a string and returns the index of all rows that begin with
the string. To match only the entire row, use the optional command exact
>> str = strvcat(alpha,beta,alphabeta); >> strmatch(alpha,str) ans = 1
114
String Manipulation
str2num
str2num converts string values into numerical varies. The input can be either vector or matrix valued. >> strvcat(1,2,3) ans = 1 2 3 >> str2num(strvcat(1,2,3)) ans = 1 2 3 >> str2num([1 2 3;4 5 6]) ans = 1 4 2 5 3 6
str2double
str2double converts string values into numerical varies. Unlike str2num it only operates on scalars or cell
num2str
num2str converts numerical values into strings. The input can be scalar, vector or matrix valued. >> num2str([1;2;3]) ans = 1 2 3 >> num2str([1 2 3;4 5 6]) ans =
18.1 Exercises
115
1 4
2 5
3 6
18.1
Exercises
1. Load the le hardtoparsetext.mat and inspect the variable string_data. The data in this le are ; delimited and contain stock name, date of observation, shares out standing, and price. Write a program that will loop over the rows and parse the data into four variables: ticker,date, shares and price. Note: Ticker should be a string, date should be a MATLAB serial data, and shares outstanding and price should be numerical. For values of N/A, use NaN. For help converting the dates to serial dates, see chapter 17.
116
String Manipulation
Chapter 19
19.1
Structures
Structures allow related pieces of data to be organized. Structures are constructed using variable_name.eld_name syntax where both variable_name and eld_name should be satisfy the constraints on variable names. Structures can be used to organize data. Consider the case of working with data that comes in triples which correspond to x-, y- and z-data. One alternative would be to store the data as a 3 by 1 vector. Alternatively, a structure could be used with eld names x, y and z to provide added guidance on what is expected.
>> coord.x = 0.5 coord = x: 0.5000 >> coord.y = -1 coord = x: 0.5000 y: -1 >> coord.z = 2 coord = x: 0.5000 y: -1 z: 2
Structures can also be used in arrays (array of structures). Structure arrays can be constructed using the command struct, or can be lazily initialized by concatenation. Continuing from the previous example,
>> coord(2).x = 3 coord = 1x2 struct array with fields: x y z >> coord(2).y = 2
118
coord = 1x2 struct array with fields: x y z >> coord(2).z = -1 coord = 1x2 struct array with fields: x y z
The elements of the array of structures can be accessed like any other array with the caveat that the assignment will itself be a structure.
>> newCoord = coord(1) newCoord = x: 0.5000 y: -1 z: 2
19.1.1
The fundamental problem with structures in MATLAB is that they are difcult to work with, and that operating on structures requires operating on the elds one-at-a-time. Structures are also difcult to preallocate and so will produce performance issues if used in large arrays. Structures are still collonly used (for example, in optimset), although they have been supplanted by a more useful object, the cell array.
19.2
Cell Arrays
Cell arrays are a powerful alternative to the everything is a matrix model of classic MATLAB. Cell arrays are formally jagged (or ragged) arrays and are collections of other arrays. Cell arrays can be through of as generic containers where each element is a matrix. The are most useful when handling either pure string data or mixed data which contains both string values and numbers. Cell arrays is similar to other commands in MATLAB although there are some important caveats. Cell arrays can be initialized using the cell command or directly using braces ({}). In either case, braces are used to access elements within a cell array. The example below shows how cell arrays can be pre-allocated using cell and then populated using braces.
119
% Initialize a cell array >> cellArray = cell(2,1) cellArray = [] [] % Add an element using braces { } >> cellArray{1} = cell cellArray = cell [] >> cellArray{2} = array cellArray = cell array
Initially the variable was an empty cell array. After the string vector cell was added in the rst position, only the second was empty. Finally, the string vector array was placed into the second position. This simple example show the ease with which cell arrays can be used to handle strings as opposed to using matrices of characters which becomes problematic when some of the row may not have the same number of characters and so need to be padded with blank characters (and then often deblanked before actually being used in code). Cell arrays are also adept at handling mixed data, as the next example shows.
% Initialize a cell array >> cellArray = cell(2,1); >> cellArray{1} = string cellArray = string [] >> cellArray{2} = [1 2 3 4 5] cellArray = string [1x5 double] >> cellArray{2} ans = 1 2 3 4 5
The cell array above has a string in the rst position and a 5 by 1 numeric vector in the second. Cell arrays can even contain other cell arrays, and so can be used to store virtually any data structure by nesting.
% Initialize a cell array >> cellArray{3} = cell(2,1) cellArray = string [1x5 double] {2x1 cell }
120
19.3
The primary method for accessing cell arrays is through the use of braces ({}) as the two previous examples demonstrated. Selecting an element using braces returns the contents of the cell and can be used to assign the values for processing using functions that are not designed for cell arrays. Continuing from the previous example,
>> x = cellArray{1} x = string >> y = cellArray{2} y = 1 2 3 4 5
Cell arrays can also be accessed using parentheses although this type of access is markedly different from accessing cell arrays with braces. Unlike braces which access the contents of a cell, parentheses access the cell itself and not its contents. The difference in behavior means that subsets of a cell array can be assigned to another variable without iterating across the contents of the cell array.
>> cellArray = cell(3,1); >> cellArray{1} = one; >> cellArray{2} = two; >> cellArray{3} = three; cellArray = one two three % Correct method to reassign elements of a cell array to a new array using parentheses ( ) >> newCellArray = cellArray(1:2) newCellArray = one two % Incorrect method to reassign elements of a cell array to a new array using braces { } >> newCellArray = cellArray{1:2} newCellArray = one
In the example above, newCellArray contains the rst elements of cellArray. Also note the incorrect attempt to assign the rst two elements using braces which does not produce the desired result.
19.4
Cell arrays, like structures, are useful data structures for working with strings or mixed data. Cell arrays are generally superior to structures and there are many functions which can operate directly on cell arrays of strings (e.g. sort, unique, ismember). They do come with some overhead and so are not appropriate for every use. For example, a 2 by 1 vector containing [1 2] requires 16 bytes of memory. A cell array with 1 in its rst cell and 2 in its second requires 240 bytes of memory, a 15 fold increase. Due to this overhead cell
121
arrays are undesirable in situations where data is highly regular and where the contents of each cell is small.
122
Chapter 20
Other standard DOS le navigation commands, such as dir and mkdir are also available. Alternatively, the current directory can be changed by clicking the button with . . . next to the Current Directory box at the top of the command window (see gure 1.1).
20.1
The le system can be used in MATLAB code. One common application of programatic access to the le system is to perform some action on every le in a particular directory, which can be done by looping over the output of dir.
% Create some files for i=1:3; fid = fopen([file_ num2str(i) .demotxt],wt); fprintf(fid,Nothing to see); fclose(fid); end
The example code below get a list of les that have the extension demotxt and then loops across the les, urst displaying the le name and then using type to print the contents of the le. This method is very useful for processing large numbers of data les.
>> d = dir(*.demotxt) d = 3x1 struct array with fields: name date bytes isdir datenum >> for i=1:length(d); >> disp(d(i).name)
124
>> type(d(i).name) >> end file_1.demotxt Nothing to see file_2.demotxt Nothing to see file_3.demotxt Nothing to see
20.2
While this section sounds like a Buddhist rite of passage, the path is an important set of locations. The path determines where MATLAB searches for les when running m-les. All of the toolbox directories are automatically on the path, but it may be necessary to add new directories to use custom or a non-standard toolbox. To see the current path, enter path in the command window. Alternatively, there is a GUI path browser available under File>Set Path. . . . The path is sorted from the most important directory to least, with the present working directory (what pwd returns in the command window) silently atop the list. The path determines which les MATLAB will use when evaluating a function or running a batch le. Suppose a custom function is accidentally titled mean. When mean is entered in the command window, MATLAB will nd all occurrences of mean on the path and rank them based on the order the les appear. The highest ranked led will then be executed. Because of this, it is crucial that existing function names are avoided when writing m-les. which function -all will show all les that match function (function, m-les and mat les), returning them in the order they appear on the path. This is useful for detecting duplicate le names. New directories can be appended to the path using addpath or File>Set Path. . .. The GUI tool can be used to re-rank directories on the path. To save any changes, use the command savepath or click on Save Path in the Path GUI.
20.3
In most shared environments, the MATLAB program directory will be read only and the original MATLAB path cannot be directly altered. To work around this issue, create and save a le named startup.m in U:\MATLAB. MATLAB will automatically look in the startup directory for this le and execute it when launched. This le should contain the following:
addpath(U:\Path to Add); addpath(U:\Second Path to Add);
where Path to Add and Second Path to Add are directories to be added to the base path.
20.4
Exercises
20.4 Exercises
125
2. Change into this directory using cd. 3. Create a new le names tobedeleted.m using the editor in this new directory (It can be empty). 4. Get the directory listing using dir. 5. Add this directory to the path using either addpath or the Path GUI. Save the changes using either savepath or the Path GUI. 6. Delete the newly created m-le, and then delete this directory from the command line. 7. Remove this folder from the path using either rmpath or the Path GUI.
126
Chapter 21
21.1
Pre-allocating data and pre-generating random numbers in large blocks is the most basic optimization. Preallocating data arrays allows expensive memory allocation to be avoided in the core of the program and pregenerating random numbers allows function overhead to be avoided. To see the effects of pre-allocating, consider the following code:
y = 0; tic; for i=2:10000; y(i) = y(i-1) + randn; end; toc Elapsed time is 0.145061 seconds. y = zeros(10000,1); tic; for i=2:10000; y(i) = y(i-1) + randn; end; toc Elapsed time is 0.001615 seconds.
128
The second version with a pre-allocated y is 80 times faster. To see the effects of pre-generating random numbers, consider the following code:
y = zeros(100000,1); tic; for i=2:100000; y(i) = y(i-1) + randn; end; toc Elapsed time is 0.014133 seconds. y = zeros(100000,1); e=randn(100000,1); tic;for i=2:100000; y(i) = y(i-1) + e(i); end; toc Elapsed time is 0.011111 seconds.
21.2
One of the key advantages to using an environment such as MATLAB is that end-users are not required to manage memory. This abstraction comes at the cost of performance and memory allocation is slow. For an example of the penalty, consider the two implementations of the following recursion y t = .1 + .5y t 1 .2y t 2 + 0.8t 1 + t
epsilon = randn(10000,1); y = zeros(10000,1); parameters = [.1 .5 -.2 .8 1]; tic for t=3:10000 y(t) = parameters * [1 y(t-1) y(t-1) epsilon(t-1) epsilon(t)]; end toc Elapsed time is 0.023440 seconds. tic for t=3:10000 y(t) = parameters(1); for i=1:2 y(t) = y(t) + parameters(i+1)*y(t-i); end for i=0:1 y(t) = y(t) + parameters(5-i)*epsilon(t-i); end end
129
The second implementation is 11 times faster because it avoids allocating memory inside the loop. In the rst implementation, [1 y(t-1) y(t-1) epsilon(t-1) epsilon(t)] requires a new, empty 5 element vector to be allocated in memory and then for the 5 elements to be copied into this vector every iteration. The second implementation uses more loops but avoids costly memory allocation.
21.3
Vector and matrix operations are highly optimized and writing code in matrix-vector notation is faster than looping. Consider the problem of computing
N
XX=
n =1
xn xn
21.4
Many optimization targets will depend on parameters, data and functions of data. In many cases the functions of the data do not depend on the parameter values allowing them to be pre-computed. For example if the optimization target is a likelihood target that depends on the square of the data (e.g. the Gaussian loglikelihood), pre-computing the square of the data and passing it as one of the optional arguments avoids needlessly re-computing these values every time the objective function is called.
21.5
Use M-Lint
The editor provides M-Lint guidance when available. This advice is almost always correct and should only be ignored if known to be wrong.
130
21.6
The nal and most advanced optimization tool is the proler. Running through the proler records every line executed and the time required to execute. This allows hot-spots in code code segments which require the most time to be identied so that optimization can be focused on the code that spends the most time running. The proler is run using
>> profile on >> code_to_profile >> profile viewer
The rst command turns the prole on. The second run the code to be proled. The nal command turns the proler off and opens the prole report viewer.
Chapter 22
All pseudo-random numbers are generated by two core random number generators, rand: Uniform pseudo-random number generator on the interval (0,1) randn: Standard Normal pseudo-random number generator The distribution of pseudo-random number generated will determine which of these are used. For example, Weibull pseudo-random numbers use rand. Normal pseudo-random numbers obviously call randn. Both Students-t and 2 pseudo-random numbers call both rand and randn.
22.2
The two core pseudo-random number generators each have a state. The state for randn is a 2 by 1 vector. The state for rand is either a 35 by 1 vector (old version) or a 625 by 1 vector (new version, known as twister). Saving and the restoring the state allows the same sequence of pseudo-random numbers to be generated. This allows Monte Carlo results to be replicated across different version of code or on different computers. The state can be retrieved using state = randn(state) for the Normal pseudo-random number generator and state = rand(twister) for the uniform pseudo-random-number generator.1 The state can be restored using randn(state,state) or rand(twister,state)
>> state = randn(state); state = 3950588338 2796772805 >> randn ans = 0.778806031346949 >> randn ans = 0.645328521657827
If using an old version of MATLAB, twister may not be available and the state can be saved using state = rand(state).
1
132
>> randn(state,state); >> randn ans = 0.778806031346949 >> randn ans = 0.645328521657827
These two sequences are the same since the state was restored to its previous value. The same can be accomplished with rand using rand(twister) (new version) or rand(state) (old version).
22.3
The state of both rand and randn is reset each time MATLAB is opened. Thus two programs drawing pseudorandom numbers on different computers, or in two instance on the same computer, will be identical. Two avoid this problem the state needs to be initialized using a random value such as the time the program began running. This can be accomplished in recent versions of MATLAB by
RandStream.setDefaultStream(RandStream(mt19937ar,seed,sum(100*clock)))
Warning: Do not over-initialize the pseudo-random number generators. The generators should be initialized once per session and then allowed to produce the sequence beginning with the state set by clock. Repeatedly re-initializing the pseudo-random number generators will produce a sequence that is much less random than the generator was designed to provide.
Chapter 23
23.1
abs
General Math
Returns the absolute value of the elements of a vector or matrix. If used on a complex data, returns the complex modulus.
diff
Returns the difference between two adjacent elements of a vector. The if the original vector has length T , vector returned has length T 1. If used on a matrix, returns a matrix of differences of each column. The matrix returned has one less row than the original matrix.
exp
log
Returns the natural logarithm of the elements of a vector or matrix. Returns complex values for negative elements.
log10
Returns the logarithm base 10 of the elements of a vector or matrix. Returns complex values for negative elements.
134
max
Returns the maximum of a vector. If used on a matrix, returns a row vector containing the maximum of each column.
mean
Returns the arithmetic mean of a vector. If used on a matrix, returns a row vector containing the mean of each column.
min
Returns the minimum of a vector. If used on a matrix, returns a row vector containing the minimum of each column.
mod
Returns the remainder of a division operation where the elements of a vector or matrix are divided by a scalar or conformable vector or matrix.
roots
Returns the sign, dened as x /|x | and 0 if x = 0, of the elements of a vector or matrix. Operates elementby-element on vectors or matrices.
sum
Returns the sum of the elements of a vector. If used on a matrix, operated column-by-column.
23.2
ceil
Rounding
23.3 Statistics
135
round
23.3
Statistics
Computes the correlation of a matrix. If a matrix x is N by M , returns the M by M correlation treating the columns of x as realizations from separate random variables.
cov
Computes the covariance of a matrix. If a matrix x is N by M , returns the M by M covariance treating the columns of x as realizations from separate random variables. If used on a vector, produces the same output as var.
kurtosis
Computes the kurtosis of a vector. If used on a matrix, a row vector containing the kurtosis of each column is returned.
median
Returns the median of a vector. If used on a matrix, a row vector containing the median of each column is returned.
prctile
Computes the percentiles of a vector. If used on a matrix, a row vector containing the percentiles of each column is returned.
regress
Estimates a classic linear regression. Does not compute White heteroskedasticity-robust standard errors.
quantile
Computes the quantiles of a vector. If used on a matrix, a row vector containing the quantiles of each column is returned.
skewness
Computes the skewness of a vector. If used on a matrix, a row vector containing the skewness of each column is returned.
136
std
Computes the standard deviation of a vector. If used on a matrix, a row vector containing the standard deviation of each column is returned.
var
Computes the variance of a vector. If used on a matrix, a row vector containing the variance of each column is returned. D I ST cdf Returns the cumulative distribution function values for a given D I ST , where D I ST takes one of many forms such as t (tcdf), norm (normcdf), or gam (gamcdf). Inputs vary by distribution. D I ST inv Returns the inverse cumulative distribution value for a given D I ST , where D I ST takes one of many forms such as t (tinv), norm (norminv), or gam (gaminv). Inputs vary by distribution. D I ST pdf Returns the probability density function values for a given D I ST , where D I ST takes one of many forms such as t (tpdf), norm (normpdf), or gam (gampdf). Inputs vary by distribution. D I ST rnd Produces pseudo-random numbers for a given D I ST , where D I ST takes one of many forms such as t (trnd), norm (normrnd), or gam (gamrnd). Inputs vary by distribution. Note: D I ST function are available for the following distributions: Beta, Binomial, 2 , Exponential, Extreme Value, F , Gamma, Generalized Extreme Value, Generalized Pareto, Geometric, Hypergeometric, Lognormal, Negative Binomial, Noncentral F , Noncentral t , Noncentral 2 , Normal, Poisson, Rayleigh, t , Uniform, Discrete, Uniform, Weibull.
23.4
rand
Random Numbers
Uniform pseudo-random number generator. One of three core random number generators that are used to produce pseudo-random numbers from other distributions.
randg
Standard gamma pseudo-random number generator. One of three core random number generators that are used to produce pseudo-random numbers from other distributions.
23.5 Logical
137
randn
Standard normal pseudo-random number generator. One of three core random number generators that are used to produce pseudo-random numbers from other distributions.
random
Generic pseudo-random number generator. Can generate random numbers for the following distributions: Beta, Binomial, 2 , Exponential, Extreme Value, F , Gamma, Generalized Extreme Value, Generalized Pareto, Geometric, Hypergeometric, Lognormal, Negative Binomial, Noncentral F , Noncentral t , Noncentral 2 , Normal, Poisson, Rayleigh, t , Uniform, Discrete, Uniform, Weibull.
23.5
all
Logical
Returns logical true (1) if all elements of a vector are logical true. If used on a matrix, returns a row vector containing logical true if all elements of each column are logical true.
any
Returns logical true (1) if any elements of a vector are logical true. If used on a matrix, returns a row vector containing logical true if any elements of each column are logical true.
find
Returns the indices of the elements of a vector or matrix which satisfy a logical condition.
ischar
Returns logical true if the argument is nite. Operates element-by-element on vectors or matrices.
isinf
Returns logical true if the argument is innite. Operates element-by-element on vectors or matrices.
isnan
Returns logical true if the argument is not a number (NaN). Operates element-by-element on vectors or matrices.
isreal
138
logical
23.6
ans
Special Values
ans is a special variable that contains the value of the last unassigned operation.
eps
eps is the numerical precision of MATLAB. Numbers differing by more the eps are numerically identical.
Inf
Inf represents innity.
NaN
NaN represents not-a-number. It occurs as a results of performing an operation which produces in indenite
23.7
eye
Special Matrices
linspace
z=linspace(L ,U ,N ) returns a 1 by N vector of points uniformly spaced between L and U (inclusive).
logspace
z=logspace(L ,U ,N ) returns a 1 by N vector of points logarithmically spaced between 10L and 10U (inclu-
sive).
ones
z=ones(N ,M ) returns a N by M matrix of ones.
toeplitz
z=toeplitz(x) returns a Toeplitz matrix constructed from a vector x.
139
zeros
z=zeros(N ,M ) returns a N by M matrix of zeros.
23.8
chol
Returns the elements along the diagonal of a square matrix. If the input to diag is a vector, returns a matrix with the elements of the vector along the diagonal.
eig
i j =1
x j . If used on a matrix,
140
cumsum
i j =1
x j . If used on a matrix,
23.9
cat
Matrix Manipulation
Concatenates two matrices along some dimension. If x and y are conformable matrices, cat(1,x,y) is the same as [x; y] and cat(2,x,y) is the same as [x y].
length
Returns the number of elements in a matrix. If the matrix is 2D with dimensions N and M , numel returns NM.
repmat
Reshapes a matrix to have a different size. The product of the dimensions must be the same before and after, hence the number of elements cannot change.
size
Returns the dimension of a matrix. Dimension 1 is the number of rows and dimension 2 is the number of columns.
23.10
Set Functions
intersect
Returns the intersection of two vectors. Can be used with optional rows argument and same-sized matrices to produce an intersection of the rows of the two matrices.
setdiff
Returns the difference between the elements of two vectors. Can be used with optional rows argument and same-sized matrices to produce a matrix containing difference of the rows of the two matrices.
141
sort
Produces a sorted vector from smallest to largest. If used on a matrix, operates column-by-column.
sortrows
Sorts the rows of a matrix using lexicographic ordering (similar to alphabetizing words).
union
Returns the union of two vectors. Can be used with optional rows argument and same-sized matrices to produce an union of the rows of the two matrices.
unique
Returns the unique elements of a vector. Can be used with optional rows argument on a matrix to select the set of unique rows.
23.11
case
Flow Control
Command which can be evaluated to logical true or false in a switch . . . case . . . otherwise ow control block.
else
Command that is the default in if . . . elseif . . . else ow control blocks. If none of the if or elseif statement are evaluated to logical true, the else path is followed.
elseif
Command that is used to continue a if . . . elseif . . . else ow control block. Should be immediately followed by a statement that can be evaluated to logical true or false.
end
Command indicating the end of a ow control block. Both if . . . elseif . . . else and switch . . . case . . . otherwise must be terminated with an end. Also ends loops.
if
Command that is used to begin a if . . . elseif . . . else ow control block. Should be immediately followed by a statement that can be evaluated to logical true or false.
142
switch
Command signalling the beginning of a switch . . . case . . . otherwise ow control block. Switch should be followed by a variable to be used by case.
23.12
Looping
continue
Forces a loop to proceed to the next iteration while bypassing any code occurring after the continue statement.
break
Prematurely breaks out of a loop before the all iterations have completed.
end
All loop blocks must be terminated by an end command. Also ends ow control blocks.
for
One of two types of loops. for loops iterate over a predened vector unless prematurely ended by break.
while
One of two types of loops. While loops continue until some logical condition is evaluated to logical false (0) unless prematurely ended by a break or continue command.
23.13
fminbnd
Optimization
Function minimization with bounds. Find the minimum of a function that exists between L and U .
fmincon
Constrained function minimization using a gradient based search. Constraints can be linear or non-linear and equality or inequality.
fminsearch
23.14 Graphics
143
optimget
23.14
axis
Graphics
Sets or gets the current axis limits of the active gure. Can also be used to tighten limits using the command axis tight.
bar
Produces a contour plot of the levels of z data against vectors of x and y data.
errorbar
Produces a plot of x data against y data with error bars (condence sets) around each point.
figure
Opens a new gure window. When used with a number, for example figure(X X ) opens a window with label Figure X X where X X is some integer. If a windows with label Figure X X is already open, that gure is set as the active gure and any subsequent plot commands will operate on Figure X X .
gcf
Gets of list of properties from a graphics handle or the value of a property if used with an optional second argument.
144
hist
Produces a histogram of data. Can also be used to compute bin centers and height.
legend
Produces a 3-D mesh plot of a matrix of z data against vectors of x and y data.
pie
Command that allows for multiple plots to be graphed on the same gure. Used in conjunction with other plotting commands, such as subplot(2,1,1); plot(x,y); subplot(2,1,2); plot(y,x);
surf
Produces a 3-D surface plot of a matrix of z data against vectors of x and y data.
145
title
23.15
clock
Date Functions
Returns the current date and time as a 6 by 1 numeric vector of the form [YEAR MONTH DATE HOUR MIN SEC].
date
Parses date numbers and date strings and returns date vectors of the form [YEAR MONTH DATE HOUR MIN SEC].
etime
Can be used to compute the elapsed time between two readings from clock.
now
146
tic
Begins a tic-toc timing loop. Useful for determining the amount of time required to run a section of code.
toc
23.16
char
String Function
Horizontally concatenates two or more strings. Equivalent to [string1 string2] for strings with the same number of rows.
strcmp
147
strfind
Vertically concatenates two or more strings. If the strings have different numbers of columns, right pads the shorter string with blanks.
23.17
cos
Trigonometric Functions
Computes the cosine of a scalar, vector or matrix. Operates element-by-element on vectors or matrices.
sin
Computes the sine of a scalar, vector or matrix. Operates element-by-element on vectors or matrices.
23.18
cd
File System
Change directory. When used with a directory, changes the working directory to that directory. When called as cd .., changes the working directory to its parent. If the desired directory has a space, use the function version cd(c:\dir with space\dir2\dir3).
delete
Deletes a le from the present working directory. Warning: This command is dangerous; les deleted are permanently gone and not in the Recycle Bin.
dir
148
mkdir
pwd
rmdir
Removes a child directory in the present working directory. Child directory must be empty.
23.19
clc
MATLAB Specic
clear
Clears variables from memory. clear and clear all remove all variables from memory, while clear var1 var2 . . . removes only those variables listed.
clf
close
Closes gure windows. Can be used to close all gure windows by calling close all.
doc
When used as doc function, opens the help browser to the documentation of function. When used alone (doc) opens the help browser.
edit
Launches the built-in editor. If called using edit lename, opens the editor with lename.m or, if lename.m does not exist on the MATLAB path, creates the le in the current directory.
format
Changes how numbers are represented in the command windows. format long shows all decimal places while format short only shows up to 5. format short is the default.
149
help
Displays inline help for calling a function (help function). Also can be used to list the function in a toolbox (help toolbox) or to list toolboxes (help).
helpbrowser
Opens the integrated help system for MATLAB at the last viewed page.
helpdesk
Opens the integrated help system for MATLAB at the home page.
keyboard
Allows functions to be interrupted for debugging. After verifying function operation, use return to continue running.
profile
Built-in MATLAB proler. Reports code dependencies, timing of executed code and provides tips for improving the performance of m-les. Has four important variants: profile on turns the proles on profile off turns the proles off profile report opens the proling report which contains statics on the performance on code executed since profile on was called. Does not stop the proler. profile viewer turns the proles off and opens the proling report which contains statics on the performance on code executed since profile on was called
realmax
Returns the largest number MATLAB is capable of represented. Larger numbers are Inf.
realmin
Returns the smallest positive number MATLAB is capable of representing. Numbers closer to 0 are 0.
which
When used in combination with a function name, returns full path to function. Useful if there may be multiple functions with same name on the MATLAB path.
whos
Returns a list of all variables in memory along with a description of type and information on size and memory requirements.
150
23.20
csvread
Input/Output
Reads the current le until an end-of-line character is encountered, returning a string representing the line without the end-of-line character.
fopen
Opens a le for low level reading (using e.g. fgetl) or writing (using e.g. fprintf).
fprintf
Loads the contents of a MATLAB data le (.mat) into the current workspace. Can also be used to load simple text les.
save
Saves variables to a MATLAB data le (.mat). Can also be used to save tab delimited text les. Can be combined with -ascii -double to produce a tab delimited text le.
textread
Reads formatted text. Can read into cell arrays and from specic points in a le.
xlsfinfo
23.20 Input/Output
151
xlsread
Reads variables in .xls les. All data should be numeric, although it does contain methods which allow for text to be read.
xlswrite
Index
. . ., 12 *cdf, 95 *inv, 95 *pdf, 95 *rnd, 95 ;, 11 %, 12
abs, 133 all, 57, 137 AND, 56 ans, 138 any, 57, 137 axis, 143 bar, 143 bar3, 143 break, 68, 142 case, 62, 141 cat, 140 cd, 147 cdf, 136 ceil, 134 cell, 118 corrcoef, 135 cos, 147 cov, 135 csvread, 27, 150 csvwrite, 27, 31, 150 cumprod, 139 cumsum, 140 date, 145 datenum, 29, 105, 145 datestr, 106, 145 datetick, 108, 145 datevec, 107, 145 delete, 147 det, 139 diag, 139 diff, 133 dir, 123, 147 disp, 93 dlmwrite, 31 doc, 13, 148 double, 111, 146 edit, 10, 148 eig, 139 else, 61, 141 elseif, 61, 141 end, 141, 142 eps, 138 errorbar, 143 etime, 108, 145 exp, 133 eye, 138 fclose, 30, 150 fgetl, 29, 150 figure, 143 find, 137
Cell Arrays, 118121 char, 111, 146 chol, 139 clc, 148 clear, 148 clf, 148 clock, 108, 145 close, 148 colormap, 143 Comments, 12 continue, 69, 142 contour, 79, 143 corr, 135
INDEX
153
floor, 134 fminbnd, 99, 142 fmincon, 100, 142 fminsearch, 99, 142 fminunc, 98, 142 fopen, 29, 150 for, 65, 142 format, 148 fprintf, 150 gcf, 143 get, 143 help, 12, 149 helpbrowser, 149 helpdesk, 149 hist, 144 if, 61, 141
min, 38, 134 mkdir, 148 mod, 134 NaN, 138 NOT, 56 now, 108, 145 num2str, 114, 146 numel, 140 ones, 138 optimget, 143 optimset, 104, 143 OR, 56 otherwise, 62 pdf, 136
Importing Data, 2331 Inf, 138 intersect, 140 inv, 136, 139 ischar, 137 isfinite, 137 isinf, 137 isnan, 137 isreal, 137
keyboard, 93, 149 kron, 139 kurtosis, 135 legend, 71, 7375, 83, 144 length, 37, 140 linspace, 138 load, 28, 150 log, 133 log10, 133 logical, 56, 138 logspace, 138
Performance, 127 pi, 138 pie, 144 plot, 71, 144 plot3, 73, 144 prctile, 95, 135 print, 87, 144 profile, 130, 149 pwd, 148
quantile, 95, 135 rand, 131, 136 randg, 136 randn, 131, 137 random, 137 realmax, 149 realmin, 149 regexp, 114 regexpi, 114 regress, 95, 135 repmat, 140 research, 93 reshape, 140 rmdir, 148 rnd, 136 roots, 134 round, 135
154
INDEX
save, 31, 150 scatter, 75, 144 set, 144 setdiff, 140 shading, 144 sign, 134 sin, 147 size, 37, 140 skewness, 135 sort, 39, 141 sortrows, 141 sqrt, 40, 134 std, 136 str2double, 29, 114, 146 str2num, 114, 146 strcat, 112, 146 strcmp, 113, 146 strcmpi, 113, 146 strfind, 29, 112, 147 strmatch, 113, 147 strncmp, 113, 147 strncmpi, 113, 147 struct, 117
which, 10, 149 while, 67, 142 whos, 149 x2mdate, 26, 146 xlabel, 71, 145 xlsfinfo, 150 xlsflinfo, 25 xlsread, 24, 151 xlswrite, 25, 31, 151 ylabel, 71, 145 zeros, 47 zlabel, 71, 145
Structures, 117118 strvcat, 112, 147 subplot, 79, 144 sum, 38, 134 surf, 76, 144 switch, 62, 142
textread, 28, 150 textscan, 28, 30, 150 tic, 108, 146 title, 71, 145 toc, 108, 146 toeplitz, 138 trace, 139 tril, 139 triu, 139 type, 123 union, 141 unique, 141 var, 136