0% found this document useful (0 votes)
16 views50 pages

Intro To Algorithm

NTHU CS

Uploaded by

辣台妹
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
16 views50 pages

Intro To Algorithm

NTHU CS

Uploaded by

辣台妹
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 50

EECS 204002

Data Structures 資料結構


Prof. REN-SONG TSAY 蔡仁松 教授
NTHU

CH. 1
BASIC CONCEPTS

2023/9/8 © Ren-Song Tsay, NTHU, Taiwan 2


1.5

Algorithm

2023/9/8 © Ren-Song Tsay, NTHU, Taiwan 3


1.5 What is an Algorithm?
An algorithm is a finite set of instructions that
accomplishes a particular task (problem) and satisfies
the following criteria:
— Input
◦ Zero/more quantities are externally supplied. 可以沒有 input

— Output
◦ At least one quantity is produced. 一定要有 output

— Definiteness
◦ Each instruction is clear and unambiguous.
— Finiteness
◦ Terminate after a finite number of steps.
— Effectiveness:
◦ Every instruction must be basic and easy to be computed.

4
1.5.1 Representation of Algorithms
— Natural languages
◦ English, …etc.
— Graphic representation
◦ Flowchart. 流程圖
◦ Feasible only if the algorithm is small and
simple.
— Programming language
◦ C++
◦ Concise and effective!

5
1.5.1 Example: Binary Search
Problem statement: Assume we have 𝑛 ≥
1 distinct integers that are sorted in array
𝐴 0 … 𝐴 𝑛 − 1 . Determine the existence
of an integer 𝑥. If 𝑥 = 𝐴[𝑗], return index 𝑗;
otherwise return −1.
A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7]
A 1 3 5 8 9 17 32 50
Eg. For x=9, return index 4;
For x=10, return -1.
6
1.5.1 BS in Plain English
1. Let left and right denote the left and right ends of
the list with initial value 0 and n-1.
2. Let middle = (left+right) / 2 be the middle position in
the list
3. Compare A[middle] with x and obtain three results:
a. x < A[middle]: x must be somewhere between 0 and
middle-1. We set right to middle-1
b. x == A[middle]: We return middle
c. x > A[middle]: x must be somewhere between middle+1
and n-1. We set left to middle+1.
4. If x is not found and there are still integers to check,
we recalculate middle and repeat the above
comparison.

7
1.5.1 BS in Pseudo C++ Code
回傳 index
int BinarySearch(int *A, const int x, const int n)
{ int left=0, right=n-1;

while (left <= right)


{ // more integers to check
int middle = (left+right)/2;
if (x < A[middle]) right = middle-1;
else if (x > A[middle]) left = middle+1;
else return middle;
} // end of while
return -1; // not found
}

8
不是很有效率,
1.5.2 Recursive Algorithm 因為呼叫一次就要
使用一個 stack 來存
local variables

— A powerful mechanism to make your


algorithm or code more clear.
— Direct recursion :
◦ Function calls itself directly.
◦ E.g. funcA funcA. fA

— Indirect recursion:
◦ Function A calls other function B that invoke
the function A itself.
fA fB
◦ E.g. funcA funcB funcA.

9
A Recursively Defined Problem
The binomial coefficient
!!
𝐶 𝑛, 𝑚 =
#! !$# !

can be computed by the recursive formula:


Decreased
Parameters -> 𝐶(𝑛, 𝑚) = 𝐶(𝑛 − 1, 𝑚) + 𝐶(𝑛 − 1, 𝑚 − 1)

where 𝐶 0,0 = 𝐶 𝑛, 𝑛 = 1 -> Termination conditions


10
Principles for Feasible Recursive
Algorithms
— Termination conditions:
◦ The function should return a value or stop
calling itself under certain conditions.
— Decreased Parameters
◦ So that each call is one step closer to a
termination condition.

◦ E.g.
𝐶(𝑛, 𝑚) = 𝐶(𝑛 − 1, 𝑚) + 𝐶(𝑛 − 1, 𝑚 − 1)

11
For the “While” statement
— Replace with if-else and recursion
— In Binary Search problem…
int BinarySearch(int *A, const int x, const int n)
{ int left=0, right=n-1;

while (left <= right)


{

}
return -1;
}

12
Recursive Binary Search
int BinarySearch(int *A, const int x, const int
left, const int right )
{ // Search the A[left],..,A[right] for x
if (left <= right) { // more integers to check
int middle = (left+right)/2;
if (x < A[middle])
return BinarySearch(A, x, left, middle-1);
else if (x > A[middle])
return BinarySearch(A, x, middle+1, right);
return middle;
} // end of if
return -1; // not found
}
13
1.5.2 Example
— Search for x=9 in array A[0]…[7] :
A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7]

A 1 3 5 8 9 17 32 50

1st 3rd 2nd


— 1st call: BinarySearch(A, 9, 0, 7)
2nd call: BinarySearch(A, 9, 4, 7)
3rd call: BinarySearch(A, 9, 4, 4)
return index 4.
14
Quiz
Write down the
recursive version
of Binomial
coefficient in
pseudo
Recursivecodes
form
C(n, m) = C(n-1, m) + C(n-1, m-1)
Termination conditions
C(0, 0) = C(n, n) = 1

15
1.5.2 Criteria of a “Good” Program
— Does it do what you want to do?
— Does it work correctly?
— Any documentation about how to use it?
— Are functions created logically?
— Is the code readable?
— However, the above questions are HARD to
achieve (at least when only DS is taught).
— So, we focus on the “Performance” of the
program
17
1.7
Performance
Analysis and
Measurement

2023/9/8 © Ren-Song Tsay, NTHU, Taiwan 18


1.7 Performance Evaluation
— Two aspects:
◦ Space Complexity
– How much memory space is used?
◦ Time Complexity
– How much execution time is needed?
— Two approaches:
◦ Performance Analysis
– machine independent
– a prior estimate
◦ Performance Measurement
– machine dependent
– a posterior measure
19
1.7.1 Uses Of Performance Analysis
Ø Determine practicality of algorithm
Ø Predict run time on large instance
Ø Compare algorithms with different
complexity
Øe.g., O(n) v.s. O(n2)
1.7.1 Performance Analysis
— Space complexity : 𝑆(𝑃) = 𝐶 + 𝑆𝑃 (𝐼)
— 𝐶 is a fixed part: Program

◦ Independent of the size of input and output.


◦ Space for instruction and static variables,
fixed-size structured variables, constants.
— 𝑆𝑃(𝐼) is a variable part:
◦ Depends on the specific problem instance.
◦ Space of referenced variable and recursion
stack space (Instance Characteristics).

21
Instance Characteristics (I)
— Commonly used characteristics (I) include
the size of the input and output of the
problem.
— We shall concentrate solely on estimating
the 2nd part, SP(I).

— Ex1. sorting(A[], n) array size


Then I= number of integers = n.
— Ex 2. Summation of 1 to n, i.e., 1+2+3+…+n
Then I= value of n = n.
需要加幾個數值

22
Space Complexity: Simple
Function
float Abc(float a, b, c)
{
return a+b+b*c+(a+b-c)/(a+b)+4.0;
}

— I = a, b, c
— C = space for the program + space for
variables a, b, c, Abc = constant
— 𝑆𝐴𝑏𝑐 (𝐼) = 0
— 𝑆 𝐴𝑏𝑐 = 𝐶 + 𝑆𝐴𝑏𝑐 (𝐼) = constant

23
Space Complexity : Iterative
Summation (和下一頁做比較)
float Sum(float *A, const int n)
{ float s = 0;
for(int i=0; i<n; i++)
s += A[i];
return s;
}

— 𝐼 = 𝑛 (number of elements to be summed)


— 𝐶= constant 沒有
— 𝑆𝑆𝑢𝑚(𝐼) = 0 (A stores only the address of recursive
call
array) it is just pointer
— 𝑆(𝑆𝑢𝑚) = 𝐶 + 𝑆𝑆𝑢𝑚(𝐼) = constant
24
Space Complexity : Recursive
Summation
float Rsum(float *A, const int n)
{
if (n<=0) return A[0];
else return (Rsum(A, n-1) + A[n-1]);
}

— 𝐼 = 𝑛 (number of elements to be summed)


— 𝐶= constant
— Each recursive call “Rsum” requires 4 1 + 1 + 1 =
12 bytes.
— Number of calls: 𝑅𝑠𝑢𝑚(𝐴, 𝑛) à 𝑅𝑠𝑢𝑚(𝐴, 𝑛 − 1)
à … à 𝑅𝑠𝑢𝑚(𝐴, 0) ⇒ 𝑛 + 1 calls 12bytes * (n+1) calls
— 𝑆 𝑅𝑠𝑢𝑚 = 𝐶 + 𝑆𝑅𝑠𝑢𝑚(𝑛) = const + 12 (𝑛 + 1)
這裡是做 recursive call
25
1.7.1.2 Time Complexity
𝑇(𝑃) = 𝐶 + 𝑇𝑃 (𝐼)
— 𝐶 is a constant:
◦ Compile time.
— 𝑇𝑃(𝐼) is variable:
◦ Execution time.

26
Performance Analysis
— How to evaluate TP(I) ?
◦ Count every Add, Sub, Multiply, … etc.
◦ Practically infeasible because each instruction
takes different running time at different
machine.
— Use “program step” to estimate TP(I)
◦ “program step” = a statement whose
execution time is independent of instance
characteristics(I).
abc=a+b+b*c; à one program step
a=2; à one program step
27
Time Complexity : Iterative
Summation
— 𝐼 = 𝑛 (number of elements to be summed)
— 𝑇𝑆𝑢𝑚(𝐼) = 1 + 𝑛 + 1 + 𝑛 + 1 = 2𝑛 + 3
— 𝑇(𝑆𝑢𝑚) = 𝐶 + 𝑇𝑆𝑢𝑚(𝑛) = const+(2𝑛 + 3)

float Sum(float *A, const int n)


{ float s = 0; // 1 step
for(int i=0; i<n; i++) // n+1 steps
s += A[i]; // n steps
return s; // 1 step
}

28
Time Complexity : Recursive
Summation
float Rsum(float *A, const int n)
{
if (n<=0) // 1 step
return A[0]; // 1 step
else return(Rsum(A,n-1)+A[n-1]); // 1 step
}

— I = n (number of elements for summation)


— TRsum(n) = ?

29
Time Complexity : Recursive
Summation
float Rsum(float *A, const int n)
{
if (n<=0) // 1 step
return A[0]; // 1 step
else return (Rsum(A, n-1) + A[n-1]);// 1 step
}
— 𝐼 = 𝑛 (number of elements for summation)
— 𝑇𝑅𝑠𝑢𝑚(0) = 2 這裡(使用 recursive call 來算
sum 的 time complexity) 跟(用
— 𝑇𝑅𝑠𝑢𝑚 𝑛 = 2 + 𝑇𝑅𝑠𝑢𝑚 𝑛 − 1 for loop 求 sum 的 time complexity)
是一樣的,最大的差別在於
= 2 + (2 + 𝑇𝑅𝑠𝑢𝑚(𝑛 − 2)) 使用 recursive 的 space complexity
不是 constant,用 for loop 的
= … space complexity 是 constant

= 2𝑛 + 𝑇𝑅𝑠𝑢𝑚(0) = 2𝑛 + 2

30
Time Complexity : Matrix
Addition
void Add(int **a, int **b, int **c, int m, int n)
{
for(int i=0; i<m; i++) // m+1 steps
for(int j=0; j<n; j++) // m*(n+1) steps
c[i][j] = a[i][j]+b[i][j]; // m*n steps
}

— 𝐼 = 𝑚(𝑟𝑜𝑤𝑠), 𝑛 (𝑐𝑜𝑙𝑢𝑚𝑛𝑠)
— 𝑇𝐴𝑑𝑑 𝐼 = (𝑚 + 1) + 𝑚(𝑛 + 1) + 𝑚𝑛
= 2𝑚𝑛 + 2𝑚 + 1
— 𝑇 𝐴𝑑𝑑 = 𝐶 + 𝑇$%% (𝐼)
= 𝑐𝑜𝑛𝑠𝑡 + (2𝑚𝑛 + 2𝑚 + 1)
31
Observation on Step Counts
— In the previous examples :
𝑇𝑆𝑢𝑚(𝑛) = 2𝑛 + 3 steps
𝑇𝑅𝑠𝑢𝑚(𝑛) = 2𝑛 + 2 steps
— So, Rsum is faster than Sum?
◦ No!
◦ ∵The execution time of each step is different.
— “Growth Rate” is more critical
◦ “How the execution time changes in the instance
characteristics?”

32
Growth Rate 是評量 complexity 最重要的因素

Program Growth Rate


— In the Sum program, 𝑇𝑆𝑢𝑚(𝑛) = 2𝑛 + 3 means
when 𝑛 is tenfold (10𝑋), the execution time
𝑇𝑆𝑢𝑚(𝑛) is tenfold (10𝑋).

因此 — We say that Sum program runs in linear time.

— 𝑇𝑅𝑠𝑢𝑚(𝑛) = 2𝑛 + 2 also runs in linear time.

— We say 𝑇𝑆𝑢𝑚(𝑛) and 𝑇𝑅𝑠𝑢𝑚(𝑛) have the same


growth rate, and are equal in time complexity!

33
漸進(近似)符號
1.7.1.3 Asymptotic Notation
— To make meaningful (but inexact)
statements about the time and space
complexities of a program.
◦ Predict the growth rate.
— Two programs with time complexity
◦ P1: 𝑐1𝑛2 + 𝑐2𝑛
◦ P2: 𝑐3𝑛
◦ Which one runs faster?

34
1.7.1.3 Asymptotic Notation
— Scenario 1: 𝑐1 = 1, 𝑐2 = 2, and 𝑐3 = 100
◦ 𝑃1(𝑛2 + 2𝑛) ≤ 𝑃2(100𝑛) for 𝑛 ≤ 98.
— Scenario 2: 𝑐1 = 1, 𝑐2 = 2, and 𝑐3 = 1000
◦ 𝑃1(𝑛2 + 2𝑛) ≤ 𝑃2(1000𝑛) for 𝑛 ≤ 998.
• No matter what values 𝑐1, 𝑐2 and 𝑐3 are, there
will be an n beyond which 𝑐1𝑛2 + 𝑐2𝑛 > 𝑐3𝑛
• Therefore, we should compare the
complexity for a sufficiently large value
of 𝑛
35
Notation: Big-O (O)
— Definition:
𝑓(𝑛) = 𝑂(𝑔(𝑛)) iff there exist 𝒄,
𝒏𝟎 > 𝟎 such that 𝒇(𝒏) ≤ 𝒄𝒈(𝒏) for
all 𝒏 ≥ 𝒏𝟎. 有等號
— Ex1. 3𝑛 + 2 = 𝑂(𝑛)
◦ 3𝑛 + 2 ≤ 4𝑛 for all 𝑛 ≥ 2
— Ex2. 100𝑛 + 6 = 𝑂(𝑛)
◦ 100𝑛 + 6 ≤ 101𝑛 for all 𝑛 ≥ 6
— Ex3. 10𝑛2 + 4𝑛 + 2 = 𝑂(𝑛2)
◦ 10𝑛2 + 4𝑛 + 2 ≤ 11 𝑛2 for all 𝑛 ≥ 5
The upper bound or worst-case running time 36
Notation: Omega (Ω)
— Definition: 𝑓(𝑛) = Ω(𝑔(𝑛)) iff
there exist 𝒄, 𝒏𝟎 > 𝟎 such that
𝒇(𝒏) ≥ 𝒄𝒈(𝒏) for all all 𝒏 ≥ 𝒏𝟎.
— Ex1. 3𝑛 + 2 = Ω(𝑛)
◦ since 3𝑛 + 2 ≥ 3𝑛 for all 𝑛 ≥ 1
— Ex2. 100𝑛 + 6 = Ω(𝑛)
◦ since 100𝑛 + 6 ≥ 100 𝑛 for all 𝑛 ≥ 1
— Ex3. 10𝑛2 + 4𝑛 + 2 = Ω(𝑛2)
◦ since 10𝑛2 + 4𝑛 + 2 ≥ 𝑛2 for all 𝑛 ≥ 1
The lower bound or best-case running time
37
Notation:Theta (Θ)
= Θ(𝑔(𝑛)) iff
— Definition: 𝑓(𝑛)
𝒇(𝒏) = 𝑶(𝒈(𝒏)) and 𝒇(𝒏) =
𝛀(𝒈(𝒏)).
• Ex1. 3𝑛 + 2 = Θ(𝑛)
— Ex2. 100𝑛 + 6 = Θ(𝑛)
— Ex3. 10𝑛2 + 4𝑛 + 2 = Θ(𝑛2)

The tight bound or average-case running time

38
Theorem 1.2
If 𝑓(𝑛) = 𝑎# 𝑛# + ⋯ + 𝑎1𝑛 + 𝑎0, 𝑎# > 0,
then 𝑓(𝑛) = 𝑂(𝑛# ).
◦ 3𝑛 + 2 = 𝑂(𝑛)
◦ 100𝑛 + 6 = 𝑂(𝑛)
◦ 10𝑛2 + 4𝑛 + 2 = 𝑂(𝑛2)
◦ 6𝑛4 + 1000 𝑛3 + 𝑛2 = 𝑂(𝑛4)
— Leading constants and lower-order terms do
not matter.

39
Theorem 1.2 Proof
𝑓(𝑛) = 𝑎𝑚𝑛𝑚 + … + 𝑎1𝑛 + 𝑎0
≤ |𝑎𝑚|𝑛𝑚 + … + |𝑎1|𝑛 + |𝑎0|
≤ 𝑛𝑚(|𝑎𝑚| + … + |𝑎1| + |𝑎0|)
≤ 𝑛𝑚 𝑐 for 𝑛 ≥ 1
So, 𝑓(𝑛) = 𝑂(𝑛𝑚)

40
Quiz
— n2 -10n – 6 = O(?)
— n + log n = O(?)
— n + n log n = O(?)
— n2 + log n = O(?)
— 2n + n10000 = O(?)
— n4 + 1000 n3 + n2 = O(n4), True or false?
— n4 + 1000 n3 + n2 = O(n5), True or false?

41
Naming Common Functions
Complexity Naming
O(1) Constant time
O(log n) Logarithmic time
O(n log n) O(log n) ≤ . ≤ O(n2)
O(n2) Quadratic time
O(n3) Cubic time
O(n100) Polynomial time
O(2n) Exponential time
When n is large enough, the latter terms
take more time than the former ones.
44
1.7.1 Plot of Common Function
F1.4 Values
60 2n
n2

50

40
n log n


30
f

20

n
10
log n

0
0 1 2 3 4 5 6 7 8 9 10
n →

45
1.7.1
T1.8 Execution Time Comparison
f (n)
n 𝑛 𝑛 log2 𝑛 𝑛2 𝑛3 𝑛4 𝑛10 2𝑛
10 .01 µs .03 µs .1 µs 1 µs 10 µs 10s 1µs
20 .02 µs .09 µs .4 µs 8 µs 160 µs 2.84h 1ms
30 .03 µs .15 µs .9 µs 27 µs 810 µs 6.83d 1s
40 .04 µs .21 µs 1.6 µs 64 µs 2.56ms 121d 18m
50 .05 µs .28 µs 2.5 µs 125 µs 6.25ms 3.1y 13d
100 .10 µs .66 µs 10 µs 1ms 100ms 3171y 4*1013y
103 1 µs 9.96 µs 1 ms 1s 16.67m 3.17*1013y 32*10283y
104 10 µs 130 µs 100 ms 16.67m 115.7d 3.17*1023y …
1056 100 µs 1.66 ms 10s 11.57d 3171y 33
3.17*1043y …
10 1ms 19.92ms 16.67m 31.71y 3.17*107y 3.17*10 y …
µs = microsecond = 10-6 second; ms = milliseconds = 10-3 seconds
s = seconds; m = minutes; h = hours; d = days; y = years;

46
Compute Execution Time in
Big-O
— Two approaches to compute the time
complexity of a program in big-O
— Approach 1:
Step1: Compute the total step-count.
Step2: Take big-O using theorem 1.2.
— Approach 2:
Step1: Take big-O on each step.
Step2: Sum up the big-O of all steps.

47
Rule of Sum
— If f1(n) = O(g1(n)), and f2(n)=O(g2(n)), then
f1(n) + f2(n) = O(max(g1(n), g2(n)).
◦ Ex. f1(n) = O(n), f2(n) = O(n2)
Then f1(n) + f2(n) = O(n2).
◦ Ex. f1(n) = O(n), f2(n) = O(n)
Then f1(n) + f2(n) = O(n).
— Good for computing the time complexity
of a sequential program.

48
Rule of Product
for (i=0; i<n; i++) { // O(n)
for (j=0; j<n; j++) // O(n)
sum := sum + 1; // O(1)
}

f(n) = O(n · n · 1) = O(n2).


— If f1(n) = O(g1(n)), and f2(n)=O(g2(n)),
then f1(n) · f2(n) = O(g1(n) · g2(n)).
◦ Ex. f1(n) = O(n), f2(n) = O(n)
Then f1(n) · f2(n) = O(n2).
— Applicable to nested loops.
49
Complexity of Binary Search
int BinarySearch(int *A, const int x, const int n)
{ int left=0, right=n-1;

while (left <= right) O(?)


{ // more integers to check
int middle = (left+right)/2; O(1)

if (x < A[middle]) right = middle-1; O(1)

else if (x > A[middle]) left = middle+1; O(1)

else return middle; O(1)


} // end of while
return -1; // not found
}

50
Complexity of Binary Search
— Analysis of the while loop:
◦ Iteration 1: n values to be searched
◦ Iteration 2: n/2 left for searching
◦ Iteration 3: n/4 left for searching
◦ …
◦ Iteration k+1: n/(2k) left for searching
When n/(2k) = 1, searching must finish.
i.e. n = 2k ⇒ k = log2 n
— Hence, worst-case exe time of binary
search is 𝑂(log2𝑛).
51
1.7.2 Performance Measurement
— Obtain actual space and time requirement
when running a program.
— How to do time measurement in code?
◦ Method 1: Use clock(), measured in clock
ticks
◦ Method 2: Use time(), measured in seconds
— To time a short program, it is necessary
to repeat it many times, and then take the
average.
52
Performance Measurement
Method 1: Use clock(), measured in clock
ticks
#include <time.h>

void main()
{
clock_t start = clock();
// main body of program comes here!
clock_t stop = clock();
double duration = ((double) (stop-start))
/ CLOCKS_PER_SEC;
}

53
Performance Measurement
Method 2: Use time(), measured in seconds
#include <time.h>

void main()
{
time_t start = time(NULL);

// main body of program comes here!

time_t stop = time(NULL);

double duration = (double) difftime(stop,start);


}

54

You might also like