DAA - Non-Deterministic Algorithms
DAA - Non-Deterministic Algorithms
Non-deterministic algorithms
Non-deterministic algorithms, also known as randomized algorithms, are algorithms that use
randomness in their computation. Unlike deterministic algorithms, which produce the same
output for a given input every time they are executed, non-deterministic algorithms can produce
different outputs or behaviors each time they are run, even for the same input.
The design and analysis of non-deterministic algorithms involve considering the average-case
behavior or expected performance of the algorithm, rather than focusing solely on the worst-case
scenario. Non-deterministic algorithms make use of random numbers or random choices during
their execution to achieve certain properties, such as improved efficiency, simplicity, or
probabilistic correctness.
Here are a few examples of non-deterministic algorithms and their applications:
➢ Randomized Quicksort: Quicksort is a widely used sorting algorithm. Randomized
Quicksort improves upon the worst-case performance of the deterministic version by
randomizing the choice of the pivot element. By choosing a random pivot, the algorithm
achieves expected O(n log n) time complexity, even for certain inputs that would
otherwise result in worst-case behavior.
➢ Monte Carlo Algorithms: These algorithms use random numbers to approximate solutions
to problems that are otherwise computationally expensive to solve exactly. Monte Carlo
methods are used in various fields, such as computational physics, finance, and
optimization problems. The accuracy of the approximation improves as more random
samples are taken.
➢ Randomized Prim's Algorithm: Prim's algorithm is used to find a minimum spanning tree
in a connected weighted graph. In its randomized version, the algorithm selects edges
randomly instead of always choosing the minimum-weight edge. Although the
deterministic version guarantees correctness, the randomized version provides a faster
average-case running time.
➢ Las Vegas Algorithms: These algorithms use randomness to improve efficiency by
allowing the algorithm to terminate early in certain cases. Las Vegas algorithms guarantee
correctness but have a variable running time. An example is the randomized algorithm for
solving the k-SAT problem, which is a well-known NP-complete problem.
➢ Skip Lists: Skip lists are a probabilistic data structure that allows for efficient searching,
insertion, and deletion operations in sorted lists. They are an alternative to balanced
search trees and offer expected O(log n) time complexity for these operations.
When analyzing non-deterministic algorithms, a common approach is to consider their expected
behavior, average-case complexity, or probabilistic guarantees. This involves analyzing the
algorithm's performance over a range of possible inputs and taking into account the probabilities
of different outcomes. Randomized algorithms provide a powerful toolset for solving complex
problems efficiently and often yield practical solutions in real-world scenarios.
What are the major differences between deterministic and non-deterministic algorithms?
Deterministic algorithm Non-deterministic algorithm
In the study of computational complexity, several classes of problems are defined based on the
resources required to solve them. Here are the definitions of the classes P, NP, NP-complete, and
NP-hard:
1. P (Polynomial Time): The class P represents the set of decision problems that can be
solved by a deterministic Turing machine in polynomial time. In other words, these are
the problems for which an algorithm exists that can solve them efficiently, with a time
complexity bound of O(n^k), where n is the size of the input and k is a constant.
Examples of problems in P include sorting, searching, and basic arithmetic operations.
2. NP (Non-deterministic Polynomial Time): The class NP represents the set of decision
problems for which a potential solution can be verified in polynomial time by a
deterministic Turing machine. In other words, given a solution, it can be verified
efficiently. However, finding the solution itself may not be efficient. This class includes
many important problems, such as the traveling salesman problem, the knapsack
problem, and the graph coloring problem.
3. NP-complete (Nondeterministic Polynomial-time complete): A problem is NP-complete
if it is both in NP and every problem in NP can be reduced to it in polynomial time. In
simpler terms, an NP-complete problem is one where a solution can be verified in
polynomial time and any other problem in NP can be transformed into it efficiently. The
classic example of an NP-complete problem is the Boolean satisfiability problem (SAT).
4. NP-hard (Nondeterministic Polynomial-time hard): The class NP-hard contains problems
that are at least as hard as the hardest problems in NP. Unlike NP-complete problems,
NP-hard problems may not be in NP themselves. These problems do not necessarily have
to be decision problems. Informally, an NP-hard problem is one that is "as hard as" or
"harder than" the NP-complete problems in terms of computational complexity.
In summary, the class P represents problems that can be solved efficiently, NP represents
problems that can be verified efficiently, NP-complete represents the hardest problems in NP, and
NP-hard represents problems that are at least as hard as the hardest problems in NP. The
relationship between these classes is that all NP-complete problems are NP-hard, but it is
unknown whether P = NP (i.e., whether NP-complete problems can be solved in polynomial time
or not).
For measuring the complexity of an algorithm, we use the input length as the parameter. For
example, An algorithm A is of polynomial complexity p() such that the computing time of A is
O(p(n)) for every input of size n.
Decision problem/ Decision algorithm: Any problem for which the answer is either zero or one
is decision problem. Any algorithm for a decision problem is termed a decision algorithm.
Optimization problem/ Optimization algorithm: Any problem that involves the identification
of an optimal (either minimum or maximum) value of a given cost function is known as an
optimization problem. An optimization algorithm is used to solve an optimization problem.
P-) is the set of all decision problems solvable by deterministic algorithms in polynomial time.
NP-) is the set of all decision problems solvable by nondeterministic algorithms in polynomial
time. Since deterministic algorithms are just a special case of nondeterministic, by this we
concluded that P ⊆ NP
The most famous unsolvable problems in Computer Science is Whether P=NP or P≠NP In
considering this problem, s.cook formulated the following question.
If there any single problem in NP, such that if we showed it to be in ‘P’ then that would imply
that P=NP.
Let L1 and L2 be problems, Problem L1 reduces to L2 (written L1 α L2) iff there is a way to
solve L1 by a deterministic polynomial time algorithm using a deterministic algorithm that
solves L2 in polynomial time
This implies that, if we have a polynomial time algorithm for L2, Then we can solve L1 in
polynomial time.
▪ Boolean formula f(a1, a2, …an) is satisfiable if there is a way to assign values to a variable
of the formula such that the function evaluates to true. If this is the case, the function
is satisfiable.
▪ On the other hand, if no such assignment is possible, the function is unsatisfiable.
▪ The expression (A ^ ¬B) is satisfiable for A = 1 and B = 0, but the expression (A ^ ¬A) is
unsatisfiable.
▪ No known algorithm exists that solves the problem efficiently.
Example: f(x, y, z) = (x ∨ (y ∧ z) ) ∧ (x ∧ z)
X y z x ∨ (y ∧ z) x∧z f(x, y, z)
0 0 0 0 0 0
0 0 1 0 0 0
0 1 0 0 0 0
0 1 1 1 0 0
1 0 0 1 0 0
1 0 1 1 1 1
1 1 0 1 0 0
1 1 1 1 1 1
▪ The function is true for the combinations (x = 1, y = 0, z = 1) and (x = y = z = 1), hence it
is satisfiable.
▪ Given the input sequence, it can be verified in linear time. But with n input n variables,
there exist 2n instances. Testing each of them takes O(n.2n) time, which speaks SAT ∈
NP.
Clique :
A clique is a subgraph of a graph such that all the vertices in this subgraph are connected with
each other that is the subgraph is a complete graph. The Maximal Clique Problem is to find the
maximum sized clique of a given graph G, that is a complete graph which is a subgraph of G and
contains the maximum number of vertices. This is an optimization problem. Correspondingly, the
Clique Decision Problem is to find if a clique of size k exists in the given graph or not.
To prove that a problem is NP-Complete, we have to show that it belongs to both NP and NP-
Hard Classes. (Since NP-Complete problems are NP-Hard problems which also belong to NP)
The Clique Decision Problem belongs to NP – If a problem belongs to the NP class, then it
should have polynomial-time verifiability, that is given a certificate, we should be able to verify
in polynomial time if it is a solution to the problem.
Proof:
1. Certificate – Let the certificate be a set S consisting of nodes in the clique and S is a
subgraph of G.
2. Verification – We have to check if there exists a clique of size k in the graph. Hence,
verifying if number of nodes in S equals k, takes O(1) time. Verifying whether each
vertex has an out-degree of (k-1) takes O(k2) time. (Since in a complete graph, each
vertex is connected to every other vertex through an edge. Hence the total number of
edges in a complete graph = kC2 = k*(k-1)/2 ). Therefore, to check if the graph formed by
the k nodes in S is complete or not, it takes O(k2) = O(n2) time (since k<=n, where n is
number of vertices in G).
Therefore, the Clique Decision Problem has polynomial time verifiability and hence belongs to
the NP Class.
The Clique Decision Problem belongs to NP-Hard – A problem L belongs to NP-Hard if every
NP problem is reducible to L in polynomial time. Now, let the Clique Decision Problem by C. To
prove that C is NP-Hard, we take an already known NP-Hard problem, say S, and reduce it to C
for a particular instance. If this reduction can be done in polynomial time, then C is also an NP-
Hard problem. The Boolean Satisfiability Problem (S) is an NP-Complete problem as proved by
the Cook’s theorem. Therefore, every problem in NP can be reduced to S in polynomial time.
Thus, if S is reducible to C in polynomial time, every NP problem can be reduced to C in
polynomial time, thereby proving C to be NP-Hard.
Proof that the Boolean Satisfiability problem reduces to the Clique Decision Problem
Let the boolean expression be – F = (x1 v x2) ^ (x1‘ v x2‘) ^ (x1 v x3) where x1, x2, x3 are the
variables, ‘^’ denotes logical ‘and’, ‘v’ denotes logical ‘or’ and x’ denotes the complement of x.
Let the expression within each parentheses be a clause. Hence we have three clauses – C1, C2 and
C3. Consider the vertices as – <x1, 1>; <x2, 1>; <x1’, 2>; <x2’, 2>; <x1, 3>; <x3, 3> where the
second term in each vertex denotes the clause number they belong to. We connect these vertices
such that –
1. No two vertices belonging to the same clause are connected.
2. No variable is connected to its complement
Thus, the graph G (V, E) is constructed such that – V = { <a, i> | a belongs to Ci } and E = { ( <a,
i>, <b, j> ) | i is not equal to j ; b is not equal to a’ } Consider the subgraph of G with the vertices
<x2, 1>; <x1’, 2>; <x3, 3>. It forms a clique of size 3 (Depicted by dotted line in above figure) .
Corresponding to this, for the assignment – <x1, x2, x3> = <0, 1, 1> F evaluates to true.
Therefore, if we have k clauses in our satisfiability expression, we get a max clique of size k and
for the corresponding assignment of values, the satisfiability expression evaluates to true. Hence,
for a particular instance, the satisfiability problem is reduced to the clique decision problem.
Therefore, the Clique Decision Problem is NP-Hard.
Vertex cover:
Now consider the “clique” problem which is NPC and reduce it into VC to prove NPC. Clique of
a graph G is a subset of vertices such that these vertices form a complete subgraph in the given
graph G.
The two graphs titles (a) and (b) are given below for the VC problem –
(a) (b)
Parallel Algorithms:
Parallel algorithms play a crucial role in the field of Design and Analysis of Algorithms (DAA)
as they focus on designing algorithms that can be executed on parallel computing architectures,
where multiple processors or cores work together to solve a problem more efficiently. Parallel
algorithms aim to exploit the available parallelism to achieve faster execution times and
improved scalability.
Here are some common techniques and examples of parallel algorithms in DAA:
1. Parallel Divide and Conquer: The divide and conquer technique can be parallelized by
dividing the problem into independent subproblems that can be solved concurrently on
different processors. The results from the subproblems are then combined to obtain the
final solution. Examples include parallel merge sort, parallel quicksort, and parallel
matrix multiplication.
2. Parallel Prefix Sum: The prefix sum, also known as the scan operation, calculates the
cumulative sum of elements in an array. The parallel prefix sum algorithm divides the
array into smaller segments and performs prefix sum computations concurrently. This
technique is useful in parallelizing algorithms like parallel sorting, parallel graph
algorithms, and parallel dynamic programming.
3. Parallel Graph Algorithms: Many graph algorithms can be parallelized to take advantage
of the inherent parallelism in graphs. For example, parallel breadth-first search (BFS) and
parallel depth-first search (DFS) algorithms can explore different parts of the graph
simultaneously on different processors. Other graph algorithms, such as minimum
spanning tree, shortest path, and maximum flow algorithms, can also be parallelized.
4. Parallel Dynamic Programming: Dynamic programming algorithms, which solve
problems by breaking them down into overlapping subproblems, can be parallelized to
exploit parallelism. The key is to identify independent subproblems that can be solved
concurrently. Parallel dynamic programming can be applied to problems such as
sequence alignment, matrix chain multiplication, and knapsack problems.
5. Parallel Monte Carlo Algorithms: Monte Carlo methods involve generating random
samples to approximate solutions to complex problems. Parallelization can be achieved
by distributing the sampling and computation across multiple processors. Applications of
parallel Monte Carlo algorithms include simulations, optimization problems, and
computational physics.
6. Parallel Sorting: Sorting algorithms, such as parallel merge sort and parallel quicksort,
can be designed to utilize multiple processors. The idea is to divide the input into smaller
subproblems that can be sorted independently and then merge the sorted subproblems in
parallel to obtain the final sorted result.
These are just a few examples of parallel algorithms in DAA. The design and analysis of parallel
algorithms require careful consideration of load balancing, communication overhead,
synchronization, and scalability issues. Performance metrics, such as speedup, efficiency, and
scalability, are often used to evaluate the effectiveness of parallel algorithms in exploiting
parallel computing resources.
The model of a parallel algorithm is developed by considering a strategy for dividing the data
and processing method and applying a suitable strategy to reduce interactions. In this chapter, we
will discuss the following Parallel Algorithm Models −
• Data parallel model
• Task graph model
• Work pool model
• Master slave model
• Producer consumer or pipeline model
• Hybrid model
Data Parallel
In data parallel model, tasks are assigned to processes and each task performs similar types of
operations on different data. Data parallelism is a consequence of single operations that is being
applied on multiple data items.
In the task graph model, parallelism is expressed by a task graph. A task graph can be either
trivial or nontrivial. In this model, the correlation among the tasks are utilized to promote locality
or to minimize interaction costs. This model is enforced to solve problems in which the quantity
of data associated with the tasks is huge compared to the number of computation associated with
them. The tasks are assigned to help improve the cost of data movement among the tasks.
Examples − Parallel quick sort, sparse matrix factorization, and parallel algorithms derived via
divide-and-conquer approach.
Here, problems are divided into atomic tasks and implemented as a graph. Each task is an
independent unit of job that has dependencies on one or more antecedent task. After the
completion of a task, the output of an antecedent task is passed to the dependent task. A task with
antecedent task starts execution only when its entire antecedent task is completed. The final
output of the graph is received when the last dependent task is completed (Task 6 in the above
figure).
In work pool model, tasks are dynamically assigned to the processes for balancing the load.
Therefore, any process may potentially execute any task. This model is used when the quantity of
data associated with tasks is comparatively smaller than the computation associated with the
tasks.
There is no desired pre-assigning of tasks onto the processes. Assigning of tasks is centralized or
decentralized. Pointers to the tasks are saved in a physically shared list, in a priority queue, or in
a hash table or tree, or they could be saved in a physically distributed data structure.
The task may be available in the beginning, or may be generated dynamically. If the task is
generated dynamically and a decentralized assigning of task is done, then a termination detection
algorithm is required so that all the processes can actually detect the completion of the entire
program and stop looking for more tasks.
Example − Parallel tree search
Master-Slave Model
In the master-slave model, one or more master processes generate task and allocate it to slave
processes. The tasks may be allocated beforehand if −
• the master can estimate the volume of the tasks, or
Hybrid Models
A hybrid algorithm model is required when more than one model may be needed to solve a
problem.
A hybrid model may be composed of either multiple models applied hierarchically or multiple
models applied sequentially to different phases of a parallel algorithm.
Example − Parallel quick sort