0% found this document useful (0 votes)

5 views10 pages

Unit-2(Query Optimization and Processing)

This document discusses query optimization and processing in DBMS, detailing measures of query cost such as execution time, CPU time, memory usage, disk I/O, and network bandwidth. It covers various operations including selection, sorting, joining, and other data manipulation techniques, along with the evaluation of expressions and the transformation of relational expressions for improved query execution efficiency. Additionally, it highlights the importance of estimating statistics and choosing evaluation plans to optimize query performance, as well as the benefits of using materialized views.

Uploaded by

Thutta Sony swaroop

Available Formats

Download as PDF, TXT or read online on Scribd

Download as pdf or txt

0% found this document useful (0 votes)

5 views10 pages

Unit-2(Query Optimization and Processing)

Uploaded by

Thutta Sony swaroop

Available Formats

Download as PDF, TXT or read online on Scribd

Download as pdf or txt

You are on page 1/ 10

UNIT -2(QUERY OPTIMIZATION AND PROCESSING)

Measures of Query Cost in DBMS :

In a DBMS, the cost of a query is typically measured in terms of the resources it

consumes, such as CPU time, memory, I/O operations, and network bandwidth. There
are several measures of query cost in DBMS, including:

1. Execution time: Execution time is the time it takes for a query to execute and
return results. This measure is typically expressed in seconds or milliseconds and
is a useful measure of the performance of a query.
2. CPU time: CPU time is the amount of time that the query spends executing on
the CPU. This measure is typically expressed in seconds or milliseconds and is a
useful measure of the amount of processing power consumed by a query.
3. Memory usage: Memory usage is the amount of memory that the query
consumes while executing. This measure is typically expressed in bytes or
megabytes and is a useful measure of the amount of memory required to execute
a query.
4. Disk I/O: Disk I/O is the number of times that the query reads from or writes to
disk. This measure is typically expressed in the number of I/O operations or the
amount of data read or written.
5. Network bandwidth: Network bandwidth is the amount of data transferred
between the client and the server during the execution of the query. This
measure is typically expressed in bytes or megabytes per second and is a useful
measure of the amount of network resources consumed by a query.

Measuring the cost of a query in a DBMS is important for optimizing query performance
and improving system scalability. By understanding the resources consumed by a query,
DBMS professionals can make informed decisions about query optimization, indexing,
partitioning, and other performance tuning techniques.

Selection Operation in DBMS :

In a DBMS, the selection operation is used to retrieve a subset of rows from a table that
meet a specified condition. The selection operation is commonly referred to as the
"where" clause in SQL.
The syntax for the selection operation in SQL is as follows:

sqlCopy code
SELECT column1, column2, ...
FROM table
WHERE condition;

In this syntax, SELECT is used to specify the columns to be retrieved, FROM is used to
specify the table to be queried, and WHERE is used to specify the condition that the rows
must meet in order to be retrieved.

The condition in the WHERE clause is typically a comparison between a column in the
table and a constant value, a comparison between two columns in the table, or a logical
combination of multiple conditions using logical operators such as AND , OR, and NOT .

For example, the following SQL statement retrieves all rows from the "customers" table
where the "city" column is equal to "New York":

sqlCopy code
SELECT *
FROM customers
WHERE city = 'New York';

The selection operation is an important part of querying data in a DBMS, and it can be
used to filter and retrieve specific subsets of data based on a wide range of conditions.

Sorting in DBMS:
Sorting is the process of arranging data in a specific order, typically in ascending or
descending order, based on the values of one or more columns. Sorting is an important
operation in a DBMS and is commonly used to display query results in a meaningful way
or to prepare data for further analysis.

In SQL, the ORDER BY clause is used to sort the results of a query based on one or more
columns. The syntax for the ORDER BY clause is as follows:

SELECT column1, column2, ...

FROM table
WHERE condition
ORDER BY column1 [ASC|DESC], column2 [ASC|DESC], ...;
In this syntax, the ORDER BY clause is used to specify one or more columns by which the
results should be sorted. The optional ASC or DESC keyword is used to specify whether the
sort should be in ascending or descending order.

For example, the following SQL statement retrieves all rows from the "employees" table
where the "department" column is equal to "Sales", and sorts the results by the "salary"
column in descending order:

SELECT *
FROM employees
WHERE department = 'Sales'
ORDER BY salary DESC;

Sorting can be an expensive operation, especially on large datasets. To improve query

performance, it is common to create indexes on the columns used in the ORDER BY clause, as well
as to limit the number of columns included in the SELECT clause to only those needed.

Join Operation in DBMS:

The join operation is used in a DBMS to combine rows from two or more tables based
on a related column between them. The join operation is an important part of querying
data in a relational database and allows users to combine data from multiple tables into
a single result set.

In SQL, there are several types of join operations, including:

1. Inner join: The inner join operation returns only the rows from both tables where
the join condition is true.
2. SELECT *
3. FROM table1
4. INNER JOIN table2
ON table1.column = table2.column;

2. Left join: The left join operation returns all the rows from the left table and
matching rows from the right table. If there is no matching row in the right table,
the result set will contain NULL values for the right table columns.
3. SELECT *
4. FROM table1
5. LEFT JOIN table2
ON table1.column = table2.column;
3. Right join: The right join operation returns all the rows from the right table and
matching rows from the left table. If there is no matching row in the left table, the
result set will contain NULL values for the left table columns.
4. SELECT *
5. FROM table1
6. RIGHT JOIN table2
ON table1.column = table2.column;
Full outer join: The full outer join operation returns all the rows from both tables,
including those with no matching rows in the other table. If there is no matching row in
one of the tables, the result set will contain NULL values for the missing columns

SELECT *
FROM table1
FULL OUTER JOIN table2
ON table1.column = table2.column;

The join operation is an important part of querying data in a DBMS, and it can be used
to combine data from multiple tables based on a wide range of conditions. Joining tables
can be an expensive operation, especially on large datasets, so it is important to optimize
queries by using appropriate indexes and limiting the number of columns included in the
SELECT clause.

Other Operations in DBMS:

Apart from the selection and join operations, there are several other operations that are
commonly used in a DBMS to manipulate data. These include:

1. Projection: The projection operation is used to select specific columns from a

table, while discarding the rest. The syntax for the projection operation in SQL is:
2. SELECT column1, column2, ...
FROM table

2. Aggregation: The aggregation operation is used to calculate summary statistics

for groups of rows in a table. Common aggregation functions include COUNT,
SUM, AVG, MIN, and MAX. The syntax for the aggregation operation in SQL is:
3. SELECT column1, aggregate_function(column2)
4. FROM table
GROUP BY column1;
3. Subquery: A subquery is a query that is embedded within another query.
Subqueries can be used to retrieve data that will be used in the main query, or to
perform complex filtering or aggregation operations. The syntax for a subquery in
SQL is:
4. SELECT column1, column2, ...
5. FROM table1
WHERE column1 IN (SELECT column1 FROM table2 WHERE condition);

Set operations: Set operations are used to combine the results of two or more queries
into a single result set. The common set operations include UNION, INTERSECT, and
EXCEPT. The syntax for the UNION set operation in SQL is

SELECT column1, column2, ...

FROM table1
UNION
SELECT column1, column2, ...
FROM table2;

5. Modification operations: Modification operations are used to modify the data in a

table. The common modification operations include INSERT, UPDATE, and
DELETE. The syntax for the INSERT operation in SQL is:
6. INSERT INTO table (column1, column2, ...)
VALUES (value1, value2, ...);

Evaluation of Expressions in DBMS:

In a DBMS, expressions are used to perform calculations or comparisons on data. The

evaluation of expressions is an important part of processing queries and retrieving data
from a database. The steps involved in evaluating expressions in a DBMS are:

1. Parsing: The query parser first reads the query and breaks it down into smaller
units, such as keywords, identifiers, operators, and constants. This process is
called parsing.
2. Semantic analysis: After parsing, the query parser performs semantic analysis to
check if the query is syntactically correct and meaningful. It checks for errors such
as undefined variables, ambiguous column names, and invalid operators.
3. Optimization: Once the query is parsed and validated, the query optimizer
identifies the most efficient way to execute the query. This involves selecting the
most appropriate access path, join order, and join method.
4. Execution: After optimization, the query engine executes the query by evaluating
each expression in the query. Expressions can be evaluated using either tuple-at-
a-time or block-at-a-time processing.
• Tuple-at-a-time processing: In tuple-at-a-time processing, each row of data is
processed one at a time. This approach is best suited for small result sets or
queries with complex expressions.
• Block-at-a-time processing: In block-at-a-time processing, a block of rows is
processed at once. This approach is best suited for large result sets or queries
with simple expressions.
5. Result generation: Finally, the query engine generates the result set by combining
the rows of data that meet the query conditions. The result set is returned to the
user or application that issued the query.

The evaluation of expressions in a DBMS is a complex process that involves several

steps, including parsing, semantic analysis, optimization, execution, and result
generation. By optimizing the evaluation of expressions, DBMS can efficiently process
large amounts of data and retrieve results quickly.

Transformation of Relational Expressions in DBMS:

In a relational database, queries are expressed as relational expressions using operators
such as SELECT, PROJECT, JOIN, and UNION. These expressions are then transformed or
optimized by the query optimizer to improve the efficiency of the query execution. The
transformation of relational expressions in a DBMS involves the following steps:

1. Algebraic simplification: In this step, the query optimizer simplifies the relational
expression using algebraic identities and properties. For example, the optimizer
can use the distributive property to rewrite a query as a combination of simpler
queries.
2. Predicate push-down: In this step, the optimizer pushes down predicates
(conditions) in a query to the lowest possible level in the expression tree. This
reduces the number of rows that need to be processed and improves query
performance.
3. Join reordering: In this step, the optimizer reorders the join operations in a query
to reduce the number of intermediate results that need to be stored and
processed. This can significantly improve query performance for queries with
multiple joins.
4. Subquery optimization: In this step, the optimizer optimizes subqueries by
selecting the most efficient access path and join method. This can reduce the
overall cost of the query execution.
5. Index selection: In this step, the optimizer selects the most appropriate indexes to
use for the query. This can improve query performance by reducing the number
of disk accesses needed to retrieve the data.
6. View merging: In this step, the optimizer combines views in a query to eliminate
redundant computations and reduce the number of intermediate results that
need to be stored and processed.
7. Query rewrite: In this step, the optimizer rewrites the query using alternative
expressions that have the same meaning but are more efficient to execute.

The transformation of relational expressions in a DBMS is an important aspect of query

optimization. By applying these transformations, the optimizer can improve the
efficiency of query execution and reduce the overall cost of processing queries.

Estimating Statistics of Expression Results in DBMS:

In a DBMS, statistics are used to estimate the size and selectivity of query expressions.
The estimation of statistics is an important part of query optimization, as it helps the
query optimizer to choose the most efficient query plan.

The following are some of the techniques used in a DBMS to estimate statistics of
expression results:

1. Sampling: In this technique, a subset of the data is randomly selected and

analyzed to estimate the statistics of the full dataset. This is a commonly used
technique as it is fast and can provide reasonably accurate estimates.
2. Histograms: In this technique, the values of a column are divided into several
buckets based on their frequency of occurrence. The number of values in each
bucket is then used to estimate the selectivity of the query.
3. Index statistics: In this technique, the optimizer uses statistics from the database
index to estimate the selectivity of the query. This technique is often used for
queries that involve indexed columns.
4. Cost-based estimation: In this technique, the query optimizer uses a cost-based
model to estimate the statistics of the expression results. The cost model
considers factors such as the size of the data, the number of join operations, and
the selectivity of the query to estimate the cost of executing the query.
The accuracy of the statistics estimation depends on the quality and quantity of the data
available. To improve the accuracy of the estimation, the DBMS may use a combination
of techniques and may continuously update the statistics based on the changes in the
database. By accurately estimating the statistics of expression results, the DBMS can
optimize query plans and improve the efficiency of query processing
Choice of Evaluation Plans in DBMS:
The choice of evaluation plans in a DBMS depends on several factors, such as the size of
the database, the complexity of the query, and the available hardware resources. The
query optimizer in a DBMS selects the most efficient evaluation plan based on these
factors. The following are some of the factors that affect the choice of evaluation plans:

1. Index selection: The query optimizer may choose to use an index for a particular
query if it can significantly reduce the number of rows that need to be processed.
However, the use of an index may not always be the most efficient option,
especially if the index is not selective enough or if the cost of accessing the index
is too high.
2. Join order selection: In a query with multiple join operations, the order in which
the joins are performed can significantly affect the query performance. The query
optimizer may try different join orders and choose the one that results in the
lowest cost.
3. Join algorithm selection: There are several algorithms for performing joins, such
as nested loop join, hash join, and sort-merge join. The choice of join algorithm
depends on the size of the input tables, the available memory, and the available
CPU resources.
4. Parallelism: The query optimizer may choose to execute parts of a query in
parallel if there are multiple CPU cores available. Parallelism can significantly
improve query performance by allowing multiple operations to be executed
simultaneously.
5. Materialization: The query optimizer may choose to materialize intermediate
results if it can improve query performance. Materialization involves storing
intermediate results in a temporary table, which can be used later in the query
execution.
6. Subquery optimization: The query optimizer may choose to optimize subqueries
by selecting the most efficient access path and join method. This can reduce the
overall cost of the query execution.

The choice of evaluation plans in a DBMS is a complex process that involves considering
various factors and selecting the plan that results in the lowest cost. By selecting the
most efficient evaluation plan, the DBMS can improve the performance of query
processing and reduce the overall cost of executing queries.
Materialized Views in DBMS:
A materialized view in a DBMS is a precomputed table that stores the results of a query.
Materialized views are used to improve the performance of queries by reducing the
amount of work that needs to be done at query execution time. The following are some
of the benefits of using materialized views in a DBMS:

1. Faster query performance: Materialized views can significantly improve the

performance of queries that access large amounts of data. By precomputing the
results of a query and storing them in a materialized view, the DBMS can avoid
expensive join and aggregation operations at query execution time.
2. Reduced workload on the database: Materialized views can reduce the workload
on the database by precomputing the results of a query and storing them in a
table. This can reduce the amount of work that needs to be done at query
execution time and improve the overall performance of the database.
3. Better scalability: Materialized views can improve the scalability of a database by
reducing the amount of work that needs to be done at query execution time. This
can allow the database to handle larger amounts of data and more complex
queries.
4. Query optimization: Materialized views can be used by the query optimizer to
improve query plans. By using materialized views, the query optimizer can choose
a more efficient plan that involves accessing the precomputed results rather than
performing expensive join and aggregation operations.
5. Data consistency: Materialized views can be used to enforce data consistency
across multiple tables. By precomputing the results of a query and storing them
in a materialized view, the DBMS can ensure that the results are always up-to-
date and consistent with the underlying data.

However, materialized views also have some disadvantages. The precomputed results
may become stale if the underlying data changes, which can lead to inconsistent query
results. To address this issue, the DBMS may need to refresh the materialized views
periodically or in response to changes in the underlying data. Materialized views can
also consume significant storage space, which can be a concern for large databases.

Overall, materialized views can be a useful tool for improving query performance in a
DBMS, but their use should be carefully considered and balanced against the potential
disadvantages.

ch4 23 11 2023
100% (1)
ch4 23 11 2023
81 pages
Informatica B2B
0% (1)
Informatica B2B
6 pages
Apache Doris Docs (English) - Compressed
No ratings yet
Apache Doris Docs (English) - Compressed
1,714 pages
Sybase Interview Questions and Answers
100% (1)
Sybase Interview Questions and Answers
4 pages
Microstrategy Test Breakdown
No ratings yet
Microstrategy Test Breakdown
19 pages
Netezza Questions
100% (1)
Netezza Questions
22 pages
Tuning
No ratings yet
Tuning
20 pages
DBMS Practical - Removed
No ratings yet
DBMS Practical - Removed
11 pages
Power Bi
No ratings yet
Power Bi
5 pages
IDAB Assignment 3: 1. Explain SQL Subqueries
No ratings yet
IDAB Assignment 3: 1. Explain SQL Subqueries
6 pages
Introduction To SQL
100% (1)
Introduction To SQL
43 pages
Shivanesh Dbms
No ratings yet
Shivanesh Dbms
22 pages
Sy Dip - Dbms Super 25 With Answer by Shivam Sir
No ratings yet
Sy Dip - Dbms Super 25 With Answer by Shivam Sir
20 pages
Intermediate SQL Interview Questions
No ratings yet
Intermediate SQL Interview Questions
13 pages
SQL Ques
No ratings yet
SQL Ques
13 pages
DB Ass 3 Solved
No ratings yet
DB Ass 3 Solved
8 pages
Perofrmance and Indexes Discussion Questions Solutions PDF
No ratings yet
Perofrmance and Indexes Discussion Questions Solutions PDF
5 pages
Query Proc Notes
No ratings yet
Query Proc Notes
10 pages
Difference Between Varchar and Varchar2 Data Types?: Salary Decimal (9,2) Constraint Sal - CK Check (Salary 10000)
No ratings yet
Difference Between Varchar and Varchar2 Data Types?: Salary Decimal (9,2) Constraint Sal - CK Check (Salary 10000)
4 pages
Oracle SQL Tuning PDF
50% (2)
Oracle SQL Tuning PDF
70 pages
DBMSHBM
No ratings yet
DBMSHBM
80 pages
An Inline View Is A SELECT Statement in The FROM
No ratings yet
An Inline View Is A SELECT Statement in The FROM
5 pages
SQL More Notes-2
No ratings yet
SQL More Notes-2
23 pages
Teradata Performance Optimization
No ratings yet
Teradata Performance Optimization
20 pages
DBMS
No ratings yet
DBMS
27 pages
SQL Interview Qns
No ratings yet
SQL Interview Qns
19 pages
DBMS Notes
No ratings yet
DBMS Notes
31 pages
NICE ONE - SQL Optimization
No ratings yet
NICE ONE - SQL Optimization
60 pages
SQL Interview
No ratings yet
SQL Interview
62 pages
Querry Optimization
No ratings yet
Querry Optimization
13 pages
Wa0003.
No ratings yet
Wa0003.
20 pages
Teradata Interview Questions
No ratings yet
Teradata Interview Questions
8 pages
Final
No ratings yet
Final
4 pages
Gather Statistics With DBMS - STATS
No ratings yet
Gather Statistics With DBMS - STATS
24 pages
Set 3: Sybase Interview Questions and Answers:Theoritical
No ratings yet
Set 3: Sybase Interview Questions and Answers:Theoritical
5 pages
DBMS Normal Forms Lab
No ratings yet
DBMS Normal Forms Lab
86 pages
Remo SQL File
No ratings yet
Remo SQL File
26 pages
TOP 50 ASP Questions
No ratings yet
TOP 50 ASP Questions
5 pages
Week3v2 (2)
No ratings yet
Week3v2 (2)
10 pages
SQL_part_2__1733732359
No ratings yet
SQL_part_2__1733732359
10 pages
Unit - I Relational Database Management System (RDBMS)
No ratings yet
Unit - I Relational Database Management System (RDBMS)
15 pages
Questions (SQL) : Saved To The Database. DCL
No ratings yet
Questions (SQL) : Saved To The Database. DCL
15 pages
Day 10 1729086189
No ratings yet
Day 10 1729086189
14 pages
DBMS lab report 3rd year b.tech (1)
No ratings yet
DBMS lab report 3rd year b.tech (1)
66 pages
Ut Karsh
No ratings yet
Ut Karsh
23 pages
2
No ratings yet
2
8 pages
SQL Interview
No ratings yet
SQL Interview
9 pages
Dbms Question Bank Ans
No ratings yet
Dbms Question Bank Ans
16 pages
TCL - Transaction Control Language
No ratings yet
TCL - Transaction Control Language
14 pages
DBMS Solution Paper 2009
No ratings yet
DBMS Solution Paper 2009
4 pages
1) General Questions of SQL SERVER What Is RDBMS?: Read More Here
100% (1)
1) General Questions of SQL SERVER What Is RDBMS?: Read More Here
26 pages
Managing Database Systems.pptx
No ratings yet
Managing Database Systems.pptx
17 pages
1955 PDF
No ratings yet
1955 PDF
16 pages
Oracle Related Questions
No ratings yet
Oracle Related Questions
33 pages
SQL_Interview_Questions
No ratings yet
SQL_Interview_Questions
4 pages
40 Most Asked SQL
No ratings yet
40 Most Asked SQL
8 pages
2 marks question answers
No ratings yet
2 marks question answers
16 pages
General Questions of SQL SERVER
No ratings yet
General Questions of SQL SERVER
34 pages
Types of Functioin
No ratings yet
Types of Functioin
3 pages
IM Ch11 DB Performance Tuning Ed12
No ratings yet
IM Ch11 DB Performance Tuning Ed12
17 pages
Sqlserver Interview Qustions
No ratings yet
Sqlserver Interview Qustions
21 pages
SQL Interview Success From Beginner To Pro
From Everand
SQL Interview Success From Beginner To Pro
Shana
No ratings yet
Advanced SAS Interview Questions You'll Most Likely Be Asked
From Everand
Advanced SAS Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
More on C# in Front Office
From Everand
More on C# in Front Office
Xing Zhou
No ratings yet
Oracle SQL and PL/SQL
From Everand
Oracle SQL and PL/SQL
Niraj Gupta
4.5/5 (8)
Data Warehouse Maintenance Plan
No ratings yet
Data Warehouse Maintenance Plan
88 pages
Shrink
No ratings yet
Shrink
10 pages
DB2 - An Introduction To Materialized Query Tables
100% (1)
DB2 - An Introduction To Materialized Query Tables
9 pages
CSI ZG515/ SS ZG515 Data Warehousing: BITS Pilani
No ratings yet
CSI ZG515/ SS ZG515 Data Warehousing: BITS Pilani
67 pages
4.2 NoSQL Databases UNIT-1
No ratings yet
4.2 NoSQL Databases UNIT-1
35 pages
Using The Online Patching Readiness Report in Oracle E-Business Suite Release 12.2 (Doc ID 1531121.1) PDF
No ratings yet
Using The Online Patching Readiness Report in Oracle E-Business Suite Release 12.2 (Doc ID 1531121.1) PDF
19 pages
db2z 11 Perfbook
No ratings yet
db2z 11 Perfbook
1,080 pages
Ivunit Query Processing
No ratings yet
Ivunit Query Processing
12 pages
Oracle 19c - Complete Checklist For Upgrading To Oracle Database 19c (19.x) Using DBUA
No ratings yet
Oracle 19c - Complete Checklist For Upgrading To Oracle Database 19c (19.x) Using DBUA
22 pages
Create Materialized View: Purpose
No ratings yet
Create Materialized View: Purpose
19 pages
Materialized View
100% (1)
Materialized View
29 pages
DAA-C01 Dumps - Snowflake Certified SnowPro Advanced - Data Analyst
No ratings yet
DAA-C01 Dumps - Snowflake Certified SnowPro Advanced - Data Analyst
11 pages
OBIEE Standards and Best Practices
100% (1)
OBIEE Standards and Best Practices
10 pages
View vs. Materialized View
No ratings yet
View vs. Materialized View
3 pages
Oracle To Azure Database For Postgresql Migration Cookbook: Prepared by
No ratings yet
Oracle To Azure Database For Postgresql Migration Cookbook: Prepared by
13 pages
Ntma Import Schema Creation
No ratings yet
Ntma Import Schema Creation
9 pages
NoSql Mod 1 C
No ratings yet
NoSql Mod 1 C
16 pages
Data Science v No SQL Databases
No ratings yet
Data Science v No SQL Databases
61 pages
InInformaticaiew Question in in Vesco
No ratings yet
InInformaticaiew Question in in Vesco
3 pages
Answers
No ratings yet
Answers
4 pages
Replication Base Datos Oracle
No ratings yet
Replication Base Datos Oracle
48 pages
Views Vs Mview in Oracle PDF
No ratings yet
Views Vs Mview in Oracle PDF
43 pages
Unit 2 _ Big Data Analytics_CCS334
No ratings yet
Unit 2 _ Big Data Analytics_CCS334
36 pages
Access and Security Part - 5
No ratings yet
Access and Security Part - 5
90 pages
SELECT Department, MAX (Salary) As "Highest Salary" FROM Employees GROUP BY Department
No ratings yet
SELECT Department, MAX (Salary) As "Highest Salary" FROM Employees GROUP BY Department
1 page