0% found this document useful (0 votes)

66 views

Ivunit Query Processing

Query processing involves compiling and executing database queries. It has two phases: compile-time and runtime. During compile-time, the query is optimized and an execution plan is generated. During runtime, the execution plan is carried out and results are returned. The goal of query optimization is to find the most efficient execution plan with the lowest estimated cost in terms of resources like disk access, CPU usage, and memory usage. Common techniques for query optimization include exhaustive search, heuristics-based rules, and query rewriting.

Uploaded by

Keshava Varma

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

66 views

Ivunit Query Processing

Uploaded by

Keshava Varma

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 12

IV UNIT

QUERY PROCESSING

Definition. Query processing denotes the compilation and execution of a query specification

usually expressed in a declarative database query language such as the
structured query language (SQL). Query processing consists of a compile-time phase and a
runtime phase.

Introduction to Query Processing

 Query Processing is a translation of high-level queries into low-level expression.

 It is a step wise process that can be used at the physical level of the file system, query
optimization and actual execution of the query to get the result.
 It requires the basic concepts of relational algebra and file structure.
 It refers to the range of activities that are involved in extracting data from the database.
 It includes translation of queries in high-level database languages into expressions that can be
implemented at the physical level of the file system.
 In query processing, we will actually understand how these queries are processed and how they
are optimized.

In the above diagram,

 The first step is to transform the query into a standard form.
 A query is translated into SQL and into a relational algebraic expression. During this process,
Parser checks the syntax and verifies the relations and the attributes which are used in the query.
 The second step is Query Optimizer. In this, it transforms the query into equivalent expressions
that are more efficient to execute.
 The third step is Query evaluation. It executes the above query execution plan and returns the
result.

Translating SQL Queries into Relational Algebra

Example

SELECT Ename FROM Employee

WHERE Salary > 5000;

Translated into Relational Algebra Expression

σ Salary > 5000 (π Ename (Employee))

OR
π Ename (σ Salary > 5000 (Employee))

A sequence of primitive operations that can be used to evaluate a query is a Query Execution Plan or
Query Evaluation Plan.
 The above diagram indicates that the query execution engine takes a query execution plan and
returns the answers to the query.
 Query Execution Plan minimizes the cost of query evaluation.
MEASURES OF QUERY COST

Query processing steps and evaluation plan, Though a system can create multiple plans for a
query, the chosen method should be the best of all. It can be done by comparing each possible
plan in terms of their estimated cost. For calculating the net estimated cost of any plan, the cost
of each operation within a plan should be determined and combined to get the net estimated cost
of the query evaluation plan.

The cost estimation of a query evaluation plan is calculated in terms of various resources that
include:

o Number of disk accesses

o Execution time taken by the CPU to execute a query
o Communication costs in distributed or parallel database systems.

To estimate the cost of a query evaluation plan, we use the number of blocks transferred from the
disk, and the number of disks seeks. Suppose the disk has an average block access time of
ts seconds and takes an average of tT seconds to transfer x data blocks. The block access time is
the sum of disk seeks time and rotational latency. It performs S seeks than the time taken will
be b*tT + S*tS seconds. If tT=0.1 ms, tS =4 ms, the block size is 4 KB, and its transfer rate is 40
MB per second. With this, we can easily calculate the estimated cost of the given query
evaluation plan.

The response time for a query-evaluation plan (that is, the wall-clock time required to execute
the plan), assuming no other activity is going on in the computer, would account for all these
costs, and could be used as a measure of the cost of the plan. Unfortunately, the response time of
a plan is very hard to estimate without actually executing the plan, for the following reasons:

1. The response time depends on the contents of the buffer when the query begins execution; this
information is not available when the query is optimized, and is hard to account for even if it
were available.
2. In a system with multiple disks, the response time depends on how accesses are distributed
among disks, which is hard to estimate without detailed knowledge of data layout on disk.

tT – time to transfer one block

tS – time for one seek
Cost for b block transfers plus S seeks
b * tT + S * tS
We ignore CPU costs for simplicity
Real systems do take CPU cost into account
SELECTION OPERATION

Unary Relational Operations

 SELECT (symbol: σ)
 PROJECT (symbol: π)
 RENAME (symbol: ρ)

Relational Algebra Operations From Set Theory

 UNION (υ)
 INTERSECTION ( ),
 DIFFERENCE (-)
 CARTESIAN PRODUCT ( x )

Binary Relational Operations

 JOIN
 DIVISION

Evaluation of Expressions

The evaluate an algebraic expression means to find the value of the expression when the

variable is replaced by a given number. To evaluate an expression, we substitute the given
number for the variable in the expression and then simplify the expression using the order of
operations.

The result of each evaluation is materialized in a temporary relation for subsequent use. A
disadvantage to this approach is the need to construct the temporary relations, which (unless they
are small) must be written to disk. An alternative approach is to evaluate several operations
simultaneously in a pipeline, with the results of one operation passed on to the next, without the
need to store a temporary relation.

1. Materialization

2. Pipelining

Let's take a brief discussion of these methods.

Materialization

In this method, the given expression evaluates one relational operation at a time. Also, each
operation is evaluated in an appropriate sequence or order. After evaluating all the operations,
the outputs are materialized in a temporary relation for their subsequent uses. It leads the
materialization method to a disadvantage. The disadvantage is that it needs to construct those
temporary relations for materializing the results of the evaluated operations, respectively. These
temporary relations are written on the disks unless they are small in size.

Pipelining

Pipelining is an alternate method or approach to the materialization method. In pipelining, it

enables us to evaluate each relational operation of the expression simultaneously in a pipeline. In
this approach, after evaluating one operation, its output is passed on to the next operation, and
the chain continues till all the relational operations are evaluated thoroughly. Thus, there is no
requirement of storing a temporary relation in pipelining.
Query Optimization

A query optimizer translates a query into a sequence of physical operators that can be directly

carried out by the query execution engine. ... The goal of query optimization is to derive an
efficient execution plan in terms of relevant performance measures, such as memory usage
and query response time.

A query optimizer translates a query into a sequence of physical operators that can be directly

Optimizer Components
Query processing is done with the following aim −
 Minimization of response time of query (time taken to produce the results to user’s
query).
 Maximize system throughput (the number of requests that are processed in a given
amount of time).
 Reduce the amount of memory and storage required for processing.
 Increase parallelism.

Query Transformer

For some statements, the query transformer determines whether it is advantageous to rewrite the
original SQL statement into a semantically equivalent SQL statement with a lower cost.

When a viable alternative exists, the database calculates the cost of the alternatives separately
and chooses the lowest-cost alternative. The following graphic shows the query transformer
rewriting an input query that uses OR into an output query that uses UNION ALL.

Query Transformer
Approaches to Query Optimization

Among the approaches for query optimization, exhaustive search and heuristics-based
algorithms are mostly used.
Exhaustive Search Optimization
In these techniques, for a query, all possible query plans are initially generated and then the best
plan is selected. Though these techniques provide the best solution, it has an exponential time
and space complexity owing to the large solution space. For example, dynamic programming
technique.
Heuristic Based Optimization
Heuristic based optimization uses rule-based optimization approaches for query optimization.
These algorithms have polynomial time and space complexity, which is lower than the
exponential complexity of exhaustive search-based algorithms. However, these algorithms do
not necessarily produce the best query plan.
Some of the common heuristic rules are −
 Perform select and project operations before join operations. This is done by moving the
select and project operations down the query tree. This reduces the number of tuples
available for join.
 Perform the most restrictive select/project operations at first before the other operations.

EXAMPLE:
Estimating Statistics of Expression results in DBMS

In order to determine ideal plan for evaluating the query, it checks various details about the
tables that are stored in the data dictionary. These informations about tables are collected when a
table is created and when various DDL / DML operations are performed on it. The optimizer
checks data dictionary for :

 Total number of records in a table, nr. This will help to determine which table needs to be
accessed first. Usually smaller tables are executed first to reduce the size of the
intermediary tables. Hence it is one of the important factors to be checked.
 Total number of records in each block, fr. This will be useful in determining blocking
factor and is required to determine if the table fits in the memory or not.
 Total number of blocks assigned to a table, br. This is also an important factor to
calculate number of records that can be assigned to each block. Suppose we have 100
records in a table and total number of blocks are 20, then fr can be calculated as nr/b r =
100/20 = 5.
 Total length of the records in the table, l r. This is an important factor when the size of
the records varies significantly between any two tables in the query. If the record length
is fixed, there is no significant affect. But when a variable length records are involved in
the query, average length or actual length needs to be used depending upon the type of
operations.
 Number of unique values for a column, d Ar. This is useful when a query uses
aggregation operation or projection. It will provide an estimate on distinct number of
columns selected while projection. Number groups of records can be determined using
this when Aggregation operation is used in the query. E.g.; SUM, MAX, MIN, COUNT
etc.
 Levels of index, x. This data provides the information like whether the single level of
index like primary key index, secondary key indexes are used or multi-level indexes like
B+ tree index, merge-sort index etc are used. These index levels will provide details
about number of block access required to retrieve the data.
 Selection cardinality of a column, s A. This is the number of records present with same
column value as A. This is calculated as nr/d Ar. i.e.; total number of records with
distinct value of A. For example, suppose EMP table has 500 records and DEPT_ID has
5 distinct values. Then the selection cardinality of DEPT_ID in EMP table is 500/ 5 =
100. That means, on an average 100 employees are distributed among each department.

Choice of Evaluation Plans

To choose an evaluation plan for a query expression is simply to choose for each operation the cheapest
algorithm for evaluating it. We can choose any ordering of the operations that ensures that operations
lower in the tree are executed before operations.

statistics for them based on cost based evaluation and heuristic methods are collected. It
checks the costs based on the different techniques that we have seen so far. It checks for the
operator, joining type, indexes, number of records, selectivity of records, distinct values etc from
the data dictionary. Once all these informations are collected, the best evaluation plan.

EXAMPLE:

EMP and DEPT.

∏ EMP_ID, DEPT_NAME (σ DEPT_ID = 10 AND EMP_LAST_NAME = ‘Joseph’ (EMP)
∞DEPT)

Or
∏ EMP_ID, DEPT_NAME (σ DEPT_ID = 10 AND EMP_LAST_NAME = ‘Joseph’ (EMP
∞DEPT))

Or
σ DEPT_ID = 10 AND EMP_LAST_NAME = ‘Joseph’ (∏ EMP_ID, DEPT_NAME, DEPT_ID
(EMP ∞DEPT))
MATERIALIZED VIEW

is a database object that contains the results of a query. For example, it may be a local copy of
data located remotely, or may be a subset of the rows and/or columns of a table or join result, or
may be a summary using an aggregate function.

The basic difference between View and Materialized View is that Views are not stored
physically on the disk. ... View can be defined as a virtual table created as a result of the query
expression. However, Materialized View is a physical copy, picture or snapshot of the base
table.
MATERIZLIZED AND VIEWS DIFFEERENCES:

Views Materialized Views

Query expression are stored in the

databases system, and not the resulting Resulting tuples of the query expression are
tuples of the query expression. stored in the databases system.

Views needs not to be updated every time Materialized views are updated as the
the relation on which view is defined is tuples are stored in the database system. It
updated, as the tuples of the views are can be updated in one of three ways
computed every time when the view is depending on the databases system as
accessed. mentioned above.

It does not have any storage cost It does have a storage cost associated with
associated with it. it.

It does not have any updation cost It does have updation cost associated with
associated with it. it.

There is no SQL standard for defining a

materialized view, and the functionality is
There is an SQL standard of defining a provided by some databases systems as an
view. extension.

Materialized views are efficient when the

view is accessed frequently as it saves the
Views are useful when the view is computation time by storing the results
accessed infrequently. before hand.

Performance Tuning Interview Questions
100% (3)
Performance Tuning Interview Questions
8 pages
Chapter - 1 - Query Optimization
No ratings yet
Chapter - 1 - Query Optimization
38 pages
Jim Manico (Hamburg) - Securiing The SDLC PDF
No ratings yet
Jim Manico (Hamburg) - Securiing The SDLC PDF
33 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
28 pages
DBMS
No ratings yet
DBMS
24 pages
Query Processing
No ratings yet
Query Processing
3 pages
Presentation9 - Query Processing and Query Optimization in DBMS
No ratings yet
Presentation9 - Query Processing and Query Optimization in DBMS
36 pages
Rdbms Assignment
No ratings yet
Rdbms Assignment
12 pages
Sudhansu,DBMS-3rd
No ratings yet
Sudhansu,DBMS-3rd
6 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
33 pages
Measures of Query Cost
No ratings yet
Measures of Query Cost
15 pages
Measures of Query Cost
No ratings yet
Measures of Query Cost
15 pages
Query Proc Notes
No ratings yet
Query Proc Notes
10 pages
Query Processing
No ratings yet
Query Processing
20 pages
Query Processing
No ratings yet
Query Processing
5 pages
UT 1 QB Solution
No ratings yet
UT 1 QB Solution
4 pages
UNIT 4 Query Processing and Different types of Databases
No ratings yet
UNIT 4 Query Processing and Different types of Databases
13 pages
ADBChapter 1
No ratings yet
ADBChapter 1
32 pages
ADBMS Chapter One
No ratings yet
ADBMS Chapter One
21 pages
Execution Plan Basics - Simple Talk
100% (1)
Execution Plan Basics - Simple Talk
34 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
127 pages
Advanced Database Systems: Chapter 3:query Processing and Evaluation
100% (1)
Advanced Database Systems: Chapter 3:query Processing and Evaluation
36 pages
Chapter One1
No ratings yet
Chapter One1
21 pages
Query Processing and Query Optimization Techniques
No ratings yet
Query Processing and Query Optimization Techniques
20 pages
Chapter 5
No ratings yet
Chapter 5
45 pages
A Survey of Distributed Query Optimization
No ratings yet
A Survey of Distributed Query Optimization
10 pages
4.query Processing and Optimization
No ratings yet
4.query Processing and Optimization
5 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
25 pages
Heuristic-Based Query Optimization
No ratings yet
Heuristic-Based Query Optimization
6 pages
Bca3020 Unit 11 SLM
No ratings yet
Bca3020 Unit 11 SLM
22 pages
CH - 2 Query Process
No ratings yet
CH - 2 Query Process
44 pages
DE_Module5_QueryOptimization
No ratings yet
DE_Module5_QueryOptimization
11 pages
36-Module-4 Query Optimization-16-03-2024
No ratings yet
36-Module-4 Query Optimization-16-03-2024
6 pages
An Overview of Query Optimization in Relation Systems
No ratings yet
An Overview of Query Optimization in Relation Systems
11 pages
Introduction To Query Processing and Optimization
No ratings yet
Introduction To Query Processing and Optimization
4 pages
SQL Performance Tuning Process: Set Statistics ON DBCC DBCC Select From Select From
No ratings yet
SQL Performance Tuning Process: Set Statistics ON DBCC DBCC Select From Select From
12 pages
RL QOptimizer a Reinforcement Learning Based Query Optimizer
No ratings yet
RL QOptimizer a Reinforcement Learning Based Query Optimizer
14 pages
DBMS Unit 4
No ratings yet
DBMS Unit 4
9 pages
Execution Plan Basics
No ratings yet
Execution Plan Basics
29 pages
Chapter 1 Query Processing and Optimization
No ratings yet
Chapter 1 Query Processing and Optimization
129 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
23 pages
Unit 6
No ratings yet
Unit 6
34 pages
Lecture1 Comp 202 Dsa
No ratings yet
Lecture1 Comp 202 Dsa
32 pages
Adaptive Query Processing
No ratings yet
Adaptive Query Processing
140 pages
DDS Unit - 2
No ratings yet
DDS Unit - 2
7 pages
CH - 1 Query Process SW
No ratings yet
CH - 1 Query Process SW
43 pages
Query Processing and Optimization in Oracle RDB: Gennady Antoshenkov, Mohamed Ziauddin
No ratings yet
Query Processing and Optimization in Oracle RDB: Gennady Antoshenkov, Mohamed Ziauddin
9 pages
Chapter 2 Querry Proccessing
No ratings yet
Chapter 2 Querry Proccessing
7 pages
Adbms Notes
No ratings yet
Adbms Notes
50 pages
Chapter 2 - Query Optimization
No ratings yet
Chapter 2 - Query Optimization
40 pages
Chapter 4 Query Optimization
100% (2)
Chapter 4 Query Optimization
35 pages
Introduction To Query Processing
No ratings yet
Introduction To Query Processing
21 pages
SQL Performance in ERP 11i
No ratings yet
SQL Performance in ERP 11i
14 pages
Ch-2 Query Processing and Optimization
No ratings yet
Ch-2 Query Processing and Optimization
26 pages
Dbms Chapter 5
No ratings yet
Dbms Chapter 5
54 pages
Data Communication Basics CH 2
No ratings yet
Data Communication Basics CH 2
36 pages
Advanced Database Systems Chapter 2
100% (1)
Advanced Database Systems Chapter 2
16 pages
Query Processing in DBMS
No ratings yet
Query Processing in DBMS
22 pages
Advanced Database Chapter Two Query Processing and Optimization
100% (1)
Advanced Database Chapter Two Query Processing and Optimization
43 pages
Tutorial On High-Level Synthesis: and We
No ratings yet
Tutorial On High-Level Synthesis: and We
7 pages
Graph Layout Support for Model-Driven Engineering
From Everand
Graph Layout Support for Model-Driven Engineering
Miro Spönemann
No ratings yet
Excel Formula List
No ratings yet
Excel Formula List
12 pages
Mca Dbms Notes Unit 1
No ratings yet
Mca Dbms Notes Unit 1
36 pages
Dbms Unit IV
No ratings yet
Dbms Unit IV
35 pages
DBMS Unit V
No ratings yet
DBMS Unit V
17 pages
Big Data UNIT1
No ratings yet
Big Data UNIT1
23 pages
Bda Unit 4
No ratings yet
Bda Unit 4
20 pages
GCP DevOps - Interview Preparation Guide
No ratings yet
GCP DevOps - Interview Preparation Guide
10 pages
Java Docs in JCreator
No ratings yet
Java Docs in JCreator
3 pages
PHP Legacy App Scaling Overview
No ratings yet
PHP Legacy App Scaling Overview
38 pages
Milestone 2
100% (2)
Milestone 2
8 pages
SRS&SDS Format WithDetailedPoints 083013
No ratings yet
SRS&SDS Format WithDetailedPoints 083013
20 pages
Software Development Life Cycle (SDLC) : Asma Sajid
No ratings yet
Software Development Life Cycle (SDLC) : Asma Sajid
20 pages
Clarion - ABC Library Reference
100% (3)
Clarion - ABC Library Reference
1,396 pages
Mock Exam 2
No ratings yet
Mock Exam 2
25 pages
The A-Z of Web Scraping in 2020 (A How-To Guide)
No ratings yet
The A-Z of Web Scraping in 2020 (A How-To Guide)
18 pages
Sequence
No ratings yet
Sequence
1,568 pages
SPiiPlus Modbus Setup Guide
No ratings yet
SPiiPlus Modbus Setup Guide
56 pages
New 4
No ratings yet
New 4
49 pages
Starting and Stoping Sap System
No ratings yet
Starting and Stoping Sap System
13 pages
Diagrama de Flujo Sobre Las Instrucciones de Un Proceso Relacionado Con Su Quehacer Laboral Ga5-240202501-Aa1-Ev01
100% (1)
Diagrama de Flujo Sobre Las Instrucciones de Un Proceso Relacionado Con Su Quehacer Laboral Ga5-240202501-Aa1-Ev01
5 pages
Web Services Spring Boot JPA Hibernate
No ratings yet
Web Services Spring Boot JPA Hibernate
9 pages
Introduction To Perl
100% (1)
Introduction To Perl
62 pages
PAT1 Result f15cb31
100% (1)
PAT1 Result f15cb31
9 pages
Lecture 9 & 10 - Normalization
No ratings yet
Lecture 9 & 10 - Normalization
31 pages
Sandesh Kandel VIT University 19BCE2639: Digital Assignment (DSA)
No ratings yet
Sandesh Kandel VIT University 19BCE2639: Digital Assignment (DSA)
7 pages
A P
No ratings yet
A P
3 pages
Oracle XML Publisher
100% (2)
Oracle XML Publisher
62 pages
Ramu Nelapati
No ratings yet
Ramu Nelapati
5 pages
IS2023 Tutorial 7 Ecom
No ratings yet
IS2023 Tutorial 7 Ecom
8 pages
TWS Git & Github Short Notes
No ratings yet
TWS Git & Github Short Notes
16 pages
Web Programming Lab Programs Vi Bca Questions Answer Web Programming Lab Programs Vi Bca Questions Answer
No ratings yet
Web Programming Lab Programs Vi Bca Questions Answer Web Programming Lab Programs Vi Bca Questions Answer
20 pages
An Introduction To The Tools and Platforms On Android
No ratings yet
An Introduction To The Tools and Platforms On Android
80 pages
BLOCKTRANSFERANDSEARCH
No ratings yet
BLOCKTRANSFERANDSEARCH
7 pages
Tle Quiz
No ratings yet
Tle Quiz
4 pages
M.areen Object Oriented Programming
No ratings yet
M.areen Object Oriented Programming
9 pages