0% found this document useful (0 votes)
23 views58 pages

Relational Algebra

The document discusses relational algebra, a procedural query language used in relational databases, and its importance in query optimization and as a theoretical foundation for SQL. It outlines basic and derived operators of relational algebra, such as select, project, union, and set difference, and provides examples of their application. Additionally, it explains the concepts of union compatibility, set operations, and how to work with long expressions in relational algebra.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views58 pages

Relational Algebra

The document discusses relational algebra, a procedural query language used in relational databases, and its importance in query optimization and as a theoretical foundation for SQL. It outlines basic and derived operators of relational algebra, such as select, project, union, and set difference, and provides examples of their application. Additionally, it explains the concepts of union compatibility, set operations, and how to work with long expressions in relational algebra.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Relational Algebra

Relational Query Languages

• Languages for describing queries on a relational database


• Structured Query Language (SQL)
• Commercial query language
• Declarative
• Relational Algebra
• Intermediate language used within DBMS
• Procedural
Why is Relational Algebra Important?

• Provides a framework for query optimization


• It is the theoretical foundation of relational database and SQL
• SQL queries are internally translated into RA expressions
• Relational algebra takes relation instances as arguments and returns a relation
instance as output
• Relational algebra consists of a set of operators
Relational Algebra in a DBMS
Optimized
Relational Query Executable code
SQL query Relational
algebra execution
algebra
expression plan
expression
Code
parser
generator

Query optimizer

DBMS
4
Relational Algebra Operators
• Six basic operators
• Select
• Project
• Union
• Set difference
• Cartesian product
• Rename
• Derived operators
• Join
• Intersection
• Division

• The operators take one or more relations as inputs and give a new relation as a
result
Select Operation
• Notation: p (r)
: select operator
p: selection predicate/ formula
r: relation name or relational algebra expression

• Defined as: p (r) ={ t|t ∈ r and p(t)}

Where p is a formula in propositional calculus consisting of terms connected by : ∧


(and), ∨ (or), ¬ (not)
Each term is one of: <attribute> op <attribute> or <constant>
where op is one of: =, ≠, >, ≥. <. ≤
• Used to select a subset of tuples from a relation that satisfy a selection predicate
• Select operation produces a relation with the same schema as r consisting of tuples of r
that satisfies selection predicate
• Select operation is applied to each tuple individually, can not be applied more than one
tuple. If condition is true , tuple is selected
• The relation resulting from the SELECT operation has the same attributes as r
• Relational model is set-based (no duplicate tuples)
• Relation r has no duplicates, therefore selection cannot produce duplicates

Student (SID, Sname, Sex, Dept, Supervisor).


Find the details about the CSE students.

SQL: Select * from Student Where Dept=‘CSE’;


Corresponding RA expression is : 𝜎𝐷ept =‘CSE’ (Student)
Example of Select
Student

SID SName Sex Dept Supervisor


101 Saurav M CSE 4P01
102 Richa F CSE 4P01
103 Dip M BT 4P03
104 David M IT 4P05
105 Reema F BT 4P03
Query is :
106 Saurav M IT 4P05
Find the student details whose
deptno is CSE

𝜎𝐷𝑒𝑝𝑡="𝐶𝑆𝐸"(𝑆𝑡𝑢𝑑𝑒𝑛𝑡)
SID SName Sex Dept Supervisor
101 Saurav M CSE 4P01
102 Richa F CSE 4P01
Example of Select

Find the information about the students


SID SName Sex Dept Supervisor with the name “David”
104 David M IT 4P105 𝜎𝑆𝑁𝑎𝑚𝑒=′𝑆𝑁𝑎𝑚𝑒′(𝑆𝑡𝑢𝑑𝑒𝑛𝑡)

Rollno Name Sex Dept Supervisor


102 Richa F CSE 4P01 Find the information about the students
whose roll number between 102 and 105
103 Dip M BT 4P03
𝜎𝑅𝑜𝑙𝑙𝑛𝑜>101 𝐴𝑁𝐷 𝑅𝑜𝑙𝑙𝑁𝑜<106 (𝑆𝑡𝑢𝑑𝑒𝑛𝑡)
104 David M IT 4P05
105 Reema F BT 4P03
Properties of Select Condition

• Number of tuples resulting from a select operation is less than equal to number of tuples
in R
• Equivalences
Select operation is commutative:
𝜎𝑐1 (𝜎𝑐2 R ) = 𝜎𝑐2 (𝜎𝑐1 (R))
A sequence of selects can be applied in any order
We can combine a cascade of select operations into a single select with AND condition
𝜎𝑐1 (𝜎𝑐2 R ) = 𝜎𝑐1 𝐴𝑁𝐷 𝐶2 R )
SID SName Sex Dept Supervisor
101 Saurav M CSE 4P01
102 Richa F CSE 4P01
103 Dip M BT 4P03
104 David M IT 4P05
105 Reema F BT 4P03
106 Saurav M IT 4P05
Project Operation
• If we are interested in certain attributes of a relation, we use project operator to project the
relation over these attributes only
• Keep only the required attributes of a relation and deletes attributes that are not in
projection list
• Notation : 𝜋𝐴1,𝐴2,𝐴3,…𝐴𝑖 (R) or 𝜋<𝑎𝑡𝑡𝑟𝑖𝑏𝑢𝑡𝑒 𝑙𝑖𝑠𝑡> (R)
where A1,A2, A3,…Ai are the list of required attributes in the relation R

• Project is an unary operator (one relation as operand)


• Duplicate elimination: Removes duplicate tuples/rows if attribute list includes non key
attributes
• Example: to eliminate the supervisor attribute of student
πSID,SName,Sex,Dept (Student)

SQL: Select SID, SName, Sex, Dept from Student


Student
SID Name Sex Dept Supervisor
101 Saurav M CSE 4P01
102 Richa F CSE 4P01
103 Dip M BT 4P03
104 David M IT 4P05
105 Reema F BT 4P03
106 Saurav M IT 4P05
• Commutativity does not hold on Project
• Equivalences
𝜋𝑙𝑖𝑠𝑡1(𝜋𝑙𝑖𝑠𝑡2 (𝑅 )) = 𝜋𝑙𝑖𝑠𝑡1(R)
as long as <list2> contains <list1>, otherwise LHS is incorrect expression
𝜋list 𝜎𝐶(𝑅) = 𝜎𝐶 𝜋list(𝑅)
as long as all attributes used by C are in list
• Degree
• Number of attributes in projected attribute list
Example of Project
Student

SID SName Sex Dept Supervisor


101 Saurav M CSE 4P01
Find the roll no and name of students
102 Richa F CSE 4P01 𝜋𝑆𝐼𝐷,𝑆𝑁𝑎𝑚𝑒 ( Student)
103 Dip M BT 4P03
104 David M IT 4P05
105 Reema F BT 4P03
106 Saurav M IT 4P05 SID SName
101 Saurav
SName
Saurav is 102 Richa
Saurav displayed once as 103 Dip
Find the name of students Richa project removes
duplicate tuples. 104 David
𝜋 𝑆𝑁𝑎𝑚𝑒 ( Student) Dip
Name is non key 105 Reema
David attribute 106 Saurav
Reema
Size of Project Expression Result

• What about the number of tuples in new relation R1?


R1=𝜋<𝐿𝑖𝑠𝑡>( R)
• Two cases possible:
• Projection List < List> contains some key of the relation R
• |R1| =|R|
• Projection List <List> does not contain any key of R
• |R1| ≤|R|
Project in SQL

• SQL
SELECT DISTINCT <attribute list> FROM R
• Note, the need for DISTINCT in SQL to eliminate duplicates
Project and Select Example

Obtain name of the students whose department is either CSE or BT

𝜋𝑆𝑁𝑎𝑚𝑒 (𝜎𝐷𝑒𝑝𝑡=′𝐶𝑆𝐸′∨ 𝐷𝑒𝑝𝑡=′𝐵𝑇 ′ (Student))

sName
Saurav
Richa
Dip
Reema
Working With Long Expressions

• Sometimes easier to write expressions a piece at a time


• Incremental development
• Documentation of steps involved
• Consider in-line expression: 𝜋RollNo,Name,Sex𝜎Dept=‘CSE’ (Student)
• Equivalent sequence of operations:
T1← 𝜎Dept=‘CSE’ (Student)
RESULT← 𝜋RollNo,Name,Sex T1
Set Theoretic Operations

• Relation is a set of tuples => set operations should apply


• Result of combining two relations with a set operator is a relation => all its
elements must be tuples having same structure
• Hence, scope of set operations limited to union compatible relations
• Union compatible relations can be combined using union, intersection, and set
difference
• Union, intersection and set difference = binary operator
Union
• Notation: R  S
• Defined as:
R  S = {t | t  R or t  S}
• R  S is valid if R and S are union compatible
• Union operation is commutative R  S= S  R
• Union is associative R  (S  T)= (R  S)  R
Union Compatibility
• Union Compatibility
Consider two relations R , S where R = (A1, A2, …, Ak); S = (B1, B2, …, Bm)
R and S are called union-compatible if k = m and dom(Ai)= dom(Bi) for 1 ≤i ≤k
This means that both the relations have the same number of attributes and that each
pair of the corresponding attributes have same domain.
Domain of first attribute of R must be same with domain of first attribute of S, then
second attribute and so on… Here, Domain can be char, float, int, date…. It also
considers the content of the attributes

• R(SID,NAME) , S(SID, NAME, MARKS) ARE NOT UNION COMPATIBLE AS S


HAS 3 ATTRIBUTES
• R(SID, NAME) , S(SID, MARKS) ARE NOT UNION COMPATIBLE AS
DOM(MARKS) IS NOT EQUAL TO DOM(NAME)
• R(SID, NAME) S(SID,STUD_NAME) ARE UNION COMPATIBLE THOUGH
ATTRIBUTES HAVE DIFFERENT NAME , BUT DOMAIN SAME
Union Operation – Example
R S

A B A B

a 1 a 2
a 2 b 3
c 1

A B

a 1
R ∪ S: a 2
c 1
b 3
Meaningful expression, but union not possible
Mother SID
Father SID
Rita 101
Dev 101
Nita 102
Rohit 102
Rita 1033
Dev 103
Disha 104
Shah 104
Dimple 106
Shaan 105
Renaming and Union

• But renaming can overcome the limitations of set Parent SID


operators Dev 101

• Rename changes the attribute name in a relation without Rohit 102


changing its value Dev 103
Shah 104
• 𝜌𝑓𝑎𝑡ℎ𝑒𝑟→𝑝𝑎𝑟𝑒𝑛𝑡 (R1)
Shaan 105
• 𝜌𝑚𝑜𝑡ℎ𝑒𝑟→𝑝𝑎𝑟𝑒𝑛𝑡 (R2) Rita 101
• 𝜌𝑓𝑎𝑡ℎ𝑒𝑟→𝑝𝑎𝑟𝑒𝑛𝑡 (R1) ∪ 𝜌𝑚𝑜𝑡ℎ𝑒𝑟→𝑝𝑎𝑟𝑒𝑛𝑡 (R1) Nita 102

• Practical problem is here we loss the information who is Rita 1033


father and mother Disha 104
Dimple 106
• Better solution is join R1 and R2
Set Difference

• Notation R- S
• Defined as:
R - S = {t | t  R and t  S}
• Set differences must be taken between compatible relations.
1. r and s must have the same number of attributes
2. attribute domains of r and s must be compatible

• Set difference is not commutative R – S ≠ S-R


Set Difference Operation – Example
R S

A B A B

a 1 a 2
a 2 b 3
c 1

A B

a 1
R-S
c 1
Example

Relations:
Person (SSN, Name, Address, Hobby)
Professor (Id, Name, Office, Phone)
are not union compatible. However,

 Name (Person) and  Name (Professor)


are union compatible and
 Name (Person) -  Name (Professor)
makes sense.
Intersection
• Notation: R  S
• Defined as:
R  S ={ t | t  R and t  S }
• R  S is valid if R and S are union compatible
• Note: R  S = (R∪ S) – (( R - S) ∪ (S - R))
OR
R  S = R – (R – S)
• Thus  is not a basic operation and thus does not add any power to RA. It is
convenient to write R  S than to R – (R – S)
Intersection Example
R S

A B A B

a 1 a 2
a 2 b 3
c 1

A B
R∩S
a 2
• Supplier(SID, Sname, City, State)
• Customer(CID, Cname, City, State)

𝜋𝐶𝑖𝑡𝑦 (Customer) − 𝜋𝐶𝑖𝑡𝑦 (Supplier) returns cities of customers , not suppliers.

𝜋𝐶𝑖𝑡𝑦 (Customer) ∪ 𝜋𝐶𝑖𝑡𝑦 (Supplier) returns cities (distinct values) for customers
and suppliers

𝜋𝐶𝑖𝑡𝑦 (𝜎𝑆𝑡𝑎𝑡𝑒=′𝑊𝐵′ (Customer)) − 𝜋𝐶𝑖𝑡𝑦 (𝜎𝑆𝑡𝑎𝑡𝑒=′𝑊𝐵′ (Supplier)) returns


cities of WB where customers reside, not supplier
Find name of courses which were offered in 2019 but not in ENROLMENT
2018
KEY =( ROLLNO , CID)
SQL:
ROLLNO CID SEMESTER YEAR GRADE
101 CSC01 1 2018 F
SELECT CID FROM ENROLMENT WHERE YEAR=2019 MINUS 101 PH01 1 2018 D
SELECT CID FROM ENROLMENT WHERE YEAR=2018;
102 CSC01 1 2018 B
102 PH01 1 2018 C
103 CSC01 1 2019 F
103 PH01 1 2019 D
104 ES01 1 2019 C
105 CSC01 1 2017 F
106 ES01 1 2017 F
Find the name of courses which were offered both in 2018 and 2019

SQL:

SELECT CID FROM ENROLMENT WHERE YEAR=2018 INTERSET SELECT CID FROM ENROLMENT
WHERE YEAR=2019;
Examples on Union, Set Difference and Intersection

Find the roll number of students who have never failed in


any subject in 1st semester ROLLNO CID SEMESTER YEAR GRADE
101 CSC01 1 2018 F
T1= 𝜋𝑅𝑂𝐿𝐿𝑁𝑂 (𝜎𝑆𝐸𝑀𝐸𝑆𝑇𝐸𝑅=1 𝐴𝑁𝐷 𝐺𝑅𝐴𝐷𝐸=𝐹′ (ENROLMENT)
101 PH01 1 2018 D
R2= 𝜋𝑅𝑂𝐿𝐿𝑁𝑂 (ENROLMENT) – T1 102 CSC01 1 2018 B
102 PH01 1 2018 C
103 CSC01 1 2019 F
103 PH01 1 2019 D
104 ES01 1 2019 C
R2
105 CSC01 1 2017 F
ROLLNO 106 PH01 1 2017 F
102
ENROLMENT
104
Find the roll number of students who has enrolled for CSC01 either in 2018 or 2019 or in both

𝜋𝑅𝑂𝐿𝐿𝑁𝑂(𝜎𝐶𝐼𝐷=′𝐶𝑆𝐶01′∧ 𝑌𝐸𝐴𝑅=2018 (ENROLMENT)) ∪ 𝜋𝑅𝑂𝐿𝐿𝑁𝑂(𝜎𝐶𝐼𝐷=′𝐶𝑆𝐶01′ ∧ 𝑌𝐸𝐴𝑅=2019 (ENROLMENT))

ROLLNO
101
102
103
Find the roll number of the girl students who have scored B grade in CSC01

𝜋𝑅𝑂𝐿𝐿𝑁𝑂(𝜎𝑆𝐸𝑋=′𝐹′ (Student)) ∩ 𝜋𝑅𝑂𝐿𝐿𝑁𝑂(𝜎𝐺𝑅𝐴𝐷𝐸=′ 𝐵′ 𝐴𝑁𝐷 𝐶𝐼𝐷=′𝐶𝑆𝐶01′ (ENROLMENT))

Roll No
102
Cartesian Product

• Binary operator
• Denoted by R  S
R: one relation
S: another relation
 : cross to denote the Cartesian product
• The relations on which it is applied do not have to be union compatible
•  is used to combine the tuples from two relations in a combinatorial fashion
• Every tuple from R is combined with every tuple from S
Example

R S RS

A B C D A B C D
x1 x2 y1 y2 x1 x2 y1 y2
x3 x4 y3 y4 x1 x2 y3 y4
x3 x4 y1 y2
x3 x4 y3 y4

If R has 𝑛𝑅 number of tuples and S has 𝑛𝑆 number of tuples then R  S will have (𝑛𝑅 *𝑛𝑆 ) tuples

If R has n number of attributes and S has m number of attributes then R  S will have (n+m) number of
attributes
Example of Cartesian Product
• R  S is expensive to compute
• Factor of two in the size of each row
• Quadratic in the number of rows
•  operation is generally meaningless when it is used alone. It becomes useful
when it is followed by a selection that matches values of attributes coming from
the component relations
• example:
Faculty (FID, Fname, DOJ, Address, Dno) [ PK: FID, FK= Dno]
Department (DID, Dname, Year, HOD) [ PK= DID]
Retrieve the faculty ID, Fname and their HOD
Now, HOD belongs to Department and FID and Fname are from Faculty
So, we have to combine both the relations to get the suitable information.
Faculty
Dept
FID Fname DOJ Address Dno
DID Dname Year HOD
4P01 David 10102010 Delhi CSE
CSE Computer 1994 Anuj
4P02 Anuj 01042008 Delhi CSE
ME Mechanical 1980 Virat
4P03 Anita 01042008 Kolkata ME
ECE Electronics 1993 Alia
4P04 Alia 02052006 Mumbai ECE
EE Electrical 1990 Sachin
4P05 Virat 02052006 Mumbai ME
4P06 Sachin 01062004 Indore EE
4P07 Deewan 01042008 Kolkata ME
4P08 Dinesh 02052006 Mumbai ECE
T1 = 𝜋𝐹𝐼𝐷,𝐹𝑛𝑎𝑚𝑒,𝐷𝑛𝑜 𝐹𝑎𝑐𝑢𝑙𝑡𝑦
4P09 Rina 02052006 Mumbai ME
T2 = 𝜋𝐷𝐼𝐷,𝐻𝑂𝐷 (𝐷𝑒𝑝𝑡)
4P11 Shiva 01062004 Indore EE
T3 = T1 × T2
T4 (result) = 𝜋𝐹𝐼𝐷,𝐹𝑛𝑎𝑚𝑒,𝐻𝑂𝐷 ( 𝜎𝐷𝑛𝑜=𝐷𝐼𝐷( T3) )
Student( SID, Sname, Dno, Sex)
Dept (DID, Dname, Location)

Find name and ID of female students along with their departments:


SQL: Basic Query Structure

• SQL is based on set and relational operations with certain modifications


and enhancements
• A typical SQL query has the form:
select A1, A2, ..., An from r1, r2, ..., rm where P
• Ai represents an attribute
• Ri represents a relation
• P is a predicate
• This query is equivalent to the relational algebra expression
 A1,A2 ,,An ( P (r1  r2    rm ))
The select Clause

• The select clause list the attributes desired in the result of a query
• corresponds to the projection operation of the relational algebra
• Example: find the names of all departments :
select Dname from Department
• In the relational algebra, the query would be:
𝜋Dname (Department)
The where Clause

• The where clause specifies conditions that the result must satisfy
• Corresponds to the selection predicate of the relational algebra.
• To find rollnumber, name of the girls’ students whose department is BT
select Rollnumber, Name
from Student
where Sex= ‘F' and Deptno= ‘BT’
The from Clause

• The from clause lists the relations involved in the query


• Corresponds to the Cartesian product operation of the relational algebra.
• Find the Cartesian product Student X Department
select * from Student, Department

 Find the Name, Dname of all students.

select Name, Dname from Student, Department where [Link] = Department. DID
Division
• Division is typically required when you want to find out entities that are interacting with all entities of
a set of different type entities
• Not supported as a basic operator, but useful for expressing queries which contain the keyword all :
Find name of the students who have taken all courses necessary for graduation
Find customers who have bank account in all the banks in a city
• R ÷ 𝐒 can be applied if and only if
1. Attributes of S is proper subset of attributes of R
2. Relation returned by ÷ operator will have attributes = ( all attributes of R – all attributes of S)
3. Relation returned by ÷ operator will return those tuples from relation R which are associated to
every tuple in S
• R ÷S = { x| ∀y , y ∈ S, ∃ x| < xy >∈ R}
i.e., R ÷ S contains all x tuples such that for every y tuple in S, there is an <xy> tuple in R
Division Example

Q. Find name and roll number of the students who has played all sports in the college

Hobby Sports
SID Sname Sports Instructor Sports Instructor
101 Joy Cricket 4P106 Cricket 4P106
101 Joy Football 4P107 Football 4P107
102 Dinesh Football 4P107
103 Romit Cricket 4P106
Hobby ÷ Sports
103 Romit Football 4P107

SID Sname
101 Joy
103 Romit
Division Example
sno pno pno
pno pno
s1 p1 p1
p2 p2
p2
s1 p2 p4 p4
s1 p3 B1
B2
s1 p4 B3
s2 p1
s2 p2 sno
s1
s3 p2 s2 sno
s4 p2 s3 s1 sno
s4 p4 s4 s4 s1
A A ÷ B1 A ÷ B2 A ÷ B3

You might also like