0% found this document useful (0 votes)
20 views86 pages

Relational Algebra and SQL Overview

This document covers the concepts of Relational Algebra and SQL as part of a Database Management System lecture. It explains the mathematical definitions of relational databases, various operations in relational algebra, and their SQL equivalents, including selection, projection, and join operations. Additionally, it discusses complex SQL query constructs and aggregate functions, providing examples for better understanding.

Uploaded by

mirain242
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views86 pages

Relational Algebra and SQL Overview

This document covers the concepts of Relational Algebra and SQL as part of a Database Management System lecture. It explains the mathematical definitions of relational databases, various operations in relational algebra, and their SQL equivalents, including selection, projection, and join operations. Additionally, it discusses complex SQL query constructs and aggregate functions, providing examples for better understanding.

Uploaded by

mirain242
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Database Management System

Lecture 2

Relational Algebra and SQL

* Some materials adapted from R. Ramakrishnan, J. Gehrke


Today’s Agenda
• Relational Algebra
• Complex SQL

Database Management System 2


Relational Algebra

Database Management System 3


Relational DB and Algebra
• SQL • Relational Algebra
• Practical definition of relational DB • Mathematical definition of Relational DB
• Operates on Tables (bags) • Operates on Relations (Sets)

• Operations • Operations
• Keywords • set-based operations
• Statements: SELECT, FROM, WHERE,… • Intersection, Union,...

• The default is to produce a bag of


rows as a query result
• Want a set, use DISTINCT

Database Management System 4


Describing a relational DB mathematically
• Two ingredients
• A relation is a set of tuples
• Define query operators as a set functions

Database Management System 5


Recap: Cross product with Set
• Let A = {a, b, c} and B = {1, 2}
• Cross product in set theory is defined as ordered pairs (2-tuples) where each
pair consists of an element from A and B

A × B = {(a, 1), (b, 1), (c, 1), (a, 2), (b, 2), (c, 2)}

• How about A = {a, b, c}, B = {1, 2}, and C = {α, β}?

Database Management System 6


Defining Relations
Person(name, salary, num, status)
name = {all possible strings of 30 characters}
salary = {real numbers between 0 and 100,000,000}
num = {integer between 0 and 9999}
status = {“a”, “b”}

• Any instance of the relation is always a subset (⊆) of attributes


• name × sal × num × status

• Each relation instance is a subset of the cross product of its domains


• one element of a relation is called tuple
• A relation is always a set by definition

Database Management System 7


Recap: Set Theory

A = {1, 3, 5, 7} B = {1, 2, 3, 4}

• What do these return?


• A∩B
• A∪B
• A–B
• A×B

Database Management System 8


Relational Algebra has Additional Operations

A = {1, 3, 5, 7} B = {1, 2, 3, 4}

• Introducing new operators


(C for condition, L for attribute list, R for renaming specification)
• A⋈cB
• A÷B
• 𝝈c (A)
• 𝜋 L(A)
• 𝜌 R(A)

Database Management System 9


Relational Algebra as a Query Language
• We don’t normally use relational algebra directly
• Products don’t allow you to write relational algebra queries

• But, it is used internally in a DBMS to represent a query plan


• It is also often used in theoretical work on databases
• (although fragments of first order logic are frequently used as well ... )

Database Management System 10


Relational Algebra Queries w/out Operators
• What does the following SQL query return? Student

SELECT * Student
FROM Student; John Cusack
Will Smith

• Answer: Student
(It is called identity function)

• A relation name by itself is a valid relational algebra query


• Listing the relation name just returns the tuples in the relation

Database Management System 11


Relational Algebra: Selection operator (𝝈)
Account
Number Owner Balance Type
7003001 Jane Smith 1,000,000 Savings
7003003 Alfred Hitchcock 4,400,200 Savings
7003005 Takumi Fujiwara 2,230,000 Checking
7003007 Brian Mills 1,200,000 Savings

• The relational algebra query


𝝈 Balance<3000 (Account)
• Is similar to the SQL query
SELECT *
FROM Account
WHERE Balance < 3,000,000;

Database Management System 12


Relational Algebra: Selection operator (𝝈)
• Select (𝝈) is a unary operator:
𝝈:R→R
• It is always applied to a single relation

𝝈 Balance<3000 (Account)

Select operator Relation or relational


algebra expression

the predicate (condition)


Attribute Comparator (≥, >, =, ≠, <, ≤) Attribute|Constant

Database Management System 13


Exercises
• 𝝈 Balance<3,000,000 (Account)
• 𝝈 Number<7003005 (Account)
• 𝝈 Balance=Number (Account)
• 𝝈 Type=“checking” (𝝈 Balance<3,000,000 (Account))

Account
Number Owner Balance Type
7003001 Jane Smith 1,000,000 Savings
7003003 Alfred Hitchcock 4,400,200 Savings
7003005 Takumi Fujiwara 2,230,000 Checking
7003007 Brian Mills 1,200,000 Savings

Database Management System 14


Relational Algebra: Projection Operator(𝜋)
Account
Number Owner Balance Type
7003001 Jane Smith 1,000,000 Savings
7003003 Alfred Hitchcock 4,400,200 Savings
7003005 Takumi Fujiwara 2,230,000 Checking
7003007 Brian Mills 1,200,000 Savings

• The relational algebra query:


𝜋 Number, Owner (Account)
• Is similar to the SQL query
SELECT Number, Owner
FROM Account;

Database Management System 15


Relational Algebra: Projection operator (𝜋)
• Projection (𝜋) is a unary operator:
𝜋:R→R
• It is always applied to a single relation

𝜋 Number, Owner (Account)

Projection operator Relation or relational


algebra expression

List of attributes to keep

Database Management System 16


Example
SELECT Number
𝜋 Owner (Account) Vs.
FROM Account;
Account
Number Owner Balance Type
7003001 Jane Smith 1,000,000 Savings
7003003 Alfred Hitchcock 4,400,200 Savings
7003005 Takumi Fujiwara 2,230,000 Checking
7003007 Brian Mills 1,200,000 Savings
7003009 Alfred Hitchcock 3,400,200 Checking

Owner Number
Jane Smith • Relations are always sets 7003001
Alfred Hitchcock • Query answer is a set of names 7003003
Takumi Fujiwara
• and J. Smith appears just once 7003005
Brian Mills
in the answer 7003007
7003009

Database Management System 17


Combining Select and Project
• Are any of these equivalent ?

𝜋 Owner(𝝈 Balance < 3,000,000 (Account))


𝝈 Balance<3,000,000(𝜋 Owner, Balance (Account))
𝜋 Owner(𝝈 Balance<3,000,000(𝜋 Owner, Balance(Account)))
𝝈 Type = “checking” (𝝈 Balance<3,000,000(𝜋 Owner, Balance(Account)))

Account
Number Owner Balance Type
7003001 Jane Smith 1,000,000 Savings
7003003 Alfred Hitchcock 4,400,200 Savings
7003005 Takumi Fujiwara 2,230,000 Checking
7003007 Brian Mills 1,200,000 Savings
7003009 Alfred Hitchcock 3,400,200 Checking

Database Management System 18


Relational Algebra: Cross Product operator (×)
• Used in the basic definition of a relation
• “An instance of a relation is a subset of the cross product of its domains”

• Is also an operator in the relational algebra

Database Management System 19


Example
• Suppose we have following two relations
Teacher(TID, Tname) Course(CID, Cname)
Teacher Course
TID Tname CID Cname
101 Emma Thompson 346 How to Act
105 Billy Elliot 491 How to Think
110 John Waine
Teacher X Course SELECT * FROM Teacher, Course;
• The cross product TID Tname CID Cname
101 Emma Thompson 346 How to Act
produces every
101 Emma Thompson 491 How to Think
possible combinations
105 Billy Elliot 346 How to Act
of teacher and courses
105 Billy Elliot 491 How to Think
110 John Waine 346 How to Act
110 John Waine 491 How to Think
Database Management System 20
Relational Algebra: Join operator (⋈)
• Join () is a binary operator
⋈:R×R→R
• It is always applied to a two relations and returns one

Account ⋈Number=Account Deposit

relation or relational Relation or relational


algebra expression algebra expression

the join predicate (condition)


Attribute comparator(≥, >, =, ≠, <, ≤) Attribute

Database Management System 21


Relational Algebra: Join operator (⋈)
Account
Number Owner Balance Type

Deposit
Accnt TxID Date Amount

• The relational algebra query

Account ⋈ Number=Accnt(Deposit)

• is equivalent to

𝝈 Number = Accnt (Account × Deposit)

Database Management System 22


Relational Algebra: Join operator (⋈)
• The join operator is defined for convenience

R1 ⋈ a1=a2R2 ≡ 𝝈 a1=a2 (R1 × R2)

• Any query with a join can always be rewritten into cross product followed by
selection

Database Management System 23


Notes on Join
• Each simple Boolean predicate in the join condition must compare an
attribute from one relation to an attribute in the other relation

Account ⋈ Number = Account ^ type = “checking” Deposit


• type=“checking” is not a join condition

• if you have a join with NO condition, then it is just a cross product

Database Management System 24


Examples
S instance of Student F instance of Faculty
sid name advisor age fid name age
101 Bill 301 20 301 Morrison 45
102 John 302 20 302 Groot 37
103 Edward 301 19
104 Albert 301 19
105 Thompson 302 19

• S ⋈ advisor=fid (F) select * from Student as s, Faculty as f where [Link] = [Link];

• S ⋈ [Link] < [Link] (F) select * from Student as s, Faculty as f where [Link] < [Link];

• The most common join is called a equi-join (for equality condition)


R1 ⋈ A1 = A2 R2
Database Management System 25
SQL statement to an relational Algebra expression

SELECT DISTINCT attributes


FROM T1, T2, …
WHERE conditions
?
𝜋 attributes(𝝈 conditions (T1 × T2 × … ))

• SELECT-FROM-WHERE queries are sometimes described as equivalent to the


Select-Project-Join (SPJ) subset of relational algebra

Database Management System 26


Complex SQL

Database Management System 27


More SQL query constructs

1. SELECT … 1. Extensions: SUM, COUNT, MIN, AVG, etc


2. FROM … 2. Extensions include various kinds of JOINs
3. Additional comparators, e.g. EXISTS, IN, ANY
3. WHERE …

(SELECT … FROM … WHERE …) 4. Operators that takes two or more complete


4. UNION SQL queries as arguments, e.g., UNION and
(SELECT … FROM … WHERE …) INTERSECT

ORDER BY … 5. Several additional clauses, e.g., ORDER BY,


5. GROUP BY … GROUP BY, and HAVING
HAVING …

Database Management System 28


More SQL query constructs

1. SELECT … 1. Extensions: SUM, COUNT, MIN, AVG, etc


2. FROM … 2. Extensions include various kinds of JOINs
3. Additional comparators, e.g. EXISTS, IN, ANY
3. WHERE …

(SELECT … FROM … WHERE …) 4. Operators that takes two or more complete


4. UNION SQL queries as arguments, e.g., UNION and
(SELECT … FROM … WHERE …) INTERSECT

ORDER BY … 5. Several additional clauses, e.g., ORDER BY,


5. GROUP BY … GROUP BY, and HAVING
HAVING …

Database Management System 29


Sample Database
• Let’s consider the following DB for the examples

Customer(Number, Name, Address, Crating,


Camount, Cbalance, Salesperson)
foreign key
[Link] ->[Link]
Salesperson(Number, Name, Address, Office)

• We are going to other DBs time to time

Database Management System 30


SELECT (1/4)
• Aggregate Operators: COUNT, SUM, MIN, MAX, and AVG

SELECT MIN(Cbalnace), MAX(Cbalance), AVG(Cbalance)


FROM Customer;

SELECT MIN(Cbalnace), MAX(Cbalance), AVG(Cbalance)


FROM Customer
WHERE age > 35;

• If one aggregate operator appears in the SELECT clause


• ALL OF THE ENTRIES in the select clause MUST BE AN AGGREGATE OPERATOR
• Unless the query includes a GROUP BY clause (more on later)

Database Management System 31


Stop to think
• What would/should the query result be?
• Is it allowed?

SELECT Name, Crating, AVG(Cbalance)


FROM Customer;

Database Management System 32


SELECT (2/4)
• What is the difference between these two queries?

SELECT COUNT(Name) SELECT DISTINCT Name


Vs.
FROM Customer; FROM Customer;

• When will these two queries return the same answer?


• or what are the conditions for it to happen

Database Management System 33


SELECT (3/4)
• What is the implication of using DISTINCT
• When computing the SUM or AVG of an attribute?
SUM(DISTINCT(AGE)) Vs. SUM(age)

The SUM or AVG will be computed only distinct values

• When computing the MIN or MAX of an attribute?


MIN(DISTINCT(AGE)) Vs. MIN(age)

No Difference: the result does not depend on whether


or not duplicates are removed

Database Management System 34


SELECT (4/4)
• SELECT clause list can also include simple arithmetic expressions using
+, -, *, /

SELECT (Camount – Cbalance) AS AvailableCredit, Name


FROM Customer
WHERE Camount > 0

Database Management System 35


More SQL query constructs

1. SELECT … 1. Extensions: SUM, COUNT, MIN, AVG, etc


2. FROM … 2. Extensions include various kinds of JOINs
3. Additional comparators, e.g. EXISTS, IN, ANY
3. WHERE …

(SELECT … FROM … WHERE …) 4. Operators that takes two or more complete


4. UNION SQL queries as arguments, e.g., UNION and
(SELECT … FROM … WHERE …) INTERSECT

ORDER BY … 5. Several additional clauses, e.g., ORDER BY,


5. GROUP BY … GROUP BY, and HAVING
HAVING …

Database Management System 36


FROM: Syntactic Sugars and new operators
• There are a number of join types that can be expressed in FROM clause
• Inner join (the regular join)
• Cross join syntactic sugars that can be expressed
• natural join
using SELECT-FROM-WHERE queries

• left outer join


• right outer join New operators
• full outer join

Database Management System 37


FROM
• These two queries are equivalent

1. SELECT [Link], [Link]


FROM Customer C JOIN Salesperson S ON [Link] = [Link]
WHERE [Link] < 6;

𝜋 [Link], [Link](𝝈[Link] < 6(Customer ⋈[Link] = [Link] Salesperson))

2. SELECT [Link], [Link]


FROM Customer C, Salesperson S
WHERE [Link] = [Link] AND [Link] < 6;

𝜋 [Link], [Link](𝝈[Link] < 6^[Link] = [Link](Customer × Salesperson))

Database Management System 38


FROM: JOIN with USING clause
• JOIN with USING clause when attributes in the 2 tables have the same name
Course(CNumber, CName, Description)
Teacher(TNumber, TName, Phone)
Offering(CNumber, TNumber, Time, Days, Room)

• These Two queries are equivalent


SELECT [Link], [Link], Room
FROM Course C JOIN Offering USING(CNumber);

SELECT [Link], [Link], Room


FROM Course C JOIN Offering O ON [Link]=[Link];

• USING clause doesn’t need (and can’t have) a correlation name

Database Management System 39


FROM: Basic Join ≡ (INNER) JOIN
• For the INNER JOIN
SELECT [Link], [Link]
FROM Customer C INNER JOIN Salesperson S ON [Link] = [Link];

• The query result includes all “matches” but excludes


• customer rows that do not have a Salesperson
• Salesperson rows that are not assigned to any customers

• The keyword “INNER” is optional


• above query is equivalent to

SELECT [Link], [Link]


FROM Customer C JOIN Salesperson S ON [Link] = [Link];

Database Management System 40


FROM: cross product ≡ CROSS JOIN
• The following queries are equivalent
SELECT *
FROM Customer, Salesperson;

SELECT *
FROM Customer CROSS JOIN Salesperson;

Database Management System 41


FROM: Equi-Jioin vs. Natual Join (1/3)
• When the join is based on equality of attributes, we always have two
identical attributes in the result
Faculty Department
Name DeptID DeptID DeptName
Smith 1 1 Engineering
James 2 2 Communications
Brown 3 3 Marketing
Johnson 1 SELECT *
Robert FROM Faculty F INNER JOIN Department D
ON [Link] = [Link];
[Link] [Link] [Link] [Link]
Smith 1 1 Engineering
Equi-Join Johnson 1 1 Engineering
James 2 2 Communication
Brown 3 3 Markeing
Database Management System 42
FROM: Equi-Jioin vs. Natual Join (1/3)
• Equi-Join with the USING construct: applicable with columns having same
name
Faculty Department
Name DeptID DeptID DeptName
Smith 1 1 Engineering
James 2 2 Communications
Brown 3 3 Marketing
Johnson 1 SELECT *
FROM Faculty F INNER JOIN Department D
USING (DeptID);
Name DeptID DeptName
Smith 1 Engineering
Equi-Join with Johnson 1 Engineering

USING construct James 2 Communication


Brown 3 Markeing
Database Management System 43
FROM: Equi-Jioin vs. Natual Join (3/3)
• NATURAL JOIN: Equi-Join with only one column for each equally named
columns
Faculty Department
Name DeptID DeptID DeptName
Smith 1 1 Engineering
James 2 2 Communications
Brown 3 3 Marketing
Johnson 1 SELECT *
FROM Faculty NATURAL JOIN Department;
Name DeptID DeptName
Smith 1 Engineering

NATURAL JOIN Johnson


James
1
2
Engineering
Communication
If you don’t specify which attributes to
Brown 3 Markeing
join on, natural join will join on
all attributes with the same name Database Management System 44
FROM: more on NATURAL JOIN (1/2)
• NATURAL JOIN is like a “macro” that joins tables with an equality condition
for all attributes with the same name

Course(CNumber, CName, Description)

Teacher(TNumber, TName, Phone)

Offering(CNumber, TNumber, Time, Days, Room)

• NATURAL JOIN drops one of duplicate columns automatically

Database Management System 45


FROM: more on NATURAL JOIN (2/2)
• List the course and teacher name for all course offerings
• This query can be expressed with the NATURAL JOIN or with an INNER JOIN
• These two queries are equivalent
SELECT CName, TName
FROM Course C, Offering O, Teaching T
WHERE [Link] = [Link] AND [Link] = [Link]

SELECT CName, TName


FROM Course NATURAL JOIN Offering NATURAL JOIN Teacher;

• They are equivalent because the join attributes have the same attribute names

• But is it always useful?

Database Management System 46


FROM: INNER JOIN Vs. OUTER JOIN (1/2)
• For the INNER JOIN
SELECT [Link], [Link]
FROM Customer C INNER JOIN Salesperson S ON [Link] = [Link]
• the query result does not include (p.40)
• a customer that does not have a salesperson
• a salesperson that is not assigned to any customers

Number Name Address Crating Camount Cbalance Salesperson


Customer
1 Smith 1st Str. 700 10,000 9,000 55

2 Jones 2nd Str. 700 8,000 4,000 77

3 Mills 3rd Str. 700 11,000 8,000 NULL

Number Name Address Office


Salesperson
55 Miller 5th Str. 101

77 Khan 7th Str. 102

83 Dunham 8th Str. 103


Database Management System 47
FROM: INNER JOIN Vs. OUTER JOIN (2/2)
• An INNER (regular) JOIN includes only those customers that have
salespersons (only the matches)
SELECT [Link], [Link]
FROM Customer as C INNER JOIN Salesperson as S
ON [Link] = [Link];
• A LEFT OUTER JOIN will include all matches plus all – customers that do not
have a Salesperson
• A RIGHT OUTER JOIN will include all matches plus all – salespersons that are
not assigned to any customers
• A FULL OUTER JOIN will include all of these

Database Management System 48


FROM: LEFT OUTER JOIN
INNER JOIN on [Link] = [Link] gives:
1 Smith 1st Str. 700 10,000 9,000 55 55 Miller 5th Str. 101
2 Jones 2nd Str. 700 8,000 4,000 77 77 Khan 7th Str. 102

LEFT OUTER JOIN on [Link] = [Link] gives:


1 Smith 1st Str. 700 10,000 9,000 55 55 Miller 5th Str. 101
2 Jones 2nd Str. 700 8,000 4,000 77 77 Khan 7th Str. 102
3 Mills 3rd Str. 700 11,000 8,000 NULL NULL NULL NULL NULL

Number Name Address Crating Camount Cbalance Salesperson


Customer
1 Smith 1st Str. 700 10,000 9,000 55

2 Jones 2nd Str. 700 8,000 4,000 77

3 Mills 3rd Str. 700 11,000 8,000 NULL

Number Name Address Office


Salesperson
55 Miller 5th Str. 101

77 Khan 7th Str. 102

83 Dunham 8th Str. 103


Database Management System 49
FROM: RIGHT OUTER JOIN
INNER JOIN on [Link] = [Link] gives:
1 Smith 1st Str. 700 10,000 9,000 55 55 Miller 5th Str. 101
2 Jones 2nd Str. 700 8,000 4,000 77 77 Khan 7th Str. 102

RIGHT OUTER JOIN on [Link] = [Link] gives:


1 Smith 1st Str. 700 10,000 9,000 55 55 Miller 5th Str. 101
2 Jones 2nd Str. 700 8,000 4,000 77 77 Khan 7th Str. 102
NULL NULL NULL NULL NULL NULL NULL 83 Dunham 8th Str. 103

Number Name Address Crating Camount Cbalance Salesperson


Customer
1 Smith 1st Str. 700 10,000 9,000 55

2 Jones 2nd Str. 700 8,000 4,000 77

3 Mills 3rd Str. 700 11,000 8,000 NULL

Number Name Address Office


Salesperson
55 Miller 5th Str. 101

77 Khan 7th Str. 102

83 Dunham 8th Str. 103


Database Management System 50
FROM: FULL OUTER JOIN * not supported in mysql

INNER JOIN on [Link] = [Link] gives:


1 Smith 1st Str. 700 10,000 9,000 55 55 Miller 5th Str. 101
2 Jones 2nd Str. 700 8,000 4,000 77 77 Khan 7th Str. 102

RIGHT OUTER JOIN on [Link] = [Link] gives:


1 Smith 1st Str. 700 10,000 9,000 55 55 Miller 5th Str. 101
2 Jones 2nd Str. 700 8,000 4,000 77 77 Khan 7th Str. 102
3 Mills 3rd Str. 700 11,000 8,000 NULL NULL NULL NULL NULL
NULL NULL NULL NULL NULL NULL NULL 83 Dunham 8th Str. 103

Customer Number Name Address Crating Camount Cbalance Salesperson

1 Smith 1st Str. 700 10,000 9,000 55

2 Jones 2nd Str. 700 8,000 4,000 77

3 Mills 3rd Str. 700 11,000 8,000 NULL

Number Name Address Office


Salesperson
55 Miller 5th Str. 101

77 Khan 7th Str. 102

83 Dunham 8th Str. 103


Database Management System 51
FROM: a form of subquery
• You can put a complete query expression in the FROM clause
• also known as nested queries or subqueries
• Parentheses are important

SELECT ...
FROM Employee E, (SELECT ... FROM ... WHERE ...)
WHERE ...

Database Management System 52


Relational Algebra Operators

Database Management System 53


Eight standard relational algebra operators
•𝜋 project We have seen already
•𝝈 select We have seen already
•∪ union From set theory
•∩ intersect From set theory can only used with
union-compatible relations
•– difference From set theory
•× cross product We have seen already
•⋈ join We have seen already
•÷ divide
•𝜌 renaming

Database Management System 54


Union-compatible relations
• Two relations are union-compatible if
• have same number of attributes
• have same domains

• Example
Checking(CNum: int, COwner: string, CBalance: int)

Savings(SNum: int, SOwner: string, SBalance: int)

Database Management System 55


Example: ∪ union
Checking ∪ Savings
Checking
Cnum Cowner Cbalance
Cnum Cowner Cbalance
101 Smith 1000
101 Smith 1000
102 Mils 2000
102 Mills 2000
104 Jones 1000
104 Jones 1000
105 Schwab 3000
105 Schwab 3000
103 Smith 5000

Savings note that attributes are from


Snum Sowner Sbalance the first relation in the query
103 Smith 5000 SELECT CNum, COwner, CBalance
FROM Checking
UNION
SELECT SNum, SOwner, SBalance
FROM Savings;

Database Management System 56


Example: ∩ intersection

Checking ∩ Savings
It is empty – no tuples appear in both relations
?
𝜋Cowner(Checking) ∩ 𝜋 Sowner(Savings)

Smith – the only owner in SavingsAcount

Checking Savings
Cnum Cowner Cbalance Snum Sowner Sbalance
101 Smith 1000 103 Smith 5000
102 Mils 2000
104 Jones 1000
105 Schwab 3000
Database Management System 57
Example: – difference * not supported in mysql

• Find all tuples that are in the Checking relation but are not in the Savings
relation

CheckingAccount − SavingsAccount

• Everyone in Checking except Smith

𝜋 COwner(CheckingAccount) − 𝜋 SOwner(SavingsAccount)

Workaround for difference operation


example query
SELECT * FROM p LEFT OUTER JOIN q ON [Link] = [Link] WHERE [Link] IS NULL

Database Management System 58


More SQL query constructs

1. SELECT … 1. Extensions: SUM, COUNT, MIN, AVG, etc


2. FROM … 2. Extensions include various kinds of JOINs
3. Additional comparators, e.g. EXISTS, IN, ANY
3. WHERE …

(SELECT … FROM … WHERE …) 4. Operators that takes two or more complete


4. UNION SQL queries as arguments, e.g., UNION and
(SELECT … FROM … WHERE …) INTERSECT

ORDER BY … 5. Several additional clauses, e.g., ORDER BY,


5. GROUP BY … GROUP BY, and HAVING
HAVING …

Database Management System 59


UNION and INTERSECTION
• Two complete queries with UNION • Two complete queries with
in between INTERSECT in between
(SELECT [Link]
FROM Customer C (SELECT [Link]
WHERE [Link] LIKE “B%”) FROM Customer C)
UNION INTERSECT
(SELECT [Link] (SELECT [Link]
FROM Salesperson S FROM Salesperson S);
WHERE [Link] LIKE “B%”);

• Two complete queries with EXCEPT (SELECT [Link]


FROM Customer C)
(i.e., DIFFERENCE) in between EXCEPT
• MySQL doesn’t support EXCEPT (SELECT [Link]
FROM Salesperson S);

Database Management System 60


ALL in UNION, INTERSECT, and EXCEPT
• If you don’t specify ALL, the result is computed on sets
• Eliminate duplicates from first operand
• Eliminate duplicates from second operand
• Compute operation
• Eliminate duplicates from result

• Note the difference and chose wisely


• UNION Vs. UNION ALL
• INTERSECT Vs. INTERSECT ALL
• EXCEPT Vs. EXCEPT ALL

Database Management System 61


More SQL query constructs

1. SELECT … 1. Extensions: SUM, COUNT, MIN, AVG, etc


2. FROM … 2. Extensions include various kinds of JOINs
3. Additional comparators, e.g. EXISTS, IN, ANY
3. WHERE …

(SELECT … FROM … WHERE …) 4. Operators that takes two or more complete


4. UNION SQL queries as arguments, e.g., UNION and
(SELECT … FROM … WHERE …) INTERSECT

ORDER BY … 5. Several additional clauses, e.g., ORDER BY,


5. GROUP BY … GROUP BY, and HAVING
HAVING …

Database Management System 62


GROUP BY
• Any SQL query can have the answer “grouped”
• one output row for each group

SELECT Salesperson, COUNT(*) SELECT Salesperson, COUNT(*)


FROM Customer; FROM Customer
GROUP BY Salesperson;

Customer
Number Name Address Crating Camount Cbalance Salesperson Salesperson COUNT(*)

1 Smith 1st Str. 700 10,000 9,000 55 55 1

2 Jones 2nd Str. 700 8,000 4,000 77 77 1

3 Mills 3rd Str. 700 11,000 8,000 NULL NULL 1

Database Management System 63


GROUP BY
SELECT Salesperson, COUNT(*)
FROM Customer
GROUP BY Salesperson;

Customer
Number Name Address Crating Camount Cbalance Salesperson

1 Smith 1st Str. 700 10,000 9,000 55

2 Jones 2nd Str. 700 8,000 4,000 77

3 Mills 3rd Str. 700 11,000 8,000 NULL

4 Bill 4th Str. 700 13,000 5,000 55

5 Jane 5th Str. 800 3,000 3,000 55

6 Harley 8th Str. 700 2,000 8,000 20

7 Khale 9th Str. 900 6,000 1,000 77

Database Management System 64


Example: GROUP BY
SELECT Salesperson, COUNT(*)
1. Make groups resulting in 4 Groups
FROM Customer
2. Evaluate
GROUP BY Salesperson;
“SELECT Salesperson, Count(*)” for each group

Customer
Number Name Address Crating Camount Cbalance Salesperson Salesperson COUNT(*)

1 Smith 1st Str. 700 10,000 9,000 55 55 3

2 Jones 2nd Str. 700 8,000 4,000 77 NULL 1

3 Mills 3rd Str. 700 11,000 8,000 NULL 77 2

4 Bill 4th Str. 700 13,000 5,000 55 20 1

5 Jane 5th Str. 800 3,000 3,000 55

6 Harley 8th Str. 700 2,000 8,000 20

7 Khale 9th Str. 900 6,000 1,000 77

Database Management System 65


SQL HAVING
• HAVING clause specifies a predicate evaluated against each group
• A group is in the result if it satisfies the HAVING condition

SELECT Salesperson, COUNT(*)


FROM Customer
GROUP BY Salesperson HAVING COUNT(*) > 1;

Customer
Number Name Address Crating Camount Cbalance Salesperson Salesperson COUNT(*)

1 Smith 1st Str. 700 10,000 9,000 55 55 2

2 Jones 2nd Str. 700 8,000 4,000 55

3 Mills 3rd Str. 700 11,000 8,000 NULL

Database Management System 66


Example: GROUP BY
SELECT Salesperson, COUNT(*)
1. Make groups resulting in 4 Groups
FROM Customer
2. Check if COUNT(*) >1 holds
GROUP BY Salesperson
3. Evaluate
HAVING COUNT(*) > 1;
“SELECT Salesperson, Count(*)” for each group

Customer
Number Name Address Crating Camount Cbalance Salesperson Salesperson COUNT(*)

1 Smith 1st Str. 700 10,000 9,000 55 55 3

2 Jones 2nd Str. 700 8,000 4,000 77 NULL 1

3 Mills 3rd Str. 700 11,000 8,000 NULL 77 2

4 Bill 4th Str. 700 13,000 5,000 55 20 1

5 Jane 5th Str. 800 3,000 3,000 55


Salesperson COUNT(*)
6 Harley 8th Str. 700 2,000 8,000 20
55 3
7 Khale 9th Str. 900 6,000 1,000 77
77 2

Database Management System 67


Note on GROUP BY, HAVING
• The only attribute that can appear in a “grouped” query are
• the grouping attributes
• aggregate operators that are applied to the group

• Thus, the following is not legal

SELECT Name
FROM Customer GROUP BY Salesperson;

• Because ther can be more than one name for each group

Database Management System 68


Exercise
Team(Name, Games, Wins, Losses, Conference)
Player(Name, Hits, AtBats, HomeRuns, Team)
[Link] -> [Link]

• Write SQL queries for the following


• Average number of wins and losses across teams
• Average number of wins and losses per conference
• Batting average for each player, where batting average is the number of hits divided
by at bats

Database Management System 69


ORDER BY
• Sort the result of a query

SELECT Number, Name, Salesperson


FROM Customer
ORDER BY Name;
Customer Customer
Number Name … Salesperson Number Name … Salesperson
1 Smith … 55 4 Bill … 55
2 Jones … 77 6 Harley … 20
3 Mills … NULL 5 Jane … 55
4 Bill … 55 2 Jones … 77
5 Jane … 55 7 Khale … 77
6 Harley … 20 3 Mills … NULL
7 Khale … 77 1 Smith … 55

Database Management System 70


ORDER BY
• Sort the result of a query

SELECT Number, Name, Salesperson


FROM Customer
ORDER BY Name DESC;
Customer Customer
Number Name … Salesperson Number Name … Salesperson
1 Smith … 55 1 Smith … 55
2 Jones … 77 3 Mills … NULL
3 Mills … NULL 7 Khale … 77
4 Bill … 55 2 Jones … 77
5 Jane … 55 5 Jane … 55
6 Harley … 20 6 Harley … 20
7 Khale … 77 4 Bill … 55

Database Management System 71


ORDER BY
• Sort the result of a query

SELECT Number, Name, Salesperson


FROM Customer
ORDER BY Name, Salesperson;
Customer Customer
Number Name … Salesperson Number Name … Salesperson
1 Smith … 55 8 Bill … 20
2 Jones … 77 4 Bill … 55
3 Mills … NULL 6 Harley … 20
4 Bill … 55 5 Jane … 55
5 Jane … 55 2 Jones … 77
6 Harley … 20 7 Khale … 77
7 Khale … 77 3 Mills … NULL
8 Bill … 20 1 Smith … 55

Database Management System 72


Subqueries
• It can be used in the where clause (in addition to the FROM clause)

SELECT [Link], [Link] Outer query


FROM Customer C1
WHERE [Link] = (SELECT MAX([Link])
FROM Customer C2); Inner query

• Inner query returns


• A single value that represents max credit rating

• Outer query returns


• The name and number of the customer with the highest credit ratings

Database Management System 73


Example
SELECT [Link], [Link]
FROM Customer C1
WHERE [Link] = (SELECT MAX([Link])
FROM Customer C2);

1. FROM clause in outer query


2. Take a row from the Customer table
3. Check if the row satisfies the WHERE clause
4. Evaluate the inner query (result: 800)
5. Evaluate if Crating is equal to the result

Customer
Number Name Address Crating Camount Cbalance Salesperson

1 Smith 1st Str. 200 10,000 9,000 55

2 Jones 2nd Str. 800 8,000 4,000 55

3 Mills 3rd Str. 700 11,000 8,000 NULL

Database Management System 74


Subqueries
• Subqueries can be used in the where clause (in addition to the from clause)

SELECT [Link], [Link]


FROM Customer C1
WHERE [Link] = (SELECT MAX([Link])
FROM Customer C2);

• Six Comparators: =, >, < >=, <=, <> (not equal)


• inner query must return a single value

• If the inner query does not mention any attributes from the outer query (C1
not mentioned in the inner query)
• Then you only need to evaluate the inner query once
• The inner (sub) query is NOT correlated

Database Management System 75


Subqueries: SOME/ALL comparison
SELECT [Link]
FROM Salesperson S
WHERE [Link] = SOME (SELECT [Link]
FROM Customer C
WHERE [Link] = 700);
• For SOME, the expression must be true for at least one row in the subquery
answer
• “ANY” is equivalent to SOME

• What does this query return?

The name of each salespeople that has a


customer with a credit rating of 700

Database Management System 76


Subqueries: SOME/ALL comparison
SELECT [Link]
FROM Salesperson S
WHERE [Link] = ALL (SELECT [Link]
FROM Customer C
WHERE [Link] = 700);
• For ALL, the expression must be true for all rows in the subquery answer

• What does this query return?

The name of the salesperson that has all the customers


with a rating of 700 (if such a salesperson exists)

Database Management System 77


Subqueries: IN/NOT IN comparison (1/4)
SELECT [Link], [Link]
FROM Customer C1
WHERE [Link] IN (SELECT Name
FROM Salesperson);

• With IN, the attribute matches at least one value returned from the
subquery
• Same as “= SOME”

Database Management System 78


Subqueries: IN/NOT IN comparison (2/4)
SELECT [Link], [Link]
FROM Customer C1
WHERE [Link] NOT IN (SELECT Name
FROM Salesperson);

• With NOT IN, the attribute matches none of the values returned from the
subquery
• Same as “<> ALL”

Database Management System 79


Subqueries: IN/NOT IN comparison (3/4)
• Are these equivalent?
• Do we need to use DISTINCT for these to be equivalent?
• Is the subquery correlated?

SELECT [Link], [Link]


FROM Salesperson S
WHERE [Link] IN (SELECT [Link]
FROM Customer C);
SELECT DISTINCT [Link], [Link]
FROM Salesperson S, Customer C
WHERE [Link] = [Link];

Database Management System 80


Subqueries: IN/NOT IN comparison (4/4)
SELECT [Link], [Link]
FROM Salesperson S
WHERE [Link] IN (SELECT [Link]
FROM Customer C
WHERE [Link] = [Link]);

• Because the subquery mentions an attribute from a table in the outer query
• The subquery must be (re-)evaluated for each row in the outer query (each time the
WHERE clause is evaluated)
• Correlated subqueries can be very expensive!

Database Management System 81


Subqueries: EXISTS/NOT EXISTS (1/2)
SELECT [Link]
FROM Customer C
WHERE EXISTS (SELECT *
FROM Salesperson S
WHERE [Link] = [Link] AND
[Link] = [Link]);

• If the answer to the subquery is not empty ... then the EXISTS predicate
returns TRUE
• Is this subquery correlated?
• What does this query return?

Database Management System 82


Subqueries: EXISTS/NOT EXISTS (2/2)
SELECT [Link]
FROM Customer C
WHERE EXISTS (SELECT *
FROM Salesperson S
WHERE [Link] = [Link] AND
[Link] = [Link]);
• Four predicates can be applied to a subquery
• EXISTS : is the subquery answer non-empty?
• NOT EXISTS : is the subquery answer empty?
• UNIQUE : does the subquery return just one row?
• NOT UNIQUE : does the subquery return multiple rows?

Database Management System 83


Missing Relational Algebra
Operator
Divide

Database Management System 84


Divide Operator (p. 54)
• Suppose we have a extra table in our database

Account AccountTypes
Number Owner Balance Type Type
7003001 Jane Smith 1,000,000 Savings Checking
7003003 Alfred Hitchcock 4,400,200 Savings Savings
7003005 Takumi Fujiwara 2,230,000 Checking
7003007 Brian Mills 1,200,000 Savings

• How do we find customers that have at least one account of each account
type?

𝜋Owner,Type(Account) ÷ AccountTypes
Find account owners who have ALL types of accounts

Database Management System 85


For Next Week
• Review – Quiz on the material
• Ch. 4 to 4.2
• Ch. 5.5

• Reading assignments
• Ch. 2-2.5
• Ch. 3.5

• Be sure you understand


• Aggregate operations
• how join operates
• set operators
• GROUP BY, HAVING, ORDER BY, Subqueries

Database Management System 86

You might also like