0% found this document useful (0 votes)
38 views27 pages

Lecture 4 Relational Model and Algebra

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views27 pages

Lecture 4 Relational Model and Algebra

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

International University, VNU-HCMC

School of Computer Science and Engineering

Lecture 4: Relational Model and Algebra

Instructor: Nguyen Thi Thuy Loan


nttloan@[Link], nthithuyloan@[Link]
[Link]

International University, VNU-HCMC

Course Website
• Blackboard IU
• Please check frequently for updates!
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

Duke CS, Fall 202


2022 2

1
International University, VNU-HCMC

Outline
• Relational model
• Relational algebra
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

Duke CS, Fall 202


2022 3

International University, VNU-HCMC

Acknowledgement
• The following slides are referenced from Dr. Sudeepa
Roy, Duke University.
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

Duke CS, Fall 202


2022 4

2
International University, VNU-HCMC

Edgar F. Codd (1923-2003)


• Pilot in the Royal Air Force in WW2
• Inventor of the relational model
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

and algebra while at IBM


• Turing Award, 1981

RDBMS = Relational DBMS

[Link]
Duke CS, Fall 202
2022 5

International University, VNU-HCMC

The famous “Beers” database


Bars
Drinkers Frequent Bars Each has an address
“X” times a week
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

Bars Serve Beers


At price “Y”

Drinkers Likes Beers

Drinkers Beers
Each has an address (Later in ER diagram – how to Each has a brewer
design a relational database)
Duke CS, Fall 202
2022 6

3
International University, VNU-HCMC See online database for more tuples

“Beers” as a Relational Database


Bar Serves
name address bar beer price
The Edge 108 Morris Street The Edge Budweiser 2.50
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

Satisfaction 905 W. Main Street The Edge Corona 3.00


Satisfaction Budweiser 2.25

Beer drinker bar times_a_week


Name brewer
Ben Satisfaction 2
Budweiser Anheuser-Busch Inc.
Dan The Edge 1
Corona Grupo Modelo
Dan Satisfaction 2
Dixie Dixie Brewing
Frequents
drinker beer
Drinker
name address Amy Corona
Amy 100 W. Main Street Dan Budweiser
Ben 101 W. Main Street Dan Corona
Dan 300 N. Duke Street Likes
Ben Budweiser
Duke CS, Fall 2022 7

International University, VNU-HCMC

Relational data model


• A database is a collection of relations (or tables)
• Each relation has a set of attributes (or columns)
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

• Each attribute has a name and a domain (or type)


• Set-valued attributes are not allowed
• Each relation contains a “set” of tuples (or rows)
• Each tuple has a value for each attribute of the relation
• Duplicate tuples are not allowed (Two tuples are duplicates if they
agree on all attributes)
• Ordering of rows doesn’t matter (even though output is
always in some order) Serves
bar beer price
• However, SQL supports “bag” The Edge Budweiser 2.50
or duplicate tuples (why?) The Edge Corona 3.00

Simplicity is a virtue! Satisfaction Budweiser 2.25

Duke CS, Fall 202


2022 8

4
International University, VNU-HCMC

Schema vs. instance


Serves
bar beer price
The Edge Budweiser 2.50
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

The Edge Corona 3.00


Satisfaction Budweiser 2.25

Beer Frequents
Name brewer drinker bar times_a_week
Budweiser Anheuser-Busch Inc. Ben Satisfaction 2
Corona Grupo Modelo Dan The Edge 1
Dixie Dixie Brewing Dan Satisfaction 2

• Ordering of rows doesn’t matter (even though output is


always in some order)
Duke CS, Fall 202
2022 9

International University, VNU-HCMC

Schema vs. instance


• Schema (metadata)
• Specifies the logical structure of data
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

• Is defined at setup time


• Rarely changes
• Instance
• Represents the data content
• Changes rapidly, but always conforms to the schema

Compare to types vs. collections of objects of these types in a


programming language

Duke CS, Fall 202


2022 10

5
International University, VNU-HCMC

Example
• Schema (metadata)
• Beer (name string, brewer string)
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

• Serves (bar string, beer string, price float)


• Frequents (drinker string, bar string, times_a_week int)

• Instance
• Beer {<Budweiser, Anheuser-Busch Inc.>, <corona, Grupo Modelo>,…}
• Serves {<the Edge, Budweiser, 2.50>,<The Edge, Corona, 3.0>,..}
• Frequents {<Ben, Satisfaction,2>,<Dan, The Edge, 1>,…}

Duke CS, Fall 202


2022 11

International University, VNU-HCMC

Relational algebra
A language for querying relational data
based on “operators”
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

RelOp
RelOp

• Core operators:
• Selection, projection, cross product, union, difference,
and renaming
• Additional, derived operators:
• Join, natural join, intersection, etc.
• Compose operators to make complex queries
Duke CS, Fall 202
2022 12

6
International University, VNU-HCMC

Selection
• Input: a table 𝑅
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

• Notation: 𝜎P 𝑅
• 𝑝 is called a selection condition (or predicate)
• Purpose: filter rows according to some criteria
• Output: same columns as 𝑅, but only rows of 𝑅 that
satisfy 𝑝 (𝑠𝑒𝑡!)

Duke CS, Fall 202


2022 13

International University, VNU-HCMC

Selection example
Find beers with price < 2.75

𝝈𝒑𝒓𝒊𝒄𝒆 Serves
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

Serves < 𝟐.𝟕𝟓


bar beer price bar beer price
The Edge Budweiser 2.50 The Edge Budweiser 2.50
The Edge Corona 3.00
Satisfaction Budweiser 2.25 Satisfaction Budweiser 2.25

Duke CS, Fall 202


2022 14

7
International University, VNU-HCMC

More on selection
• Selection condition can include any column of 𝑅, constants,
comparison (=, ≤, etc.) and Boolean connectives (∧: and, ∨:
or, ¬: not)
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

• Example: Serves tuples for “The Edge” or price >= 2.75


𝜎bar= ‘The Edge’ ∨ price ³ 2.75𝑆𝑒𝑟𝑣𝑒𝑠
• You must be able to evaluate the condition over each single
row of the input table!
!
WRONG
• Example: the most expensive beer at any bar
𝜎price ³ every price in Servers 𝑈𝑠𝑒𝑟
Serves
bar beer price
The Edge Budweiser 2.50
The Edge Corona 3.00
Satisfaction Budweiser 2.25

Duke CS, Fall 202


2022 15

International University, VNU-HCMC

Projection
• Input: a table 𝑅
• Notation: 𝜋L 𝑅
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

• 𝐿 is a list of columns in 𝑅
• Purpose: output chosen columns
• Output: same rows, but only the columns in 𝐿 (𝑠𝑒𝑡!)

Duke CS, Fall 202


2022 16

8
International University, VNU-HCMC

Projection
Example: Find all the prices for each beer
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

Serves
𝝅𝒃𝒆𝒆𝒓,𝒑𝒓𝒊𝒄𝒆 Serves
bar beer price
The Edge Budweiser 2.50 beer price

The Edge Corona 3.00 Budweiser 2.50

Satisfaction Budweiser 2.25 Corona 3.00


Budweiser 2.25

Output of 𝜋beerServes?

Duke CS, Fall 202


2022 17

International University, VNU-HCMC

More on Projection

• Duplicate output rows are removed (by definition)


• Example: beer on servers
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

𝝅𝒃𝒆𝒆𝒓 Serves
Serves
bar beer price beer
The Edge Budweiser 2.50 Budweiser

The Edge Corona 3.00 Corona

Satisfaction Budweiser 2.25 Budweiser

Duke CS, Fall 202


2022 18

9
International University, VNU-HCMC

Cross product
• Input: two tables 𝑅 and 𝑆
• Natation: 𝑅×𝑆
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

• Purpose: pairs rows from two tables


• Output: for each row 𝑟 in 𝑅 and each 𝑠 in 𝑆, output
a row 𝑟𝑠 (concatenation of 𝑟 and 𝑠)

Duke CS, Fall 202


2022 19

International University, VNU-HCMC

Cross product
Bar Frequents
name address drinker bar times_a_week
The Edge 108 Morris Street Ben Satisfaction 2
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

Dan The Edge 1


Satisfaction 905 W. Main Street
Dan Satisfaction 2

Bar x Frequents
name address drinker bar times_a_
w eek
The Edge 108 Morris Street Ben Satisfaction 2

The Edge 108 Morris Street Dan The Edge 1

The Edge 108 Morris Street Dan Satisfaction 2

Satisfaction 905 W. Main Street Ben Satisfaction 2

Satisfaction 905 W. Main Street Dan The Edge 1

Satisfaction 905 W. Main Street Dan Satisfaction 2

Duke CS, Fall 202


2022 20

10
International University, VNU-HCMC

Cross product
• Ordering of columns is unimportant as far as
contents are concerned.
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

name address drink bar tim drinker bar time name address
er es_ s_a_
a_w w
eek eek
The Edge 108 Morris Street Ben Satisfaction 2 Ben Satisfaction 2 The Edge 108 Morris Street

The Edge 108 Morris Street Dan The Edge 1 Dan The Edge 1 The Edge 108 Morris Street

The Edge 108 Morris Street Dan Satisfaction 2 = Dan Satisfaction 2 The Edge 108 Morris Street

Satisfaction 905 W. Main Street Ben Satisfaction 2 Ben Satisfaction 2 Satisfaction 905 W. Main Street

Satisfaction 905 W. Main Street Dan The Edge 1 Dan The Edge 1 Satisfaction 905 W. Main Street

Satisfaction 905 W. Main Street Dan Satisfaction 2 Dan Satisfaction 2 Satisfaction 905 W. Main Street

• So cross product is commutative, i.e., for any R and


S, R X S = S X R (up to the ordering of columns)
Duke CS, Fall 202
2022 21

International University, VNU-HCMC

Derived operator: join


(Also known as “theta-join”: most general joins)
One of the most important
• Input: two tables 𝑅 and 𝑆 operations!
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

• Notation: 𝑅 ⋈P 𝑆
• 𝑝 is called a join condition (or predicate)
• Purpose: relate rows from two tables according
to some criteria
• Output: for each row 𝑟 in 𝑅 and each row 𝑠 in 𝑆,
output a row 𝑟𝑠 if 𝑟 and 𝑠 satisfy 𝑝
• Shorthand for 𝜎P (R×𝑆)
• Predicate p only has equality (A = 5 ∧ B = 7) : equijoin
Duke CS, Fall 202
2022 22

11
International University, VNU-HCMC

Join example
• Extend Frequents relation with addresses of the bars
𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑡𝑠 ⋈bar = name 𝐵𝑎𝑟
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

Ambiguous attribute? Prefix a column reference with table name and “.” to
disambiguate identically named columns from different tables. Ex. Use [Link]
Bar Frequents
name address drinker bar times_a_week
The Edge 108 Morris Street Ben Satisfaction 2
Satisfaction 905 W. Main Street Dan The Edge 1
Dan Satisfaction 2

name address drinker bar times_a_week


The Edge 108 Morris Street Ben Satisfaction 2
The Edge 108 Morris Street Dan The Edge 1
The Edge 108 Morris Street Dan Satisfaction 2
Satisfaction 905 W. Main Street Ben Satisfaction 2
Satisfaction 905 W. Main Street Dan The Edge 1
Satisfaction 905 W. Main Street Dan
Duke CS, Fall 2022 Satisfaction 2 23

International University, VNU-HCMC

Join Types
• Theta Join
• Equi-Join
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

• Natural Join

• Later, (left/right) outer join, semi-join

Duke CS, Fall 202


2022 24

12
International University, VNU-HCMC

Derived operator: natural join


• Input: two tables 𝑅 and 𝑆
• Notation: 𝑅 ⋈ 𝑆 (i.e. no subscript)
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

• Purpose: relate rows from two tables, and


• Enforce equality between identically named columns
• Eliminate one copy of identically named columns
• Shorthand for 𝜋L (R ⋈P 𝑆) , where
• 𝑝 equates each pair of columns common to 𝑅 and 𝑆
• 𝐿 is the union of column names from 𝑅 and 𝑆 (with
duplicate columns removed)

Duke CS, Fall 202


2022 25

International University, VNU-HCMC

Natural join example


Serves ⋈ 𝐿𝑖𝑘𝑒𝑠
= 𝜋? (𝑆𝑒𝑟𝑣𝑒𝑠 ⋈? 𝐿𝑖𝑘𝑒𝑠)
= 𝜋bar, beer, price, drinker (𝑆𝑒𝑟𝑣𝑒𝑠 ⋈[Link] = [Link] 𝐿𝑖𝑘𝑒𝑠)
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

Serves Likes
drinker beer
bar beer price
The Edge Budweiser 2.50 Amy Corona
The Edge Corona 3.00 Dan Budweiser
Satisfaction Budweiser 2.25 Dan Corona
Ben Budweiser

Serves ⋈ 𝐿𝑖𝑘𝑒𝑠
bar beer price drinker
The Edge Budweiser 2.50 Dan Natural Join is on beer. Only one
The Edge Budweiser 2.50 Ben column for beer in the output
The Edge Corona 3.00 Amy What happens if the tables
The Edge Corona 3.00 Dan have two or more common columns?
... …. …..

Duke CS, Fall 202


2022 26

13
International University, VNU-HCMC

Union
• Input: two tables 𝑅 and 𝑆 Important for set operations:
Union Compatibility
• Notation: 𝑅 ∪ 𝑆
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

• 𝑅 and 𝑆 must have identical schema


• Output:
• Has the same schema as 𝑅 and 𝑆
• Contains all rows in 𝑅 and all rows in 𝑆 (with duplicate
rows removed)

Example on board

Duke CS, Fall 202


2022 27

International University, VNU-HCMC

Difference
• Input: two tables 𝑅 and 𝑆 Important for set operations:
• Notation: 𝑅 − 𝑆 Union Compatibility
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

• 𝑅 and 𝑆 must have identical schema


• Output:
• Has the same schema as 𝑅 and 𝑆
• Contains all rows in 𝑅 that are not in 𝑆

Example on board

Duke CS, Fall 202


2022 28

14
International University, VNU-HCMC

Derived operator: intersection


• Input: two tables 𝑅 and 𝑆 Important for set operations:
• Notation: 𝑅 ∩ 𝑆 Union Compatibility
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

• 𝑅 and 𝑆 must have identical schema


• Output:
• Has the same schema as 𝑅 and 𝑆
• Contains all rows that are in both 𝑅 and 𝑆
• How can you write it using other operators?
• Shorthand for 𝑅 − (𝑅 − 𝑆)
• Also equivalent to 𝑆 − (𝑆 − 𝑅)
• And to 𝑅 ⋈ 𝑆

Duke CS, Fall 202


2022 29

International University, VNU-HCMC

Renaming
• Input: a table 𝑅
• Notation: 𝜌S 𝑅, 𝜌( A 1, A2 ,…)𝑅, or 𝜌S (A1,A 2 ,…)R
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

• Purpose: “rename” a table and/or its columns


• Output: a table with the same rows as 𝑅, but called
differently
• Used to
• Avoid confusion caused by identical column names
• Create identical column names for natural joins
• As with all other relational operators, it doesn’t modify the
database
• Think of the renamed table as a copy of the original

Duke CS, Fall 202


2022 30

15
International University, VNU-HCMC

Renaming example
Frequents
• Find drinkers who drinker bar times_a_week
frequent both “The Ben Satisfaction 2
Edge” and “Satisfaction”
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

Dan The Edge 1


Dan Satisfaction 2

!
WRONG
𝜋drinker 𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑡𝑠 ⋈ Bar= ‘The Edge ∧ 𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑡𝑠
Bar = ‘Satisfaction’∧
drinker = drinker
!
Rename
𝜌 d1, b1, t1 𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑡𝑠
𝜋uid1 ⋈b 1 = ‘The Edge’∧ b2 = Satisfaction’ ∧ d1=d2
𝜌 d2, b2, t2 𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑡𝑠
Duke CS, Fall 202
2022 31

International University, VNU-HCMC

Expression tree notation


What if you move 𝜎 to the top?
• Find addresses of all bars that Still correct?
‘Dan’ frequents More or less efficient?
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

Bar
name address
Also called logical Plan tree
The Edge 108 Morris Street 𝜋address
Satisfaction 905 W. Main Street

⋈bar=name
Frequents
drinker bar times_a_week 𝜎drinker = ‘Dan’
Ben Satisfaction 2
Dan The Edge 1
𝐵𝑎𝑟 𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑡𝑠
Dan Satisfaction 2

Equivalent to
𝜋 address(𝐵𝑎𝑟 ⋈bar = name (𝜎drinker =‘Dan’𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑡𝑠))
Duke CS, Fall 2022 32

16
International University, VNU-HCMC

Summary of core operators


• Selection: 𝜎P 𝑅
• Projection: 𝜋L 𝑅
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

• Cross product: 𝑅×𝑆


• Union: 𝑅 ∪ 𝑆
• Difference: 𝑅 − 𝑆
• Renaming: 𝜌S A1, A2 ,… 𝑅
• Does not really add “processing” power

Duke CS, Fall 202


2022 33

International University, VNU-HCMC

Summary of derived operators


• Join: 𝑅 ⋈P 𝑆
• Natural join: 𝑅 ⋈ 𝑆
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

• Intersection: 𝑅 ∩ 𝑆

• Many more
• Semijoin, anti-semijoin, quotient, …

Duke CS, Fall 202


2022 34

17
International University, VNU-HCMC
Frequents(drinker, bar, times_of_week)
Exercise Bar(name, address)
Drinker(name, address)

• Bars that drinkers in address “300 N. Duke Street”


do not frequent
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

Duke CS, Fall 2022 35

International University, VNU-HCMC Frequents(drinker, bar, times_of_week) 40

Bar(name, address)
Exercise Drinker(name, address)

• Bars that drinkers in address “300 N. Duke Street”


do not frequent
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

− Bars that the drinkers at this


address frequent
All bars

𝜌bar 𝜋bar
⋈ drinker = 𝑛𝑎𝑚𝑒
𝜋n a m e
𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑡𝑠 𝜎address=‘300 [Link] Street
𝐵𝑎𝑟

𝐷𝑟𝑖𝑛𝑘𝑒𝑟
Duke CS, Fall 2022 36

18
International University, VNU-HCMC Frequents(drinker, bar, times_of_week) 41

Bar(name, address)
A trickier Exercise Drinker(name, address)

• For each bar, find the drinkers who frequent it max no. times a
week
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

Duke CS, Fall 2022 37

International University, VNU-HCMC Frequents(drinker, bar, times_of_week) 42

Bar(name, address)
A trickier Exercise Drinker(name, address)

• For each bar, find the drinkers who frequent it max no.
times a week
• Who do NOT visit a bar max no. of times?
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

• Whose times_of_weeks is lower than somebody else’s for a given bar


𝜋[Link], [Link]
𝜋bar, drinker
⋈[Link]-of-week < [Link]-of-week
∧[Link]=[Link]
𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑡𝑠
𝜌F1 𝜌F2
A deeper question: 𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑡𝑠 𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑡𝑠
When (and why) is “−” needed?
Duke CS, Fall 2022 38

19
International University, VNU-HCMC

Expressions in a Single Assignment


• Example: the theta-join R3 = R1 ⋈C R2 can be
written: R3 := σC (R1 Χ R2)
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

• Precedence of relational operators:


1. [σ, π, ρ] (highest)
2. [Χ, ⋈]
3. ∩
4. [∪, —]

Duke CS, Fall 202


2022 39

International University, VNU-HCMC

Find names of sailors who’ve reserved boat #103

Sailors(sid, sname, rating, age)


Boats(bid, bname, color)
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

Reserves(sid, bid, day)

Duke CS, Fall 2022 40

20
International University, VNU-HCMC
Find sailors who’ve reserved a red or a green
boat
Sailors(sid, sname, rating, age)
Boats(bid, bname, color) Use of rename operation
Reserves(sid, bid, day)
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

• Can identify all red or green boats, then find sailors


who’ve reserved one of these boats:

r (Tempboats, (s color =' red ' Ú color =' green ' Boats))

p sname(Tempboats ⨝ Reserves ⨝ Sailors)


Can also define Tempboats using union. Try the “AND”
version yourself

Duke CS, Fall 202


2022 41

International University, VNU-HCMC

Division
• Not supported as a primitive operator, but useful for
expressing queries like:
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

Find sailors who have reserved all boats.


• Let A have 2 fields, x and y; B have only field y:

A/B = p x (A) - p x ((p x (A)´ B)-A)

– i.e., A/B contains all x tuples (sailors) such that for


every y tuple (boat) in B, there is an xy tuple in A.
– Or: If the set of y values (boats) associated with an x value
(sailor) in A contains all y values in B, the x value is in A/B.

Duke CS, Fall 202


2022 42

21
International University, VNU-HCMC

Examples of Division A/B


p x (A) - p x ((p x (A)´ B)-A)
sno pno pno pno pno
s1 p1 p2 p2 p1
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

s1 p2 B1 p4 p2
s1 p3 B2 p4
s1 p4
s2 p1 sno B3
s2 p2 s1
s3 p2 s2 sno
s4 p2 s3 s1 sno
s4 p4 s4 s4 s1

A A/B1 A/B2 A/B3


Duke CS, Fall 202
2022 43

International University, VNU-HCMC

Expressing A/B Using Basic Operators


• Division is not essential op; just a useful
shorthand.
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

– (Also true of joins, but joins are so common that


systems implement joins specially)
• Idea: For A/B, compute all x values that are not
`disqualified’ by some y value in B.
– x value is disqualified if by attaching y value from B,
we obtain an xy tuple that is not in A.

Disqualified x values: all disqualified tuples

A/B: p x (A) - p x ((p x (A)´ B)-A)


Duke CS, Fall 202
2022 44

22
International University, VNU-HCMC
Find the name of sailors who’ve reserved
all boats
Sailors(sid, sname, rating, age)
Boats(bid, bname, color)
Reserves(sid, bid, day)
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

• Uses division; schemas of the input relations to/ must be


carefully chosen:
r (Tempsids, (p sid,bidReserves) / (p bid Boats))
p sname (Tempsids ⨝ Sailors)
• To find sailors who’ve reserved all ‘Interlake’ boats:

..... / p bid (s bname =' Interlake' Boats)

Duke CS, Fall 2022 45

International University, VNU-HCMC 43

Monotone operators
RelOp What happens
Add more rows to the output?
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

to the input...

• If some old output rows may need to be removed


• Then the operator is non-monotone
• Otherwise the operator is monotone
• That is, old output rows always remain “correct” when more rows
are added to the input
• Formally, for a monotone operator 𝑜𝑝:
𝑅 ⊆ 𝑅' implies 𝑜𝑝(𝑅) ⊆ 𝑜𝑝(𝑅’) for any 𝑅, 𝑅'

Duke CS, Fall 202


2022 46

23
International University, VNU-HCMC 44

Which operators are non-monotone?

• Selection: 𝜎P 𝑅 Monotone
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

• Projection: 𝜋L 𝑅 Monotone
• Cross product: 𝑅×𝑆 Monotone
• Join: 𝑅 ⋈P 𝑆 Monotone
• Natural join: 𝑅 ⋈ 𝑆 Monotone
• Union: 𝑅 ∪ 𝑆 Monotone
• Difference: 𝑅 − 𝑆 Monotone w.r.t. 𝑅; non-monotone w.r.t 𝑆
• Intersection: 𝑅 ∩ 𝑆 Monotone

Duke CS, Fall 2022 47

International University, VNU-HCMC 45

Why is “−” needed for “highest”?


• Composition of monotone operators produces a
monotone query
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

• Old output rows remain “correct” when more rows are


added to the input
• Is the “highest” query monotone?
• No!
• Current highest price 3.0
• Add another row with price 3.01
• Old answer is invalidated
So it must use difference!

Duke CS, Fall 202


2022 48

24
International University, VNU-HCMC

Why do we need core operator X?


• Difference: The only non-monotone operator
• Projection: The only operator that removes columns
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

• Cross product: The only operator that adds columns


• Union: The only operator that allows you to add rows?
A more rigorous argument?
• Selection: Homework problem

Duke CS, Fall 202


2022 49

International University, VNU-HCMC 46

Extensions to relational algebra


• Duplicate handling (“bag algebra”)
• Grouping and aggregation
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

• “Extension” (or “extended projection”) to allow


new column values to be computed

☞ All these will come up when we talk about SQL


☞ But for now we will stick to standard relational
algebra without these extensions

Duke CS, Fall 202


2022 50

25
International University, VNU-HCMC

Why is RA a good query language?


• Simple:
– A small set of core operators
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

– Semantics are easy to grasp


• Declarative?
– Yes, compared with older languages like CODASYL
– Though assembling operators into a query does feel
somewhat “procedural”
• Complete?
– With respect to what?

Duke CS, Fall 202


2022 51

International University, VNU-HCMC

Operations of Relational Algebra


Assoc. Prof. Nguyen Thi Thuy Loan, PhD

Duke CS, Fall 202


2022 52

26
International University, VNU-HCMC
Assoc. Prof. Nguyen Thi Thuy Loan, PhD

Thank you for your attention!

Duke CS, Fall 202


2022 53

27

You might also like