Sample Midterm Solutions
Sample Midterm Solutions
• Good luck!
Name:
NEU ID (optional):
1
1 Multiple-Choice Questions (40 points)
Circle ALL the correct choices: there may be more than one correct choice, but there is always
at least one correct choice. NO partial credit: the set of all the correct answers must be checked.
There are 10 multiple choice questions worth 4 points each.
1. Which of the following describes the role of a Data Definition Language (DDL) in a DBMS?
(c). DDL is used for defining and managing the structure of the database objects, like creating,
altering, and deleting tables, indexes, and schemas.
2. The following ERD describes a relation between employees, managers, and departments:
3. Suppose that S(A, ...) is a relation containing m rows and T (B, ...) is a relation containing
n rows. Assume that A is the primary key of S and B is a foreign key reference to S(A).
Suppose that we perform an inner join between S and T on S.A = T.B. The maximum
number of rows that can be in the output is:
2
(a) m
(b) n
(c) m + n
(d) mn
(e) max(m, n)
(b). When we perform an inner join S.A = T.B, each row in T can match at most one row
of S (since the values of the primary key A are unique), therefore the maximum number of
rows we could have in the result is n.
(a) 1
(b) 3
(c) 5
(d) 10
(b)
3
(b) and (c).
(a): While it is common for a foreign key to reference a primary key in another table, it can
also reference a unique key in another table.
(b): A table can have only one primary key, but it can have multiple foreign keys referencing
different tables.
(c): Foreign keys establish referential integrity constraints, ensuring that relationships be-
tween tables are maintained. They prevent actions that would create inconsistencies in linked
data, such as deleting a record that is referenced by another table.
(d): A foreign key can have NULL values unless it is explicitly defined as NOT NULL.
You are asked to find the entire lineage (all superiors) of employee with id 100 up to the
highest level in the organization. Which SQL statement will retrieve this information?
4
(d) WITH RecursiveCTE AS (
SELECT EmployeeID , FirstName , LastName , ManagerID
FROM Employees
WHERE EmployeeID = 100
UNION ALL
SELECT e . EmployeeID , e . FirstName , e . LastName , e . ManagerID
FROM Employees e
JOIN RecursiveCTE r ON e . ManagerID = r . ManagerID
)
SELECT FirstName , LastName FROM RecursiveCTE ;
(a) In this option, the recursive CTE starts with the specific employee (where EmployeeID =
100) and then, in each subsequent iteration, it fetches the direct manager of the last retrieved
employee by joining on e.EmployeeID = r.ManagerID. This process continues until there are
no more managers for the given employee lineage.
(d).
(a): The dependencies AD → B and E → G are not preserved.
(b): The dependency AD → B is not preserved.
(c): The dependency B → E is not preserved.
(a) 1NF
(b) 2NF
(c) 3NF
(d) BCNF
(a) and (b). The table is in 1NF since every attribute in the table is atomic, and it is also in
2NF since it does not have any non-prime attribute that is functionally dependent on a subset
of the primary key (the primary key consists only of one column OrderId). However, it is not
in BCNF or 3NF, since the functional dependency ProductId → ProductName depends on
ProductId, which is not a candidate key of the table.
5
9. Which of the following is true about foreign keys?
11. What is the purpose of a cursor object when working with databases in Python?
(c)
6
2 ERD (15 points)
Design a system for movie rentals that manages movie collections, customer rentals, payments, and
employee interactions. Draw an ERD using the provided entities and their relationships. Mark
primary keys, and indicate the cardinality and participation (total/partial) of relationships.
System description:
• A director has an id, first name, last name, and birth date.
• Each movie has one director, but a director can direct multiple movies.
• A customer has id, first name, last name, address, and email.
• A movie can be rented multiple times by different customers at different times. For each
rental, the system stores the rental date, due date, and return date.
• Each rental is associated with a payment. A payment has amount and date.
• An employee has id, first name, last name, and hire date.
• A rental can be processed by only one employee, but an employee can process multiple rentals.
7
3 Data Normalization (15 points)
Consider the relational schema R(A, B, C, D, E, G) with the functional dependencies:
F = {AB → C, C → A, BC → D, ACD → B, D → EG, BE → C, CG → BD, CE → AG}.
2. The candidate keys are AB, BC, BD, BE, CD, CE, CG.
4. We first split on C → A, and get the relations R1 = (A, C) and R2 = (B, C, D, E, G). The
relation R2 is not in BCNF, because the dependency D → EG violates the BCNF (D is not
a candidate key in R2 ). Therefore, we split it into R21 = (D, E, G) and R22 = (B, C, D).
Both R21 and R22 are in BCNF. The final relations are: R1 = (A, C), R21 = (B, C, D), R22 =
(D, E, G).
8
4 SQL (30 points)
You are given the following schemas of a database for a coaching club.
• The relation coach contains data on coaches that work in the club. Each coach has an id,
name, e-mail address, date of starting his/her job as a coach (from_date) and hourly rate.
• The relation types contains data on the coaching types offered in the club, including the type
name (e.g., ”life coaching”, ”career coaching”, etc.) and description.
• The relation coaches describes which coaching types are offered by each coach. Each coaching
type has at least one coach who offers it.
• The relation clients contains data on clients of the club. Each client has an id, name,
address and mobile phone. Each client has at least one training program.
• The relation training_program contains data on the training programs of the clients. For
each client, it stores the starting date of the program (start_date), id of the coach, the
coaching type (type_name), and the number of total hours planned for the program. The
cost of the program is calculated based on the number of hours and the hourly rate of the
selected coach.
1. Find all the clients that have been trained in a coaching type whose description contains the
word ”life”.
SELECT tp . client_id
FROM training_program AS tp
JOIN types AS t
ON tp . type_name = t . type_name
WHERE t . description LIKE '% life % ';
2. Find pairs of different clients who started their training at the same day with the same coach.
The result columns should include the ids of the customers and the starting date of the
training. Each such pair should appear only once in the result.
SELECT t1 . client_id , t2 . client_id , t1 . start_date
FROM training_program AS t1
JOIN training_program AS t2
ON t1 . client_id < t2 . client_id
WHERE t1 . start_date = t2 . start_date AND t1 . coach - id = t2 . coach - id ;
3. Find all the clients who have never been trained by a coach named Levi.
9
SELECT client_id
FROM clients
WHERE client_id NOT IN (
SELECT tp . client_id
FROM training_program AS tp
JOIN coach AS c
ON tp . coach_id = c . coach_id
WHERE c . name = ' Levi '
);
4. Find for each client the total amount he/she has to pay for all his/her training programs.
SELECT tp . client_id , SUM ( tp . hours * c . hourly_rate ) AS total_pay
FROM training_program AS tp
JOIN coach AS c
ON tp . coach_id = c . coach_id
GROUP BY tp . client_id ;
5. Find customers who have been trained in all the coaching types offered by the club.
SELECT c . client_id
FROM clients AS c
WHERE NOT EXISTS (
SELECT *
FROM types
WHERE type_name NOT IN (
SELECT tp . type_name
FROM training_program AS tp
WHERE tp . client_id = c . client_id
)
);
6. Find customers who have had at least 3 different training programs, and all their training
programs have been carried out by the same coach.
SELECT client_id
FROM training_program
WHERE client_id NOT IN (
SELECT t1 . client_id
FROM training_program AS t1
JOIN training_program AS t2
ON t1 . client_id = t2 . client_id
WHERE t1 . coach_id <> t2 . coach_id
)
GROUP BY client_id
HAVING COUNT (*) >= 3;
10