DATABASE MANAGEMENT SYSTEM
1. Data and Information
Data refers to raw, unprocessed facts (e.g., numbers, text, or images).
Information is processed data that has meaning (e.g., "John, 25 years
old" is meaningful data).
2. Database & Database Management System (DBMS)
A database is an organized collection of data stored electronically.
A DBMS is software that helps store, retrieve, update, and manage data
efficiently (e.g., MySQL, PostgreSQL).
3. DBMS vs RDBMS
DBMS: Stores data in a file system with no structured relationships (e.g.,
XML, JSON).
RDBMS: Uses structured tables with relationships, follows ACID
properties (e.g., MySQL, Oracle).
4. Database Models
Flat Model: Data is stored in a single table with no relationships (like a
spreadsheet).
Hierarchical Model: Organizes data in a tree structure with parent-child
relationships (e.g., XML, Windows Registry).
Network Model: Uses graph-like connections where multiple parent-
child relationships exist.
Relational Model: Stores data in tables with rows and columns,
supporting relationships using keys.
5. Codd’s 12 Rules (Any five)
1. Information Rule: All data must be stored in tables.
2. Guaranteed Access: Every data element is accessible using a table name,
primary key, and column name.
3. Systematic Treatment of NULLs: NULL values should be treated
consistently.
4. Dynamic Online Catalog: Metadata should be stored as tables and
accessed using SQL.
5. Comprehensive Data Language: The database must support a language
for defining, accessing, and managing data (like SQL).
6. Entity
An entity is any real-world object or concept that can be
distinctly identified in a database. It represents something that has
attributes (characteristics) and can be stored as a record in a table.
Types of Entities
Strong Entity
Weak Entity
7. Attributes
An attribute is a characteristic or property of an entity that helps
describe it. In a database , attributes are represented as columns in a
table.
Types of Attributes
Single-valued vs Multi-valued: A single-valued attribute has only one
value (e.g., Age), whereas a multi-valued attribute has multiple values
(e.g., Contact Numbers).
Simple vs Composite: A simple attribute cannot be broken down (e.g.,
Age), while a composite attribute has multiple parts (e.g., Name = First
Name + Last Name).
Stored vs Derived: A stored attribute is saved in the database (e.g., Date
of Birth), while a derived attribute is calculated (e.g., Age = Current Date
- DOB).
8. Relations
One-to-One: Each entity in Table A relates to only one entity in Table B
(e.g., One country has one capital).
One-to-Many: One entity in Table A relates to multiple entities in Table
B (e.g., A teacher can teach multiple students).
Many-to-Many: Multiple entities in Table A relate to multiple entities in
Table B (e.g., Students enroll in multiple courses, and courses have
multiple students).
9. Simple ER Diagram
An Entity-Relationship (ER) Diagram visually represents entities,
attributes, and relationships.
Example: A Student (Entity) has Name (Attribute) and is enrolled in a
Course (Entity).
10. Normalization
1NF (First Normal Form): Ensures atomicity (no repeating groups or
multi-valued attributes).
2NF (Second Normal Form): Ensures full functional dependency (no
partial dependencies on a composite primary key).
3NF (Third Normal Form): Removes transitive dependencies (attributes
should depend only on the primary key).
BCNF (Boyce-Codd Normal Form): Ensures that every determinant is a
candidate key (stronger than 3NF).
4NF (Fourth Normal Form): Removes multi-valued dependencies to
avoid unnecessary duplication.
11. Candidate Key, Primary Key, Foreign Key, Super Key
Candidate Key: A minimal set of attributes that uniquely identify a row
(e.g., Roll No in a Student table).
Primary Key: A selected candidate key to uniquely identify records (e.g.,
Employee ID).
Foreign Key: An attribute in one table that refers to the primary key of
another table (e.g., Student_ID in the Enrollments table referring to
Student table).
Super Key: A set of attributes that uniquely identify a row but may
contain extra attributes (e.g., (Roll No, Name) when Roll No is enough).
12. What is a Transaction?
A transaction is a sequence of database operations performed as a
single unit of work.
Example: A bank transfer includes debit from one account and credit to
another; both must happen together.
13. What is ACID?
Atomicity: Ensures that a transaction is either fully completed or not at
all.
Consistency: Ensures database remains in a valid state before and after
a transaction.
Isolation: Ensures transactions do not interfere with each other.
Durability: Ensures committed transactions are permanently stored,
even after a system failure.
14. Denormalization
Denormalization is the process of merging tables to reduce joins and
improve performance by storing redundant data.
Example: Instead of storing customer address separately, it can be
stored in the orders table to avoid joins.
15. DDL, DML, DCL
DDL (Data Definition Language): Commands used to define or modify
database structures (e.g., CREATE, ALTER, DROP).
DML (Data Manipulation Language): Commands used to manipulate
data in tables (e.g., INSERT, UPDATE, DELETE).
DCL (Data Control Language): Commands used to control database
access and permissions (e.g., GRANT, REVOKE).
16. Difference between DROP, TRUNCATE, and DELETE
Operation Removes Data? Removes Structure? Can be Rolled Back?
DROP Removes table and data Yes No
TRUNCATE Removes all data No No
DELETE Removes selected rows No Yes
17. When to use WHERE and HAVING?
WHERE: Used to filter rows before aggregation (e.g., SELECT * FROM
students WHERE age > 18).
HAVING: Used to filter after aggregation (e.g., SELECT dept,
COUNT(*) FROM employees GROUP BY dept HAVING
COUNT(*) > 5).
18. Example for Self Join, Left Join, and Right Join
Self Join: Joining a table with itself.
SELECT A.name, B.manager_name
FROM employees A, employees B
WHERE A.manager_id = B.emp_id;
Left Join: Returns all records from the left table and matching records
from the right table.
SELECT employees.name, departments.dept_name
FROM employees
LEFT JOIN departments ON employees.dept_id =
departments.dept_id;
Right Join: Returns all records from the right table and matching records
from the left table.
SELECT employees.name, departments.dept_name
FROM employees
RIGHT JOIN departments ON employees.dept_id =
departments.dept_id;
19. Cartesian Product
The result of joining two tables without any condition, multiplying row
combinations.
If Table A has 3 rows and Table B has 4 rows, Cartesian Product results in
3 × 4 = 12 rows.
Eg. SELECT * FROM students, courses;
20. Correlated Query
A subquery that depends on the outer query for each row processed.
Example: Find employees earning more than the average salary of their
department.
SELECT name, salary FROM employees E1
WHERE salary > (SELECT AVG(salary) FROM employees E2 WHERE
E1.dept_id = E2.dept_id);
21. What is a Stored Procedure?
A precompiled SQL script stored in the database that executes business
logic.
Example:
CREATE PROCEDURE GetAllEmployees()
AS
BEGIN
SELECT * FROM employees;
END;
22. What is a Trigger?
A trigger is an automatic database action that executes before or after
an event (INSERT, UPDATE, DELETE).
Example: Log changes to an employee table.
CREATE TRIGGER LogChanges
AFTER UPDATE ON employees
FOR EACH ROW
INSERT INTO log_table (emp_id, action_time) VALUES (NEW.emp_id,
NOW());
23. What is a View?
A virtual table that stores the result of a SQL query and simplifies
complex queries.
Example:
CREATE VIEW HighSalary AS
SELECT name, salary FROM employees WHERE salary > 50000;
24. RAID – Levels
RAID (Redundant Array of Independent Disks) improves storage
reliability.
Levels:
o RAID 0: Data striping (no redundancy).
o RAID 1: Mirroring (full redundancy).
o RAID 5: Striping with parity (fault tolerance).
o RAID 10: Mirroring + Striping (high performance and redundancy).
25. What is a Distributed Database?
A database spread across multiple locations or servers, but appearing
as a single database.
Used for load balancing, fault tolerance, and scalability (e.g., Google
Cloud Spanner).
26. Char, Varchar2, Blob – Data Type Uses
Data Type Usage
CHAR(n) Fixed-length string (e.g., CHAR(10) always stores 10 characters).
Variable-length string (e.g., VARCHAR2(10) stores up to 10
VARCHAR2(n)
characters, saving space).
BLOB Stores large binary objects like images, videos, and documents.
27. What is Big Data?
Large, complex datasets that traditional databases cannot handle
efficiently.
Defined by 3Vs:
o Volume (huge data size),
o Velocity (high-speed processing),
o Variety (structured, semi-structured, unstructured data).
28. What is MapReduce?
A data processing model in Hadoop that processes large-scale data in
parallel.
Steps:
1. Map: Splits data and processes it in parallel.
2. Reduce: Aggregates and summarizes the results.
Example: Counting words in a dataset.
29. What is NoSQL?
A non-relational database designed for flexibility, scalability, and high-
speed operations.
Types:
o Key-Value Stores (e.g., Redis),
o Document Databases (e.g., MongoDB),
o Column-Family Stores (e.g., Cassandra),
o Graph Databases (e.g., Neo4j).
30. What is Hadoop?
An open-source framework for distributed storage and processing of big
data.
Components:
o HDFS (Hadoop Distributed File System) → Stores large data files.
o MapReduce → Processes data in parallel.
o YARN → Manages resources.