1
CHAbase Query Using
SQL
CHAPTER-4
SQL (Structured Query Language) is a standard language for managing and manipulating
databases.
Key features of SQL:
o Retrieve data from tables.
o Insert new data into tables.
o Update or delete existing data.
o Create and modify database structures (tables, indexes, etc.).
Functions in MySQL: Introduction
Functions simplify operations by performing pre-defined tasks.
Single-row functions: Operate on each row indepen dently, e.g., mathematical calculations,
text
manipulation, and date/time handling.
Aggregate functions: Summarize data from multiple rows (not covered in these notes).
Single-Row Functions
Mathematical/Numeric Functions
These functions perform mathematical calculations on numeric data.
1. POW(x, y):
o Raises x to the power of y.
o Example: POW(3, 2) → 9 (i.e., 323^232).
2. ROUND(x, d):
o Rounds x to d decimal places.
o Example: ROUND(3.678, 1) → 3.7.
3. TRUNCATE(x, d):
o Truncates x to d decimal places without rounding.
o Example: TRUNCATE(3.678, 1) → 3.6.
4. MOD(x, y):
o Returns the remainder of x divided by y.
o Example: MOD(10, 3) → 1.
5. SQRT(x):
o Returns the square root of x.
o Example: SQRT(25) → 5.
6. ABS(x):
o Returns the absolute value of x.
o Example: ABS(-7) → 7.
String (Text) Functions
String functions are used to manipulate text.
1. ASCII(string):
o Returns the ASCII value of the first character.
o Example: ASCII('A') → 65.
2. LOWER(string)/LCASE(string):
o Converts text to lowercase.
o Example: LOWER('HELLO') → hello.
3. UPPER(string)/UCASE(string):
o Converts text to uppercase.
o Example: UPPER('hello') → HELLO.
4. LENGTH(string):
2
o Returns the number of characters in the string.
o Example: LENGTH('hello') → 5.
5. REPLACE(string, old, new):
o Replaces all occurrences of old with new.
o Example: REPLACE('hello world', 'world', 'SQL') → hello SQL.
6. LEFT(string, n):
o Extracts the first n characters.
o Example: LEFT('hello', 2) → he.
7. RIGHT(string, n):
o Extracts the last n characters.
o Example: RIGHT('hello', 3) → llo.
8. LTRIM(string):
o Removes leading spaces from the string.
o Example: LTRIM(' hello') → hello.
9. RTRIM(string):
o Removes trailing spaces from the string.
o Example: RTRIM('hello ') → hello.
10. TRIM(string):
o Removes both leading and trailing spaces.
o Example: TRIM(' hello ') → hello.
11. REVERSE(string):
o Reverses the string.
o Example: REVERSE('hello') → olleh.
12. REPEAT(string, n):
o Repeats the string n times.
o Example: REPEAT('Hi', 3) → HiHiHi.
13. SUBSTRING(string, start, length)/MID(string, start, length)/SUBSTR(string, start,
length):
o Extracts a substring starting at start for length characters.
o Example: SUBSTRING('database', 2, 4) → atab.
14. INSTR(string, substring):
o Returns the position of the first occurrence of substring.
o Example: INSTR('hello world', 'world') → 7.
15. CONCAT(string1, string2, ...):
o Concatenates multiple strings into one.
o Example: CONCAT('hello', ' ', 'world') → hello world.
Date and Time Functions
Date and time functions are used to work with temporal data.
1. CURDATE():
o Returns the current date.
o Example: CURDATE() → 2024-12-07.
2. NOW():
o Returns the current date and time.
o Example: NOW() → 2024-12-07 [Link].
3. SYSDATE():
o Returns the current system date and time. Similar to NOW().
4. DATE(expression):
o Extracts the date part from a datetime value.
o Example: DATE('2024-12-07 [Link]') → 2024-12-07.
5. MONTH(date):
3
o Returns the month as a number (1-12).
o Example: MONTH('2024-12-07') → 12.
6. YEAR(date):
o Returns the year part of the date.
o Example: YEAR('2024-12-07') → 2024.
7. DAYNAME(date):
o Returns the name of the day.
o Example: DAYNAME('2024-12-07') → Saturday.
8. DAYOFMONTH(date):
o Returns the day of the month (1-31).
o Example: DAYOFMONTH('2024-12-07') → 7.
9. MONTHNAME(date):
o Returns the name of the month.
o Example: MONTHNAME('2024-12-07') → December.
10. DAYOFWEEK(date):
o Returns the index of the weekday (1 = Sunday).
o Example: DAYOFWEEK('2024-12-07') → 7.
11. DAYOFYEAR(date):
o Returns the day of the year (1-366).
o Example: DAYOFYEAR('2024-12-07') → 342.
Multiple-Row Functions
Introduction
Multiple-row functions (also called aggregate functions) operate on a group of rows and
return
a single value for the entire group.
They are typically used with the GROUP BY clause to summarize data or perform
calculations.
These functions ignore NULL values unless explicitly mentioned.
Common Aggregate Functions
1. SUM(expression):
o Calculates the total (sum) of numeric values in a column or expression.
o Example:
SELECT SUM(salary) AS TotalSalary FROM employees;
Output: Total of all salaries in the salary column.
2. AVG(expression):
o Calculates the average of numeric values in a column or expression.
o Example:
SELECT AVG(salary) AS AverageSalary FROM employees;
Output: Average of all salaries.
3. MAX(expression):
o Returns the maximum value from the specified column or expression.
o Example:
SELECT MAX(salary) AS HighestSalary FROM employees;
Output: Highest salary in the salary column.
4. MIN(expression):
o Returns the minimum value from the sp ecified column or expression.
o Example:
SELECT MIN(salary) AS LowestSalary FROM employees;
Output: Lowest salary in the salary column.
5. COUNT(expression):
4
o Counts the number of non-NULL values in a column or expression.
o Example:
SELECT COUNT(employee_id) AS TotalEmployees FROM employees;
Output: Total number of employees (excluding NULL values).
6. COUNT(DISTINCT expression):
o Counts the number of unique, non-NULL values in a column.
o Example:
SELECT COUNT(DISTINCT department_id) AS UniqueDepartments FROM employees;
Output: Number of unique department IDs.
Example with GROUP BY
Aggregate functions are often used with the GROUP BY clause to group data based on
specific criteria.
Example:
SELECT department_id, AVG(salary) AS AverageSalary
FROM employees
GROUP BY department_id;
Output: The average salary for each department.
Sorting in SQL: ORDER BY
Introduction
The ORDER BY clause is used to sort the rows in the result set of a query.
Sorting can be done in ascending order (default) or descending order.
Syntax
SELECT column1, column2
FROM table_name
ORDER BY column1 [ASC|DESC], column2 [ASC|DESC];
Key Points
1. Default Sorting:
o Rows are sorted in ascending order (ASC) by default.
o To sort in descending order, use the DESC keyword.
2. Multiple Columns:
o You can sort by multiple columns. Sorting is applied in the order specified.
3. Alias in Sorting:
o You can sort using column aliases defined in the SELECT clause.
Examples
1. Sort by One Column (Ascending):
SELECT name, age
FROM students
ORDER BY age;
Output: Rows sorted by age in ascending order.
2. Sort by One Column (Descending):
SELECT name, age
FROM students
ORDER BY age DESC;
Output: Rows sorted by age in descending order.
3. Sort by Multiple Columns:
SELECT name, age, marks
FROM students
ORDER BY marks DESC, age ASC;
Output: Rows sorted by marks (descending), and for equal marks, by age (ascending).
6. GROUP BY Clause
5
Introduction
The GROUP BY clause groups rows that have the same values in specified columns into
summary rows.
It is typically used with aggregate functions like SUM(), AVG(), COUNT(), etc.
Syntax
SELECT column1, aggregate_function(column2)
FROM table_name
GROUP BY column1;
Key Points
1. Column in SELECT Clause:
o Any column not used in an aggregate function must be included in the GROUP BY clause.
2. Order of Execution:
o GROUP BY is applied after WHERE and before ORDER BY.
Examples
1. Basic GROUP BY:
SELECT department_id, COUNT(employee_id) AS TotalEmployees
FROM employees
GROUP BY department_id;
Output: Number of employees in each department.
2. Using GROUP BY with Multiple Columns:
SELECT department_id, job_id, AVG(salary) AS AvgSalary
FROM employees
GROUP BY department_id, job_id;
Output: Average salary for each job in each department.
6.1 HAVING Clause
The HAVING clause is used to filter the grouped data returned by the GROUP BY clause.
It is similar to the WHERE clause but operates on aggregated/grouped data.
Syntax
SELECT column1, aggregate_function(column2)
FROM table_name
GROUP BY column1
HAVING condition;
Key Points
1. Difference Between WHERE and HAVING:
o WHERE filters rows before grouping.
o HAVING filters groups after grouping.
2. Aggregate Functions in HAVING:
o Aggregate functions (e.g., SUM, AVG, COUNT) can only be used in the HAVING clause,
not WHERE.
Examples
1. Using HAVING to Filter Groups:
SELECT department_id, SUM(salary) AS TotalSalary
FROM employees
GROUP BY department_id
HAVING SUM(salary) > 100000;
Output: Departments where the total salary exceeds 100,000.
2. Combining WHERE and HAVING:
SELECT department_id, AVG(salary) AS AvgSalary
FROM employees
WHERE job_id = 'IT_PROG'
6
GROUP BY department_id
HAVING AVG(salary) > 5000;
Output: Average salary of IT programmers grouped by department, where the average
exceeds 5,000.
Aggregate Functions and Conditions on Groups: HAVING Clause
The HAVING clause is used to filter groups after the GROUP BY operation. It is similar to
the WHERE
clause, but while WHERE filters rows before grouping, HAVING filters groups after they are
formed. The
HAVING clause is often used in conjunction with aggregate functions.
Key Points
WHERE vs HAVING:
o WHERE filters rows before grouping (applies to individual rows).
o HAVING filters the result of GROUP BY (applies to groups).
Using Aggregate Functions in HAVING:
o You can use aggregate functions (like SUM, COUNT, AVG) in the HAVING clause
to apply conditions to groups.
Syntax for HAVING Clause
SELECT column1, aggregate_function(column2)
FROM table_name
WHERE condition
GROUP BY column1
HAVING aggregate_function(column2) condition
ORDER BY column1 [ASC|DESC];
Examples of Conditions on Groups with HAVING Clause
1. Basic Example with HAVING Clause:
o Filter departments where the total salary exceeds 50,000.
SELECT department_id, SUM(salary) AS TotalSalary
FROM employees
GROUP BY department_id
HAVING SUM(salary) > 50000;
Explanation:
o This groups employees by department_id and calculates the total salary for each
department.
o The HAVING clause filters the groups where the total salary is greater than 50,000.
2. Combining WHERE and HAVING:
o Filter employees by job type, and then group by department and find average salary,
filtering only those departments where the average salary is above 6,000.
SELECT department_id, AVG(salary) AS AverageSalary
FROM employees
WHERE job_id = 'SA_REP'
GROUP BY department_id
HAVING AVG(salary) > 6000;
Explanation:
o First, the WHERE clause filters employees who have the job type 'SA_REP'.
o The query then groups the remaining employees by department_id and calculates the
average
salary for each department.
o The HAVING clause filters the groups where the average salary is above 6,000.
3. Using Multiple Conditions in HAVING:
7
o Filter departments where the number of employees is greater than 5 and the highest salary
is
greater than 10,000.
SELECT department_id, COUNT(employee_id) AS EmployeeCount, MAX(salary) AS
HighestSalary
FROM employees
GROUP BY department_id
HAVING COUNT(employee_id) > 5 AND MAX(salary) > 10000;
Explanation:
o Groups employees by department_id.
o Counts the number of employees and finds the maximum salary for each department.
o The HAVING clause filters the groups where there are more than 5 employees and the
maximum salary is greater than 10,000.
4. Using HAVING with COUNT(DISTINCT):
o Find departments with more than 3 unique job types.
SELECT department_id, COUNT(DISTINCT job_id) AS UniqueJobs
FROM employees
GROUP BY department_id
HAVING COUNT(DISTINCT job_id) > 3;
Explanation:
o This groups employees by d epartment_id.
o It counts the distinct job types (job_id) for each department.
o The HAVING clause filters departments with more than 3 distinct job types.
5. Sorting the Grouped Results with HAVING:
o Find departments where the total salary is greater than 50,000, and then order the results by
total salary in descending order.
SELECT department_id, SUM(salary) AS TotalSalary
FROM employees
GROUP BY department_id
HAVING SUM(salary) > 50000
ORDER BY TotalSalary DESC;
Explanation:
o First, groups employees by department_id and calculates the total salary.
o The HAVING clause filters departments where the total salary is greater than 50,000.
o Finally, the result is sorted by TotalSalary in descending order.
Operations on Relations in SQL
In relational databases, we often work with relations (tables) that can be manipulated using
set operations.
These operations allow us to combine, compare, and subtract data from tables, similar to
operations in set theory.
1. UNION Operation
Description:
The UNION operation combines the results of two or more SELECT statements and returns
distinct rows
from both queries. It removes duplicate rows from the result set.
Syntax:
SELECT column1, column2, ...
FROM table1
UNION
SELECT column1, column2, ...
8
FROM table2;
Rules for UNION:
o The number of columns and their data types must be the same in both SELECT statements.
o It returns only distinct values by default (no duplicates).
Example:
SELECT employee_id FROM employees_2022
UNION
SELECT employee_id FROM employees_2023;
Explanation:
o This query combines the employee_id from the employees_2022 and employees_2023
tables,
ensuring that the same employee ID is not repeated in the result.
2. INTERSECTION Operation
Description:
The INTERSECTION operation returns only the rows that are present in both SELECT
queries. It
finds the common elements between two result sets.
Syntax:
SELECT column1, column2, ...
FROM table1
INTERSECT
SELECT column1, column2, ...
FROM table2;
Rules for INTERSECTION:
o The number of columns a nd their data types must be the same in both SELECT statements.
o Only rows that appear in both queries will be included in the result.
Example:
SELECT employee_id FROM employees_2022
INTERSECT
SELECT employee_id FROM employees_2023;
Explanation:
o This query returns the employee_id values that exist in both the employees_2022 and
employees_2023 tables.
3. SET DIFFERENCE (MINUS) Operation
Description:
The MINUS (also known as SET DIFFERENCE) operation returns the rows from the first
SELECT
statement that are not present in the second SELECT statement.
Syntax:
SELECT column1, column2, ...
FROM table1
MINUS
SELECT column1, column2, ...
FROM table2;
Rules for MINUS:
o The number of columns and their data types must be the same in both SELECT statements.
o The result includes only the rows that are in the first set but not in the second.
Example:
SELECT employee_id FROM employees_2022
MINUS
9
SELECT employee_id FROM employees_2023;
Explanation:
o This query returns the employee_id values that exist in the employees_2022 table but
not in the employees_2023 table.
Key Differences Between the Operations
Operation Description Example
UNION
Combines results from two queries, returns
distinct rows.
SELECT employee_id FROM employees_2022 UNION
SELECT ...
INTERSECTION Returns common rows that exist in both queries.
SELECT employee_id FROM employees_2022 INTERSECT
SELECT ...
MINUS
Returns rows from the first query not found in the
second.
SELECT employee_id FROM employees_2022 MINUS
SELECT ...
Additional Points
UNION ALL:
Unlike UNION, UNION ALL does not remove duplicates. It returns all rows, including
duplicates,
from both queries.
SELECT employee_id FROM employees_2022
UNION ALL
SELECT employee_id FROM employees_2023;
Performance Considerations:
o UNION may take longer to execute because it eliminates duplicates.
o UNION ALL is generally faster since it does not check for duplicates.
SQL JOINS
SQL joins are used to combine rows from two or more tables based on a related column.
These operations
allow us to retrieve data that spans multiple tables and define relationships between the
tables.
1. Cartesian Product (CROSS JOIN)
Description:
The CROSS JOIN produces a Cartesian product of two tables. It returns every combination
of rows
from both tables. If table 1 has m rows and table 2 has n rows, the result of the CROSS JOIN
will
have m * n rows.
Syntax:
SELECT column1, column2, ...
FROM table1
CROSS JOIN table2;
Example:
SELECT employee_id, department_id
FROM employees
CROSS JOIN departments;
10
Explanation:
o This query will return every combination of employee_id from the employees table and
department_id from the departments table.
Note: This operation can result in a large result set, especially when working with large
tables.
2. Equi Join
Description:
An Equi Join is a type of join that combines rows from two tables based on the equality of a
specified
column in both tables. It is the most common type of join, and it is a type of INNER JOIN. It
uses the
equality operator (=) to match columns.
Syntax:
SELECT column1, column2, ...
FROM table1
JOIN table2
ON table1.column_name = table2.column_name;
Example:
SELECT employees.employee_id, [Link], departments.department_id
FROM employees
JOIN departments
ON [Link] _id = departments.department_id;
Explanation:
o This query joins the employees and departments tables on the department_id column,
returning
the employee_id, name, and department_id for each employee.
3. Inner Join
Description:
The INNER JOIN returns only the rows that have matching values in both tables. If there is
no match,
the row will not be included in the result set. It is the most commonly used type of join.
Syntax:
SELECT column1, column2, ...
FROM table1
INNER JOIN table2
ON table1.column_name = table2.column_name;
Example:
SELECT employees.employee_id, [Link], departments.department_name
FROM employees
INNER JOIN departments
ON employees.department_id = departments.department_id;
Explanation:
o This query retrieves the employee ID, name, and department name by matching rows
from the employees and departments tables based on the department_id.
4. Right Outer Join (RIGHT JOIN)
Description:
A RIGHT OUTER JOIN (or simply RIGHT JOIN) returns all the rows from the right table
and the
matching rows from the left table. If there is no match, the result will contain NULL for
columns of
11
the left table.
Syntax:
SELECT column1, column2, ...
FROM table1
RIGHT JOIN table2
ON table1.column_name = table2.column_name;
Example:
SELECT employees.employee_id, [Link], departments.department_name
FROM employees
RIGHT JOIN departments
ON employees.department_id = departments.department_id;
Explanation:
o This query retrieves the employee ID, name, and department name. If an employee does not
belong
to a department, the employee_id and name will be NULL.
Note:
o If the RIGHT JOIN is used with no matching rows in the left table, it still includes the rows
from the right table with NULL values for the left table’s columns.
5. Natural Join
Description:
A NATURAL JOIN automatically joins tables based on columns with the same name and
data type
in both tables. It eliminates the need to explicitly specify the join condition (i.e., the ON
clause) for
matching columns.
Syntax:
SELECT column1, column2, ...
FROM table1
NATURAL JOIN table2;
Example:
SELECT employee_id, department_name
FROM employees
NATURAL JOIN departments;
Explanation:
o This query assumes that both the employees and departments tables have a column with the
same name (e.g., department_id). The NATURAL JOIN will automatically match rows
where the department_id column is equal in both tables.
Note:
o It’s important to ensure that the columns with the same name in both tables are meant to be
matched. The NATURAL JOIN can sometimes lead to unexpected results if there are
multiple
columns with the same name but different meanings.
Key Differences Between Joins
INNER JOIN returns only matching rows from both tables.
CROSS JOIN returns the Cartesian product of the tables (every combination of rows).
RIGHT JOIN includes all rows from the right table and matched rows from the left.
NATURAL JOIN automatically joins tables based on common column names.