SQL
SQL
• Non-Relational :
• A non-relational database, also known as a NoSQL database, is a type of database that stores
and manages data in a flexible, non-tabular format, unlike the structured tables used in relational
databases. Non-relational databases are designed to handle unstructured or semi-structured data,
allowing for a more flexible schema or even schema-less data storage, making them ideal for
applications where data formats vary, change frequently, or don't fit well in a rigid table structure.
• Example: Storing product information
BASIC OPERATIONS IN SQL
BASIC SQL OPERATIONS
• DDL (Data Definition Language)
• It is a type of SQL command used to define data structures and modify data. It creates, alters, and
deletes database objects such as tables, views, indexes, and users.
• Examples of DDL statements include CREATE, ALTER, DROP and TRUNCATE.
• FROM: First, the table on which the DML operation is performed has to be processed. So,
the FROM clause is evaluated first in an SQL Query.
• WHERE: After the table data on which other operations take place is processed by JOIN
and FROM clause, WHERE clause is evaluated. WHERE clause filters the rows based on
conditions from the table evaluated by the FROM clause. This WHERE clause discards
rows that don’t satisfy the conditions, thus reducing the rows of data that need to be
processed further in other clauses.
• GROUP BY: If the query has a GROUP BY clause, it is executed next. Here, the data is
grouped based on the common value in the column specified in the GROUP BY clause.
This reduces the number of rows further equal to no of distinct values in the GROUP BY
column. This helps to calculate aggregate functions.
SQL ORDER OF EXECUTION
• HAVING: If the query had a GROUP BY clause, the HAVING clause is evaluated immediately
after GROUP BY. HAVING clause is not compulsory for GROUP BY. Similar to WHERE
operations, this clause also filters the table group processed before by the GROUP BY
clause. This HAVING also discards rows that don’t satisfy the conditions, thus reducing the
rows of data that need to be processed further in other clauses
• SELECT: The SELECT is executed next after GROUP BY and HAVING. It computes
expressions (arithmetic, aggregate, etc.) and aliases given in the SELECT clause. The
computation is now performed on the smallest dataset after much filtering and grouping
operations done by previous clauses.
• DISTINCT: The DISTINCT clause is executed after evaluating expressions, and alias
references in the previous step. It filters any duplicate rows and returns only unique rows.
SQL ORDER OF EXECUTION
• ORDER BY: After executing all the above clauses, the data to be
displayed or processed is computed. Now ORDER BY is executed to sort
it based on particular column(s) either in ascending or descending
order. It is left associative, that is it is sorted based on the first
specified column and then by the second, and so on.
• LIMIT/OFFSET: At last, after the order of data to be processed is
evaluated, LIMIT and OFFSET clauses are evaluated to display only the
rows that fall within the LIMIT. So, it is generally not recommended to
LIMIT only certain rows from many rows evaluated before, since It is not
efficient and waste of computation.
EXAMPLE: SQL QUERY EXAMPLE FOR ORDER OF EXECUTION
• Scenario: Find all appointments where the notes field is not provided
(i.e., NULL).
AGGREGATE FUNCTIONS AND GROUPING
• These are aggregate functions used to perform calculations on a set of rows and return a
single value - COUNT, SUM, AVG, MIN, MAX
• COUNT: Counts the number of rows or non-NULL values.
• SUM: Calculates the total of numeric values.
• AVG: Returns the average of numeric values.
• MIN: Finds the smallest value.
• MAX: Finds the largest value.
• GROUP BY clause
• GROUP BY organizes rows into groups based on a column’s value.
• Used with aggregate functions to perform calculations for each group separately
• HAVING clause
• HAVING filters groups created by GROUP BY, based on aggregate values.
• Similar to WHERE, but works on grouped (aggregated) data.
BUSINESS SCENARIOS
• SSMS Examples
WHEN TO USE THESE CLAUSES:
2.GROUP BY Clause:
1. To break data into categories or groups (e.g., by doctor, department, or patient
demographics).
3.HAVING Clause:
1. To filter results of grouped data based on aggregate values, such as identifying
departments with high revenue or doctors with many appointments.
JOINING THE TABLES
• INNER JOIN
• Definition: An INNER JOIN retrieves rows that have matching values in both tables based on the
specified condition.
• Scenario : The hospital wants a list of patients who have scheduled appointments, including their
names, appointment dates, and fees.
• FULL OUTER JOIN: A FULL OUTER JOIN retrieves all rows from both tables, matching rows
where possible. Unmatched rows from either table will have NULL values for missing columns.
• Scenario: The hospital wants a complete list of all patients and all appointments, showing
matches where possible.
• SELF JOIN: A SELF-JOIN is when a table is joined to itself. Often used to compare rows within the
same table.
• Scenario: The hospital wants to know which staff members report to whom.
• Cross Join: A CROSS JOIN produces a Cartesian product of two tables, pairing every row from the
first table with every row from the second.
• Scenario : The hospital wants to see all possible combinations of doctors and departments to
explore possible rotations.
JOINS SUMMARY
INNER JOIN Match rows in both tables Only include rows with matches.
Include all rows from the left Include unmatched rows from the
LEFT JOIN
table left table.
Include all rows from the right Include unmatched rows from the
RIGHT JOIN
table right table.
• Scenario : Find the department of the doctor with the highest experience.
• Multiple-Row Subqueries
• Multiple row subquery returns one or more rows to the outer SQL statement. You may use the IN, ANY, or ALL operator
in outer query to handle a subquery that returns multiple rows.
• Scenario: List the names of patients who have scheduled appointments with doctors in the "Oncology" department.
SUBQUERIES AND NESTED SUBQUERIES
• Correlated Subqueries
• In a correlated subquery, the inner query refers to columns from the outer
query. This subquery runs once for each row of the outer query.
• Scenario : Find doctors who have handled more than one appointment.
• 2. INTERSECT : INTERSECT retrieves only the rows that are common between two queries.
• Scenario: Identify names that are common between the Staff and Doctors tables (e.g., a staff
member who is also a doctor).
SET OPERATIONS
• 3. EXCEPT: EXCEPT retrieves rows from the first query that are not present in the second query.
• You want to know the total fees collected by each doctor for their
appointments while keeping row-level data.
AGGREGATED WINDOW FUNCTIONS
• AVG
• The AVG window function calculates the average of a column across rows in
a group while keeping the individual rows intact.
• Scenario: You want to find the average fee collected by each doctor for
their appointments.
• COUNT
• The COUNT window function calculates the number of rows in a group or
partition while maintaining row-level data.
• Scenario: You want to know how many appointments each doctor has.
WINDOW FUNCTIONS: ANALYTIC FUNCTIONS
• Analytic functions provide advanced insights by analyzing and
comparing data within partitions or rows. These include LEAD, LAG,
FIRST_VALUE, LAST_VALUE, and NTILE. They help in scenarios like trend
analysis, identifying first or last records, or dividing data into equal
parts.
• LEAD Function: The LEAD function retrieves the value of a column from
the next row in the same partition. It's used to compare current rows
with future rows.
• Scenario: You want to know the fee of the next appointment for each
doctor
WINDOW FUNCTIONS: ANALYTIC FUNCTIONS
• LAG Function : The LAG function retrieves the value of a column from
the previous row in the same partition. It is often used to compare
current rows with past rows.
• Scenario: You want to know the fee of the previous appointment for
each doctor.
WINDOW FUNCTIONS: ANALYTIC FUNCTIONS
• FIRST_VALUE Function
• The FIRST_VALUE function retrieves the first value of a column in the
window or partition.
• Scenario : You want to know the fee of the first appointment for each
doctor.
• NTILE Function
• The NTILE function divides rows in a partition into a specified number of
groups and assigns a rank (bucket number) to each row.
• Scenario: You want to divide appointments for each doctor into two equal
groups based on their fees.
WINDOW FUNCTIONS: ANALYTIC FUNCTIONS
• The frame clause specifies a subset of rows relative to the current row within a
window. It is used in window functions to define which rows are included in the
computation. The two primary clauses are:
• RANGE BETWEEN: Operates on the logical value of rows (e.g., based on ORDER BY values).
• ROWS BETWEEN: Operates on the physical row positions within the result set.
• RANGE BETWEEN
• Defines a range of rows based on values in the ORDER BY clause. It is used for scenarios
like cumulative sums or averages over a specific value range.
• Scenario: You want to calculate the cumulative sum of fees for each doctor up to the
current appointment
BEST EXAMPLE FOR WINDOW FUNCTIONS
• https://www.sqlshack.com/use-window-functions-sql-server/
• https://www.sqlshack.com/overview-of-mysql-window-functions/
• https://datalemur.com/sql-tutorial/sql-union-intercept-except