SQL Interview
SQL Interview
question
Here are the answers for the SQL and PySpark join operations based on the provided tables
and the specified considerations:
Table Structure:
Here’s a complete summary of the results for all join types assuming case insensitivity in
MS SQL. This includes the correct row counts and reasoning for each join type.
When considering case sensitivity in PySpark (the default behavior), the output for each
join type changes since "A" and "a" are treated as different values, and "c" and "C" are also
different. Below is a detailed explanation for each join type, assuming case sensitivity:
b
C
NULL
NULL
Explanation:
o "b" has no match.
o "C" has no match because "c" is different.
o The two NULL values from Table 2 have no match with the NULL value in Table
1 (PySpark does not consider NULL values as equal by default).
• Number of Rows: 4 rows
1. 👥 𝐅𝐮𝐥𝐥 𝐍𝐚𝐦𝐞 𝐃𝐢𝐬𝐩𝐥𝐚𝐲: How can you display the full name of
each employee by combining their first and last names?
How could you extract the last three characters of a job title?
How can you pull a department name starting from the second
character?
4. 𝐂𝐚𝐬𝐞 𝐂𝐨𝐧𝐯𝐞𝐫𝐬𝐢𝐨𝐧:
How can you convert first names to uppercase for consistency?
How would you convert last names to lowercase, perhaps for
generating email addresses?
5. 𝐓𝐞𝐱𝐭 𝐑𝐞𝐩𝐥𝐚𝐜𝐞𝐦𝐞𝐧𝐭:
How would you update all occurrences of the word "Manager"
to "Lead" in job titles?
How could you replace "IT" with "Technology" in department
names?
How would you replace spaces with hyphens in full names for
URL-friendly formatting?
How can you break down the hire date into year, month, and
day components?
o You have two tables, table1 and table2, representing data collected from
two different sources. You want to combine this data to analyse how the
entries from these sources relate to each other.
o The goal is to understand various types of joins and how they impact the
data returned from the tables.
table1 table2
col1
1 col2
2 1
3 2
1 3
1 Null
Null 4
2
Null
Inner Join
Returns only the rows with matching values in both tables.
SELECT t1.col1, t2.col2
FROM table1 t1
INNER JOIN table2 t2 ON t1.col1 = t2.col2;
Result:
col1 col2
1 1
2 2
3 3
1 1
1 1
2 2
Result:
col1 col2
1 1
2 2
3 3
1 1
1 1
Null Null
2 2
Null Null
Returns all rows from the right table (table2), and the matched rows from
the left table (table1). If there is no match, NULL values are returned for
columns from the left table.
Result:
col1 col2
1 1
2 2
3 3
1 1
1 1
Null Null
Null 4
Null Null
Self Join
A Self Join is a regular join, but the table is joined with itself. This can be
useful when comparing rows within the same table.
Let's compare the values in table1 with each other to find rows where the
value in col1 is the same:
SQL Showdown: WHERE vs. HAVING – Which One to Use and When?
Ever found yourself puzzled over when to use WHERE and when to reach for HAVING? Let’s
dive into the epic battle of these two SQL clauses with a simple example!
The Setup:
This gives us the total sales for each product. But what if you only want to see products with
total sales greater than 1000?
Oops! WHERE can’t be used with aggregate functions because it filters before the
aggregation. Lesson learned!
Using WHERE: You want to calculate total sales for just 'iPhone' and 'Speakers':
GROUP BY PRODUCT
Calculate Total sales of iPhone and Speakers using HAVING clause: This example retrieves all
rows from Sales table, performs the sum and then removes all products except iPhone and Speakers.
Final Verdict:
HAVING is the hero when you need to filter groups after aggregation.
Speed Tip: WHERE is usually faster since it narrows down the data set first.
Use WHERE in SELECT, INSERT, and UPDATE statements; save HAVING for filtering grouped
data in SELECT statements.
Level up your SQL game by mastering these clauses and use them wisely to optimize your
queries!
Result: Flexible and checks multiple sources before settling on 'NO MANAGER'
SELECT E.NAME AS EMPNAME, CASE WHEN M.NAME IS NULL THEN 'NO MANAGER'
ELSE M.NAME END AS MANAGER
FROM EMPDETAILS E
LEFT JOIN EMPDETAILS M
ON E.MANAGERID=M.ID
DEPT
Salgrade
JobHistory
-- 3) Display employee names, salaries, and total salaries for each department.
SELECT E.ENAME, E.SAL, SUM(E.SAL) OVER (PARTITION BY E.DEPTNO) AS
TOTAL_DEPT_SALARY
FROM EMP E;
-- 9) Display the names of employees who have the highest salary in their department.
SELECT ENAME
FROM EMP E
WHERE SAL = (
SELECT MAX(SAL)
FROM EMP
WHERE DEPTNO = E.DEPTNO
);
-- 10) Display the department name and total salary for each department.
SELECT D.DNAME, SUM(E.SAL) AS TOTAL_SALARY
FROM EMP E
JOIN DEPT D ON E.DEPTNO = D.DEPTNO
GROUP BY D.DNAME;
--department Table
CREATE TABLE DEPT (
DEPTNO INT PRIMARY KEY, -- Department Number
DNAME VARCHAR(50), -- Department Name
LOC VARCHAR(50) -- Location
);
-- Salary Grade
CREATE TABLE SALGRADE (
GRADE INT, -- Salary Grade
LOSAL DECIMAL(10, 2), -- Lowest Salary for Grade
HISAL DECIMAL(10, 2) -- Highest Salary for Grade
);
-- Job History
INSERT INTO EMP (EMPNO, ENAME, JOB, MGR, HIREDATE, SAL, COMM, DEPTNO) VALUES
(7369, 'SMITH', 'CLERK', 7902, '1980-12-17', 800, NULL, 20),
(7499, 'ALLEN', 'SALESMAN', 7698, '1981-02-20', 1600, 300, 30),
(7521, 'WARD', 'SALESMAN', 7698, '1981-02-22', 1250, 500, 30),
(7566, 'JONES', 'MANAGER', 7839, '1981-04-02', 2975, NULL, 20),
(7698, 'BLAKE', 'MANAGER', 7839, '1981-05-01', 2850, NULL, 30),
(7782, 'CLARK', 'MANAGER', 7839, '1981-06-09', 2450, NULL, 10),
(7839, 'KING', 'PRESIDENT', NULL, '1981-11-17', 5000, NULL, 10),
(7902, 'FORD', 'ANALYST', 7566, '1981-12-03', 3000, NULL, 20),
(7934, 'MILLER', 'CLERK', 7782, '1982-01-23', 1300, NULL, 10);
DEPT
Salgrade
JobHistory
-- 11) Display the names of employees who are working in the SALES
department.
SELECT ENAME
FROM EMP E
JOIN DEPT D ON E.DEPTNO = D.DEPTNO
WHERE D.DNAME = 'SALES';
-- 12) Find the employee with the second highest salary in the
company.
SELECT TOP 1 ENAME
FROM (
SELECT ENAME, RANK() OVER (ORDER BY SAL DESC) AS RANK
FROM EMP
) AS SAL_RANKED
WHERE RANK = 2;
Optimized way
SELECT ENAME
FROM (
SELECT ENAME, SAL, DENSE_RANK() OVER (ORDER BY SAL DESC) AS RANK
FROM EMP
) AS SAL_RANKED
WHERE RANK = 2;
--Using Joins
SELECT E1.ENAME
FROM EMP E1
JOIN EMP E2 ON E1.SAL < E2.SAL
GROUP BY E1.ENAME, E1.SAL
HAVING COUNT(DISTINCT E2.SAL) = 1;
-- 13) Display the average salary for each job title round to 2
decimal.
SELECT JOB, ROUND(CAST(AVG(SAL) AS DECIMAL(10, 2)), 2) AS
AVERAGE_SALARY
FROM EMP
GROUP BY JOB;
-- 14) Display the names of employees who joined after the employee
‘SMITH’.
SELECT ENAME
FROM EMP
WHERE HIREDATE > (
SELECT HIREDATE
FROM EMP
WHERE ENAME = 'SMITH'
);
Optimized way:
WITH SMITH_HIREDATE AS (
SELECT HIREDATE
FROM EMP
WHERE ENAME = 'SMITH'
)
SELECT ENAME
FROM EMP, SMITH_HIREDATE
WHERE EMP.HIREDATE > SMITH_HIREDATE.HIREDATE;
-- 15) Display employee details along with their commission, but
show ‘0’ if no commission is given.
SELECT ENAME, SAL, ISNULL(COMM, 0) AS COMMISSION
FROM EMP;
Notes on ISNULL:
1. Definition:
o ISNULL is a function used in SQL to replace NULL values
with a specified value.
2. Syntax:
ISNULL(expression, replacement_value)
Check out 200+ scenario based databricks and pyspark Scenario based Question in
Topmate: https://topmate.io/shivakiran_kotur/1376452
Check out 200+ Python Question and Answer for Data Engineer asked in Interview in
Topmate: https://topmate.io/shivakiran_kotur/1337666
SQL Mastery Series – 110 Question using 4 table – Set 3
Emp Table:
DEPT
Salgrade
JobHistory
-- 17) Display the names of employees who work in the same
department as ‘SMITH’ and dont include smith.
SELECT ENAME
FROM EMP
WHERE DEPTNO = (
SELECT DEPTNO
FROM EMP
WHERE ENAME = 'SMITH'
)
AND ENAME != 'SMITH';
--USing CTE
WITH SmithDept AS (
SELECT DEPTNO
FROM EMP
WHERE ENAME = 'SMITH'
)
SELECT ENAME
FROM EMP
WHERE DEPTNO = (SELECT DEPTNO FROM SmithDept)
AND ENAME != 'SMITH';
-- 19) Find employees whose job title contains the letter ‘M’.
SELECT ENAME , Job
FROM EMP
WHERE JOB LIKE '%M%';
--Using CTE
WITH ManagerSalaries AS (
SELECT EMPNO, SAL
FROM EMP
)
SELECT E.ENAME
FROM EMP E
JOIN ManagerSalaries M ON E.MGR = M.EMPNO
WHERE E.SAL > M.SAL;
Check out 200+ scenario based databricks and pyspark Scenario based Question in
Topmate: https://topmate.io/shivakiran_kotur/1376452
Check out 200+ Python Question and Answer for Data Engineer asked in Interview in
Topmate: https://topmate.io/shivakiran_kotur/1337666
SQL Mastery Series – 110 Question using 4 table – Set 5
Emp Table:
DEPT
Salgrade
JobHistory
-- 31) Display those employees whose salary is less than their manager's salary but more than the
salary of any other manager.
SELECT E.ENAME
FROM EMP E
JOIN EMP M ON E.MGR = M.EMPNO
WHERE E.SAL < M.SAL
AND E.SAL > ANY (SELECT SAL FROM EMP WHERE EMPNO IN (SELECT DISTINCT MGR FROM EMP WHERE MGR IS NOT
NULL));
-- 32) Display all employee names with the total salary of the
company for each employee.
SELECT ENAME, (SELECT SUM(SAL) FROM EMP) AS TOTAL_COMPANY_SALARY
FROM EMP;
-- 37) Delete records from the employee table where the department
number is not available in the department table.
DELETE FROM EMP
WHERE DEPTNO NOT IN (SELECT DEPTNO FROM DEPT);
-- 39) Display employee name, salary, commission, and net pay where
the net pay is greater than any other employee's salary in the
company.
SELECT ENAME, SAL, COMM, (SAL + ISNULL(COMM, 0)) AS NET_PAY
FROM EMP
WHERE (SAL + ISNULL(COMM, 0)) > ANY (SELECT SAL FROM EMP);
-- 40) Display the names of those employees who are going to retire
on 31-Dec-99, if the maximum job period is 30 years.
SELECT ENAME
FROM EMP
WHERE DATEADD(YEAR, 30, HIREDATE) = '1999-12-31';
Check out 200+ scenario based databricks and pyspark Scenario based Question in
Topmate: https://topmate.io/shivakiran_kotur/1376452
Check out 200+ Python Question and Answer for Data Engineer asked in Interview in
Topmate: https://topmate.io/shivakiran_kotur/1337666
SQL Mastery Series – 110 Question using 4 table – Set 3
Emp Table:
DEPT
Salgrade
JobHistory
-- 23) Display employee details where the third character of the
name is ‘A’.
SELECT *
FROM EMP
WHERE ENAME LIKE '__A%';
--Using CTE
WITH AvgSalary AS (
SELECT AVG(SAL) AS avg_sal
FROM EMP
)
SELECT M.ENAME
FROM EMP M
WHERE M.EMPNO IN (SELECT DISTINCT MGR FROM EMP WHERE MGR IS NOT
NULL)
AND M.SAL > (SELECT avg_sal FROM AvgSalary);
-- 29) Display names of managers whose salary is more than the min
salary of employees. (Duplicate question)
SELECT M.ENAME
FROM EMP M
WHERE M.EMPNO IN (SELECT DISTINCT MGR FROM EMP WHERE MGR IS NOT
NULL)
AND M.SAL > (SELECT min(SAL) FROM EMP);
-- 30) Display employee name, salary, commission, and net pay for
those employees whose net pay is greater than or equal to any other
employee’s salary.
SELECT ENAME, SAL, COMM, (SAL + ISNULL(COMM, 0)) AS NET_PAY
FROM EMP
WHERE (SAL + ISNULL(COMM, 0)) >= ALL (SELECT SAL FROM EMP);
--using joins
SELECT E.ENAME, E.SAL, E.COMM, (E.SAL + ISNULL(E.COMM, 0)) AS
NET_PAY
FROM EMP E
JOIN (
SELECT MAX(SAL) AS MAX_SAL
FROM EMP
) AS MaxSalary
ON (E.SAL + ISNULL(E.COMM, 0)) >= MaxSalary.MAX_SAL;
--Using CTE
WITH MaxSalary AS (
SELECT MAX(SAL) AS MAX_SAL
FROM EMP
)
SELECT E.ENAME, E.SAL, E.COMM, (E.SAL + ISNULL(E.COMM, 0)) AS
NET_PAY
FROM EMP E
WHERE (E.SAL + ISNULL(E.COMM, 0)) >= (SELECT MAX_SAL FROM
MaxSalary);
Check out 200+ scenario based databricks and pyspark Scenario based Question in
Topmate: https://topmate.io/shivakiran_kotur/1376452
Check out 200+ Python Question and Answer for Data Engineer asked in Interview in
Topmate: https://topmate.io/shivakiran_kotur/1337666
SQL Mastery Series – 110 Question using 4 table – Set 6
Emp Table:
DEPT
Salgrade
JobHistory
-- 41) Display those employees whose salary is an odd value.
SELECT ENAME
FROM EMP
WHERE SAL % 2 = 1;
-- 43) Display those employees who joined the company in the month
of December.
SELECT ENAME
FROM EMP
WHERE MONTH(HIREDATE) = 12;
-- 48) Display the details of employees who are getting the same
salary as the minimum salary of any department.
SELECT *
FROM EMP
WHERE SAL IN (
SELECT MIN(SAL)
FROM EMP
GROUP BY DEPTNO
);
-- 49) Display the names of employees who joined in the year 1981
and are not getting any commission.
SELECT ENAME
FROM EMP
WHERE YEAR(HIREDATE) = 1981
AND COMM IS NULL;
Check out 200+ scenario based databricks and pyspark Scenario based Question in
Topmate: https://topmate.io/shivakiran_kotur/1376452
Check out 200+ Python Question and Answer for Data Engineer asked in Interview in
Topmate: https://topmate.io/shivakiran_kotur/1337666
SQL Mastery Series – 110 Question using 4 table – Set 7
Emp Table:
DEPT
Salgrade
JobHistory
-- 51) Display employee names whose total earnings (salary +
commission) is greater than the average earnings of the company.
SELECT ENAME
FROM EMP
WHERE (SAL + ISNULL(COMM, 0)) > (
SELECT AVG(SAL + ISNULL(COMM, 0))
FROM EMP
);
-- 54) Display those employees who joined on the 10th of any month.
SELECT ENAME
FROM EMP
WHERE DAY(HIREDATE) = 10;
-- 55) Display the names of employees who have not joined in the
year 1981.
SELECT ENAME
FROM EMP
WHERE YEAR(HIREDATE) <> 1981;
-- 56) Display the names of employees who are not clerks and whose
salary is not more than 3000.
SELECT ENAME
FROM EMP
WHERE JOB <> 'CLERK'
AND SAL <= 3000;
-- 57) Display the names of employees who have not joined in the
month of December.
SELECT ENAME
FROM EMP
WHERE MONTH(HIREDATE) <> 12;
-- 59) Display the names of employees who earn more than the
minimum salary of their department.
SELECT ENAME
FROM EMP E
WHERE SAL > (
SELECT MIN(SAL)
FROM EMP
WHERE DEPTNO = E.DEPTNO
);
-- 64) Display the names of employees who joined in the first half
of the year.
SELECT ENAME
FROM EMP
WHERE MONTH(HIREDATE) <= 6;
-- 65) Display the names of employees who have a higher salary than
any clerk.
SELECT ENAME
FROM EMP
WHERE SAL > ANY (SELECT SAL FROM EMP WHERE JOB = 'CLERK');
Check out 200+ scenario based databricks and pyspark Scenario based Question in
Topmate: https://topmate.io/shivakiran_kotur/1376452
Check out 200+ Python Question and Answer for Data Engineer asked in Interview in
Topmate: https://topmate.io/shivakiran_kotur/1337666