0% found this document useful (0 votes)
198 views10 pages

Advanced SQL

The document discusses various advanced SQL techniques for ranking, finding the median, calculating running totals, percentages, and cumulative percentages in SQL. It explains that ranking and running totals require a self-join and ordering results to count or sum preceding values. Finding the median involves determining the middle rank. Percentages are calculated by dividing individual values by a subquery total. Cumulative percentages divide running totals by the subquery total.

Uploaded by

Raghav Prabhu
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
Download as doc, pdf, or txt
0% found this document useful (0 votes)
198 views10 pages

Advanced SQL

The document discusses various advanced SQL techniques for ranking, finding the median, calculating running totals, percentages, and cumulative percentages in SQL. It explains that ranking and running totals require a self-join and ordering results to count or sum preceding values. Finding the median involves determining the middle rank. Percentages are calculated by dividing individual values by a subquery total. Cumulative percentages divide running totals by the subquery total.

Uploaded by

Raghav Prabhu
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1/ 10

SQL > Advanced SQL > Rank

Displaying the rank associated with each row is a common request, and there is no straightforward way to do so in SQL. To display rank in SQL, the idea is to do a self-join, list out the results in order, and do a count on the number of records that's listed ahead of (and including) the record of interest. Let's use an example to illustrate. Say we have the following table, Table Total_Sales
Name John Stella Sophia Greg Jeff Sales 10 20 40 50 20

Jennifer 15

we would type, SELECT a1.Name, a1.Sales, COUNT(a2.sales) Sales_Rank FROM Total_Sales a1, Total_Sales a2 WHERE a1.Sales <= a2.Sales or (a1.Sales=a2.Sales and a1.Name = a2.Name) GROUP BY a1.Name, a1.Sales ORDER BY a1.Sales DESC, a1.Name DESC; Result: Name Greg Sophia Stella Jeff Jennifer John Sales 50 40 20 20 15 10 Sales_Rank 1 2 3 3 5 6

Let's focus on the WHERE clause. The first part of the clause, (a1.Sales <= a2.Sales), makes sure we are only counting the number of occurrences where the value in the Sales column is less than or equal to itself. If there are no duplicate values in the Sales column, this portion of the WHERE clause by itself would be sufficient to generate the correct ranking. The second part of the clause, (a1.Sales=a2.Sales and a1.Name = a2.Name), ensures that when there are duplicate values in the Sales column, each one would get the correct rank.

SQL > Advanced SQL > Median

To get the median, we need to be able to accomplish the following: Sort the rows in order and find the rank for each row. Determine what is the "middle" rank. For example, if there are 9 rows, the middle rank would be 5. Obtain the value for the middle-ranked row.

Let's use an example to illustrate. Say we have the following table, Table Total_Sales
Name John Stella Sophia Greg Jeff Sales 10 20 40 50 20

Jennifer 15

we would type, SELECT Sales Median FROM (SELECT a1.Name, a1.Sales, COUNT(a1.Sales) Rank FROM Total_Sales a1, Total_Sales a2 WHERE a1.Sales < a2.Sales OR (a1.Sales=a2.Sales AND a1.Name <= a2.Name) group by a1.Name, a1.Sales order by a1.Sales desc) a3 WHERE Rank = (SELECT (COUNT(*)+1) DIV 2 FROM Total_Sales); Result: Median 20 You will find that Lines 2-6 are the same as how we find the rank of each row. Line 7 finds the "middle" rank. DIV is the way to find the quotient in MySQL, the exact way to obtain the quotient may be different with other databases. Finally, Line 1 obtains the value for the middle-ranked row.

SQL > Advanced SQL > Running Totals


Displaying running totals is a common request, and there is no straightforward way to do so in SQL. The idea for using SQL to display running totals similar to that for displaying rank: first do a self-join, then, list out the results in order. Where as finding the rank requires doing a count on the number of records that's listed ahead of (and including) the record of interest, finding the running total requires summing the values for the records that's listed ahead of (and including) the record of interest. Let's use an example to illustrate. Say we have the following table,

Table Total_Sales
Name John Stella Sophia Greg Jeff Sales 10 20 40 50 20

Jennifer 15

we would type, SELECT a1.Name, a1.Sales, SUM(a2.Sales) Running_Total FROM Total_Sales a1, Total_Sales a2 WHERE a1.Sales <= a2.sales or (a1.Sales=a2.Sales and a1.Name = a2.Name) GROUP BY a1.Name, a1.Sales ORDER BY a1.Sales DESC, a1.Name DESC; Result: Name Greg Sophia Stella Jeff Jennifer John Sales 50 40 20 20 15 10 Running_Total 50 90 110 130 145 155

The combination of the WHERE clause and the ORDER BY clause ensure that the proper running totals are tabulated when there are duplicate values.

SQL > Advanced SQL > Percent To Total


To display percent to total in SQL, we want to leverage the ideas we used for rank/running total plus subquery. Different from what we saw in the SQL Subquery section, here we want to use the subquery as part of the SELECT. Let's use an example to illustrate. Say we have the following table, Table Total_Sales
Name John Stella Sophia Greg Sales 10 20 40 50

Jennifer 15

Jeff

20

we would type, SELECT a1.Name, a1.Sales, a1.Sales/(SELECT SUM(Sales) FROM Total_Sales) Pct_To_Total FROM Total_Sales a1, Total_Sales a2 WHERE a1.Sales <= a2.sales or (a1.Sales=a2.Sales and a1.Name = a2.Name) GROUP BY a1.Name, a1.Sales ORDER BY a1.Sales DESC, a1.Name DESC; Result: Name Greg Sophia Stella Jeff Jennifer John Sales 50 40 20 20 15 10 Pct_To_Total 0.3226 0.2581 0.1290 0.1290 0.0968 0.0645

The subquery "SELECT SUM(Sales) FROM Total_Sales" calculates the sum. We can then divide the individual values by this sum to obtain the percent to total for each row.

SQL > Advanced SQL > Cumulative Percent To Total


To display cumulative percent to total in SQL, we use the same idea as we saw in the Percent To Total section. The difference is that we want the cumulative percent to total, not the percentage contribution of each individual row. Let's use the following example to illuatrate: Table Total_Sales
Name John Stella Sophia Greg Jeff Sales 10 20 40 50 20

Jennifer 15

we would type, SELECT a1.Name, a1.Sales, SUM(a2.Sales)/(SELECT SUM(Sales) FROM Total_Sales) Pct_To_Total FROM Total_Sales a1, Total_Sales a2 WHERE a1.Sales <= a2.sales or (a1.Sales=a2.Sales and a1.Name = a2.Name) GROUP BY a1.Name, a1.Sales ORDER BY a1.Sales DESC, a1.Name DESC;

Result: Name Greg Sophia Stella Jeff Jennifer John Sales 50 40 20 20 15 10 Pct_To_Total 0.3226 0.5806 0.7097 0.8387 0.9355 1.0000

The subquery "SELECT SUM(Sales) FROM Total_Sales" calculates the sum. We can then divide the running total, "SUM(a2.Sales)", by this sum to obtain the cumulative percent to total for each row.

http://www.1keydata.com/sql/sql-cumulative-percent-to-total.html

SQL > SQL Commands > NULL


In SQL, NULL means that data does not exist. NULL does not equal to 0 or an empty string. Both 0 and empty string represent a value, while NULL has no value. Any mathematical operations performed on NULL will result in NULL. For example, 10 + NULL = NULL Aggregate functions such as SUM, COUNT, AVG, MAX, and MIN exclude NULL values. This is not likely to cause any issues for SUM, MAX, and MIN. However, this can lead to confusion with AVG and COUNT. Let's take a look at the following example: Table Sales_Data
store_name Sales Store A Store B Store C Store D 300 200 100 NULL

Below are the results for each aggregate function: SUM (Sales) = 600 AVG (Sales) = 200 MAX (Sales) = 300

MIN (Sales) = 100 COUNT (Sales) = 3 Note that the AVG function counts only 3 rows (the NULL row is excluded), so the average is 600 / 3 = 200, not 600 / 4 = 150. The COUNT function also ignores the NULL rolw, which is why COUNT (Sales) = 3.

SQL > SQL Commands > ISNULL Function


The ISNULL function is available in both SQL Server and MySQL. However, their uses are different: SQL Server In SQL Server, the ISNULL() function is used to replace NULL value with another value. For example, if we have the following table, Table Sales_Data
store_name Sales Store A Store B 300 NULL

The following SQL, SELECT SUM(ISNULL(Sales,100)) FROM Sales_Data; returns 400. This is because NULL has been replaced by 100 via the ISNULL function. MySQL In MySQL, the ISNULL() function is used to test whether an expression is NULL. If the expression is NULL, this function returns 1. Otherwise, this function returns 0. For example, ISNULL(3*3) returns 0 ISNULL(3/0) returns 1

SQL > SQL Commands > IFNULL Function


The IFNULL() function is available in MySQL, and not in SQL Server or Oracle. This function takes two arguments. If the first argument is not NULL, the function returns the first argument. Otherwise, the second argument is returned. This function is commonly used to replace NULL value with another value. It is similar to the NVL function in Oracle and the ISNULL Function in SQL Server.

For example, if we have the following table, Table Sales_Data


store_name Sales Store A Store B 300 NULL

The following SQL, SELECT SUM(IFNULL(Sales,100)) FROM Sales_Data; returns 400. This is because NULL has been replaced by 100 via the ISNULL function.

SQL > SQL Commands > NVL Function


The NVL() function is available in Oracle, and not in MySQL or SQL Server. This function is used to replace NULL value with another value. It is similar to the IFNULL Function in MySQL and the ISNULL Function in SQL Server. For example, if we have the following table, Table Sales_Data
store_name Sales Store A Store B Store C 300 NULL 150

The following SQL, SELECT SUM(NVL(Sales,100)) FROM Sales_Data; returns 550. This is because NULL has been replaced by 100 via the ISNULL function, hence the sum of the 3 rows is 300 + 100 + 150 = 550.

SQL > SQL Commands > Coalesce Function


The COALESCE function in SQL returns the first non-NULL expression among its arguments. It is the same as the following CASE statement: SELECT CASE ("column_name") WHEN "expression 1 is not NULL" THEN "expression 1" WHEN "expression 2 is not NULL" THEN "expression 2" ... [ELSE "NULL"] END

FROM "table_name" For examples, say we have the following table, Table Contact_Info
Name Business_Phone Cell_Phone Home_Phone Jeff 531-2531 622-7813 772-5588 NULL 565-9901 312-4088 594-7477 Laura NULL Peter NULL

and we want to find out the best way to contact each person according to the following rules: 1. If a person has a business phone, use the business phone number. 2. If a person does not have a business phone and has a cell phone, use the cell phone number. 3. If a person does not have a business phone, does not have a cell phone, and has a home phone, use the home phone number. We can use the COALESCE function to achieve our goal: SELECT Name, COALESCE(Business_Phone, Cell_Phone, Home_Phone) Contact_Phone FROM Contact_Info; Result:
Name Contact_Phone Jeff 531-2531 Laura 772-5588 Peter 594-7477

SQL > SQL Commands > NULLIF Function


The NULLIF function takes two arguments. If the two arguments are equal, then NULL is returned. Otherwise, the first argument is returned. It is the same as the following CASE statement: SELECT CASE ("column_name") WHEN "expression 1 = expression 2 " THEN "NULL" [ELSE "expression 1"] END FROM "table_name" For example, let's say we have a table that tracks actual sales and sales goal as below:

Table Sales_Data
Store_name Actual Goal Store A Store B Store C 50 40 25 50 50 30

We want to show NULL if actual sales is equal to sales goal, and show actual sales if the two are different. To do this, we issue the following SQL statement: SELECT Store_name, NULLIF(Actual,Goal) FROM Sales_Data; The result is:
Store_name NULLIF(Actual,Goal) Store A Store B Store C NULL 40 25

SQL > Advanced SQL > Limit


Sometimes we may not want to retrieve all the records that satsify the critera specified in WHERE or HAVING clauses. In MySQL, this is accomplished using the LIMIT keyword. The syntax for LIMIT is as follows: [SQL Statement 1] LIMIT [N] where [N] is the number of records to be returned. Please note that the ORDER BY clause is usually included in the SQL statement. Without the ORDER BY clause, the results we get would be dependent on what the database default is. For example, we may wish to show the two highest sales amounts in Table Store_Information Table Store_Information
store_name Los Angeles San Diego San Francisco Boston Sales Date

$1500 Jan-05-1999 $250 Jan-07-1999 $300 Jan-08-1999 $700 Jan-08-1999

we key in,

SELECT store_name, Sales, Date FROM Store_Information ORDER BY Sales DESC LIMIT 2; Result: store_name Sales Boston Date

Los Angeles $1500 Jan-05-1999 $700 Jan-08-1999

You might also like