Advanced SQL
Advanced SQL
Displaying the rank associated with each row is a common request, and there is no straightforward way to do so in SQL. To display rank in SQL, the idea is to do a self-join, list out the results in order, and do a count on the number of records that's listed ahead of (and including) the record of interest. Let's use an example to illustrate. Say we have the following table, Table Total_Sales
Name John Stella Sophia Greg Jeff Sales 10 20 40 50 20
Jennifer 15
we would type, SELECT a1.Name, a1.Sales, COUNT(a2.sales) Sales_Rank FROM Total_Sales a1, Total_Sales a2 WHERE a1.Sales <= a2.Sales or (a1.Sales=a2.Sales and a1.Name = a2.Name) GROUP BY a1.Name, a1.Sales ORDER BY a1.Sales DESC, a1.Name DESC; Result: Name Greg Sophia Stella Jeff Jennifer John Sales 50 40 20 20 15 10 Sales_Rank 1 2 3 3 5 6
Let's focus on the WHERE clause. The first part of the clause, (a1.Sales <= a2.Sales), makes sure we are only counting the number of occurrences where the value in the Sales column is less than or equal to itself. If there are no duplicate values in the Sales column, this portion of the WHERE clause by itself would be sufficient to generate the correct ranking. The second part of the clause, (a1.Sales=a2.Sales and a1.Name = a2.Name), ensures that when there are duplicate values in the Sales column, each one would get the correct rank.
To get the median, we need to be able to accomplish the following: Sort the rows in order and find the rank for each row. Determine what is the "middle" rank. For example, if there are 9 rows, the middle rank would be 5. Obtain the value for the middle-ranked row.
Let's use an example to illustrate. Say we have the following table, Table Total_Sales
Name John Stella Sophia Greg Jeff Sales 10 20 40 50 20
Jennifer 15
we would type, SELECT Sales Median FROM (SELECT a1.Name, a1.Sales, COUNT(a1.Sales) Rank FROM Total_Sales a1, Total_Sales a2 WHERE a1.Sales < a2.Sales OR (a1.Sales=a2.Sales AND a1.Name <= a2.Name) group by a1.Name, a1.Sales order by a1.Sales desc) a3 WHERE Rank = (SELECT (COUNT(*)+1) DIV 2 FROM Total_Sales); Result: Median 20 You will find that Lines 2-6 are the same as how we find the rank of each row. Line 7 finds the "middle" rank. DIV is the way to find the quotient in MySQL, the exact way to obtain the quotient may be different with other databases. Finally, Line 1 obtains the value for the middle-ranked row.
Table Total_Sales
Name John Stella Sophia Greg Jeff Sales 10 20 40 50 20
Jennifer 15
we would type, SELECT a1.Name, a1.Sales, SUM(a2.Sales) Running_Total FROM Total_Sales a1, Total_Sales a2 WHERE a1.Sales <= a2.sales or (a1.Sales=a2.Sales and a1.Name = a2.Name) GROUP BY a1.Name, a1.Sales ORDER BY a1.Sales DESC, a1.Name DESC; Result: Name Greg Sophia Stella Jeff Jennifer John Sales 50 40 20 20 15 10 Running_Total 50 90 110 130 145 155
The combination of the WHERE clause and the ORDER BY clause ensure that the proper running totals are tabulated when there are duplicate values.
Jennifer 15
Jeff
20
we would type, SELECT a1.Name, a1.Sales, a1.Sales/(SELECT SUM(Sales) FROM Total_Sales) Pct_To_Total FROM Total_Sales a1, Total_Sales a2 WHERE a1.Sales <= a2.sales or (a1.Sales=a2.Sales and a1.Name = a2.Name) GROUP BY a1.Name, a1.Sales ORDER BY a1.Sales DESC, a1.Name DESC; Result: Name Greg Sophia Stella Jeff Jennifer John Sales 50 40 20 20 15 10 Pct_To_Total 0.3226 0.2581 0.1290 0.1290 0.0968 0.0645
The subquery "SELECT SUM(Sales) FROM Total_Sales" calculates the sum. We can then divide the individual values by this sum to obtain the percent to total for each row.
Jennifer 15
we would type, SELECT a1.Name, a1.Sales, SUM(a2.Sales)/(SELECT SUM(Sales) FROM Total_Sales) Pct_To_Total FROM Total_Sales a1, Total_Sales a2 WHERE a1.Sales <= a2.sales or (a1.Sales=a2.Sales and a1.Name = a2.Name) GROUP BY a1.Name, a1.Sales ORDER BY a1.Sales DESC, a1.Name DESC;
Result: Name Greg Sophia Stella Jeff Jennifer John Sales 50 40 20 20 15 10 Pct_To_Total 0.3226 0.5806 0.7097 0.8387 0.9355 1.0000
The subquery "SELECT SUM(Sales) FROM Total_Sales" calculates the sum. We can then divide the running total, "SUM(a2.Sales)", by this sum to obtain the cumulative percent to total for each row.
http://www.1keydata.com/sql/sql-cumulative-percent-to-total.html
Below are the results for each aggregate function: SUM (Sales) = 600 AVG (Sales) = 200 MAX (Sales) = 300
MIN (Sales) = 100 COUNT (Sales) = 3 Note that the AVG function counts only 3 rows (the NULL row is excluded), so the average is 600 / 3 = 200, not 600 / 4 = 150. The COUNT function also ignores the NULL rolw, which is why COUNT (Sales) = 3.
The following SQL, SELECT SUM(ISNULL(Sales,100)) FROM Sales_Data; returns 400. This is because NULL has been replaced by 100 via the ISNULL function. MySQL In MySQL, the ISNULL() function is used to test whether an expression is NULL. If the expression is NULL, this function returns 1. Otherwise, this function returns 0. For example, ISNULL(3*3) returns 0 ISNULL(3/0) returns 1
The following SQL, SELECT SUM(IFNULL(Sales,100)) FROM Sales_Data; returns 400. This is because NULL has been replaced by 100 via the ISNULL function.
The following SQL, SELECT SUM(NVL(Sales,100)) FROM Sales_Data; returns 550. This is because NULL has been replaced by 100 via the ISNULL function, hence the sum of the 3 rows is 300 + 100 + 150 = 550.
FROM "table_name" For examples, say we have the following table, Table Contact_Info
Name Business_Phone Cell_Phone Home_Phone Jeff 531-2531 622-7813 772-5588 NULL 565-9901 312-4088 594-7477 Laura NULL Peter NULL
and we want to find out the best way to contact each person according to the following rules: 1. If a person has a business phone, use the business phone number. 2. If a person does not have a business phone and has a cell phone, use the cell phone number. 3. If a person does not have a business phone, does not have a cell phone, and has a home phone, use the home phone number. We can use the COALESCE function to achieve our goal: SELECT Name, COALESCE(Business_Phone, Cell_Phone, Home_Phone) Contact_Phone FROM Contact_Info; Result:
Name Contact_Phone Jeff 531-2531 Laura 772-5588 Peter 594-7477
Table Sales_Data
Store_name Actual Goal Store A Store B Store C 50 40 25 50 50 30
We want to show NULL if actual sales is equal to sales goal, and show actual sales if the two are different. To do this, we issue the following SQL statement: SELECT Store_name, NULLIF(Actual,Goal) FROM Sales_Data; The result is:
Store_name NULLIF(Actual,Goal) Store A Store B Store C NULL 40 25
we key in,
SELECT store_name, Sales, Date FROM Store_Information ORDER BY Sales DESC LIMIT 2; Result: store_name Sales Boston Date