SQL
SQL
Like I
LIKE can be a useful operator when you want to compare similar values.
The movies table contains two films with similar titles, 'Se7en' and 'Seven'.
How could we select all movies that start with 'Se' and end with 'en' and have exactly one
character in the middle?
SELECT * FROM movies WHERE name LIKE 'Se_en';
is a special operator used with the WHERE clause to search for a specific pattern in
LIKE
a column.
name LIKE 'Se_en' is a condition evaluating the name column for a specific pattern.
Se_en represents a pattern with a wildcardcharacter.
The _ means you can substitute any individual character here without breaking the pattern.
The names Seven and Se7en both match this pattern.
1.
Let's test it out.
Like II
The percentage sign % is another wildcard character that can be used with LIKE.
This statement below filters the result set to only include movies with names that begin
with the letter 'A':
SELECT * FROM movies WHERE name LIKE 'A%';
%is a wildcard character that matches zero or more missing letters in the pattern. For
example:
A% matches all movies with names that begin with letter 'A'
%a matches all movies that end with 'a'
LIKE is not case sensitive. 'Batman' and 'Man of Steel' will both appear in the result of the
query above.
1.
In the text editor, type:
SELECT * FROM movies WHERE name LIKE '%man%';
How many movie titles contain the word 'man'?
LIKE c% finds any values that start with the letter 'c'
LIKE %c finds any values that end with the letter 'c'
LIKE %re% finds values that have 're' in any position
LIKE _a% finds any values that have the letter 'a' in the second index
LIEK a_%_% finds any values that start with 'a' and are at least 3 characters in length.
LIKE a%r finds any values that start with 'a' and end with 'r'.
2.
Let's try one more.
Edit the query so that it selects all the information about the movie titles that begin with
the word 'The'.
Is Null
By this point of the lesson, you might have noticed that there are a few missing values in
the movies table. More often than not, the data you encounter will have missing values.
It is not possible to test for NULL values with comparison operators, such as = and !=.
IS NULL
IS NOT NULL
We want to query for movies that have a missing value in their imdb_rating field:
Between
The BETWEEN operator can be used in a WHEREclause to filter the result set within a
certain range. The values can be numbers, text or dates.
This statement filters the result set to only include movies with names that begin with letters
'A' up to, but not including 'J'.
SELECT * FROM movies WHERE name BETWEEN 'A' AND 'J';
Here is another one:
SELECT * FROM movies WHERE year BETWEEN 1990 AND 1999;
In this statement, the BETWEEN operator is being used to filter the result set to only include
movies with years between 1990 up to, and including 1999.
Instructions
1.
Using the BETWEEN operator, write a query that selects all information about movies
whose name begins with the letters 'D', 'E', and 'F'.
This should be very similar to the first query in the narrative.
BETWEEN 'D' AND 'G' should be the condition:
The upper bound should be 'G' because BETWEEN is not inclusive of the second letter.
BETWEEN 'D' AND 'F' shouldn't be the condition because it would return names that begin with 'D' and
'E', but not 'F'.
BETWEEN is case-sensitive. If the condition is BETWEEN 'a' AND 'z', it would only return lowercase (a-z)
results and not uppercase (A-Z).
2.
Remove the previous query.
Using the BETWEEN operator, write a new query that selects all information about movies that
were released in the 1970's.
In this statement, the BETWEEN operator is being used to filter the result set to only include movies with
years in 1970-1979:
Also, numeric values (INTEGER or REAL data types) don't need to be wrapped with single quotes,
whereas TEXT string values do.
And
Sometimes we want to combine multiple conditions in a WHERE clause to make the result set
more specific and useful.
One way of doing this is to use the ANDoperator. Here, we use the AND operator to only
return 90's romance movies.
SELECT * FROM movies WHERE year BETWEEN 1990 AND 1999 AND genre = 'romance';
year BETWEEN 1990 AND 1999 is the 1st condition.
genre = 'romance' is the 2nd condition.
AND combines the two conditions.
With AND, both conditions must be true for the row to be included in the result.
1.
In the previous exercise, we retrieved every movie released in the 1970's.
Now, let's retrieve every movie released in the 70's, that's also well received.
Or
Similar to AND, the OR operator can also be used to combine multiple conditions in WHERE, but
there is a fundamental difference:
We are putting OR genre = 'action' on another line and indented just so it is easier to read.
Order By
That's it with WHERE and its operators. Moving on!
It is often useful to list the data in our result set in a particular order.
We can sort the results using ORDER BY, either alphabetically or numerically. Sorting the
results often makes the data more useful and easier to analyze.
For example, if we want to sort everything by the movie's title from A through Z:
SELECT * FROM movies ORDER BY name;
ORDER BY is a clause that indicates you want to sort the result set by a particular
column.
name is the specified column.
Sometimes we want to sort things in a decreasing order. For example, if we want to select
all of the well-received movies, sorted from highest to lowest by their year:
SELECT * FROM movies WHERE imdb_rating > 8 ORDER BY year DESC;
DESCis a
keyword used in ORDER BY to sort the results in descending order (high to low or Z-A).
ASC is a keyword used in ORDER BY to sort the results in ascending order (low to high or
A-Z).
The column that we ORDER BY doesn't even have to be one of the columns that we're
displaying.
Suppose we want to retrieve the nameand year columns of all the movies, ordered by their
name alphabetically.
Write a new query that retrieves the name, year, and imdb_rating columns of all the movies,
ordered highest to lowest by their ratings.
Hint
What are the columns that are selected and the table we are interested in?
SELECT name, year, imdb_rating FROM movies;
Next, let's sort them.
SELECT name, year, imdb_rating FROM movies ORDER BY imdb_rating DESC;
We added DESC here because we want to sort it in a descending order.
If you run this query, the result will start with movies with an IMDb rating of 9.0 all the way
down to 4.2.
Limit
We've been working with a fairly small table (fewer than 250 rows), but most SQL tables
contain hundreds of thousands of records. In those situations, it becomes important to cap
the number of rows in the result.
For instance, imagine that we just want to see a few examples of records.
SELECT * FROM movies LIMIT 10;
LIMIT is a clause that lets you specify the maximum number of rows the result set will have.
This saves space on our screen and makes our queries run faster.
Here, we specify that the result set can't have more than 10 rows.
LIMITalways goes at the very end of the query. Also, it is not supported in all SQL
databases.
Instructions
1.
Combining your knowledge of LIMITand ORDER BY, write a query that returns the top 3
highest rated movies.
Case
A CASE statement allows us to create different outputs (usually in the SELECT statement). It is
SQL's way of handling if-then logic.
Suppose we want to condense the ratings in movies to three levels:
Each WHEN tests a condition and the following THEN gives us the string if the condition
is true.
The ELSE gives us the string if all the above conditions are false.
The CASE statement must end with END.
In the result, you have to scroll right because the column name is very long. To shorten it,
we can rename the column to 'Review' using AS:
SELECT name, CASE WHEN imdb_rating > 8 THEN 'Fantastic' WHEN imdb_rating > 6 THEN 'Poorly
Received' ELSE 'Avoid at All Costs' END AS 'Review' FROM movies;
1.
Select the name column and use a CASEstatement to create the second column that is:
SELECT name, CASE WHEN genre = 'romance' THEN 'Chill' WHEN genre = 'comedy' THEN 'Chill' ELSE
'Intense' END AS 'Mood' FROM movies;
Review
Congratulations!
We just learned how to query data from a database using SQL. We also learned how to
filter queries to make the information more specific and useful.
Let's summarize:
SELECT is the clause we use every time we want to query information from a database.
AS renames a column or table.
DISTINCT return unique values.
WHERE is a popular command that lets you filter the results of the query based on
conditions that you specify.
LIKE and BETWEEN are special operators.
AND and OR combines multiple conditions.
ORDER BY sorts the result.
LIMIT specifies the maximum number of rows that the query will return.
CASE creates different outputs.
Instructions
Feel free to experiment a bit more with the movies table before moving on!
AGGREGATE FUNCTIONS
Introduction
We've learned how to write queries to retrieve information from the database. Now, we
are going to learn how to perform calculations using SQL.
In this lesson, we have given you a table named fake_apps which is made up of fake mobile
applications data.
Here is a quick preview of some important aggregates that we will cover in the next five
exercises:
Before getting started, take a look at the data in the fake_apps table.
AGGREGATE FUNCTIONS
Count
The fastest way to calculate how many rows are in a table is to use the COUNT() function.
is a function that takes the name of a column as an argument and counts the
COUNT()
number of non-empty values in that column.
SELECT COUNT(*) FROM table_name;
Here, we want to count every row, so we pass * as an argument inside the parenthesis.
1.
Let's count how many apps are in the table.
Common errors:
Missing parenthesis.
Missing ;.
2.
Add a WHERE clause in the previous query to count how many free apps are in the table.
Hint
Remember the WHERE statement?
The following code should go inside the previous query, before the semicolon:
SELECT COUNT(*) FROM fake_apps WHERE price = 0;
WHERE indicates we want to only include rows where the following condition is true.
price = 0 is the condition.
Sum
SQL makes it easy to add all values in a particular column using SUM().
SUM() is a function that takes the name of a column as an argument and returns the sum of
all the values in that column.
What is the total number of downloads for all of the apps combined?
SELECT SUM(downloads) FROM fake_apps;
This adds all values in the downloads column.
1.
Let's find out the answer!
Max / Min
The MAX() and MIN() functions return the highest and lowest values in a column,
respectively.
How many downloads does the most popular app have?
SELECT MAX(downloads) FROM fake_apps;
The most popular app has 31,090 downloads!
MAX()takes the name of a column as an argument and returns the largest value in that
column. Here, we returned the largest value in the downloads column.
MIN() works the same way but it does the exact opposite; it returns the smallest value.
1.
What is the least number of times an app has been downloaded?
Average
SQL uses the AVG() function to quickly calculate the average value of a particular column.
The statement below returns the average number of downloads for an app in our
database:
SELECT AVG(downloads) FROM fake_apps;
The AVG() function works by taking a column name as an argument and returns the
average value for that column.
1.
Calculate the average number of downloads for all the apps in the table.
Write a new query that calculates the average price for all the apps in the table.
Hint
Which column should go inside the parenthesis?
SELECT AVG(_____) FROM fake_apps;
The average price is $2.02365.
Round
By default, SQL tries to be as precise as possible without rounding. We can make the result
table easier to read using the ROUND()function.
1. a column name
2. an integer
It rounds the values in the column to the number of decimal places specified by the
integer.
SELECT ROUND(price, 0) FROM fake_apps;
Here, we pass the column price and integer 0as arguments. SQL rounds the values in the
column to 0 decimal places in the output.
1.
In the last exercise, we were able to get the average price of an app ($2.02365) using this
query:
SELECT AVG(price) FROM fake_apps;
Now, let's edit this query so that it rounds this result to 2 decimal places.
ROUND(AVG(price), 2)
Here, AVG(price) is the 1st argument and 2 is the 2nd argument because we want to round it to two
decimal places:
Group By I
Oftentimes, we will want to calculate an aggregate for data with certain characteristics.
For instance, we might want to know the mean IMDb ratings for all movies each year. We
could calculate each number by a series of queries with different WHERE statements, like so:
SELECT AVG(imdb_rating) FROM movies WHERE year = 1999;
The GROUP BY statement comes after any WHEREstatements, but before ORDER BY or LIMIT.
1.
In the code editor, type:
SELECT price, COUNT(*) FROM fake_apps GROUP BY price;
Here, our aggregate function is COUNT()and we arranged price into groups.
What do you expect the result to be?
The result contains the total number of apps for each price.
It is organized into two columns, making it very easy to see the number of apps at each price.
2.
In the previous query, add a WHEREclause to count the total number of apps that have been
downloaded more than 20,000 times, at each price.
Hint
Remember, WHERE statement goes before the GROUP BY statement:
SELECT price, COUNT(*) FROM fake_apps WHERE downloads > 20000 GROUP BY price;
3.
Remove the previous query.
Write a new query that calculates the total number of downloads for each category.
Group By II
Sometimes, we want to GROUP BY a calculation done on a column.
For instance, we might want to know how many movies have IMDb ratings that round to 1,
2, 3, 4, 5. We could do this using the following syntax:
SELECT ROUND(imdb_rating), COUNT(name) FROM movies GROUP BY ROUND(imdb_rating) ORDER BY
ROUND(imdb_rating);
However, this query may be time-consuming to write and more prone to error.
SQL lets us use column reference(s) in our GROUP BY that will make our lives easier.
and so on.
The following query is equivalent to the one above:
SELECT ROUND(imdb_rating), COUNT(name) FROM movies GROUP BY 1 ORDER BY 1;
Here, the 1 refers to the first column in our SELECT statement, ROUND(imdb_rating).
1.
Suppose we have the query below:
SELECT category, price, AVG(downloads) FROM fake_apps GROUP BY category, price;
Write the exact query, but use column reference numbers instead of column names
after GROUP BY.
1 refers to category.
2 refers to price.
3 refers to AVG(downloads)
Having
In addition to being able to group data using GROUP BY, SQL also allows you to filter which
groups to include and which to exclude.
For instance, imagine that we want to see how many movies of different genres were
produced each year, but we only care about years and genres with at least 10 movies.
We can't use WHERE here because we don't want to filter the rows; we want to filter groups.
HAVINGis very similar to WHERE. In fact, all types of WHERE clauses you learned about thus far
can be used with HAVING.
HAVING statement always comes after GROUP BY, but before ORDER BY and LIMIT.
1.
Suppose we have the query below:
SELECT price, ROUND(AVG(downloads)), COUNT(*) FROM fake_apps GROUP BY price;
It returns the average downloads (rounded) and the number of apps – at each price point.
However, certain price points don't have very many apps, so their average downloads are
less meaningful.
Add a HAVING clause to restrict the query to price points that have more than 10 apps.
The total number of apps at each price point would be given by COUNT(*).
SELECT price, ROUND(AVG(downloads)), COUNT(*) FROM fake_apps GROUP BY price HAVING COUNT(*) > 10;
COUNT(*) > 10 is the condition.
Because the condition has an aggregate function in it, we have to use HAVING instead of WHERE.
Review
Congratulations!
You just learned how to use aggregate functions to perform calculations on your data.
What can we generalize so far?
COUNT(): count the number of rows COUNT (): cuente el número de filas
SUM (): la suma de los valores en una
SUM(): the sum of the values in a columna
column MAX () / MIN (): el valor más grande / más
pequeño
MAX()/MIN(): the largest/smallest value
AVG (): el promedio de los valores en una
AVG(): the average of the values in a columna
column ROUND (): redondea los valores en la columna
MULTIPLE TABLES
Introduction
In order to efficiently store data, we often spread related information across multiple
tables.
For instance, imagine that we're running a magazine company where users can have
different types of subscriptions to different products. Different subscriptions might have
many different properties. Each customer would also have lots of associated information.
order_id
customer_id
customer_name
customer_address
subscription_id
subscription_description
subscription_monthly_price
subscription_length
purchase_date
However, a lot of this information would be repeated. If the same customer has multiple
subscriptions, that customer's name and address will be reported multiple times. If the
same subscription type is ordered by multiple customers, then the subscription price and
subscription description will be repeated. This will make our table big and unmanageable.
orders would contain just the information necessary to describe what was ordered:
o order_id
o customer_id
o subscription_id
o purchase_date
subscriptions would contain the information to describe each type of subscription:
o subscription_id
o description
o price_per_month
o subscription_length
purchase
order_id customer_id subscription_id
date
2017-01-
1 2 3
01
2017-01-
2 2 2
01
2017-01-
3 3 1
01
subscriptions
(a table that describes each type of subscription)
Politics 12
1 5
Magazine months
Fashion 6
2 10
Magazine months
Sports 3
3 7
Magazine months
customers
(a table with customer names and contact information)
orders
subscriptions
customers
If we just look at the orders table, we can't really tell what's happened in each order.
However, if we refer to the other tables, we can get a complete picture.
Let's examine the order with an order_id of 2. It was purchased by the customer with
a customer_id of 2.
To find out the customer's name, we look at the customers table and look for the item with
a customer_id value of 2. We can see that Customer 2's name is 'Jane Doe' and that she lives
at '456 Park Ave'.
The subscription with a subscription_id of 3 is in the third row. Its description is Sports
Magazine.
1. The first line selects all columns from our combined table. If we only want to select
certain columns, we can specify which ones we want.
2. The second line specifies the first table that we want to look in, orders
3. The third line uses JOIN to say that we want to combine information
from orderswith customers.
4. The fourth line tells us how to combine the two tables. We want to
match orderstable's customer_id column with customerstable's customer_id column.
Because column names are often repeated across multiple tables, we use the
syntax table_name.column_name to be sure that our requests for columns are unambiguous. In
our example, we use this syntax in the ONstatement, but we will also use it in the SELECTor
any other statement where we refer to column names.
For example: Instead of selecting all the columns using *, if we only wanted to
select orders table's order_id column and customerstable's customer_name column, we could use
the following query:
SELECT orders.order_id, customers.customer_name FROM orders JOIN customers ON orders.customer_id =
customers.customer_id;
Instructions
1.
Join orders table and subscriptionstable and select all columns.
Suppose we do:
SELECT * FROM orders LIMIT 10; SELECT * FROM subscriptions LIMIT 10;
You will notice that both orders table and subscriptions table have a subscription_idcolumn.
orders.subscription_id
subscriptions.subscription_id
Answer:
MULTIPLE TABLES
Inner Joins
Let's revisit how we joined orders and customers. For every possible value
of customer_id in orders, there was a corresponding row of customers with the
same customer_id.
For instance, imagine that our customers table was out of date, and was missing any
information on customer 11. If that customer had an order in orders, what would happen
when we joined the tables?
When we perform a simple JOIN (often called an inner join) our result only includes rows
that match our ON condition.
Consider the following animation, which illustrates an inner join of two tables on table1.c2
= table2.c2:
The first and last rows have matching values of c2. The middle rows do not match. The final
result has all values from the first and last rows but does not include the non-matching
middle row.
Instructions
1.
Suppose we are working for The Codecademy Times, a newspaper with two types of
subscriptions:
print newspaper
online articles
Some users subscribe to just the newspaper, some subscribe to just the online edition, and
some subscribe to both.
There is a newspaper table that contains information about the newspaper subscribers.
Count the number of subscribers who get a print newspaper using COUNT().
There is also an online table that contains information about the online subscribers.
Count the number of subscribers who get an online newspaper using COUNT().
Suppose we do:
SELECT * FROM newspaper LIMIT 10; SELECT * FROM online LIMIT 10;
You will notice that both newspaper table and online table have an id column.
newspaper.id
online.id
ON newspaper.id = online.id
Remember to use SELECT COUNT(*) to count the rows:
60
COUNT(*)
65
COUNT(*)
50
Left Joins
What if we want to combine two tables and keep some of the un-matched rows?
SQL lets us do this through a command called LEFT JOIN. A left join will keep all rows from
the first table, regardless of whether there is a matching row in the second table.
The first and last rows have matching values of c2. The middle rows do not match. The final
result will keep all rows of the first table but will omit the un-matched row from the
second table.
This animation represents a table operation produced by the following command:
SELECT * FROM table1 LEFT JOIN table2 ON table1.c2 = table2.c2;
1.
Let's return to our newspaper and onlinesubscribers.
Suppose we want to know how many users subscribe to the print newspaper, but not to
the online.
Start by performing a left join of newspaper table and online table on their id columns and
selecting all columns.
In order to find which users do notsubscribe to the online edition, we need to add
a WHERE clause.
Add a second query after your first one that adds the following WHERE clause and condition:
WHERE online.id IS NULL
This will select rows where there was no corresponding row from the onlinetable.
Hint
Don't remove your previous query.
Each of these tables has a column that uniquely identifies each row of that table:
1 2 3 2017-01-01
2 2 2 2017-01-01
3 3 1 2017-01-01
Note that customer_id (the primary key for customers) and subscription_id (the primary key
for subscriptions) both appear in this.
When the primary key for one table appears in a different table, it is called a foreign key.
So customer_id is a primary key when it appears in customers, but a foreign key when it
appears in orders.
In this example, our primary keys all had somewhat descriptive names. Generally, the
primary key will just be called id. Foreign keys will have more descriptive names.
Why is this important? The most common types of joins will be joining a foreign key from
one table with the primary key from another table. For instance, when we
join orders and customers, we join on customer_id, which is a foreign key in orders and the
primary key in customers.
1.
The classes table contains information on the classes that the school offers. Its
primary key is id.
The students table contains information on all students in the school. Its primary key
is id. It contains the foreign key class_id, which corresponds to the primary key
of classes.
Perform an inner join of classes and students using the primary and foreign keys described
above, and select all the columns.
Your ON statement should include:
Cross Join
So far, we've focused on matching rows that have some information in common.
Sometimes, we just want to combine all rows of one table with all rows of another table.
For instance, if we had a table of shirts and a table of pants, we might want to know all the
possible combinations to create different outfits.
The first two lines select the columns shirt_color and pants_color.
The third line pulls data from the table shirts.
The fourth line performs a CROSS JOIN with pants.
Notice that cross joins don't require an ONstatement. You're not really joining on any
columns!
If we have 3 different shirts (white, grey, and olive) and 2 different pants (light denim and
black), the results might look like this:
shirt_color pants_color
white black
grey black
olive black
This clothing example is fun, but it's not very practically useful.
A more common usage of CROSS JOIN is when we need to compare each row of a table to a
list of values.
Let's return to our newspaper subscriptions. This table contains two columns that we haven't
discussed yet:
start_month: the first month where the customer subscribed to the print newspaper
(i.e., 2 for February)
end_month: the final month where the customer subscribed to the print newspaper
Suppose we wanted to know how many users were subscribed during each month of the
year. For each month (1, 2, 3) we would need to know if a user was subscribed. Follow the
steps below to see how we can use a CROSS JOIN to solve this problem.
1.
Eventually, we'll use a cross join to help us, but first, let's try a simpler problem.
Let's start by counting the number of customers who were subscribed to
the newspaper during March.
Use COUNT(*) to count the number of rows and a WHERE clause to restrict to two conditions:
start_month <= 3
end_month >= 3
"During March" means that the customer's starting month was in or before March and
final month was in or after March:
SELECT COUNT(*) FROM newspaper WHERE start_month <= 3 AND end_month >= 3;
Answer: 13
2.
Don't remove the previous query.
The previous query lets us investigate one month at a time. In order to check across all
months, we're going to need to use a cross join.
Our database contains another table called months which contains the numbers between 1
and 12.
Select all columns from the cross join of newspaper and months.
Hint
SELECT * FROM newspaper CROSS JOIN months;
When you get the result, make sure to scroll right to take a look at the rightmost
column, month.
Create a third query where you add a WHERE statement to your cross join to restrict to two
conditions:
Create a final query where you aggregate over each month to count the number of
subscribers.
SELECT COUNT(*)
FROM newspaper
SELECT *
FROM newspaper
SELECT *
FROM newspaper
FROM newspaper
GROUP BY month;
Union
Sometimes we just want to stack one dataset on top of the other. Well, the UNION operator
allows us to do that.
Suppose we have two tables and they have the same columns.
table1:
pokemon type
Bulbasaur Grass
Charmander Fire
Squirtle Water
table2:
pokemon type
Snorlax Normal
pokemon type
Bulbasaur Grass
Charmander Fire
Squirtle Water
Snorlax Normal
SQL has strict rules for appending data:
1.
Let's return to our newspaper and onlinesubscriptions. We'd like to create one big table with
both sets of data.
Use UNION to stack the newspaper table on top of the online table.
SELECT *
FROM newspaper
UNION
SELECT *
FROM online;
With
Often times, we want to combine two tables, but one of the tables is the result of another
calculation.
Let's return to our magazine order example. Our marketing department might want to
know a bit more about our customers. For instance, they might want to know how many
magazines each customer subscribes to. We can easily calculate this using our orders table:
SELECT customer_id, COUNT(subscription_id) AS 'subscriptions' FROM orders GROUP BY customer_id;
This query is good, but a customer_id isn't terribly useful for our marketing department, they
probably want to know the customer's name.
We want to be able to join the results of this query with our customers table, which will tell
us the name of each customer. We can do this by using a WITH clause.
WITH previous_results AS ( SELECT ... ... ... ... ) SELECT * FROM previous_results JOIN customers
ON _____ = _____;
The WITH statement allows us to perform a separate query (such as aggregating
customer's subscriptions)
previous_results is the alias that we will use to reference any columns from the query
inside of the WITH clause
We can then go on to do whatever we want with this temporary table (such as join
the temporary table with another table)
Essentially, we are putting a whole first query inside the parentheses () and giving it a
name. After that, we can use this name as if it's a table and write a new query using the
first query.
1.
Place the whole query below into a WITHstatement, inside parentheses (), and give it
name previous_query:
SELECT customer_id, COUNT(subscription_id) AS 'subscriptions' FROM orders GROUP BY customer_id
Join the temporary table previous_querywith customers table and select the following
columns:
customers.customer_name
previous_query.subscriptions
ON previous_query.customer_id = customers.customer_id
And for review, AS is how you give something an alias.
Here, we are essentially giving everything inside the parentheses (the sub-query) the name
of previous_query using AS.
Then, previous_query is used as a temporary table that we will query from and also join with
the customers table:
WITH previous_query AS
( SELECT customer_id, COUNT(subscription_id) AS 'subscriptions'
FROM orders
GROUP BY customer_id )
SELECT customers.customer_name, previous_query.subscriptions
FROM previous_query
JOIN customers
ON previous_query.customer_id = customers.customer_id;
Do not include ; inside of the () of your WITHstatement.
MULTIPLE TABLES
Review
In this lesson, we learned about relationships between tables in relational databases and
how to query information from multiple tables using SQL.
JOIN will combine rows from different tables if the join condition is true.
LEFT JOIN will return every row in the lefttable, and if the join condition is not
met, NULL values are used to fill in the columns from the right table.
Primary key is a column that serves a unique identifier for the rows in the table.
Foreign key is a column that contains the primary key to another table.
CROSS JOIN lets us combine all rows of one table with all rows of another table.
UNION stacks one dataset on top of another.
WITH allows us to define one or more temporary tables that can be used in the final
query.
Instructions
Feel free to experiment a bit more with the orders, subscriptions, customers tables before
moving on!