Data Query Language3
Data Query Language3
The SELECT statement is used to access and retrieve data from a database. The
syntax of the SELECT statement is:
Where, ALL is represented with an (*) asterisk symbol and displays all the columns
of the table. DISTINCT specifies that only the unique rows should appear in the
result set.
For example, the Employee table is stored in the HumanResources schema of the
AdventureWorks database. To display all the details of the employees, you can use
the following query:
USE AdventureWorks
GO
GO
The result set displays the records in the same order as they are stored in the
source table.
If you need to retrieve specific columns from a table, you can specify the column
names in the SELECT statement.
For example, to view specific details such as EmployeeID, ContactID, LoginID, and
Title of the employees of AdventureWorks, you can specify the column names in
the SELECT statement, as shown in the following query:
In the preceding output, the result set shows the column names in the way they
are present in the table definition. You can customize these column names, if
required.
You can write the query to accomplish the required task in any of the following
ways:
In the preceding figure, the columns are displayed with user-defined headings, but
the original column names in the database table remain unchanged. Similarly, you
might need to make results more explanatory. In such a case, you can add more
text to the values displayed in the columns by using literals. Literals are string
values that are enclosed in single quotes and added to the SELECT statement. The
literal values are printed in a separate column as they are written in the SELECT
list. Therefore, literals are used for display purpose. The following SQL query
retrieves the employee ID and their titles from the Employee table along with a
literal “Designation:”
The literals are created by specifying the string within quotes and placing the
same inside the SELECT query. The following figure shows the output of the
preceding query.
In the preceding figure, the result set displays a virtual column with Designation as
a value in each row. This column does not physically exist in the database table.
As a database developer, you have to manage the requirements from various users,
who might want to view results in different ways. You may require to display the
values of multiple columns in a single column and also add a description with the
column value. In such a case, you can use the concatenation operator. The
concatenation operator is used to concatenate string expressions. It is
represented by the + sign.
The following SQL query concatenates the data of the Name and GroupName
columns of the Department table into a single column:
SELECT Name + ' department comes under ' + GroupName + ' group'
AS Department FROM HumanResources.Department
In the preceding query, literals such as ‘department comes under’ and ‘group’, are
concatenated to increase the readability of the output. The following figure shows
the output of the preceding query.
Calculating Column Values
Sometimes, you might also need to show calculated values for the columns. For
example, the Orders table stores the order details such as OrderID, ProductID,
OrderDate, UnitPrice, and Units. To find the total amount of an order, you need to
multiply the UnitPrice of the product with the Units. In such cases, you can apply
arithmetic operators. Arithmetic operators are used to perform mathematical
operations, such as addition, subtraction, division, and multiplication, on numeric
columns or on numeric constants.
▪ + (for addition)
▪ -(for subtraction)
▪ / (for division)
▪ (for multiplication)
▪ % (for modulo -the modulo arithmetic operator is used to obtain the
remainder of two divisible numeric integer values)
▪
All arithmetic operators can be used in the SELECT statement with column names
and numeric constants in any combination. When multiple arithmetic operators are
used in a single query, the processing of the operation takes place according to the
precedence of the arithmetic operators.
When an arithmetic expression uses the same level of precedence, the order of
execution is from left to right. For example, the EmployeePayHistory table in the
HumanResources schema contains the hourly rate of the employees. The following
SQL query retrieves the per day rate of the employees from the
EmployeePayHistory table:
Consider another example, where a teacher wants to view the names and the scores
of the students who scored more than 80%.
Therefore, the query must select the names and the scores from the table with a
condition added to the score column.
To retrieve selected rows based on a specific condition, you need to use the
WHERE clause in the SELECT statement. Using the WHERE clause selects the
rows that satisfy the condition.
The following SQL query retrieves the department details from the Department
table, where the group name is Research and Development:
In the preceding figure, the rows containing the Research and Development group
name are displayed.
The syntax for using comparison operators in the SELECT statement is:
In the WHERE clause, you can use a comparison operator to specify a condition.
The following SQL query retrieves records from the Employee table where the
vacation hour is less than 5:
The preceding query retrieves all the rows that satisfy the specified condition by
using the comparison operator, as shown in the following figure.
The following table lists the comparison operators supported by SQL Server.
❑ OR: Returns a true value when at least one condition is satisfied. For example,
the following SQL query retrieves records from the Department table when the
GroupName is either Manufacturing or Quality Assurance:
The following SQL query retrieves records from the Department table when the
GroupName is not Quality Assurance:
The preceding query retrieves all the rows except the rows that match the
condition specified after the NOT conditional expression.
Range operators retrieve data based on a range. The syntax for using range
operators in the SELECT statement is:
where,
❑ BETWEEN:
Specifies an inclusive range to search. The following SQL query retrieves records
from the Employee table where the number of hours that the employees can avail
to go on a vacation is between 20 and 50:
Excludes the specified range from the result set. The following SQL query
retrieves records from the Employee table where the number of hours that the
employees can avail to go on a vacation is not between 40 and 50:
The IN keyword selects the values that match any one of the values given in a list.
The following SQL query retrieves the records of employees who are Recruiter,
Stocker, or Buyer from the Employee table:
Alternatively, the NOT IN keyword restricts the selection of values that match
any one of the values in a list. The following SQL query retrieves records of
employees whose designation is not Recruiter, Stocker, or Buyer:
For example, you are asked to create a report that displays the names of all the
products of AdventureWorks beginning with the letter P. You can do this by using
the LIKE keyword. The LIKE keyword is used to search a string by using wildcards.
Wildcards are special characters, such as ‘*’ and ‘%’. These characters are used to
match patterns.
The LIKE keyword matches the given character string with the specified pattern.
The pattern can include combination of wildcard characters and regular
characters. While performing a pattern match, regular characters must match the
characters specified in the character string. However, wildcard characters are
matched with fragments of the character string.
For example, if you want to retrieve records from the Department table where the
values of Name column begin with ‘Pro’, you need to use the ‘%’ wildcard character,
as shown in the following query:
Consider another example, where you want to retrieve the rows from the
Department table in which the department name is five characters long and begins
with ‘Sale’, whereas the fifth character can be anything. For this, you need to use
the ‘_’ wildcard character, as shown in the following query:
The following table describes the wildcard characters that are used with the LIKE
keyword in SQL server.
The wildcard characters can be combined into a single expression with the LIKE
keyword. The wildcard characters themselves can be searched using the LIKE
keyword by putting them into square brackets ([]). The following table describes
the use of the wildcard characters with the LIKE keyword.
Retrieving Records That Contain NULL Values
A NULL value in a column implies that the data value for the column is not available.
You might be required to find records that contain null values or records that do
not contain NULL values in a particular column. In such a case, you can use the
unknown_value_operator in your queries. The syntax for using the
unknown_value_operator in the SELECT statement is:
The following SQL query retrieves only those rows from the
EmployeeDepartmentHistory table for which the value in the EndDate column is
NULL:
At times, you might need to handle the null values in a table quiet differently. For
example, the contact details of the employees are stored in the following Contact
table.
The contact details contain the residential, office, and mobile number of an
employee. If the employee does not have any of the contact numbers, it is
substituted with a null value. Now, you want to display a result set by substituting
all the null values with zero. To perform this task, you can use the ISNULL()
function. The ISNULL() function replaces the null values with the specified
replacement value.
For example, the following SQL query replaces the null values with zero in the
query output:
Consider another example, the following SQL query replaces the null values with
zero in the SalesQuota column in the query output:
To display such type of reports, you can use the COALESCE() function. The
COALESCE() function checks the values of each column in a list and returns the
first non null contact number. The null value is returned only if all the values in a
list are null. The syntax for using the COALESCE() function is: COALESCE (
column_name [ ,...n ] )
For example, you can use the following SQL query to display the very first contact
number of the employees in the Contact table:
The following SQL query retrieves the records from the Department table by
setting ascending order on the Name column:
Optionally, you can sort the result set based on more than one column. For this, you
need to specify the sequence of the sort columns in the ORDER BY clause, as
shown in the following query:
The preceding query sorts the Department table in ascending order of GroupName,
and then ascending order of DepartmentID, as shown in the following figure.
Retrieving Records from the Top of a Table
You can use the TOP keyword to retrieve only the first set of rows from the top
of a table. This set of records can be either a number of records or a percent of
rows that will be returned from a query result. For example, you want to view the
product details from the product table, where the product price is more than $ 50.
There might be various records in the table, but you want to see only the top 10
records that satisfy the condition. In such a case, you can use the TOP keyword.
The syntax for using the TOP keyword in the SELECT statement is:
where, n is the number of rows that you want to retrieve. If the PERCENT
keyword is used, then “n” percent of the rows are returned.
WITH TIES specifies that result set includes all the additional rows that matches
the last row returned by the TOP clause. It is used along with the ORDER BY
clause. The following query retrieves the top 10 rows of the Employee table:
The following query retrieves the top 10% rows of the Employee table:
If the SELECT statement including TOP has an ORDER BY clause, then the rows to
be returned are selected after the ORDER BY clause has been applied.
For example, you want to retrieve the top three records from the Employee table
where the HireDate is greater than or equal to 1/1/98 and less than or equal to
12/31/98. Further, the record should be displayed in the ascending order based on
the SickLeaveHours column. To accomplish this task, you can use the following
query:
Consider another example, where you want to retrieve the details of top 10
employees who have the highest sick leave hours. In addition, the result set should
include all those employees whose sick leave hours matches the lowest sick leave
hours included in the result set:
With the ORDER BY clause, you can retrieve the records in a specific order but
cannot limit the number of records returned.
With the TOP clause, you can limit the number of records returned but cannot
retrieve the records from a specified position.
In such a case, you can use the OFFSET and FETCH clause to retrieve a specific
number of records, starting from a particular position, in the result set.
For example, you want to retrieve the records of employees from the Employee
table. But, you do not want to include the first 15 records in the result set. In such
a case, you can use the OFFSET clause to exclude the first 15 records from the
result set, as shown in the following query:
You may want to retrieve the 10 records from the Employee table, excluding the
first 15 records. In such a case, you can use the FETCH clause along with the
OFFSET clause, as shown in the following query:
where, DISTINCT keyword specifies that only the records containing non-
duplicated values in the specified column are displayed.
The following SQL query retrieves all the Titles beginning with PR from the
Employee table:
String functions are used with the char and varchar data types. SQL Server
provides string functions that can be used as a part of any character expression.
These functions are used for various operations on strings. The syntax for using a
function in the SELECT statement is:
For example, you want to retrieve the Name, DepartmentID, and GroupName
columns from the Department table and the data of the Name column should be
displayed in uppercase with a user-defined heading, Department Name.
For this, you can use the upper() string function, as shown in the following query:
The following SQL query uses the left() string function to extract the specified
characters from the left side of a string:
SELECT Name = Title + ' ' + left (FirstName,1) + '. ' + LastName, EmailAddress
FROM Person.Contact
The following figure shows the output of the preceding query
The following table lists the string functions provided by SQL Server.
Using Conversion Functions
You can use the conversion functions to convert data from one type to another. For
example, you want to convert a string value into a numeric format. You can use the
parse() function to convert string values to numeric or date time format, as shown
in the following query:
The following table lists the conversion functions provided by SQL Server.
The following table lists the style values for displaying datetime expressions in
different formats.
Date parsing includes extracting components, such as the day, the month, and the
year from a date value. You can also retrieve the system date and use the value in
the date manipulation operations. To retrieve the current system date, you can use
the getdate() function. The following query displays the current date:
SELECT getdate()
The datediff() function is used to calculate the difference between two dates. For
example, the following SQL query uses the datediff() function to calculate the age
of the employees:
The preceding query calculates the difference between the current date and the
date of birth of employees, whereas, the date of birth of employees is stored in
the BirthDate column of the Employee table in the AdventureWorks database.
The following table lists the date functions provided by SQL Server.
The Date Functions Provided by SQL Server
To parse the date values, you can use the datepart() function in conjunction with
the date functions. For example, the datepart() function retrieves the year when
an employee was hired, along with the employee title, from the Employee table, as
shown in the following query:
The following SQL query uses datename() and datepart() functions to retrieve the
month name and year from a given date:
For example, to calculate the round off value of any number, you can use the
round() mathematical function. The round() mathematical function calculates and
returns the numeric value based on the input values provided as an argument. The
syntax of the round() function is:
round(numeric_expression,length)
Length is the precision to which the expression is to be rounded off. The following
SQL query retrieves the EmployeeID and Rate for the specified employee ID from
the EmployeePayHistory table:
In the preceding figure, the value of the Hourly Pay Rate column is rounded off to
two decimal places. While using the round() function, if the length is positive, then
the expression is rounded to the right of the decimal point. If the length is
negative, then the expression is rounded to the left of the decimal point.
The following table lists the usage of the round() function provided by SQL
Server.
Aggregate Functions
At times, you need to calculate the summarized values of a column based on a set
of rows. For example, the salary of employees is stored in the Rate column of the
EmployeePayHistory table and you need to calculate the average salary earned by
the employees.
where,
ALL specifies that the aggregate function is applied to all the values in the
specified column. DISTINCT specifies that the aggregate function is applied to
only unique values in the specified column. Expression specifies a column or an
expression with operators. You can calculate summary values by using the following
aggregate functions:
Avg(): Returns the average of values in a numeric expression, either all or
distinct. The following SQL query retrieves the average value from the Rate
column of the EmployeePayHistory table with a user-defined heading:
The count() function also accepts (*) as its parameter, but it counts the
number of rows returned by the query.
Min(): Returns the lowest value in the expression. The following SQL query
retrieves the minimum value from the Rate column of the
EmployeePayHistory table with a user-defined heading:
Max(): Returns the highest value in the expression. The following SQL query
retrieves the maximum value from the Rate column of the
EmployeePayHistory table with a user-defined heading:
Sum(): Returns the sum total of values in a numeric expression, either all or
distinct. The following SQL query retrieves the sum value of all the unique
rate values from the EmployeePayHistory table with a user-defined heading:
You can group the data by using the GROUP BY clauses of the SELECT statement.
GROUP BY
The GROUP BY clause summarizes the result set into groups, as defined in the
SELECT statement, by using aggregate functions. The HAVING clause further
restricts the result set to produce the data based on a condition.
The following SQL query returns the minimum and maximum values of vacation
hours for the different types of titles where the number of hours that the
employees can avail to go on a vacation is greater than 80:
SELECT Title, Minimum = min (VacationHours), Maximum = max
(VacationHours) FROM HumanResources.Employee WHERE VacationHours > 80
GROUP BY Title
The HAVING clause eliminates all those groups that do not match the specified
condition. The following query retrieves all the titles along with their average
vacation hours when the vacation hours are more than 30 and the group average
value is greater than 55:
SELECT Title, 'Average Vacation Hours' = avg(VacationHours) FROM
HumanResources.Employee WHERE VacationHours > 30 GROUP BY Title HAVING
avg(VacationHours) >55
The GROUP BY clause can be applied on multiple fields. You can use the following
query to retrieve the average value of the vacation hours that is grouped by Title
and ManagerID in the Employee table:
SELECT Title, 'Manager ID' = ManagerID, Average = avg (VacationHours)
FROM HumanResources.Employee GROUP BY Title, ManagerID
If you want to display all those groups that are excluded by the WHERE clause,
then you can use the ALL keyword along with the GROUP BY clause.
For example, the following query retrieves the records for the employee titles
that are eliminated in the WHERE condition:
SELECT Title, VacationHours = sum (VacationHours) FROM
HumanResources.Employee WHERE Title IN ('Recruiter', 'Stocker',
'Design Engineer') GROUP BY ALL Title ORDER BY sum (VacationHours)DESC
The GROUPING SETS clause is used to combine the result generated by multiple
GROUP BY clauses into a single result set. For example, the employee details of the
organization are stored in the following EmpTable table.
To view the average salary of the employees combined for each region and
department, you can use the following query:
SELECT Region, Department, avg(sal) AverageSalary FROM EmpTable GROUP BY
Region, Department
To view the average salary of the employees for each region, you can use the
following query:
SELECT Region, avg(sal) AverageSalary FROM EmpTable GROUP BY Region
To view the average salary of the employees for each department, you can use the
following query:
SELECT Department, avg(sal) AverageSalary FROM EmpTable GROUP BY
Department
Using the preceding queries, you can view the average salary of the employees
based on different grouping criteria. If you want to view the results of all the
previous three queries in a single result set, you need to perform the union of the
results generated from the preceding queries. However, instead of performing the
union of the results, you can use the GROUPING SET clause, as shown in the
following query:
SELECT Region, Department, AVG(sal) AverageSalary FROM EmpTable
GROUP BY
GROUPING SETS(
(Region, Department),
(Region),
(Department)
)
The following figure displays the output of the preceding query.
In the preceding figure, the rows that do not have NULL values represent the
average salary of the employees grouped for each region and department. The rows
that contain NULL values in the Department column represent the average salary
of the employees for each region. The rows that contain NULL values in the Region
column represent the average salary of the employees for each department.
You have been assigned the task to generate a report that shows the total sales
amount earned by each employee during the previous years and the total sales
amount earned by all the employees in the previous years. To generate the report
required to accomplish the preceding task, you need to apply the following levels of
aggregation:
qSum of sales amount earned by each employee in the previous years
qSum of sales amount earned by all the employees in the previous years
Therefore, you can use the ROLLUP operator to apply the preceding levels of
aggregation and generate the required result set, as shown in the following query:
In the preceding query, the EmployeeID and YearOfSale columns are specified
with the ROLLUP operator because the result is to be generated for each
employee as well as for each year of sale. The following figure shows the output of
the preceding query.
The preceding figure displays the sum of the sales amount earned by each
employee during the previous years. The amount earned by employee 101 is
displayed in row number 4. The NULL in this row represents that it contains the
sum of the preceding rows. Similarly, the amount earned by employee 102 is
displayed in row number 8, by employee 103 is displayed in row number 12, and by
employee 104 is displayed in row number 16. In addition, the output displays the
grand total of the sales amount earned by all the employees during the previous
years in row number 17.
The CUBE operator is also used to apply multiple levels of aggregation on the
result retrieved from a table. However, this operator extends the functionality of
the ROLLUP operator and generates a result set with all the possible combination
of the records retrieved from a table. For example, you have to generate a report
by retrieving data from the SalesHistory table. The generated report should show
the yearwise total amount of sale by all the employees, total amount of sale of all
the previous years, and total amount of sale earned by each employee during all the
previous years. To generate a result set required to accomplish the preceding task,
you need to apply the following levels of aggregation:
Therefore, you can use the CUBE operator to apply the preceding levels of
aggregation and generate the required result set, as shown in the following query:
Both the ROLLUP and CUBE operators are used to apply multiple levels of
aggregation on the result set retrieved from a table. However, these operators are
different in terms of result sets they produce. The differences between the
ROLLUP and CUBE operators are:
For each value in the columns on the right side of the GROUP BY clause, the
CUBE operator reports all possible combinations of values from the columns
on the left side. However, the ROLLUP operator does not report all such
possible combinations.