0% found this document useful (0 votes)
15 views25 pages

Aggregating and Grouping Example

Uploaded by

sumrun sahab
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
15 views25 pages

Aggregating and Grouping Example

Uploaded by

sumrun sahab
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 25

Aggregating Data

Using Group Function


What Are Group Functions?
Group functions operate on sets of rows to give
one result per group.
EMP
DEPTNO SAL

10 2450
10 5000
10 1300
20 800
20 1100
20 3000 “maximum MAX(SAL)
20 3000 salary in ---------
20 2975 the EMP table” 5000
30 1600
30 2850
30 1250
30 950
30 1500
30 1250
5-3

2
Types of Group Functions

• AVG
• COUNT
• MAX
• MIN
• STDDEV
• SUM
• VARIANCE

5-4

Group Functions
Each of the functions accepts an argument. The following table identifies the options that you can use in the
syntax:
Function Description
AVG([DISTINCT|ALL]n) Average value of n, ignoring null values
COUNT({*|[DISTINCT|ALL]expr}) Number of rows, where expr evaluates to something other
than null (Count all selected rows using *, including
duplicates and rows with nulls.)
MAX([DISTINCT|ALL]expr) Maximum value of expr, ignoring null values

MIN([DISTINCT|ALL]expr) Minimum value of expr, ignoring null values


STDDEV([DISTINCT|ALL]x) Standard deviation of n, ignoring null values

SUM([DISTINCT|ALL]n) Sum values of n, ignoring null values


VARIANCE([DISTINCT|ALL]x) Variance of n, ignoring null values

3
Using Group Functions

SELECT [column,] group_function(column)


FROM table
[WHERE condition]
[GROUP BY column]
[ORDER BY column];

5-5

Guidelines for Using Group Functions


• DISTINCT makes the function consider only nonduplicate values; ALL makes it consider every value
including duplicates. The default is ALL and therefore does not need to be specified.
• The datatypes for the arguments may be CHAR, VARCHAR2, NUMBER, or DATE where expr is
listed.
• All group functions except COUNT(*) ignore null values. To substitute a value for null values, use the
NVL function.
• The Oracle Server implicitly sorts the result set in ascending order when using a GROUP BY clause.
To override this default ordering, DESC can be used in an ORDER BY clause.

4
Using AVG and SUM Functions
You can use AVG and SUM for numeric data.
SQL> SELECT AVG(sal), MAX(sal),
2 MIN(sal), SUM(sal)
3 FROM emp
4 WHERE job LIKE 'SALES%';

AVG(SAL) MAX(SAL) MIN(SAL) SUM(SAL)

1400 1600 1250 5600

5-6

Group Functions
You can use AVG, SUM, MIN, and MAX functions against columns that can store numeric data. The
example on the slide displays the average, highest, lowest and sum of monthly salaries for all salespeople.

5
Using MIN and MAX Functions
You can use MIN and MAX for any datatype.
SQL> SELECT MIN(hiredate), MAX(hiredate)
2 FROM emp;

MIN(HIRED MAX(HIRED

17-DEC-80 12-JAN-83

5-7

Group Functions
You can use MAX and MIN functions for any datatype. The slide example displays the most junior and most
senior employee.
The following example displays the employee name that is first and the employee name that is the last in an
alphabetized list of all employees.

SQL> SELECT MIN(ename), MAX(ename)


2 FROM emp;

MIN(ENAME) MAX(ENAME)

ADAMS WARD

Note: AVG, SUM, VARIANCE, and STDDEV functions can be used only with numeric datatypes.

6
Using the COUNT Function
COUNT(*) returns the number of rows in a
table.
SQL> SELECT COUNT(*)
2 FROM emp
3 WHERE deptno = 30;

COUNT(*)

5-8

The COUNT Function


The COUNT function has two formats:
• COUNT(*)
• COUNT(expr)
COUNT(*) returns the number of rows in a table, including duplicate rows and rows containing null values
in any of the columns. If a WHERE clause is included in the SELECT statement, COUNT(*) returns the
number of rows that satisfies the condition in the WHERE clause.
In contrast, COUNT(expr) returns the number of nonnull rows in the column identified by expr.
The slide example displays the number of employees in department 30.

7
Using the COUNT Function
COUNT(expr) returns the number of
nonnull rows.
SQL> SELECT COUNT(comm)
2 FROM emp
3 WHERE deptno = 30;

COUNT(COMM)

5-9

The COUNT Function


The slide example displays the number of employees in department 30 who can earn a commission. Notice
that the result gives the total number of rows to be four because two employees in department 30 cannot earn
a commission and contain a null value in the COMM column.
Example
Display the number of departments in the EMP table.

SQL> SELECT COUNT(deptno)


2 FROM emp;
COUNT(DEPTNO)

14
Display the number of distinct departments in the EMP table.

SQL> SELECT COUNT(DISTINCT (deptno))


2 FROM emp;

COUNT(DISTINCT(DEPTNO))

3
8
Group Functions and Null Values
Group functions ignore null values in the
column.
SQL> SELECT AVG(comm)
2 FROM emp;

AVG(COMM)

550

5-10

Group Functions and Null Values


All group functions except COUNT (*) ignore null values in the column. In the slide example,
the average is calculated based only on the rows in the table where a valid value is stored in the
COMM column. The average is calculated as total commission being paid to all employees
divided by the number of employees receiving commission (4).

9
Using the NVL Function
with Group Functions
The NVL function forces group functions
to include null values.

SQL> SELECT AVG(NVL(comm,0))


2 FROM emp;

AVG(NVL(COMM,0))

157.14286

5-11

Group Functions and Null Values (continued)


The NVL function forces group functions to include null values. In the slide example, the
average is calculated based on all rows in the table regardless of whether null values are stored
in the COMM column. The average is calculated as total commission being paid to all
employees divided by the total number of employees in the company (14).

10
Creating Groups of Data
EMP
DEPTNO SAL

10 2450
10 5000 2916.6667
10 1300
“average DEPTNO AVG(SAL)
20 800
20 1100 salary
------- ---------
20 3000 2175 in EMP
table 10 2916.6667
20 3000
20 2975 for each 20 2175
30 1600 department” 30 1566.6667
30 2850
30 1250 1566.6667
30 950
30 1500
30 1250

5-12

Groups of Data
Until now, all group functions have treated the table as one large group of information. At times,
you need to divide the table of information into smaller groups. This can be done by using the
GROUP BY clause.

11
Creating Groups of Data:
GROUP BY Clause

SELECT column, group_function(column)


FROM table
[WHERE condition]
[GROUP BY group_by_expression]
[ORDER BY column];

Divide rows in a table into smaller groups


by using the GROUP BY clause.

5-13

The GROUP BY Clause


You can use the GROUP BY clause to divide the rows in a table into groups. You can then use the group
functions to return summary information for each group.
In the syntax:
group_by_expression specifies columns whose values determine the basis for
grouping rows
Guidelines
• If you include a group function in a SELECT clause, you cannot select individual results as well unless
the individual column appears in the GROUP BY clause. You will receive an error message if you fail
to include the column list.
• Using a WHERE clause, you can preexclude rows before dividing them into groups.
• You must include the columns in the GROUP BY clause.
• You cannot use the column alias in the GROUP BY clause.
• By default, rows are sorted by ascending order of the columns included in the GROUP BY list. You
can override this by using the ORDER BY clause.

12
Using the GROUP BY Clause
All columns in the SELECT list that are not
in group functions must be in the GROUP
BY clause.
SQL> SELECT deptno, AVG(sal)
2 FROM emp
3 GROUP BY deptno;

DEPTNO AVG(SAL)

10 2916.6667
20 2175
30 1566.6667

5-14

The GROUP BY Clause


When using the GROUP BY clause, make sure that all columns in the SELECT list that are not
in the group functions are included in the GROUP BY clause. The example on the slide displays
the department number and the average salary for each department. Here is how this SELECT
statement, containing a GROUP BY clause, is evaluated:
• The SELECT clause specifies the columns to be retrieved:
– Department number column in the EMP table
– The average of all the salaries in the group you specified in the GROUP BY clause
• The FROM clause specifies the tables that the database must access: the EMP table.
• The WHERE clause specifies the rows to be retrieved. Since there is no WHERE clause,
by default all rows are retrieved.
• The GROUP BY clause specifies how the rows should be grouped. The rows are being
grouped by department number, so the AVG function that is being applied to the salary
column will calculate the average salary for each department.

13
Using the GROUP BY Clause
The GROUP BY column does not have to
be in the SELECT list.
SQL> SELECT AVG(sal)
2 FROM emp
3 GROUP BY deptno;

AVG(SAL)

2916.6667
2175
1566.6667

5-15

The GROUP BY Clause


The GROUP BY column does not have to be in the SELECT clause. For example, the SELECT statement on
the slide displays the average salaries for each department without displaying the respective department
numbers. Without the department numbers, however, the results do not look meaningful.
You can use the group function in the ORDER BY clause.

SQL> SELECT deptno, AVG(sal)


2 FROM emp
3 GROUP BY deptno
4 ORDER BY AVG(sal);

DEPTNO AVG(SAL)

30 1566.6667
20 2175
10 2916.6667

14
Grouping by More
EMP
Than One Column
DEPTNO JOB SAL

10 MANAGER 2450
DEPTNO JOB SUM(SAL)
10 PRESIDENT 5000
-------- --------- ---------
10 CLERK 1300
10 CLERK 1300
20 CLERK 800 “sum salaries in 10 MANAGER 2450
20 CLERK 1100 the EMP table 10 PRESIDENT 5000
20 ANALYST 3000 for each job, 20 ANALYST 6000
20 ANALYST 3000 grouped by
20 CLERK 1900
20 MANAGER 2975 department”
20 MANAGER 2975
30 SALESMAN 1600
30 CLERK 950
30 MANAGER 2850
30 MANAGER 2850
30 SALESMAN 1250
30 SALESMAN 5600
30 CLERK 950
30 SALESMAN 1500
30 SALESMAN 1250

5-16

Groups Within Groups


Sometimes there is a need to see results for groups within groups. The slide shows a report that displays the
total salary being paid to each job title, within each department.
The EMP table is grouped first by department number, and within that grouping, it is grouped by job title. For
example, the two clerks in department 20 are grouped together and a single result (total salary) is produced for
all salespeople within the group.

15
Using the GROUP BY Clause
on Multiple Columns
SQL> SELECT deptno, job, sum(sal)
2 FROM emp
3 GROUP BY deptno, job;

DEPTNO JOB SUM(SAL)

10 CLERK 1300
10 MANAGER 2450
10 PRESIDENT 5000
20 ANALYST 6000
20 CLERK 1900
...
9 rows selected.

5-17

Groups Within Groups (continued)


You can return summary results for groups and subgroups by listing more than one GROUP BY column.
You can determine the default sort order of the results by the order of the columns in the GROUP BY
clause. Here is how the SELECT statement on the slide, containing a GROUP BY clause, is evaluated:
• The SELECT clause specifies the column to be retrieved:
– Department number in the EMP table
– Job title in the EMP table
– The sum of all the salaries in the group that you specified in the
GROUP BY clause
• The FROM clause specifies the tables that the database must access: the EMP table.
• The GROUP BY clause specifies how you must group the rows:
– First, the rows are grouped by department number.
– Second, within the department number groups, the rows are grouped by job title.
So the SUM function is being applied to the salary column for all job titles within each department
number group.

16
Illegal Queries
Using Group Functions
Any column or expression in the SELECT
list that is not an aggregate function must
be in the GROUP BY clause.

SQL> SELECT deptno, COUNT(ename)


2 FROM emp;

SELECT deptno, COUNT(ename)


*
ERROR at line 1:
ORA-00937: not a single-group group function

5-18

Illegal Queries Using Group Functions


Whenever you use a mixture of individual items (DEPTNO) and group functions (COUNT) in the same
SELECT statement, you must include a GROUP BY clause that specifies the individual items (in this case,
DEPTNO). If the GROUP BY clause is missing, then the error message “not a single-group group function”
appears and an asterisk (*) points to the offending column. You can correct the error on the slide by adding
the GROUP BY clause.

SQL> SELECT deptno,COUNT(ename)


2 FROM emp
3 GROUP BY deptno;
DEPTNO COUNT(ENAME)

10 3
20 5
Any column or ex3p0 6CT list that is not an aggregate function must be in the GROUP BY
ression in the SELE
clause.

17
Illegal Queries
Using Group Functions
• You cannot use the WHERE clause to restrict
groups.
• You use the HAVING clause to restrict groups.
SQL> SELECT deptno, AVG(sal)
2 FROM emp
3 WHERE AVG(sal) > 2000
4 GROUP BY deptno;

WHERE AVG(sal) > 2000


*
ERROR at line 3:
ORA-00934: group function is not allowed here

5-19

Illegal Queries Using Group Functions (continued)


The WHERE clause cannot be used to restrict groups. The SELECT statement on the slide results
in an error because it uses the WHERE clause to restrict the display of average salaries of those
departments that have an average salary greater than $2000.
You can correct the slide error by using the HAVING clause to restrict groups.
SQL> SELECT deptno, AVG(sal)
2 FROM emp
3 GROUP BY deptno
4 HAVING AVG(sal) > 2000;

DEPTNO AVG(SAL)

10 2916.6667
20 2175

18
Excluding Group Results
EMP
DEPTNO SAL

10 2450
10 5000 5000
10 1300
20 800
20 1100 “maximum DEPTNO MAX(SAL)
20 3000 salary --------- ---------
3000
20 3000 per department 10 5000
20 2975 greater than 20 3000
30 1600 $2900”
30 2850
30 1250
2850
30 950
30 1500
30 1250

5-20

Restricting Group Results


In the same way that you use the WHERE clause to restrict the rows that you select, you use the HAVING
clause to restrict groups. To find the maximum salary of each department, but show only the departments that
have a maximum salary of more than $2900, you need to do the following:
• Find the average salary for each department by grouping by department number.
• Restrict the groups to those departments with a maximum salary greater than $2900.

19
Excluding Group Results:
HAVING Clause
Use the HAVING clause to restrict groups
• Rows are grouped.
• The group function is applied.
• Groups matching the HAVING clause are
displayed.
SELECT column, group_function
FROM table
[WHERE condition]
[GROUP BY group_by_expression]
[HAVING group_condition]
[ORDER BY column];

5-21

The HAVING Clause


You use the HAVING clause to specify which groups are to be displayed. Therefore, you further restrict the
groups on the basis of aggregate information.
In the syntax:
group_condition restricts the groups of rows returned to those groups for which
the specified condition is TRUE
The Oracle Server performs the following steps when you use the HAVING clause:
• Rows are grouped.
• The group function is applied to the group.
• The groups that match the criteria in the HAVING clause are displayed.
The HAVING clause can precede the GROUP BY clause, but it is recommended that you place the GROUP
BY clause first because it is more logical. Groups are formed and group functions are calculated before the
HAVING clause is applied to the groups in the SELECT list.

20
Using the HAVING Clause

SQL> SELECT deptno, max(sal)


2 FROM emp
3 GROUP BY deptno
4 HAVING max(sal)>2900;

DEPTNO MAX(SAL)

10 5000
20 3000

5-22

The HAVING Clause (continued)


The slide example displays department numbers and maximum salary for those departments whose maximum
salary is greater than $2900.
You can use the GROUP BY clause without using a group function in the SELECT list.
If you restrict rows based on the result of a group function, you must have a GROUP BY clause as well as the
HAVING clause.
The following example displays the department numbers and average salary for those departments whose
maximum salary is greater than $2900:

SQL> SELECT deptno, AVG(sal)


2 FROM emp
3 GROUP BY deptno
4 HAVING MAX(sal) > 2900;

DEPTNO AVG(SAL)

10 2916.6667
20 2175

21
Using the HAVING Clause

SQL> SELECT job, SUM(sal) PAYROLL


2 FROM emp
3 WHERE job NOT LIKE 'SALES%'
4 GROUP BY job
5 HAVING SUM(sal)>5000
6 ORDER BY SUM(sal);

JOB PAYROLL

ANALYST 6000
MANAGER 8275

5-23

The HAVING Clause (continued)


The slide example displays the job title and total monthly salary for each job title with a total payroll
exceeding $5000. The example excludes salespeople and sorts the list by the total monthly salary.

22
Nesting Group Functions
Display the maximum average salary.

SQL> SELECT max(avg(sal))


2 FROM emp
3 GROUP BY deptno;

MAX(AVG(SAL))

2916.6667

5-24

Nesting Group Functions


Group functions can be nested to a depth of two. The slide example displays the maximum average salary.

23
Summary
SELECT column, group_function(column)
FROM table
[WHERE condition]
[GROUP BY group_by_expression]
[HAVING group_condition]
[ORDER BY column];

Order of evaluation of the clauses:


• WHERE clause
• GROUP BY clause
• HAVING clause

5-25

Summary
Seven group functions are available in SQL:
• AVG
• COUNT
• MAX
• MIN
• SUM
• STDDEV
• VARIANCE
You can create subgroups by using the GROUP BY clause. Groups can be excluded using the HAVING
clause.
Place the HAVING and GROUP BY clauses after the WHERE clause in a statement. Place the ORDER BY
clause last.
The Oracle Server evaluates the clauses in the following order:
• If the statement contains a WHERE clause, the server establishes the candidate rows.
• The server identifies the groups specified in the GROUP BY clause.
• The HAVING clause further restricts result groups that do not meet the group criteria in the HAVING
clause.

24
Practice Overview

• Showing different queries that use


group functions
• Grouping by rows to achieve more than
one result
• Excluding groups by using the HAVING
clause

5-26

Practice Overview
At the end of this practice, you should be familiar with using group functions and selecting groups of data.
Paper-Based Questions
For questions 1–3, circle either True or False.
Note: Column aliases are used for the queries.

25

You might also like