Republic of the Philippines
POLYTECHNIC UNIVERSITY OF THE PHILIPPINES
INSTITUTE OF TECHNOLOGY
Management Technology Department
INSTRUCTIONAL MATERIAL
FOR
INTE 20023
MANAGING DATA USING DBMS SOFTWARE
COMPILED BY:
JOSEPHINE M. DELA ISLA, MBE, CPA
Table of Contents
Lesson 1: Introduction to Database Management 2
Lesson 2: The Relational Model 1: Introduction, QBE, and Relational Algebra 17
Lesson 3: The Relational Model 2: SQL 31
Lesson 4: The Relational Model 3: Advanced Topics 44
Lesson 5: Database Design 1: Normalization 55
Lesson 6: Database Design 2: Design Method 65
Lesson 7: DBMS Functions 79
Lesson 8: Database Administration 90
Lesson 9: Database Management Approaches 98
1 | INTE 20023 Managing Data Using DBMS Software
INTRODUCTION
This module is intended for those who interested in gaining some familiarity with database
management. It is appropriate for students in introductory database classes. Basically
appropriate for students in database courses in related disciplines, such as business, at either
the undergraduate or graduate level. Such students require as general understanding of the
database environment.
This module assumes that the student have familiarity with computers, and background will be
required. But of course with the guidance and discussion on this module, student help
understand the database environment and operations needed in designing a database.
As the technology has grown rapidly in past four decades, today DBMS has gain its own
importance because the data has brought online in the hands of end user through different
computer networking.
2 | INTE 20023 Managing Data Using DBMS Software
COURSE OUTCOMES:
At the end of the course, students should be able to:
Identify the basic terminology and concept the databases and database management
systems
Describe the relational model and examine a method for retrieving data from relational
database using Query-by-Example(QBE) and Relational Algebra
Identify
Define the normalization process and its underlying concepts and features
Discuss entity, referential, and legal-values integrity
Understand how normalization is used in the database design process
Examine the entity-relationship model for representing and designing databases
Apply the functions, or services, provided by a DBMS
Discuss the need for database administration
Explain the Database Management Approaches
Lesson 1: Introduction to Database Management
Learning Outcomes:
After successful completion of this lesson, you should be able to:
Introduce Premiere Products, the company that is used as the basis for many of the
examples throughout the text
Introduce basic database terminology
Describe database management systems (DBMSs)
Explain the advantages and disadvantages of database processing
Introduce Henry Books, the company that is used in a case that appears throughout the
text
Introduce Alexamara Marina Group, the company that is used in another case that
appears throughout the text
COURSE MATERIALS:
Premiere Products Background
• Premiere Products
– Distributor of appliances, houseware, and sporting goods
– Uses spreadsheet software to maintain important data
– Recent growth has made spreadsheet approach problematic
FIGURE 1-1: Sample orders spreadsheet
3 | INTE 20023 Managing Data Using DBMS Software
• Problems using spreadsheet
– Redundancy
• Duplication of data or the storing of the same data in more than one place
– Difficulty accessing related data
– Limited security
– Size limitations
• Information Premiere Products needs to maintain
– Sales Reps
• Sales rep number, last name, first name, address, total commission,
commission rate
– Customers
• Customer number, name, address, current balance, credit limit, number
of customer’s sales rep
– Parts Inventory
• Part number, description, number units on hand, item class, warehouse
number, unit price
FIGURE 1-2: Sample order
• Items for each customer’s order
– Order
• Order number, order date, customer number
– Order line
4 | INTE 20023 Managing Data Using DBMS Software
• Order number, part number, number of units ordered, quoted price
– Overall order total
• Not stored because it can be calculated
Database Background
• Database
– Structure that can store information about:
• Different categories of information
• Relationships between those categories of information
• Entity
– Person, place, object, event, or idea
– Entities for Premiere Products: sales reps, customers, orders, and parts
• Attribute
– Characteristic or property of an entity
– Example: Customer has name, street, city, etc.
– May also be called a field or column
• Relationship
– Association between entities
– One-to-many relationship
• Each rep is associated with many customers
• Each customer is associated with a single rep
•
FIGURE 1-3: Entities and attributes
5 | INTE 20023 Managing Data Using DBMS Software
FIGURE 1-4: One-to-many relationship
• Data file
– File used to store data
– Computer counterpart to ordinary paper file
• Database
– Structure that can store information about:
• Multiple types of entities
• Attributes of those entities
• Relationships between the entities
6 | INTE 20023 Managing Data Using DBMS Software
FIGURE 1-5: Sample data for Premiere Products
FIGURE 1-6: Alternative Orders table structure
• Entity-relationship (E-R) diagram
– Visual way to represent a database
– Rectangles represent entities
– Lines represent relationships between connected entities
FIGURE 1-7: E-R diagram for the Premiere Products database
Database Management Systems
7 | INTE 20023 Managing Data Using DBMS Software
• Database management system (DBMS)
– Program, or collection of programs, through which users interact with a database
• Popular DBMSs: Access, Oracle, DB2, MySQL, and SQL Server
• Premiere Products decides to use Access
• Database design
– Determining the structure of the required database
FIGURE 1-8: Using a DBMS directly
FIGURE 1-9: Using a DBMS through another program
• Forms
– Screen objects used to maintain, view, and print data from a database
– DBMS creates forms that Premiere Products needs
• Reports
– DBMS creates reports for Premiere Products based on user’s answers about the
desired content and appearance of each report
8 | INTE 20023 Managing Data Using DBMS Software
FIGURE 1-10: Part form
FIGURE 1-11: Orders form
FIGURE 1-12: Parts report
9 | INTE 20023 Managing Data Using DBMS Software
Advantages of Database Processing
1. Getting more information from the same amount of data
2. Sharing data
3. Balancing conflicting requirements
– Database administrator or database administration (DBA): person or group in charge of
the database
4. Controlling redundancy
5. Facilitating consistency
6. Improving integrity
– Integrity constraint: a rule that data must follow in the database
7. Expanding security
a. Security: prevention of unauthorized access
8. Increasing productivity
9. Providing data independence
a. Data independence: can change structure of a database without changing the
programs that access the database
Disadvantages of Database Processing
1. Larger file size
2. Increased complexity
3. Greater impact of failure
4. More difficult recovery
Introduction to Henry Books Database Case
• Henry Books
– Book store chain operated by Ray Henry
– Sells used books and remainders
• Henry decided to use database to gather and store information on:
– Branches
– Publishers
– Authors
– Books
FIGURE 1-15: Sample branch and publisher data for Henry Books
10 | INTE 20023 Managing Data Using DBMS Software
FIGURE 1-15: Sample branch and publisher data for Henry Books (continued)
FIGURE 1-16: Sample author data for Henry Books
11 | INTE 20023 Managing Data Using DBMS Software
FIGURE 1-17: Sample book data for Henry Books
12 | INTE 20023 Managing Data Using DBMS Software
FIGURE 1-18: Sample data that relates books to authors and books to branches for Henry
Books
FIGURE 1-18: Sample data that relates books to authors and books to branches for Henry Books (continued)
FIGURE 1-19: E-R diagram for the Henry Books database
Introduction to the Alexamara Marina Group Database Case
• Alexamara Marina Group offers in-water boat storage to owners
– Provides boat slips that boat owners can rent on an annual basis
– Two marinas: Alexamara East and Alexamara Central
13 | INTE 20023 Managing Data Using DBMS Software
– Provides boat repair and maintenance services
• Database used to store data
FIGURE 1-20: Sample marina data for Alexamara Marina Group
FIGURE 1-21: Sample owner data for Alexamara Marina Group
FIGURE 1-22: Sample data about marina slips for Alexamara Marina Group
14 | INTE 20023 Managing Data Using DBMS Software
FIGURE 1-23: Sample data about service categories for Alexamara Marina Group
FIGURE 1-24: Sample data about service requests for Alexamara Marina Group
FIGURE 1-24: Sample data about service requests for Alexamara Marina Group (continued)
15 | INTE 20023 Managing Data Using DBMS Software
FIGURE 1-25: E-R diagram for the Alexamara Marina Group database
Summary
• Problems with non-database approaches to data management: redundancy, difficulties
accessing related data, limited security features, limited data sharing features, and
potential size limitations
• Entity: person, place, object, event, or idea for which you want to store and process data
• Attribute, field, or column: characteristic or property of an entity
• Relationship: an association between entities
• One-to-many relationship: each occurrence of first entity is related to many occurrences
of the second entity and each occurrence of the second entity is related to only one
occurrence of the first entity
• Database: structure that can store information about multiple types of entities, attributes
of entities, and relationships among entities
• Premiere Products requires information about reps, customers, parts, orders, and order
lines
• Entity-relationship (E-R) diagram: represents a database visually by using various
symbols
• Database management system (DBMS): program through which users interact with a
database; lets you create forms and reports quickly and easily and obtain answers to
questions about the data
• Advantages of database processing: getting more information from the same amount of
data, sharing data, balancing conflicting requirements, controlling redundancy,
16 | INTE 20023 Managing Data Using DBMS Software
facilitating consistency, improving integrity, expanding security, increasing productivity,
and providing data independence
• Disadvantages of database processing: larger file size, increased complexity, greater
impact of failure, and more difficult recovery
• Henry Books needs to store information about: branches, publishers, authors, books,
inventory, and author sequence
• Alexamara Marina Group needs to store information about: marinas, owners, marina
slips, service categories, and service requests
Read:
Lesson 1: Introduction to Database Management
Database Management Systems by Pratt/Adamski
Assessment/Activities:
1. What is redundancy? What problems are associated with redundancy?
2. What is an entity? An attribute?
3. What is database?
4. What is an E-R diagram?
5. What is database design?
6. How it is possible to get more information from the same amount of data by using a
database approach as opposed to a non database approach?
7. What is DBA? What kinds of responsibilities does a DBA have in a database environment?
8. What is an integrity constraint? When does a database have integrity?
9. What is a data independence? Why is it desirable?
10. How can the complexity of a DBMS be a disadvantage?
11. Why might recovery of data be more difficult in a database environment?
Lesson 2: The Relational Model 1: Introduction, QBE, and Relational Algebra
Learning Outcomes:
After successful completion of this lesson, you should be able to:
• Describe the relational model
• Understand Query-By-Example (QBE)
• Use criteria in QBE
• Create calculated columns in QBE
• Use functions in QBE
• Sort data in QBE
• Join tables in QBE
• Update data using QBE
• Understand relational algebra
17 | INTE 20023 Managing Data Using DBMS Software
COURSE MATERIALS:
Relational Databases
• A relational database is a collection of tables
• Each entity is stored in its own table
• Attributes of an entity become the fields or columns in the table
• Relationships are implemented through common columns in two or more tables
• Should not permit multiple entries (repeating groups) in a table
• Relation: two-dimensional table in which:
• Entries are single-valued
• Each column has a distinct name (called the attribute name)
• All values in a column are values of the same attribute
• Order of columns is immaterial
• Each row is distinct
• Order of rows is immaterial
• Relational database: collection of relations
• Unnormalized relation
• A structure that satisfies all properties of a relation except for the first item
• Entries contain repeating groups; they are not single-valued
• Database structure representation
• Write name of the table followed by a list of all columns within parentheses
• Each table should appear on its own line
• Notation to be used with duplicate column names within a database:
[Link]
• You qualify the column names
• Primary key: column or collection of columns of a table (relation) that uniquely identifies
a given row in that table
• Query: question represented in a way the DBMS can recognize and process
• Query-By-Example (QBE)
• Visual approach to writing queries
• Users ask their questions using an on-screen grid
• Data appears on the screen in tabular form
• Query window in Access has two panes
•
Upper portion contains a field list for each table you want to query
•
Lower pane contains the design grid, where you specify:
• Format of output
• Fields to be included in the query results
• Sort order for query results
• Any criteria the records must satisfy
Simple Queries
• To include a field in an Access query, double-click the field in the field list to place it in
the design grid
18 | INTE 20023 Managing Data Using DBMS Software
• Clicking Run button in Results group on the Query Tools Design tab runs query and
displays query results
• Add all fields from a table to the design grid by double-clicking the asterisk in the table’s
field list
FIGURE 2-3: Fields added to the design grid
FIGURE 2-4: Query results
Simple Criteria
• Criteria: conditions that data must satisfy
• Criterion: single condition that data must satisfy
• To enter a criterion for a field:
– Include field in the design grid
– Enter criterion in Criteria row for that field
• Comparison operator
19 | INTE 20023 Managing Data Using DBMS Software
– Also called a relational operator
– Used to find something other than an exact match
= (equal to)
> (greater than)
< (less than)
>= (greater than or equal to)
<= (less than or equal to)
NOT (not equal to)
Compound Criteria
• Compound criteria, or compound conditions
– AND criterion: both criteria must be true for the compound criterion to be true
– OR criterion: either criteria must be true for the compound criterion to be true
• To create an AND criterion in QBE:
– Place the criteria for multiple fields on the same Criteria row in the design grid
• To create an OR criterion in QBE:
– Place the criteria for multiple fields on different Criteria rows in the design grid
FIGURE 2-9: Query that uses an AND criterion
20 | INTE 20023 Managing Data Using DBMS Software
FIGURE 2-11: Query that uses an OR criterion
Computed Fields
• Computed field or calculated field
– Result of a calculation on one or more existing fields
• To include a computed field in a query:
– Enter a name for the computed field, followed by a colon, followed by an
expression in one of the columns in the Field row
• Alternative method
– Right-click the column in the Field row, and then click Zoom to open the Zoom
dialog box
– Type the expression in the Zoom dialog box
–
FIGURE 2-15: Query that uses a computed field
21 | INTE 20023 Managing Data Using DBMS Software
Functions
• Built-in functions
– Called aggregate functions in Access
• Count
• Sum
• Avg (average)
• Max (largest value)
• Min (smallest value)
• StDev (standard deviation)
• Var (variance)
• First
• Last
FIGURE 2-17: Query to count records
FIGURE 2-18: Query results
Grouping
• Grouping: creating groups of records that share some common characteristic
• To group records in Access:
– Select Group By operator in the Total row for the field on which to group
22 | INTE 20023 Managing Data Using DBMS Software
FIGURE 2-21: Query to group records
Sorting
• Sorting: listing records in query results in an ordered way
• Sort key: field on which records are sorted
• Major sort key
– Also called the primary sort key
– First sort field, when sorting records by more than one field
• Minor sort key
– Also called the secondary sort key
– Second sort field, when sorting records by more than one field
FIGURE 2-23: Query to sort records
Sorting on Multiple Keys
• Specifying more than one sort key in a query
• Major (primary) sort key
– Sort key on the left in the design grid
• Minor (secondary) sort key
– Sort key on the right in the design grid
23 | INTE 20023 Managing Data Using DBMS Software
FIGURE 2-27: Correct query design to sort by RepNum and then by CustomerName
Joining Tables
• Queries to select data from more than one table
• Join the tables based on matching fields in corresponding columns
• Join line
– Line drawn by Access between matching fields in the two tables
– Indicates that the tables are related
FIGURE 2-29: Query design to join two tables
Joining Multiple Tables
• Joining three or more tables is similar to joining two tables
• To join three or more tables:
– Add the field lists for all tables in the join to upper pane
– Add the fields to appear in query results to design grid in the desired order
Using an Update Query
• Update query: a query that changes data
– Makes a specified change to all records satisfying the criteria in the query
• To change a query to an update query:
– Click Update button in the Query Type group on the Query Tools Design tab
• Update To row is added when an update query is created
– Used to indicate how to update data selected by the query
24 | INTE 20023 Managing Data Using DBMS Software
FIGURE 2-35: Query design to update data
Using a Delete Query
• Delete query: permanently deletes all records satisfying the criteria entered in the query
• To change query type to a delete query:
– Click Delete button in the Query Type group on the Query Tools Design tab
• Delete row is added
– Indicates this is a delete query
FIGURE 2-36: Query design to delete records
Using a Make-Table Query
• Make-table query: creates a new table using results of a query
• Records added to new table are separate from the original table
• To change the query type to a make-table query:
– Click Make Table button in the Query Type group on the Query Tools Design tab
– In Make Table dialog box, enter the new table’s name and choose where to
create it
25 | INTE 20023 Managing Data Using DBMS Software
FIGURE 2-38: Make Table dialog box
Relational Algebra
• Theoretical way of manipulating a relational database
• Includes operations that act on existing tables to produce new tables
• Each command ends with a GIVING clause, followed by a table name
– Clause requests the result of the command to be placed in a temporary table with
the specified name
Select
• Takes a horizontal subset of a table
• Retrieves certain rows from an existing table (based on criteria) and saves them as a
new table
• Includes the word WHERE followed by a condition
• Example:
SELECT Customer WHERE CustomerNum=282
GIVING Answer
Project
• Takes a vertical subset of a table
• Causes only certain columns to be included in the new table
• Includes the word OVER followed by a list of the columns to be included
• Example:
PROJECT Customer OVER (CustomerNum, CustomerName)
GIVING Answer
Join
• Allows extraction of data from more than one table
• Two tables being joined
– Join column: common column on which two tables are joined
26 | INTE 20023 Managing Data Using DBMS Software
– Rows in new table will be the concatenation (combination) of rows from each
original table
• Natural join: joins records from each original table that is common to both tables
• Outer join: joins records from each original table including records not common to both
tables
Normal Set Operations
• Union of tables A and B
– Table containing all rows that are in either table A or table B or in both table A
and table B
• Intersection of tables A and B
– Table containing all rows that are common in both table A and table B
• Difference of tables A and B
– Referred to as A minus B
– Set of all rows that are in table A but that are not in table B
Union
• Two tables are union compatible when:
– They have the same number of columns
– Corresponding columns represent the same type of data
JOIN Orders, Customer
WHERE [Link]=[Link]
GIVING Temp1
PROJECT Temp1 OVER CustomerNum, CustomerName
GIVING Temp2
SELECT Customer WHERE RepNum='65'
GIVING Temp3
PROJECT Temp3 OVER CustomerNum, CustomerName
GIVING Temp4
UNION Temp2 WITH Temp4 GIVING Answer
Intersection
• Performed by the INTERSECT command
27 | INTE 20023 Managing Data Using DBMS Software
JOIN Orders, Customer
WHERE [Link]=[Link]
GIVING Temp1
PROJECT Temp1 OVER CustomerNum, CustomerName
GIVING Temp2
SELECT Customer WHERE RepNum='65'
GIVING Temp3
PROJECT Temp3 OVER CustomerNum, CustomerName
GIVING Temp4
INTERSECT Temp2 WITH Temp4 GIVING Answer
Difference
• Performed by the SUBTRACT command
JOIN Orders, Customer
WHERE [Link]=[Link]
GIVING Temp1
PROJECT Temp1 OVER CustomerNum, CustomerName
GIVING Temp2
SELECT Customer WHERE RepNum='65'
GIVING Temp3
PROJECT Temp3 OVER CustomerNum, CustomerName
GIVING Temp4
SUBTRACT Temp4 FROM Temp2 GIVING Answer
Product
• Mathematically called the Cartesian product
• Table obtained by concatenating every row in first table with every row in second table
28 | INTE 20023 Managing Data Using DBMS Software
FIGURE 2-43: Product of two tables
Division
• Best illustrated by considering division of a table with two columns by a table with a
single column
• Result contains quotient
FIGURE 2-44: Dividing one table by another
Summary
• Relation: two-dimensional table in which the entries are single-valued, each field has a
distinct name, all values in a field are values of the same attribute, order of fields is
immaterial, each row is distinct, and order of rows is immaterial
• Relational database: collection of relations
• A table’s primary key is the field or fields that uniquely identify a given row within the
table
• Query-By-Example (QBE) is a visual tool for manipulating relational databases
• To indicate AND criteria in an Access query, place both criteria in the same Criteria row
of the design grid; to indicate OR criteria, place criteria on separate Criteria rows of the
design grid
29 | INTE 20023 Managing Data Using DBMS Software
• To create a computed field in Access, enter expression in the desired column of design
grid
• To use functions to perform calculations in Access, include the appropriate function in
the Total row
• To sort query results in Access, select Ascending or Descending in Sort row for the field
or fields that are sort keys
• To join tables in Access, place field lists for both tables in upper pane of Query window
• To make the same change to all records that satisfy certain criteria, use an update query
• To delete all records that satisfy certain criteria, use a delete query
• To save the results of a query as a table, use a make-table query
• Relational algebra is a theoretical method of manipulating relational databases
• SELECT command selects only certain rows
• PROJECT command selects only certain columns
• JOIN command combines data from two or more tables based on common columns
• Normal set of operations: union, intersection, and difference
• Product of two tables results from concatenating every row in the first with every row in
the second
• Division process divides one table by another table
Read:
Lesson 2: The Relational Model 1: Introduction, QBE, and Relational Algebra
Assessment/Activities:
1. What is relational database?
2. Define primary key.
3. Give the different function used in the database.
4. Differentiate Join from Union.
5. Give the command for SELECT, PROJECT and UNION.
30 | INTE 20023 Managing Data Using DBMS Software
Lesson 3: The Relational Model 2: SQL
Learning Outcomes:
After successful completion of this lesson, you should be able to:
Introduce Structured Query Language (SQL)
Demonstrate simple and compound conditions in SQL
Discuss computed fields in SQL
Familiarize built-in SQL functions
Demonstrate subqueries in SQL
Apply group records in SQL
Apply Join tables using SQL
Perform union operations in SQL
Update database data using SQL
Apply SQL query to create a table in a database
COURSE MATERIALS:
Introduction
• SQL (Structured Query Language)
– Allows users to query a relational database
– Must enter commands to obtain the desired results
– Standard language for relational database manipulation
Getting Started with SQL
• If you are completing the work in this chapter using Microsoft Office Access 2007,
Microsoft Office Access 2010, or MySQL version 4.1 or higher, the following sections
contain specific information about your DBMS
Getting Started with Microsoft Office Access 2007 and 2010
• If you are using the Access 2007 or 2010 version of the Premiere Products database
provided with the Data Files for this text:
– Tables in the database have already been created
– You will not need to execute the CREATE TABLE commands to create the tables or
the INSERT commands to add records to the tables
• To execute SQL commands shown in the figures in Access 2007 or Access 2010:
– Open the Premiere Products database
– Click the Create tab on the Ribbon
– Click the Query Design button in the Other group
– Click the Close button in the Show Table dialog box
31 | INTE 20023 Managing Data Using DBMS Software
– Click the View button arrow in the Results group on the Query Design Tools tab, then
click SQL View
– The Query1 tab displays the query in SQL view, ready for you to type your SQL
commands
Getting Started with MySQL
• MySQL-Premiere script provided with the Data Files for this text will:
– Activate the database
– Create the tables
– Insert the records
• To run a script in MySQL:
– Type the SOURCE command followed by the name of the file
– Press the Enter key
• Before typing commands in MySQL, you must activate the database by typing the USE
command followed by the name of the database
• The most recent command entered in MySQL is stored in a special area of memory
called the statement history
Table Creation
• SQL CREATE TABLE command
– Creates a table by describing its layout
• Typical restrictions placed on table and column names by DBMS
– Names cannot exceed 18 characters
– Names must start with a letter
– Names can contain only letters, numbers, and underscores (_)
– Names cannot contain spaces
• INTEGER
– Number without a decimal point
• SMALLINT
– Uses less space than INTEGER
• DECIMAL(p,q)
– P number of digits; q number of decimal places
• CHAR(n)
– Character string n places long
• DATE
– Dates in DD-MON-YYYY or MM/DD/YYYY form
32 | INTE 20023 Managing Data Using DBMS Software
Simple Retrieval
• SELECT-FROM-WHERE: SQL retrieval command
– SELECT clause: lists fields to display
– FROM clause: lists table or tables that contain data to display in query results
– WHERE clause (optional): lists any conditions to be applied to the data to
retrieve
• Simple condition: field name, a comparison operator, and either another field name or
a value
FIGURE 3-6: SQL query with WHERE condition
FIGURE 3-7: Query results
FIGURE 3-8: Comparison operators used in SQL commands
33 | INTE 20023 Managing Data Using DBMS Software
Compound Conditions
• Compound condition
– Connecting two or more simple conditions using one or both of the following
operators: AND and OR
– Preceding a single condition with the NOT operator
• Connecting simple conditions using AND operator
– All of the simple conditions must be true for the compound condition to be true
• Connecting simple conditions using OR operator
– Any of the simple conditions must be true for the compound condition to be true
FIGURE 3-15: Compound condition that uses the AND operator
FIGURE 3-16: Query results
34 | INTE 20023 Managing Data Using DBMS Software
FIGURE 3-17: Compound condition that uses the OR operator
FIGURE 3-18: Query results
• Preceding a condition by NOT operator
– Reverses the truth or falsity of the original condition
• BETWEEN operator
– Value must be between the listed numbers
Computed Fields
• Computed field or calculated field
– Field whose values you derive from existing fields
– Can involve:
Addition (+)
Subtraction (-)
Multiplication (*)
Division (/)
FIGURE 3-25: SQL query with a computed field and condition
35 | INTE 20023 Managing Data Using DBMS Software
FIGURE 3-26: Query results
Using Special Operators (LIKE and IN)
• Wildcards in Access SQL
– Asterisk (*): collection of characters
– Question mark (?): any individual character
• Wildcards in MySQL
– Percent sign (%): any collection of characters
– Underscore (_): any individual character
• To use a wildcard, include the LIKE operator in the WHERE clause
• IN operator provides a concise way of phrasing certain conditions
FIGURE 3-27: SQL query with a LIKE operator
FIGURE 3-28: Query results
36 | INTE 20023 Managing Data Using DBMS Software
FIGURE 3-28: SQL query with an IN operator
FIGURE 3-29: Query results
Sorting
Sort data using the ORDER BY clause
Sort key: field on which to sort data
When sorting data on two fields:
– Major sort key (or primary sort key): more important sort key
– Minor sort key (or secondary sort key): less important sort key
FIGURE 3-33: SQL query to sort data on multiple fields
37 | INTE 20023 Managing Data Using DBMS Software
FIGURE 3-34: Query results
Built-in Functions
• Built-in functions (aggregate functions) in SQL
– COUNT: calculates number of entries
– SUM or AVG: calculates sum or average of all entries in a given column
– MAX or MIN: calculates largest or smallest values respectively
FIGURE 3-35: SQL query to count records
FIGURE 3-36: Query results
Subqueries
• Subquery: inner query
Subquery is evaluated first
38 | INTE 20023 Managing Data Using DBMS Software
Outer query is evaluated after the subquery
FIGURE 3-41: SQL query with a subquery
FIGURE 3-42: Query results
Grouping
Create groups of records that share a common characteristic
GROUP BY clause indicates grouping in SQL
HAVING clause is to groups what the WHERE clause is to rows
FIGURE 3-45: SQL query to restrict the groups that are included
39 | INTE 20023 Managing Data Using DBMS Software
FIGURE 3-46: Query results
Joining Tables
Queries can locate data from more than one table
Enter appropriate conditions in the WHERE clause
To join tables, construct the SQL command as:
1. SELECT clause: list all fields you want to display
2. FROM clause: list all tables involved in the query
3. WHERE clause: give the condition that will restrict the data to be retrieved to only
those rows from the two tables that match
FIGURE 3-49: SQL query to join tables
FIGURE 3-50: Query results
40 | INTE 20023 Managing Data Using DBMS Software
Union
Union of two tables is a table containing all rows in the first table, the second table, or
both tables
Two tables involved must be union compatible
– Same number of fields
– Corresponding fields must have same data types
FIGURE 3-55: SQL query to perform a union
FIGURE 3-56: Query results
Updating Tables
UPDATE command makes changes to existing data
INSERT command adds new data to a table
DELETE command deletes data from the database
41 | INTE 20023 Managing Data Using DBMS Software
FIGURE 3-57: SQL query to update data
FIGURE 3-58: SQL query to insert a row
FIGURE 3-59: SQL query to delete rows
Creating a Table from a Query
• INTO clause
– Saves the results of a query as a table
– Specified before FROM and WHERE clauses
• MySQL
– Create the new table using a CREATE TABLE command
– Use an INSERT command to insert the appropriate data into the new table
42 | INTE 20023 Managing Data Using DBMS Software
FIGURE 3-60a: Query to create a new table (Access)
FIGURE 3-60b: Query to create a new table (for Oracle and MySQL)
Summary of SQL Commands
Generic versions of SQL commands for every example presented in this chapter
In most cases, commands in Access are identical to the generic versions
For those commands that differ, both the generic version and the Access version are
included
Summary
Structured Query Language (SQL) is a language that is used to manipulate relational
databases
Basic form of an SQL query: SELECT-FROM-WHERE
Use CREATE TABLE command to describe table layout to the DBMS, which creates the
table
In SQL retrieval commands, fields are listed after SELECT, tables are listed after FROM,
and conditions are listed after WHERE
In conditions, character values must be enclosed in single quotation marks
Compound conditions are formed by combining simple conditions using either or both of
the following operators: AND and OR
Sorting is accomplished using ORDER BY clause
When the data is sorted in more than one field, can have a major and minor sort key
Grouping: use the GROUP BY clause
HAVING clause: restricts the rows to be displayed
Joining tables: use a condition that relates matching rows in the tables to be joined
Built-in (aggregate) functions: COUNT, SUM, AVG, MAX, and MIN
One SQL query can be placed inside another; the subquery is evaluated first
UNION operator: unifies the results of two queries
Calculated fields: include the calculation, the word AS, the name of the calculated field
INSERT command adds a new row to a table
UPDATE command changes existing data
DELETE command deletes records
INTO clause is used in a SELECT command to create a table containing the results of
the query
43 | INTE 20023 Managing Data Using DBMS Software
READ:
Lesson 3: The Relational Model 2: SQL
Database Management Systems by Pratt/Adamski
Activities/assessment:
1. Describe the process of creating a table in SQL and the different data types you can use
for fields.
2. How do you your write a compound condition in an SQL query? When is a compound
condition true?
3. How do you use the LIKE and IN operators in an SQL query?
4. What are the SQL built-in functions? How do you use them in an SQL query?
5. How do you group data in SQL? When you group data in SQL, are there any restrictions
on the items the you can include in the SELECT clause? Explain.
6. How do you qualify the name of a field in an SQL query? When it is necessary to do so?
7. Describe the three update commands in SQL?
Lesson 4: The Relational Model 3: Advanced Topics
Learning Outcomes:
After successful completion of this lesson, you should be able to:
• Define, describe, and use views
• Use indexes to improve database performance
• Examine the security features of a DBMS
• Discuss entity, referential, and legal-values integrity
• Make changes to the structure of a relational database
• Define and use the system catalog
• Discuss stored procedures, triggers, and data macros
COURSE MATERIALS:
Views
• View: application program’s or individual user’s picture of the database
• Less involved than full database
• Simplification
• Security
• Defining query: SELECT command that creates a view
– Indicates what to include in the view
• Query acts as a window into the database
• Does not produce a new table
• Query that involves a view
– DBMS does not execute the query in this form
44 | INTE 20023 Managing Data Using DBMS Software
– Query actually executed is created by merging this query with the query that
defines the view
CREATE VIEW Housewares AS
SELECT PartNum, Description, OnHand, Price
FROM Part
WHERE Class='HW'
FIGURE 4-1: Housewares view
• To create a view in Access, create and save a query
• Changing field names in a view
– SQL: include the new field names in the CREATE VIEW command
– Access: precede the name of the field with the desired name, followed by a colon
• Row-and-column subset view
– Subset of rows and columns in an individual table
FIGURE 4-3: Access query design of the Housewares view
45 | INTE 20023 Managing Data Using DBMS Software
FIGURE 4-5: Access query design of the Housewares view with changed field
names
• A view can join two or more tables
• Advantages of views
– Data independence
– Each user has his or her own view
– View should contain only fields required by the user
• Greatly simplifies user’s perception of database
• Security
Indexes
• Conceptually similar to book index
• Increase data retrieval efficiency
• Record numbers automatically assigned and used by DBMS
• Index key: field or combination of fields on which index is built
• Advantages
– Makes some data retrieval more efficient
FIGURE 4-10: Customer table with record numbers
46 | INTE 20023 Managing Data Using DBMS Software
FIGURE 4-11: Index for the Customer table on the CustomerNum field
• Disadvantages
– Occupies space on disk
– DBMS must update index whenever corresponding data are updated
• Create an index on a field (or fields) when:
• Field is the primary key of the table
• Field is the foreign key in a relationship
• Field will be frequently used as a sort field
• Need to frequently locate a record based on a value in this field
• SQL command to create an index:
CREATE INDEX CustomerName
ON Customer (CustomerName)
• Single-field index
– Key is a single field
– Also called a single-column index
• Multiple-field index
– More than one key field
– Also called a multiple-column index
FIGURE 4-13: Creating an index on a single field in Access
47 | INTE 20023 Managing Data Using DBMS Software
FIGURE 4-14: Creating a multiple-field index in Access
Security
• Prevention of unauthorized access to database
• Database administrator determines types of access various users can have
• SQL security mechanisms
– GRANT: provides privileges to users
– REVOKE: removes privileges from users
REVOKE SELECT ON Customer FROM Jones
Integrity Rules
• Two integrity rules must be enforced by a relational DBMS
– Integrity rules defined by Dr. E.F. Codd
– Entity integrity
– Referential integrity
Entity Integrity
• No field that is part of primary key may accept null values
• To specify primary key in SQL:
– Enter a PRIMARY KEY clause in either an ALTER TABLE or a CREATE TABLE command
• To designate primary key in Access:
– Select primary key field in Table Design view
– Click the Primary Key button in the Tools group on the Table Tools Design tab
• SQL command to specify a primary key:
48 | INTE 20023 Managing Data Using DBMS Software
PRIMARY KEY (CustomerNum)
FIGURE 4-15: Specifying a primary key in Access
• SQL command when more than one field included:
PRIMARY KEY (OrderNum, PartNum)
FIGURE 4-16: Specifying a primary key consisting of more than one field in Access
Referential Integrity
• Foreign key: field(s) whose value is required to match the value of the primary key for a
second table
• Referential integrity: if table A contains a foreign key that matches the primary key of
table B, the values of this foreign key must match the value of the primary key for some
row in table B or be null
• To specify referential integrity in SQL:
– FOREIGN KEY clause in either the CREATE TABLE or ALTER TABLE
commands
• To specify a foreign key, must specify both:
– Field that is a foreign key
– Table whose primary key the field is to match
• Example:
FOREIGN KEY (RepNum) REFERENCES Rep
• In Access, specify referential integrity while defining relationships
49 | INTE 20023 Managing Data Using DBMS Software
FIGURE 4-18: Specifying referential integrity in Access
FIGURE 4-19: Referential integrity violation when attempting to add a record
Legal-Values Integrity
• Legal values: set of values allowable in a field
• Legal-values integrity: no record can exist with a value in the field other than one of
the legal values
• SQL
– CHECK clause enforces legal-values integrity
– Example:
CHECK (CreditLimit IN (5000, 7500, 10000, 15000))
Access
– Validation rule: must be followed by data entered
– Validation text: informs user of the reason for rejection of data that violates the rule
50 | INTE 20023 Managing Data Using DBMS Software
–
FIGURE 4-21: Specifying a validation rule in Access
Structure Changes
• Examples of changes to database structure
– Adding and removing tables and fields
– Changing characteristics of existing fields
– Creating and dropping indexes
• SQL ALTER TABLE command changes table’s structure
• To add a new field to the Customer table:
ALTER TABLE Customer
ADD CustType CHAR(1)
FIGURE 4-22: Adding a field in Access
• Changing properties of existing fields
ALTER TABLE Customer
CHANGE COLUMN CustomerName TO CHAR(40)
• Deleting a field from a table
ALTER TABLE Part
DELETE Warehouse
51 | INTE 20023 Managing Data Using DBMS Software
;
• DROP TABLE command deletes a table
DROP TABLE SmallCust
• Changing properties of existing fields
ALTER TABLE Customer
CHANGE COLUMN CustomerName TO CHAR(40)
• Deleting a field from a table
ALTER TABLE Part
DELETE Warehouse
• DROP TABLE command deletes a table
DROP TABLE SmallCust
FIGURE 4-23: Changing a field property in Access
52 | INTE 20023 Managing Data Using DBMS Software
FIGURE 4-24: Dialog box that opens when a field in Access is deleted
FIGURE 4-25: Deleting a table in Access
Making Complex Changes
• Some changes might not be allowed by your DBMS
• In these situations, you can:
– Use CREATE TABLE command to describe the new table
– Insert values into it using INSERT command combined with a SELECT clause
• SELECT INTO command can create the new table in a single operation
System Catalog
• System catalog (or catalog)
– Contains information about tables in the database
– Maintained automatically by DBMS
• Example catalog has two tables
– Systables: information about the tables known to SQL
– Syscolumns: information about the columns or fields within these tables
• Other possible tables
– Sysindexes: information about indexes
– Sysviews: information about views
• Catalog can be used to determine information about the structure of the database
• Documenter: allows user to print detailed documentation about any table, query, report,
form, or other object in the database
• MySQL uses SHOW TABLES, SHOW INDEXES, and SHOW COLUMNS commands
Stored Procedures
• Client/server system
– Database resides on a computer called the server
53 | INTE 20023 Managing Data Using DBMS Software
– Users access database through clients
• Client
– Computer connected to a network
– Has access through server to the database
• Stored procedure
– Special file used to store a query that is run often
– Placed on the server
– Improves overall performance
– Convenience
• MySQL
– Delimiter: semicolon at the end of a MySQL command
– Need to temporarily change the delimiter for a stored procedure
– To use a stored procedure: CALL followed by the procedure name
• Access does not support stored procedures
– Use a parameter query instead
Triggers
• Action that occurs automatically in response to an associated database operation such
as an INSERT, UPDATE, or DELETE command
• Stored and compiled on the server
• Need to temporarily change the delimiter
• Access does not support triggers
– Access 2010 has data macros that have similar functionality
Figure 4-29: Macro Designer window for the After Insert event associated with the OrderLine table
Summary
• Views give each user his or her own view of the data in a database
• Indexes facilitate data retrieval from the database
• Security is provided in SQL systems using the GRANT and REVOKE commands
• Entity integrity: no field that is part of the primary key can accept null values
• Referential integrity: value in any foreign key field must be null or must match an actual
value in the primary key field of another table
• Legal-values integrity: value entered in a field must be one of the legal values that
satisfies some particular condition
54 | INTE 20023 Managing Data Using DBMS Software
• ALTER TABLE command allows you to add fields to a table, delete fields, or change the
characteristics of fields
• In Access, change the structure of a table by making the changes in the table design
• DROP TABLE command lets you delete a table from a database
• In Access, delete a table by selecting the Delete command on the table’s shortcut menu
in the Navigation Pane
• System catalog stores information about the structure of a database
• Stored procedure: query saved in a file that users can execute later
• Trigger: action that occurs automatically in response to an associated database
operation such as an INSERT, UPDATE, or DELETE
• Data macros: Access 2010 equivalent of triggers
READ:
Lesson 3: The Relational Model 2: SQL
Database Management Systems by Pratt/Adamski
Assessment/Activities:
1. What is view? How it is defined? Do the data described in a view-definition ever exist in
that form? What happens when a user accesses a database through a view?
2. Defined a view named PartOrder. It consists of the part number, description, price,
order number, order date, number ordered, and quoted price for all order lines currently
on file.
3. Describe the GRANT statement and explain how it relates to security. What types of
privileges may be granted? How are they revoked?
4. Write the SQL commands to obtain the following information from the system catalog:
a. List every table that you created.
b. List every field in the Customer table and its associated data type.
c. List every table that contains a field name PartNum.
Lesson 5: Database Design 1: Normalization
Learning Outcomes:
After successful completion of this lesson, you should be able to:
• Discuss functional dependence and primary keys
• Define first normal form, second normal form, and fourth normal form
• Describe the problems associated with tables (relations) that are not in first normal form,
second normal form, or third normal form, along with the mechanism for converting to all
three
• Discuss the problems associated with incorrect conversions to third normal form
• Describe the problems associated with tables (relations) that are not in fourth normal
form and describe the mechanism for converting to fourth normal form
55 | INTE 20023 Managing Data Using DBMS Software
• Understand how normalization is used in the database design process
COURSE MATERIALS:
Introduction
• Normalization process
– Identifying potential problems, called update anomalies, in the design of a
relational database
– Methods for correcting these problems
• Normal form: table has desirable properties
– First normal form (1NF)
– Second normal form (2NF)
– Third normal form (3NF)
– Fourth normal form (4NF)
• Normalization
– Table in first normal form better than table not in first normal form
– Table in second normal form better than table in first normal form, and so on
– Goal: new collection of tables that is free of update anomalies
Functional Dependence
• Column B is functionally dependent on column A
– Each value for A is associated with exactly one value of B
A→B
– A functionally determines B
FIGURE 5-2: Rep table with additional column, PayClass
FIGURE 5-3: Rep table
56 | INTE 20023 Managing Data Using DBMS Software
FIGURE 5-4: Rep table with second rep named Kaiser added
Keys
• Column A (or a collection of columns) is the primary key for a relation R
– Property 1: all columns in R are functionally dependent on A
– Property 2: no subcollection of columns in A also have Property 1
• Candidate key: column(s) on which all columns in table are functionally dependent
• Alternate keys: candidate keys not chosen as primary key
First Normal Form
• Repeating group: multiple entries for a single record
• Unnormalized relation: contains a repeating group
• Table (relation) in first normal form (1NF) does not contain repeating groups
Orders (OrderNum, OrderDate, (PartNum, NumOrdered) )
FIGURE 5-5: Sample unnormalized table
Orders (OrderNum, OrderDate, PartNum, NumOrdered)
FIGURE 5-6: Result of normalization (conversion to first normal form)
57 | INTE 20023 Managing Data Using DBMS Software
Second Normal Form
FIGURE 5-7: Sample Orders table
Orders (OrderNum, OrderDate, PartNum, Description, NumOrdered, QuotedPrice)
• Functional dependencies:
OrderNum → OrderDate
PartNum → Description
OrderNum, PartNum → NumOrdered, QuotedPrice, OrderDate, Description
• Update anomalies
– Update
– Inconsistent data
– Additions
– Deletions
• Nonkey column (nonkey attribute): not part of primary key
• Table (relation) in second normal form (2NF)
– Table is in first normal form
– No nonkey column is dependent on only a portion of primary key
• Dependency diagram: arrows indicate all functional dependencies
– Arrows above boxes: normal dependencies
– Arrows below boxes: partial dependencies
• Partial dependencies: only on a portion of the primary key
FIGURE 5-8: Dependences in the Orders table
58 | INTE 20023 Managing Data Using DBMS Software
FIGURE 5-9: Conversion to second normal form
Third Normal Form
• Customer (CustomerNum, CustomerName, Balance, CreditLimit, RepNum, LastName,
FirstName)
• Functional dependencies:
– CustomerNum → CustomerName, Balance, CreditLimit, RepNum,
LastName, FirstName
– RepNum → LastName, FirstName
FIGURE 5-10: Sample Customer table
• 2NF tables may still contain problems
– Redundancy and wasted space
– Update anomalies
59 | INTE 20023 Managing Data Using DBMS Software
– Update
– Inconsistent data
– Additions
– Deletions
• Determinant: column(s) that determines another column
• Table (relation) in third normal form (3NF)
– It is in second normal form
– Its only determinants are candidate keys
FIGURE 5-11: Dependencies in the Customer table
• Correction procedure
– For each determinant that is not a candidate key, remove from table the columns
that depend on this determinant
– Create new table containing all columns from the original table that depend on this
determinant
– Make determinant the primary key of new table
FIGURE 5-12: Conversion to third normal form
60 | INTE 20023 Managing Data Using DBMS Software
FIGURE 5-12: Conversion to third normal form (continued)
Incorrect Decompositions
• Decomposition must be done using method described for 3NF
• Incorrect decompositions can lead to tables with the same problems as original table.
FIGURE 5-13: Incorrect decomposition of the Customer table
61 | INTE 20023 Managing Data Using DBMS Software
FIGURE 5-13: Incorrect decomposition of the Customer table (continued)
FIGURE 5-14: Second incorrect decomposition of the Customer table
62 | INTE 20023 Managing Data Using DBMS Software
FIGURE 5-14: Second incorrect decomposition of the Customer table (continued)
Multivalued Dependencies and Fourth Normal Form
• 3NF tables may still contain problems
– Updates
– Additions
– Deletions
• Multivalued dependence of column B on column A
– “B is multidependent on A”
– “A multidetermines B”
– Each value for A is associated with a specific collection of values for B, and this
collection is independent of any values for C A → → B
• Table (relation) in fourth normal form (4NF)
– It is in third normal form
– No multivalued dependencies
• Converting table to fourth normal form
– Split third normal form table into separate tables, each containing the column that
multi determines the others
FIGURE 5-16: Conversion to fourth normal form
63 | INTE 20023 Managing Data Using DBMS Software
FIGURE 5-17: Normal forms
Avoiding the Problem with Multivalued Dependencies
• Slightly more sophisticated method for converting unnormalized table to first normal form
– Place each repeating group in separate table
– Each table will contain all columns of a repeating group, and primary key of the
original table
– Primary key to each new table will be the concatenation of the primary keys of the
original table and the repeating group
Application to Database Design
– Carefully convert tables to third normal form
– Review assumptions and dependencies periodically to see if changes to design are
needed
– Splitting relations to achieve third normal form tables creates need for an
interrelation constraint
• Interrelation constraint: condition that involves two or more relations
Summary
• Column (attribute) B is functionally dependent on another column A (or collection of
columns) when each value for A in the database is associated with exactly one value of
B
• Column(s) A is the primary key if all other columns are functionally dependent on A and
no subcollection of columns in A also have this property
• When there is more than one choice for primary key, one possibility is chosen to be the
primary key; others called candidate keys
• Table (relation) in first normal form (1NF) does not contain repeating groups
• Nonkey column (or nonkey attribute) is not a part of the primary key
• Table (relation) is in the second normal form (2NF) when it is in 1NF and no nonkey
column is dependent on only a portion of the primary key
• Determinant is a column that functionally determines another column
• Table (relation) is in third normal form (3NF) when it is in 2NF and its only determinants
are candidate keys
• Collection of tables (relations) that is not in third normal form has inherent problems
called update anomalies
• Table (relation) is in fourth normal form (4NF) when it is in 3NF and there are no
multivalued dependencies
64 | INTE 20023 Managing Data Using DBMS Software
READ:
Lesson 5: Database Design 1: Normalization
Database Management Systems by Pratt/Adamski
Assessment/Activities:
1. Define functional dependence.
2. Define primary key.
3. Define first normal form.
4. Define third normal form,. What types of problems would you find in tables that are not
in third normal form?
5. Define interrelations constraint and give one example of such a constraint. How are
interrelation constraint addressed?
6. Convert the following table to an equivalent collection of tables that are in third normal
form. This table contains information about patients of a dentist. Each patient belongs
to a household.
Patient (HouseholdNum, HouseholdName, Street, City, State, Zip
Balance, PatientNum, PatientName, (ServiceCode, Description, Fee, Date))
Lesson 6: Database Design 2: Design Method
Learning Outcomes:
After successful completion of this lesson, you should be able to:
• Discuss the general process and goals of database design
• Define user views and explain their function
• Define Database Design Language (DBDL) and use it to document database designs
• Create an entity-relationship (E-R) diagram to visually represent a database design
• Present a method for database design at the information level and view examples
illustrating this method
• Explain the physical-level design process
• Discuss top-down and bottom-up approaches to database design and examine the
advantages and disadvantages of both methods
• Use a survey form to obtain information from users prior to beginning the database
design process
• Review existing documents to obtain information prior to beginning the database design
• Discuss special issues related to implementing one-to-one relationships and many-to-
many relationships involving more than two entities
• Discuss entity subtypes and their relationships to nulls
• Learn how to avoid potential problems when merging third normal form relations
• Examine the entity-relationship model for representing and designing databases
COURSE MATERIALS:
Introduction
• Two-step process for database design
• Information-level design: completed independently of any particular DBMS
65 | INTE 20023 Managing Data Using DBMS Software
• Physical-level design: information-level design adapted for the specific DBMS that will
be used
– Must consider characteristics of the particular DBMS
User Views
• User view: set of requirements necessary to support operations of a particular database
user
• Cumulative design: supports all user views encountered during design process
Information-Level Design Method
• For each user view:
1. Represent the user view as a collection of tables
2. Normalize these tables
3. Identify all keys in these tables
4. Merge the result of Steps 1 through 3 into the cumulative design
Represent the User View As a Collection of Tables
• Step 1: Determine the entities involved and create a separate table for each type of
entity
• Step 2: Determine the primary key for each table
• Step 3: Determine the properties for each entity
• Step 4: Determine relationships between the entities
• One-to-many
• Many-to-many
• One-to-one
• One-to-many relationship: include primary key of the “one” table as a foreign key in the
“many” table
• Many-to-many relationship: create a new table whose primary key is the combination
of the primary keys of the original tables
• One-to-one relationship: simplest implementation is to treat it as a one-to-many
relationship
Normalize the Tables
• Normalize each table
• Target is third normal form
– Careful planning in early phases of the process usually rules out need to
consider fourth normal form
Identify All Keys
• For each table, identify:
– Primary key
– Alternate keys
– Secondary keys
– Foreign keys
66 | INTE 20023 Managing Data Using DBMS Software
• Alternate key: column(s) that could have been chosen as a primary key but was not
• Secondary keys: columns of interest strictly for retrieval purposes
• Foreign key: column(s) in one table that is required to match value of the primary key for
some row in another table or is required to be null
– Used to create relationships between tables
– Used to enforce certain types of integrity constraints
Types of Primary Keys
• Natural key: consists of a column that uniquely identifies an entity
– Also called a logical key or an intelligent key
• Artificial key: column created for an entity to serve solely as the primary key and that is
visible to users
• Surrogate key: system-generated; usually hidden from users
– Also called a synthetic key
Database Design Language (DBDL)
• Table name followed by columns in parentheses
• Primary AK identifies alternate keys
• SK identifies secondary keys
• FK identifies foreign keys
– key column(s) underlined
– Foreign keys followed by an arrow pointing to the table identified by the foreign key
FIGURE 6-1: DBDL for the Employee table
Entity-Relationship (E-R) Diagrams
• Visually represents database structure
• Rectangle represents each entity
– Entity’s name appears above the rectangle
• Primary key for each entity appears above the line in the entity’s rectangle
• Other columns of entity appear below the line in rectangle
• Letters AK, SK, and FK appear in parentheses following the alternate key, secondary
key, and foreign key, respectively
67 | INTE 20023 Managing Data Using DBMS Software
• For each foreign key, a line leads from the rectangle for the table being identified to the
rectangle for the table containing the foreign key
• Text uses IDEF1X style of E-R diagram
FIGURE 6-2: E-R diagram
Merge the Result into the Design
• Combine tables that have the same primary key to form a new table
• New table:
– Primary key is same as the primary key in the tables combined
– Contains all the columns from the tables combined
– If duplicate columns, remove all but one copy of the column
• Make sure new design is in third normal form
FIGURE 6-3: Information-level design method
Database Design Examples
• Develop an information-level design
• Company stores information about sales reps, customers, parts, and orders
• User view requirements
• Constraints
68 | INTE 20023 Managing Data Using DBMS Software
FIGURE 6-4: Cumulative design after first user view
FIGURE 6-6: Cumulative design after third user view
FIGURE 6-8: Final information-level design
• Henry Books database: information about branches, publishers, authors, and books
• User view requirements
FIGURE 6-9: DBDL for Book database after first user view
69 | INTE 20023 Managing Data Using DBMS Software
FIGURE 6-10: DBDL for Book database after second user view
FIGURE 6-13: Cumulative design after fifth user view
Physical-Level Design
• Undertaken after information-level design completion
• Most DBMSs support primary, candidate, secondary, and foreign keys
• To enforce restrictions, DB programmers must include logic in their programs
Top-Down Versus Bottom-Up
• Bottom-up design method
– Design starts at low level
– Specific user requirements drive design process
• Top-down design method
– Begins with general database that models overall enterprise
– Refines model until design supports all necessary applications
Survey Form
• Used to collect information from users
• Must contain particular elements
– Entity information
– Attribute (column) information
– Relationships
– Functional dependencies
– Processing information
Obtaining Information from Existing Documents
• Existing documents can furnish information about database design
• Identify and list all columns and give them appropriate names
70 | INTE 20023 Managing Data Using DBMS Software
• Identify functional dependencies
• Determine the tables and assign columns
FIGURE 6-14: Invoice for Holt Distributors
FIGURE 6-15: List of possible attributes for the Holt Distributors invoice
FIGURE 6-17: Revised list of functional dependencies for the Holt Distributors invoice
71 | INTE 20023 Managing Data Using DBMS Software
FIGURE 6-19: Expanded list of entities
One-to-One Relationship Considerations
• Simply include the primary key of each table as a foreign key in the other table
– No guarantee that the information will match
• One solution: create a single table
– Workable, but not the best solution
• Better solution
– Create separate tables for customers and sales reps
– Include the primary key of one of them as a foreign key in the other
FIGURE 6-23: One-to-one relationship implemented by including the primary key of one
table as the foreign key (and alternate key) in the other table
Many-to-Many Relationship Considerations
• Complex issues arise when more than two entities are related in a many-to-many
relationship
• Many-to-many-to-many relationship: involves multiple entities
• Deciding between a single many-to-many-to-many relationship and two (or three) many-
to-many relationships
– Crucial issue: independence
72 | INTE 20023 Managing Data Using DBMS Software
FIGURE 6-25: Result obtained by splitting the Sales table into three tables
FIGURE 6-26: Result obtained by joining three tables—the second and third rows are in
error!
Nulls and Entity Subtypes
• Null
– Special value
– Represents absence of a value in a field
– Used when a value is unknown or inapplicable
• Splitting tables to avoid use of null values
• Entity subtype: table that is a subtype of another table
•
FIGURE 6-27: Student table split to avoid use of null values
73 | INTE 20023 Managing Data Using DBMS Software
• Subtype called a category in IDEF1X terminology
• Incomplete category: records that do not fall into the subtype
• Complete categories: all records fall into the categories
FIGURE 6-29: Entity subtype in an E-R diagram
FIGURE 6-32: Two entity subtypes—incomplete categories
FIGURE 6-33: Two entity subtypes—complete categories
Avoiding Problems with Third Normal Form When Merging Tables
• When combining third normal form tables, the result might not be in third normal form
• Be cautious when representing user views
• Always attempt to determine whether determinants exist and include them in tables
74 | INTE 20023 Managing Data Using DBMS Software
The Entity-Relationship Model
• An approach to representing data in a database
• Entities are drawn as rectangles
• Relationships are drawn as diamonds with lines connecting the entities involved in
relationships
• Composite entity: exists to implement a many-to-many relationship
• Existence dependency: existence of one entity depends on the existence of another
related entity
• Weak entity: depends on another entity for its own existence
FIGURE 6-34: One-to-many relationship
FIGURE 6-35: Many-to-many relationship
FIGURE 6-36: Many-to-many-to-many relationship
75 | INTE 20023 Managing Data Using DBMS Software
FIGURE 6-37: One-to-many relationship with attributes added
FIGURE 6-38: Many-to-many relationship with attributes
FIGURE 6-39: Composite entity
76 | INTE 20023 Managing Data Using DBMS Software
FIGURE 6-40: Complete E-R diagram for the Premiere Products database
FIGURE 6-41: E-R diagram with an existence dependency and a weak entity
• Cardinality: number of items that must be included in a relationship
– An entity in a relationship with minimum cardinality of zero plays an optional role in
the relationship
– An entity with a minimum cardinality of one plays a mandatory role in the
relationship
77 | INTE 20023 Managing Data Using DBMS Software
FIGURE 6-43: E-R diagram that represents cardinality
Summary
• Database design is a two-part process: information-level design (not dependent on a
particular DBMS) and physical-level design (appropriate for the particular DBMS being
used)
• User view: set of necessary requirements to support a particular user’s operations
• Information-level design steps for each user view: represent the user view as a collection
of tables, normalize these tables, represent all keys (primary, alternate, secondary, and
foreign), and merge the results into the cumulative design
• Database design is represented in Database Design Language (DBDL)
• Designs can be represented visually using entity-relationship (E-R) diagrams
• Physical-level design process consists of creating a table for each entity in the DBDL
design
• Design method presented in this chapter is bottom-up
• Survey form is useful for documenting the information gathered for database design
process
• To obtain information from existing documents, list all attributes present in the
documents, identify potential functional dependencies, make a tentative list of tables,
and use the functional dependencies to refine the list
• To implement a one-to-one relationship, include primary key of one table in the other
table as a foreign key and indicate the foreign key as an alternate key
• If a table’s primary key consists of three (or more) columns, determine whether there are
independent relationships between pairs of these columns
• If a table contains columns that can be null and the nulls mean that the column is
inapplicable for some rows, you can split the table, placing the null column(s) in separate
tables
• The result of merging third normal form tables may not be in third normal form
• Entity-relationship (E-R) model represents the structure of a database using an E-R
diagram
READ:
Lesson 6: Database Design 2: Design Method
Database Management Systems by Pratt/Adamski
Assessment/Activities:
1. Define the term user view as it applies to database design.
2. Under what circumstances would you not need to break down an overall design
into a consideration of individual user views?
3. Describe the function of each of the following types of keys: primary, alternate,
secondary, and foreign.
4. A database at a college is required to support the following requirements.
Complete the information level design for this set of requirements. Use your own
experience to determine any constraints you need that are not state in the
problem. Represent the answer in DBDL.
a. For a department, store its number and name.
b. For an advisor, store his or her number and name and the numbers of
the department to which he or she is assigned.
78 | INTE 20023 Managing Data Using DBMS Software
c. For a course its code and description (for example, MTH110 or Algebra)
d. For a student, store for his or her number and name. For each course the
student has taken, store the course code, course description, and grade
received. In addition, store the number and name of the student’s
advisor. Assume that an advisor may advise any number of students but
that each student has just one advisor.
Lesson 7: DBMS Functions
Learning Outcomes:
After successful completion of this lesson, you should be able to:
• Introduce the functions, or services, provided by a DBMS
• Describe how a DBMS handles updating and retrieving data
• Examine the catalog feature of a DBMS
• Illustrate the concurrent update problem and describe how a DBMS handles this
problem
• Explain the data recovery process in a database environment
• Describe the security services provided by a DBMS
• Examine the data integrity features provided by a DBMS
• Discuss the extent to which a DBMS achieves data independence
• Define and describe data replication
• Present the utility services provided by a DBMS
COURSE MATERIALS:
Introduction
• Functions of a DBMS
– Update and retrieve data
– Provide catalog services
– Support concurrent update
– Recover data
– Provide security services
– Provide data integrity features
– Support data independence
– Support data replication
– Provide utility services
Update and Retrieve Data
• Fundamental capability of a DBMS
• Users don’t need to know how data is stored or manipulated
• Users add, change, and delete records during updates
79 | INTE 20023 Managing Data Using DBMS Software
FIGURE 7-1: Adding a new part to the Premiere Products database
FIGURE 7-2: Changing the price of a part in the Premiere Products database
FIGURE 7-3: Retrieving a balance amount from the Premiere Products database
Provide Catalog Services
• Metadata: data about data
• Stores metadata and makes it accessible to users
• Enterprise DBMSs often have a data dictionary (a super catalog)
Support Concurrent Update
• Ensures accuracy when several users update database at the same time
• Manages complex scenarios for updates
• Concurrent update: multiple users make updates to the same database at the same
time
The Concurrent Update Problem
80 | INTE 20023 Managing Data Using DBMS Software
FIGURE 7-4: Ryan updates the database
FIGURE 7-5: Elena updates the database
81 | INTE 20023 Managing Data Using DBMS Software
FIGURE 7-6: Ryan’s and Elena’s updates to the database result in a lost update
FIGURE 7-6: Ryan’s and Elena’s updates to the database result in a lost update
(continued)
Avoiding the Lost Update Problem
• Batch processing
– All updates done through a special program
– Problem: data becomes out of date
– Does not work in situations that require data to be current
–
FIGURE 7-7: Delaying updates to the Premiere Products database to avoid the lost
update problem
82 | INTE 20023 Managing Data Using DBMS Software
Two-Phase Locking
– Locking: deny other users access to data while one user’s updates are being processed
– Transaction: set of steps completed by a DBMS to accomplish a single user task
– Two-phase locking solves lost update problem
– Growing phase: DBMS locks more rows and releases none of the locks
– Shrinking phase: DBMS releases all the locks and acquires no new locks
FIGURE 7-8: The DBMS uses a locking scheme to apply Ryan’s and Elena’s updates to
the database
FIGURE 7-8: The DBMS uses a locking scheme to apply Ryan’s and Elena’s updates to
the database
83 | INTE 20023 Managing Data Using DBMS Software
FIGURE 7-8: The DBMS uses a locking scheme to apply Ryan’s and Elena’s updates to
the database
Deadlock
• Deadlock or deadly embrace
– Two users hold a lock and require a lock on the resource that the other already has
– To minimize occurrence, make sure all programs lock records in the same order
whenever possible
• Managing deadlocks
– DBMS detects and breaks any deadlock
– DBMS chooses one user to be the victim
FIGURE 7-9: Two users experiencing deadlock
Locking on PC-Based DBMSs
• Usually more limited than locking facilities on enterprise DBMSs
• Programs can lock an entire table or an individual row within a table, but only one or the
other
• Programs can release any or all of the locks they currently hold
• Programs can inquire whether a given row or table is locked
Timestamping
• DBMS assigns each database update a unique time (timestamp) when the update
started
• Advantages
84 | INTE 20023 Managing Data Using DBMS Software
– Avoids need to lock rows
– Eliminates processing time needed to apply and release locks and to detect and
resolve deadlocks
• Disadvantages
– Additional disk and memory space
– Extra processing time
Recover Data
• Recovery: returning database to a correct state from an incorrect state
• Simplest recovery involves using backups
– Backup or save: copy of database
Journaling
• Journaling: maintaining a journal or log of all updates
– Log is available even if database is destroyed
• Information kept in log for each transaction:
– Transaction ID
– Date and time of each update
– Before image
– After image
– Start of a transaction
– Successful completion (commit) of a transaction
FIGURE 7-10: Four sample transactions
Forward Recovery
• DBA executes a DBMS recovery program
• Recovery program applies after images of committed transactions from log to database
• Improving performance of the recovery program
– Apply the last after image of a record
85 | INTE 20023 Managing Data Using DBMS Software
FIGURE 7-12: Forward recovery
Backward Recovery
• Database not in a valid state
– Transactions stopped in midstream
– Incorrect transactions
• Backward recovery or rollback
– Undo problem transactions
– Apply before images from log to undo their updates
FIGURE 7-13: Backward recovery
Recovery on PC-Based DBMSs
• Sophisticated recovery features not available on PC-based DBMSs
• Regularly make backup copies using DBMS
– Use most recent backup for recovery
• Systems with large number of updates between backups
– Recovery features not supplied by DBMS need to be included in application
programs
Provide Security Services
• Security: prevention of unauthorized access, either intentional or accidental, to a
database
• Most common security features used by DBMSs:
– Encryption
86 | INTE 20023 Managing Data Using DBMS Software
– Authentication
– Authorizations
– Views
Encryption
• Encryption: converts data to a format indecipherable to another program and stores it in
an encrypted format
• Encryption process is transparent to a legitimate user
• Decrypting: reversing the encryption
• In Access, encrypt a database with a password
Authentication
• Authentication: techniques for identifying the person attempting to access the DBMS
• Password: string of characters assigned by DBA to a user that must be entered for
access
• Biometrics: identify users by physical characteristics such as fingerprints, voiceprints,
handwritten signatures, and facial characteristics
• Smart cards: small plastic cards with built-in circuits containing processing logic to
identify the cardholder
• Database password: string of characters assigned to database that users must enter
for accessing the database
FIGURE 7-14: Assigning a database password to the Premiere Products database
Authorizations
• DBA can use authorization rules to specify which users have what type of access to
which data
• Permissions: specify what kind of access the user has to objects in the database
• Workgroups: groups of users
Views
• View: snapshot of certain data in the database at a given moment in time
• Can be used for security purposes
Privacy
• Privacy: right of individuals to have certain information about them kept confidential
• Laws and regulations dictate some privacy rules
• Companies institute additional privacy rules
Provide Data Integrity Features
• Rules followed to ensure data is accurately and consistently updated
87 | INTE 20023 Managing Data Using DBMS Software
• Key integrity
– Foreign key and primary key constraints
• Data integrity
– Data type
– Legal values
– Format
• Four ways of handling integrity constraints:
– Constraint is ignored
– Responsibility for constraint enforcement placed on users
– Responsibility for constraint enforcement placed on programmers
– Responsibility for constraint enforcement placed on DBMS
FIGURE 7-16: Example of integrity constraints in Access
Support Data Independence
• Data independence: can change database structure without needing to change
programs that access the database
• Types of changes:
• Adding a field
• Changing a field property (such as length)
• Creating an index
• Adding or changing a relationship
Adding a Field
• Don’t need to change any program except those programs using the new field
• SQL SELECT * FROM command will present an extra field
– Solution: list the required fields in an SQL SELECT command instead of using *
Changing the Length of a Field
• Generally, don’t need to change programs
• Need to change the program if:
– Certain portion of screen or report is set aside for the field and the space cannot
fit the new length
Creating an Index
88 | INTE 20023 Managing Data Using DBMS Software
• To create an index, enter a simple SQL command or select a few options
• Most DBMSs use the new index automatically
• For some DBMSs, need to make minor changes in already existing programs
Adding or Changing a Relationship
• Trickiest of all
• May need to restructure database
Support Data Replication
• Replicated: duplicated
• Manage multiple copies of same data in multiple locations
• Maintained for performance or other reasons
• Ease of access and portability
• Replicas: copies
• Synchronization: DBMS exchanges all updated data between master database and a
replica
FIGURE 7-18: DBMS synchronizes two databases in2 a replica set
Provide Utility Services
• Utility services assist in general database maintenance
• Change database structure
• Add new indexes and delete indexes
• Use services available from operating system
• Export and import data
• Support for easy-to-use edit and query capabilities, screen generators, report
generators, etc.
• Support for procedural and nonprocedural languages
• Procedural language: must tell computer precisely how a given task is to
be accomplished
• Nonprocedural language: describe task you want computer to accomplish
• Easy-to-use menu-driven or switchboard-driven interface
Summary
• DBMS allows users to update and retrieve data in a database without needing to know
how data is structured on disk or manipulated
• DBMS must store metadata (data about the data) and make this data accessible to
users
• DBMS must support concurrent update
• Locking denies access by other users to data while DBMS processes one user’s
updates
89 | INTE 20023 Managing Data Using DBMS Software
• During deadlock and deadly embrace, two or more users are waiting for the other user to
release a lock before they can proceed
• In timestamping, DBMS processes updates to a database in timestamp order
• DBMS must provide methods to recover a database in the event the database is
damaged
• DBMSs provide facilities for periodically making a backup copy of the database
• Enterprise DBMSs maintain a log or journal of all database updates since the last
backup; log is used in recovery process
• DBMSs provide security features (encryption, authentication, authorizations, and views)
to prevent unauthorized access to a database
• DBMS must follow rules or integrity constraints (key integrity constraints and data
integrity constraints) so that it updates data accurately and consistently
• DBMS must support data independence
• DBMS must have facility to handle data replication
• DBMS must provide utility services that assist in general maintenance of a database
READ:
Lesson 7: DBMS Functions
Database Management Systems by Pratt/Adamski
Assessment/Activities:
1. When users update and retrieve data, what task does a DBMS perform that are hidden
from the users?
2. How does a catalog differ from a data dictionary?
3. Describe a situation that could cause a lost update?
4. What is a transaction?
5. What is deadlock? How does it occur?
6. What is recovery?
7. When does a DBA use forward recovery? What are the forward recovery steps?
8. What is security?
9. What is authentication? Describe three types of authentications.
10. What are permission? Explain the relationship between permission and workgroups
11. What is privacy? How is privacy related to security?
12. What is data independence?
Lesson 8: Database Administration
Learning Outcomes:
After successful completion of this lesson, you should be able to:
• Discuss the need for database administration
• Explain the DBA’s responsibilities in formulating and enforcing database policies for
access privileges, security, disaster planning, and archiving
• Discuss the DBA’s administrative responsibilities for DBMS evaluation and selection,
DBMS maintenance, data dictionary management, and training
90 | INTE 20023 Managing Data Using DBMS Software
• Discuss the DBA’s technical responsibilities for database design, testing, and
performance tuning
COURSE MATERIALS:
Introduction
FIGURE 8-1: DBA responsibilities
Database Policy Formulation and Enforcement
• DBA
– Formulates database policies
– Communicates policies to users
– Enforces policies
• Policies
– Access privileges
– Security
– Disaster planning
• Archiving
Access Privileges
• DBA
– Determines access privileges for all users
– Enters appropriate authorization rules in DBMS
• SQL GRANT statement
• Access privilege policy
– Documented by DBA
– Approved by top-level management
– Communicated by DBA to all users
91 | INTE 20023 Managing Data Using DBMS Software
–
FIGURE 8-2: Permitted and denied access privileges for Sam
FIGURE 8-4: Permitted and denied access privileges for Valerie
Security
• Prevention of unauthorized access, intentional or accidental, to database
• DBA
– Creates security policies and procedures
– Obtains management approval of policies and procedures
– Distributes policies and procedures to authorized users
• DBMS’s security features
– Encryption
– Authentication
– Authorizations
– Views
• Additional security programs may be created or purchased
• Monitoring of database usage to detect security violations
•
FIGURE 8-5: Attempted security violation by Brady, who’s not an authorized user
92 | INTE 20023 Managing Data Using DBMS Software
FIGURE 8-6: Attempted security violation by Paige, who’s authorized to access some customer
data but is not authorized to access customer balances
Disaster Planning
• Damage from physical incidents
– Software/hardware/electrical
– Natural disasters
• Disaster recovery plan: ongoing and emergency actions and procedures to ensure
data availability if a disaster occurs
• Hard drive failures
– Redundant array of inexpensive/independent drives (RAID): database updates
replicated to multiple hard drives
• Electrical power loss
– Uninterruptible power supply (UPS): power source and power generator
• Duplicate backup systems
• Hot site: completely equipped with duplicate hardware, software, and data
• Can switch to hot site in minutes or hours
– Warm site: duplicate hardware and software but not data
• Takes longer to start processing
Archiving
• Governmental laws and regulations, for example:
– Sarbannes-Oxley Act
– Patriot Act
– HIPAA
Auditing and financial requirements
Data archive or archive: place where record of certain corporate data is kept
– Stored on mass storage devices
• Copies of archives and database backups must be stored off-site
93 | INTE 20023 Managing Data Using DBMS Software
FIGURE 8-7: Movement of order 21617 from the database to the archive
Other Database Administration Functions
DBMS evaluation and selection
DBMS maintenance
Data dictionary management
Training
DBMS Evaluation and Selection
Data definition
Data restructuring
Nonprocedural languages
Procedural languages
Data dictionary
Concurrent update
– Shared lock
Backup and recovery
Security
Integrity
Replication and distributed databases
Limitations
– Local area network (LAN)
Documentation and training
– Context-sensitive help
Vendor support
Performance
Portability
– Intranet
Cost
Future plans
Other considerations
94 | INTE 20023 Managing Data Using DBMS Software
DBMS Maintenance
Installation of DBMS
Configuration changes
Upgrades for new releases
Problem resolution
Special one-time processing needs
Data Dictionary Management
Data dictionary is like database catalog, but with wider range of information
Establishes naming conventions for tables, fields, indexes, etc.
Creates data definitions for tables
Creates data integrity rules and user views
Updates data dictionary
Creates and distributes reports from data dictionary
Training
Training in using DBMS and accessing database
Training of technical staff responsible for developing and maintaining database
applications
If training is provided by vendor of DBMS, DBA handles scheduling of training
Technical Functions
Database design
Testing
Performance tuning
Database Design
Establishes sound methodology for database design
Does physical-level design
Creates documentation standards
Reviews changes to requirements and manages modifications to database
Testing
Production system or live system: hardware, software, and database for users
DBA grants access to production system only to authorized users, except for:
– Troubleshooting a problem
– Addition of new or modified programs
Test system or sandbox: used by programmers to develop new programs and modify
existing programs
FIGURE 8-9: DBA controls the interaction between the test and production systems
95 | INTE 20023 Managing Data Using DBMS Software
Performance Tuning
– DBA attempts to get best performance within funding constraints
– Creating and deleting indexes
– Splitting tables
– Changing table design
– Denormalizing converts a table in third normal form to a table not in third normal
form
– Improved performance
FIGURE 8-10: Customer table for Premiere Products
FIGURE 8-11: Result of splitting the Customer table into two tables
96 | INTE 20023 Managing Data Using DBMS Software
FIGURE 8-12: Including part descriptions in the OrderLine table, which creates a first normal
form table
Summary
Database administrator (DBA) is responsible for supervising the database and use of the
DBMS
DBA formulates and enforces policies about which users can access database, portions
they may access, and the manner in which they can access it
DBA formulates and enforces policies about security by using DBMS’s security features,
special security programs, and monitoring database usage
DBA creates and implements backup and recovery procedures as part of a disaster
recovery plan
DBA formulates and enforces policies that govern management of an archive for data
DBA leads evaluation and selection of new DBMS
DBA installs and maintains DBMS
DBA maintains data dictionary, establishes naming conventions for its content, and
provides information from it to others
DBA provides database and DBMS training and coordinates and schedules training by
outside vendors
DBA verifies all information-level database designs, completes all physical-level
database designs, and creates documentation standards; also evaluates changes in
requirements
DBA controls production system, which is accessible only to authorized users; other than
under exceptional situations, programmers access a separate test system
DBA tunes database design to improve performance; includes creating and deleting
indexes, splitting tables, and denormalizing tables
READ :
Lesson 8: Database Administration
Database Management Systems by Pratt/Adamski
97 | INTE 20023 Managing Data Using DBMS Software
Assessment/Activities:
1. What is DBA? Why is this position necessary?
2. What are the DBA’s responsibilities regarding security?
3. What are data archives? What purpose do they serve? What is the relationship between
a database and its data archives?
4. What is a shard lock? What is an exclusive lock?
5. What is context-sensitive help?
6. After DBMS has been selected, what is the DBA’s role in DBMS maintenance?
7. Who trains computer users in an organization? What is the DBA’s role in this training?
8. What is the difference between production and test systems?
9. How can splitting a table improve performance?
Lesson 9: Database Management Approaches
Learning Outcomes:
After successful completion of this lesson, you should be able to:
Describe distributed database management systems (DDBMSs)
Discuss client/server systems
Examine the ways databases are accessed on the Web
Discuss XML and related document specification standards
Define data warehouses and explain their structure and access
Discuss the general concepts of object-oriented DBMSs
Distributed Databases
Computers at various sites
Connected with communications network or network
Distributed database: single logical database physically divided among networked
computers
Distributed database management system (DDBMS): supports and manipulates
distributed databases
98 | INTE 20023 Managing Data Using DBMS Software
FIGURE 9-1: Communications network
Computers in a network communicate through messages
Access delay required for every message
– Fixed amount of time
Communication time = access delay + (data volume / transmission rate)
Characteristics of Distributed DBMSs
Homogeneous DDBMS: same local DBMS at each site
Heterogeneous DDBMS: at least two sites at which local DBMSs are different
Shared characteristics of DDBMSs
– Location transparency
– Replication transparency
– Fragmentation transparency
Location Transparency
Remote site: site other than one where user is
Local site: site where user is
Location transparency: users do not need to be aware of location of data in a
distributed database
Replication Transparency
Data replication creates update problems that can lead to data inconsistencies
Replication transparency: users unaware of steps taken by DDBMS to update various
copies of data
Fragmentation Transparency
• Data fragmentation: DDBMS can divide and manage a logical object among various
locations under its control
– Data placed at the location where it is most often accessed
Fragmentation transparency: users unaware of fragmentation
FIGURE 9-2: Premiere Products Part table data
99 | INTE 20023 Managing Data Using DBMS Software
FIGURE 9-3: Fragmentation of Part table data by warehouse
Advantages of Distributed Databases
Local control of data
Increased database capability
System availability
Improved performance
Disadvantages of Distributed Databases
Update of replicated data
Primary copy
More complex query processing
More complex treatment of concurrent update
– Local deadlock: occurs at a single site in a distributed database
– Global deadlock: involves more than one site
More complex recovery measures
– Two-phase commit: one site acts as coordinator
More difficult management of data dictionary
More complex database design
More complicated security and backup requirements
Rules for Distributed Databases (C.J. Date)
Local autonomy
No reliance on a central site
Continuous operation
Location transparency
Fragmentation transparency
Replication transparency
Distributed query processing
Distributed transaction management
Hardware independence
Operating system independence
Network independence
DBMS independence
100 | INTE 20023 Managing Data Using DBMS Software
Client/Server Systems
File server architecture
– File server: stores user files on the network
Client/server architecture
– Server: computer providing data to clients
Back-end processor or back-end machine
– Clients: computers connected to a network and used by users to access data
• Front-end processor or front-end machine
FIGURE 9-4: File server architecture
FIGURE 9-5: Two-tier client/server architecture
• Two-tier architecture
– Server performs database functions
– Clients perform presentation functions
Fat client
Thin client
Three-tier architecture
Clients perform presentation functions
Database server performs database functions
– Application servers perform business functions and interface between clients
and database server
101 | INTE 20023 Managing Data Using DBMS Software
FIGURE 9-6: Three-tier client/server architecture
Advantages of Client/Server Systems
Lower network traffic
Improved processing distribution
Thinner clients
Greater processing transparency
Increased network, hardware, and software transparency
Improved security
Decreased costs
Increased scalability
Web Access to Databases
Internet and World Wide Web (or the Web)
Web page: digital document on the Web
Web server: stores Web pages
Web client: computer requesting a Web page
Each Web page has a Uniform Resource Locator (URL)
Hypertext Transfer Protocol (HTTP): data communication method used to exchange
data on the Internet
Web browser: computer program that retrieves a Web page from a Web client
Transmission Control Protocol/Internet Protocol (TCP/IP): standard protocol for
communication on the Internet
Web pages usually created using Hypertext Markup Language (HTML)
102 | INTE 20023 Managing Data Using DBMS Software
FIGURE 9-7: Retrieving a Web page on the Internet
Static vs. dynamic Web pages
– Static Web pages: same content for all Web clients
– Dynamic Web pages: content changes in response to inputs and choices from Web
clients
Server-side extensions or server-side scripts
Client-side extensions or client-side scripts
Three-tier Web-based architecture
– Web clients
– Web server
– Database server
FIGURE 9-8: Three-tier Web-based architecture
XML
• HTML
– Describes content and appearance of Web pages
– Does not describe structure and meaning of data
Extensible Markup Language (XML)
– Tags can define meaning and structure of data
– An XML document should begin with an XML declaration
Extensible Hypertext Markup Language (XHTML)
– Markup language based on XML
– Stricter version of HTML
Defining structure, characteristics, and relationships of data
– Document Type Definition (DTD)
– XML schema
Presentation of data
– Stylesheet
103 | INTE 20023 Managing Data Using DBMS Software
FIGURE 9-10: XML schema for the Rate element from the Rep table
FIGURE 9-11: Interaction among XML and related languages
Data Warehouses
Online transaction processing (OLTP) systems
– Users use transactions when interacting with an RDBMS
Data warehouse
– Subject-oriented, integrated, time-variant, nonvolatile collection of data in support of
management’s decision-making process
– Used for analysis of existing data
– Resolves performance issues suffered by operational RDBMSs and OLTPs
–
FIGURE 9-12: Data warehouse architecture
104 | INTE 20023 Managing Data Using DBMS Software
Data Warehouse Structure and Access
Star schema
– Fact table
– Dimension table
Online analytical processing (OLAP) software: for access to a data warehouse
Data cube: a shape for visualizing a data warehouse as a multidimensional database
Data mining: uncovering new knowledge, patterns, trends, and rules from data in a data
warehouse
FIGURE 9-13: A star schema with four dimension tables and a central fact table
FIGURE 9-14: A data cube representation of the Part, Customer, and Time dimensions
Rules for OLAP Systems (E.F. Codd)
Multidimensional conceptual view
Transparency
Accessibility
Consistent reporting performance
Client/server architecture
Generic dimensionality
Dynamic sparse matrix handling
Multiuser support
Unrestricted, cross-dimensional operations
105 | INTE 20023 Managing Data Using DBMS Software
Intuitive data manipulation
Flexible reporting
Unlimited dimensions and aggregation levels
Object-Oriented DBMSs
Complex objects: graphics, drawings, photographs, video, sound, voice mail,
spreadsheets, etc.
RDBMSs store complex objects using special data types
– Binary large objects (BLOBs)
Object-oriented DBMSs used with applications whose focus is on complex objects
What Is an Object-Oriented DBMS?
Object: set of related attributes along with associated actions
Object-oriented database management system (OODBMS): database management
system in which data and associated actions are encapsulated into objects
Objects and Classes
Represent each entity as an object rather than a relation
List attributes vertically below object names
– Follow each attribute by name of domain
Objects can contain other objects
An object can contain a portion of another object
Methods and Messages
Methods: actions defined for a class
Defined during data definition process
Executed when user sends a message to the object
FIGURE 9-22: Two methods for the Premiere Products object-oriented database
Inheritance
• Subclass
– Every occurrence of subclass is considered an occurrence of the class
– Subclass inherits structure and methods of the class
Unified Modeling Language (UML)
Used to model all aspects of software development for object-oriented systems
– Includes a way to represent database designs
Class diagram: most relevant diagram type for database design
– Rectangles represent classes
– Lines joining classes represent relationships; called associations
106 | INTE 20023 Managing Data Using DBMS Software
– Visibility symbol indicates whether other classes can view or update value in
attribute
FIGURE 9-24: Class diagram for the Premiere Products database
Multiplicity: number of objects that can be related to an individual object
Constraints
Superclass
Generalization: relationship between a superclass and a subclass
FIGURE 9-26: Class diagram with a generalization and a constraint
Rules for OODBMSs
Complex objects
Object identity
Encapsulation
Information hiding
Types of classes
Inheritance
Late binding
Computational completeness
Extensibility
Persistence
Performance
Concurrent update support
107 | INTE 20023 Managing Data Using DBMS Software
Recovery support
Query facility
Summary
Distributed database: single logical database physically divided among computers at
several sites on a network
Location transparency, replication transparency, and fragmentation transparency are
important characteristics of DDBMSs
Two-tier client/server architecture: DBMS runs on file server and server sends only the
requested data to the clients
Three-tier client/server architecture: clients perform presentation functions, database
servers perform database functions, and application servers perform business functions
Web servers interact with Web clients using HTTP and TCP/IP to display HTML Web
pages
Dynamic Web pages, not static Web pages, are used in e-commerce
XML was developed because of need for data exchange between organizations and
inability of HTML to specify structure and meaning of data
XHTML: markup language based on XML; stricter version of HTML
Data warehouse: subject-oriented, integrated, time-variant, nonvolatile collection of data
in support of management’s decision-making process
Users perceive data in a data warehouse as a multidimensional database in data cube
shape
Data mining: uncovering new knowledge, patterns, trends, and rules from data stored in
a data warehouse
Object-oriented DBMSs deal with data as objects
Object: set of related attributes and actions associated with the attributes
OODBMS: database management system in which data and actions that operate on
the data are encapsulated into objects
UML: an approach to model all aspects of software development for object-oriented
systems
READ:
Lesson 9: Database Management Approaches
Database Management Systems by Pratt/Adamski
Assessment/Activities:
1. What is distributed database? What is a DDBMS?
2. How does a homogeneous DDBMS differ from a heterogeneous DDBMS? Which is
more complex?
3. What is the location transparency?
4. What is replication transparency?
5. What is fragmentation transparency?
6. Why is the ability to increase system capacity an advantage in a distributed database?
7. Why is increased efficiency an advantage in a distributed database?
8. What causes query processing to be more complex in a distributed database?
9. Describe the two-phase commit process. How does it work? Why is it necessary?
108 | INTE 20023 Managing Data Using DBMS Software
10. List and briefly describe the 12 rules against which you can measure DDBMSs.
11. What is data Warehouse?
12. What is the fact table in a data warehouse?
13. What are the 12 rules against which you can measure OLAP systems?
14. What is UML?
15. What is a visibility symbol in UML?
COURSE GRADING SYSTEM
Class Standing 70%
Quizzes
Projects/Assignments/Seatwork/Special Report
Case Study
Midterm / Final Examinations 30%
100%
Midterm Grade + Final Term Grade = FINAL GRADE
2
References:
Database Management Systems by Pratt/Adamski,2010
Database System Concepts Sixth Edition, Abraham Silberschatz Henry F. Korth
S. Sudarshan, 2006
Database Management Systems/ Managing Database, DCAP204/DCAP402, 2011
109 | INTE 20023 Managing Data Using DBMS Software