Chapter 2-Data Models
Chapter 2-Data Models
Data Models
Data Model:
A set of concepts to describe the structure of a database, and certain constraints that the database should obey. It is independent of hardware or software constraints. Rather than try to represent the data, as a database would see it, the data model focuses on representing the data as the users uses it in the real world. It works as a bridge between the concepts that make up real-world events and processes and the physical representation of those concepts in a database The Data model is one part of the conceptual design process. The data model focuses on what data should be stored in the database rather than focusing on how the data is processed.
Object-Relational Models: Most Recent Trend. Started with Informix Universal Server. Exemplified in the latest versions of Oracle-10i, DB2, and SQL Server etc. systems.
Includes tables, columns, keys, data types, validation rules, database triggers, stored procedures, domains, and access constraints
Non-technical names, so that executives and managers at all levels Uses business names for can understand the data basis of entities & attributes Architectural Description
Uses more defined and less generic specific names for tables and columns, such as abbreviated column names, limited by the database management system (DBMS) and any company defined standards
DBMS)
access.
May be de-normalized to meet performance requirements based on the nature of the database. If the nature of the database is Online Transaction Processing (OLTP) or Operational Data Store (ODS) it is usually not de-normalized. Denormalization is common in Datawarehouses.
Network Model
-The model has three basic components: records, data types and links ADVANTAGES: Network Model is able to model complex relationships and represents semantics of add/delete on the relationships. Can handle most situations for modeling using record types and relationship types. Language is navigational; uses constructs like FIND, FIND member, FIND owner, FIND NEXT within set, GET etc. Programmers can do optimal navigation through the database. DISADVANTAGES: Navigational and procedural nature of processing Database contains a complex array of pointers that thread through a set of records. Little scope for automated "query optimization
Hierarchical Model
In this data model, the relationships between logical records types have hierarchical representation
ADVANTAGES:
Hierarchical Model is simple to construct and operate on Corresponds to a number of natural hierarchically organized domains - e.g., assemblies in manufacturing, personnel organization in companies Language is simple; uses constructs like GET, GET UNIQUE, GET NEXT, GET NEXT WITHIN PARENT etc.
DISADVANTAGES:
Navigational and procedural nature of processing Database is visualized as a linear arrangement of records Little scope for "query optimization"
Example COMPANY Database Requirements of the Company (oversimplified for illustrative purposes)
The company is organized into DEPARTMENTs. Each department has a name, number and an employee who manages the department. We keep track of the start date of the department manager. Each department controls a number of PROJECTs. Each project has a name, number and is located at a single location. We store each EMPLOYEEs social security number, address, salary, sex, and birthdate. Each employee works for one department but may work on several projects. We keep track of the number of hours per week that an employee currently works on each project. We also keep track of the direct supervisor of each employee. Each employee may have a number of DEPENDENTs. For each dependent, we keep track of their name, sex, birthdate, and relationship to employee.
Types of Attributes
1. Simple
Each entity has a single atomic value for the attribute. For example, SSN or Sex. The attribute may be composed of several components. For example, Address (Apt#, House#, Street, City, State, ZipCode, Country) or Name (FirstName, MiddleName, LastName). Composition may form a hierarchy where some components are themselves composite.
2. Composite
3. Multi-valued An entity may have multiple values for that attribute. For example, Color of a CAR or PreviousDegrees of a STUDENT. Denoted as {Color} or {PreviousDegrees}. 4. In general, composite and multi-valued attributes may be nested arbitrarily to any number of levels although this is rare. For example, PreviousDegrees of a STUDENT is a composite multi-valued attribute denoted by {PreviousDegrees (College, Year, Degree, Field)}.
An entity that does not have a key attribute A weak entity must participate in an identifying relationship type with an owner or identifying entity type Entities are identified by the combination of: A partial key of the weak entity type The particular entity they are related to in the identifying entity type Example: Suppose that a DEPENDENT entity is identified by the dependents first name and birhtdate, and the specific EMPLOYEE that the dependent is related to. DEPENDENT is a weak entity type with EMPLOYEE as its identifying entity type via the identifying relationship type DEPENDENT_OF
A unary relationship exists when an association is maintained within a single entity. A binary relationship exists when two entities are associated. A ternary relationship exists when three entities are associated
Cardinality Mapping
The degree of cardinality mapping in relationship (also known as cardinality) is the number of occurrences in one entity which are associated (or linked) to the number of occurrences in another.
There are three degrees of relationship (Cardinality Mapping), known as: 1. one-to-one (1:1) 2. one-to-many (1:M) 3. many-to-many (M:N)
One-to-one (1:1)
This is where one occurrence of an entity relates to only one occurrence in another entity. A one-to-one relationship rarely exists in practice, but it can. However, you may consider combining them into one entity. For example, an employee is allocated a company car, which can only be driven by that employee. Therefore, there is a one-to-one relationship between employee and company car.
One-to-Many (1:M)
Is where one occurrence in an entity relates to many occurrences in another entity. For example, taking the employee and department entities shown on the previous page, an employee works in one department but a department has many employees. Therefore, there is a one-to-many relationship between department and employee.
Many-to-Many (M:N)
This is where many occurrences in an entity relate to many occurrences in another entity. The normalization process discussed earlier would prevent any such relationships but the definition is included here for completeness. As with one-to-one relationships, many-to-many relationships rarely exist. Normally they occur because an entity has been missed. For example, an employee may work on several projects at the same time and a project has a team of many employees. Therefore, there is a many-to-many relationship between employee and project.
However, in the normalization process this many-to-many is resolved by the entity Project Team.
Database Schema: The description of a database. Includes descriptions of the database structure and the constraints that should hold on the database. Schema Diagram: A diagrammatic display of (some aspects of) a database schema. Schema Construct: A component of the schema or an object within the schema, e.g., STUDENT, COURSE. Database Instance: The actual data stored in a database at a particular moment in time. Also called database state (or occurrence).
Distinction
o The database schema changes very infrequently. The database state changes every time the database is updated. o Schema is also called intension, whereas state is called extension.
Data Independence
Logical Data Independence: The capacity to change the conceptual schema without having to change the external schemas and their application programs. Physical Data Independence: The capacity to change the internal schema without having to change the conceptual schema. When a schema at a lower level is changed, only the mappings between this schema and higherlevel schemas need to be changed in a DBMS that fully supports data independence. The higherlevel schemas themselves are unchanged. Hence, the application programs need not be changed since they refer to the external schemas.
DBMS Languages
Data Definition Language (DDL): Used by the DBA and database designers to specify the conceptual schema of a database. In many DBMSs, the DDL is also used to define internal and external schemas (views). In some DBMSs, separate storage definition language (SDL) and view definition language (VDL) are used to define internal and external schemas. Data Manipulation Language (DML): o Used to specify database retrievals and updates. o DML commands (data sublanguage) can be embedded in a general-purpose programming language (host language), such as COBOL, C or an Assembly Language.
DBMS Interfaces
Stand-alone query language interfaces. Programmer interfaces for embedding DML in programming languages: Pre-compiler Approach Procedure (Subroutine) Call Approach User-friendly interfaces: Menu-based, popular for browsing on the web Forms-based, designed for nave users Graphics-based (Point and Click, Drag and Drop etc.) Natural language: requests in written English Combinations of the above