0% found this document useful (0 votes)
332 views56 pages

Object Based Database

This document discusses object-based databases and object-relational data models. It covers topics such as complex data types, structured types and inheritance in SQL, array and multiset types, object identity and reference types, and persistent programming languages. The key aspects covered are extending the relational model to allow complex attribute types like nested relations, arrays, and object references while preserving the declarative nature of SQL queries.

Uploaded by

prasanth6
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
Download as ppt, pdf, or txt
0% found this document useful (0 votes)
332 views56 pages

Object Based Database

This document discusses object-based databases and object-relational data models. It covers topics such as complex data types, structured types and inheritance in SQL, array and multiset types, object identity and reference types, and persistent programming languages. The key aspects covered are extending the relational model to allow complex attribute types like nested relations, arrays, and object references while preserving the declarative nature of SQL queries.

Uploaded by

prasanth6
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1/ 56

UNIT - V

Object-Based Database
Object-Based Databases

• Complex Data Types and Object Orientation


• Structured Data Types and Inheritance in SQL
• Table Inheritance
• Array and Multi set Types in SQL
• Object Identity and Reference Types in SQL
• Implementing O-R Features
• Persistent Programming Languages
• Comparison of Object-Oriented and Object-
Relational Databases
Object-Relational Data Models
• Extend the relational data model by including object
orientation and constructs to deal with added data
types.
• Allow attributes of tuples to have complex types,
including non-atomic values such as nested relations.
• Preserve relational foundations, in particular the
declarative access to data, while extending modeling
power.
• Upward compatibility with existing relational
languages.
Complex Data Types

• Motivation:
– Permit non-atomic domains (atomic ≡ indivisible)
– Example of non-atomic domain: set of integers,or set of tuples
– Allows more intuitive modeling for applications with complex
data
• Intuitive definition:
– allow relations whenever we allow atomic (scalar) values —
relations within relations
– Retains mathematical foundation of relational model
– Violates first normal form.
Example of a Nested Relation
• Example: library information system
• Each book has
– title,
– a set of authors,
– Publisher, and
– a set of keywords
• Non-1NF relation books
4NF Decomposition of Nested Relation

• Remove awkwardness of flat-books by


assuming that the following multivalued
dependencies hold:
– title author
– title keyword
– title pub-name, pub-branch
• Decompose flat-doc into 4NF using the
schemas:
– (title, author )
– (title, keyword )
– (title, pub-name, pub-branch )
4NF Decomposition of flat–books
Problems with 4NF Schema
• 4NF design requires users to include joins in their
queries.
• 1NF relational view flat-books defined by join of
4NF relations:
– eliminates the need for users to perform joins,
– but loses the one-to-one correspondence between
tuples and documents.
– And has a large amount of redundancy
• Nested relations representation is much more
natural here.
Complex Types and SQL:1999
• Extensions to SQL to support complex types
include:
– Collection and large object types
• Nested relations are an example of collection types
– Structured types
• Nested record structures like composite attributes
– Inheritance
– Object orientation
• Including object identifiers and references
• Our description is mainly based on the SQL:1999
standard
– Not fully implemented in any database system currently
– But some features are present in each of the major
commercial database systems
• Read the manual of your database system to see what it
supports
Structured Types and Inheritance in SQL
• Structured types can be declared and used in SQL
create type Name as
(firstname varchar(20),
lastname varchar(20))
final
create type Address as
(street varchar(20),
city varchar(20),
zipcode varchar(20))
not final
– Note: final and not final indicate whether subtypes can be created
• Structured types can be used to create tables with composite
attributes
create table customer (
name Name,
addressAddress,
dateOfBirth date)
• Dot notation used to reference components: name.firstname
Structured Types (cont.)
• User-defined row types
create type CustomerType as (
name Name,
address Address,
dateOfBirth date)
not final
• Can then create a table whose rows are a user-
defined type
create table customer of CustomerType
Methods

• Can add a method declaration with a structured type.


method ageOnDate (onDate date)
returns interval year
• Method body is given separately.
create instance method ageOnDate (onDate date)
returns interval year
for CustomerType
begin
return onDate - self.dateOfBirth;
end
• We can now find the age of each customer:
select name.lastname, ageOnDate (current_date)
from customer
Inheritance
• Suppose that we have the following type definition for people:
create type Person
(name varchar(20),
address varchar(20))
• Using inheritance to define the student and teacher types
create type Student
under Person
(degree varchar(20),
department varchar(20))
create type Teacher
under Person
(salary integer,
department varchar(20))
• Subtypes can redefine methods by using overriding method
in place of method in the method declaration
Multiple Inheritance
• SQL:1999 and SQL:2003 do not support multiple
inheritance
• If our type system supports multiple inheritance, we
can define a type for teaching assistant as follows:
create type Teaching Assistant
under Student, Teacher
• To avoid a conflict between the two occurrences of
department we can rename them
create type Teaching Assistant
under
Student with (department as student_dept
),
Teacher with (department as teacher_dept )
Consistency Requirements for Subtables
• Consistency requirements on subtables and
supertables.
– Each tuple of the supertable (e.g. people) can
correspond to at most one tuple in each of the
subtables (e.g. students and teachers)
– Additional constraint in SQL:1999:
All tuples corresponding to each other (that is, with
the same values for inherited attributes) must be
derived from one tuple (inserted into one table).
• That is, each entity must have a most specific type
• We cannot have a tuple in people corresponding to a
tuple each in students and teachers
Array and Multiset Types in SQL
• Example of array and multiset declaration:
create type Publisher as
(name varchar(20),
branch varchar(20))
create type Book as
(title varchar(20),
author-array varchar(20) array [10],
pub-date date,
publisher Publisher,
keyword-set varchar(20) multiset )
create table books of Book
• Similar to the nested relation books, but with array of
authors
instead of set
Creation of Collection Values
• Array construction
array [‘Silberschatz’,`Korth’,`Sudarshan’]
• Multisets
– multisetset [‘computer’, ‘database’, ‘SQL’]
• To create a tuple of the type defined by the books
relation: (‘Compilers’,
array[`Smith’,`Jones’],
Publisher (`McGraw-Hill’,`New York’),
multiset [`parsing’,`analysis’ ])
• To insert the preceding tuple into the relation books
insert into books
values
(‘Compilers’, array[`Smith’,`Jones’],
Publisher (`McGraw-Hill’,`New York’),
multiset [`parsing’,`analysis’ ])
Querying Collection-Valued Attributes
• To find all books that have the word “database” as a keyword,
select title
from books
where ‘database’ in (unnest(keyword-set ))
• We can access individual elements of an array by using indices
– E.g.: If we know that a particular book has three authors, we could write:
select author-array[1], author-array[2], author-array[3]
from books
where title = `Database System Concepts’
• To get a relation containing pairs of the form “title, author-name” for
each book and each author of the book
select B.title, A.author
from books as B, unnest (B.author-array) as A (author )
• To retain ordering information we add a with ordinality clause
select B.title, A.author, A.position
from books as B, unnest (B.author-array) with ordinality as
A (author, position )
Object-Identity and Reference Types
• Define a type Department with a field name and a field head
which is a reference to the type Person, with table people as
scope:
create type Department (
name varchar (20),
head ref (Person) scope people)
• We can then create a table departments as follows
create table departments of Department
• We can omit the declaration scope people from the type
declaration and instead make an addition to the create table
statement:
create table departments of Department
(head with options scope people)
Initializing Reference-Typed Values
• To create a tuple with a reference value, we
can first create the tuple with a null reference
and then set the reference separately:
insert into departments
values (`CS’, null)
update departments
set head = (select p.person_id
from people as p
where name = `John’)
where name = `CS’
Persistent Programming Languages

• Languages extended with constructs to handle persistent


data
• Programmer can manipulate persistent data directly
– no need to fetch it into memory and store it back to disk (unlike
embedded SQL)
• Persistent objects:
– by class - explicit declaration of persistence
– by creation - special syntax to create persistent objects
– by marking - make objects persistent after creation
– by reachability - object is persistent if it is declared explicitly to
be so or is reachable from a persistent object
Object Identity and Pointers

• Degrees of permanence of object identity


– Intraprocedure: only during execution of a single procedure
– Intraprogram: only during execution of a single program or query
– Interprogram: across program executions, but not if data-storage
format on disk changes
– Persistent: interprogram, plus persistent across data
reorganizations
• Persistent versions of C++ and Java have been
implemented
– C++
• ODMG C++
• ObjectStore
– Java
• Java Database Objects (JDO)
Comparison of O-O and O-R Databases

• Relational systems
– simple data types, powerful query languages, high protection.
• Persistent-programming-language-based OODBs
– complex data types, integration with programming language,
high performance.
• Object-relational systems
– complex data types, powerful query languages, high
protection.
• Note: Many real systems blur these boundaries
– E.g. persistent programming language built as a wrapper on a
relational database offers first two benefits, but may have
poor performance.
XML
XML
• Structure of XML Data
• XML Document Schema
• Querying and Transformation
• Application Program Interfaces to XML
• Storage of XML Data
• XML Applications
Introduction
• XML: Extensible Markup Language
• Defined by the WWW Consortium (W3C)
• Derived from SGML (Standard Generalized
Markup Language), but simpler to use than SGML
• Documents have tags giving extra information
about sections of the document
– E.g. <title> XML </title> <slide> Introduction …</slide>
• Extensible, unlike HTML
– Users can add new tags, and separately specify how the
tag should be handled for display
XML Introduction (Cont.)
• The ability to specify new tags, and to create nested tag
structures make XML a great way to exchange data, not
just documents.
– Much of the use of XML has been in data exchange applications, not as
a replacement for HTML
• Tags make data (relatively) self-documenting
– E.g.
<bank>
<account>
<account_number> A-101 </account_number>
<branch_name> Downtown </branch_name>
<balance> 500 </balance>
</account>
<depositor>
<account_number> A-101 </account_number>
<customer_name> Johnson </customer_name>
</depositor>
</bank>
XML: Motivation
• Data interchange is critical in today’s networked world
– Examples:
• Banking: funds transfer
• Order processing (especially inter-company orders)
• Scientific data
– Chemistry: ChemML, …
– Genetics: BSML (Bio-Sequence Markup Language), …
– Paper flow of information between organizations is being
replaced by electronic flow of information
• Each application area has its own set of standards for
representing information
• XML has become the basis for all new generation data
interchange formats
XML Motivation (Cont.)
• Earlier generation formats were based on plain text with line headers
indicating the meaning of fields
– Similar in concept to email headers
– Does not allow for nested structures, no standard “type” language
– Tied too closely to low level document structure (lines, spaces, etc)
• Each XML based standard defines what are valid elements, using
– XML type specification languages to specify the syntax
• DTD (Document Type Descriptors)
• XML Schema
– Plus textual descriptions of the semantics
• XML allows new tags to be defined as required
– However, this may be constrained by DTDs
• A wide variety of tools is available for parsing, browsing and
querying XML documents/data
Comparison with Relational Data

• Inefficient: tags, which in effect represent


schema information, are repeated
• Better than relational tuples as a data-exchange
format
– Unlike relational tuples, XML data is self-documenting
due to presence of tags
– Non-rigid format: tags can be added
– Allows nested structures
– Wide acceptance, not only in database systems, but
also in browsers, tools, and applications
Structure of XML Data

• Tag: label for a section of data


• Element: section of data beginning with <tagname> and
ending with matching </tagname>
• Elements must be properly nested
– Proper nesting
• <account> … <balance> …. </balance> </account>
– Improper nesting
• <account> … <balance> …. </account> </balance>
– Formally: every start tag must have a unique matching end tag, that is
in the context of the same parent element.
• Every document must have a single top-level element
Example of Nested Elements
<bank-1>
<customer>
<customer_name> Hayes </customer_name>
<customer_street> Main </customer_street>
<customer_city> Harrison </customer_city>
<account>
<account_number> A-102 </account_number>
<branch_name> Perryridge </branch_name>
<balance> 400 </balance>
</account>
<account>

</account>
</customer>
.
.
</bank-1>
Motivation for Nesting
• Nesting of data is useful in data transfer
– Example: elements representing customer_id, customer_name,
and address nested within an order element
• Nesting is not supported, or discouraged, in relational
databases
– With multiple orders, customer name and address are stored
redundantly
– normalization replaces nested structures in each order by
foreign key into table storing customer name and address
information
– Nesting is supported in object-relational databases
• But nesting is appropriate when transferring data
– External application does not have direct access to data
referenced by a foreign key
Structure of XML Data (Cont.)
• Mixture of text with sub-elements is legal in XML.
– Example:
<account>
This account is seldom used any more.
<account_number> A-
102</account_number>
<branch_name> Perryridge</branch_name>
<balance>400 </balance>
</account>
– Useful for document markup, but discouraged for data
representation
Attributes
• Elements can have attributes
<account acct-type = “checking” >
<account_number> A-102
</account_number>
<branch_name> Perryridge
</branch_name>
<balance> 400 </balance>
</account>
• Attributes are specified by name=value pairs
inside the starting tag of an element
• An element may have several attributes, but
each attribute name can only occur once
<account acct-type = “checking” monthly-fee=“5”>
Attributes vs. Subelements
• Distinction between subelement and attribute
– In the context of documents, attributes are part of markup,
while subelement contents are part of the basic document
contents
– In the context of data representation, the difference is
unclear and may be confusing
• Same information can be represented in two ways
– <account account_number = “A-101”> …. </account>
– <account>
<account_number>A-101</account_number> …
</account>
– Suggestion: use attributes for identifiers of elements, and
use subelements for contents
Namespaces
• XML data has to be exchanged between organizations
• Same tag name may have different meaning in different
organizations, causing confusion on exchanged documents
• Specifying a unique string as an element name avoids confusion
• Better solution: use unique-name:element-name
• Avoid using long unique names all over document by using XML
Namespaces
<bank Xmlns:FB=‘http://www.FirstBank.com’>

<FB:branch>
<FB:branchname>Downtown</FB:branchname>
<FB:branchcity> Brooklyn </FB:branchcity>
</FB:branch>

</bank>
More on XML Syntax
• Elements without subelements or text content can be
abbreviated by ending the start tag with a /> and deleting
the end tag
– <account number=“A-101” branch=“Perryridge” balance=“200 />
• To store string data that may contain tags, without the tags
being interpreted as subelements, use CDATA as below
– <![CDATA[<account> … </account>]]>
Here, <account> and </account> are treated as just strings
CDATA stands for “character data”
XML Document Schema

• Database schemas constrain what information can be


stored, and the data types of stored values
• XML documents are not required to have an associated
schema
• However, schemas are very important for XML data
exchange
– Otherwise, a site cannot automatically interpret data received
from another site
• Two mechanisms for specifying XML schema
– Document Type Definition (DTD)
• Widely used
– XML Schema
• Newer, increasing use
Document Type Definition (DTD)

• The type of an XML document can be specified using a


DTD
• DTD constraints structure of XML data
– What elements can occur
– What attributes can/must an element have
– What subelements can/must occur inside each element, and how
many times.
• DTD does not constrain data types
– All values represented as strings in XML
• DTD syntax
– <!ELEMENT element (subelements-specification) >
– <!ATTLIST element (attributes) >
Element Specification in DTD
• Subelements can be specified as
– names of elements, or
– #PCDATA (parsed character data), i.e., character strings
– EMPTY (no subelements) or ANY (anything can be a subelement)
• Example
<! ELEMENT depositor (customer_name account_number)>
<! ELEMENT customer_name (#PCDATA)>
<! ELEMENT account_number (#PCDATA)>
• Subelement specification may have regular expressions
<!ELEMENT bank ( ( account | customer | depositor)+)>
• Notation:
– “|” - alternatives
– “+” - 1 or more occurrences
– “*” - 0 or more occurrences
Bank DTD

<!DOCTYPE bank [
<!ELEMENT bank ( ( account | customer | depositor)+)>
<!ELEMENT account (account_number branch_name
balance)>
<! ELEMENT customer(customer_name customer_street
customer_city)>
<! ELEMENT depositor (customer_name account_number)>
<! ELEMENT account_number (#PCDATA)>
<! ELEMENT branch_name (#PCDATA)>
<! ELEMENT balance(#PCDATA)>
<! ELEMENT customer_name(#PCDATA)>
<! ELEMENT customer_street(#PCDATA)>
<! ELEMENT customer_city(#PCDATA)>
]>
Attribute Specification in DTD
• Attribute specification : for each attribute
– Name
– Type of attribute
• CDATA
• ID (identifier) or IDREF (ID reference) or IDREFS (multiple IDREFs)
– more on this later
– Whether
• mandatory (#REQUIRED)
• has a default value (value),
• or neither (#IMPLIED)
• Examples
– <!ATTLIST account acct-type CDATA “checking”>
– <!ATTLIST customer
customer_id ID # REQUIRED
accounts IDREFS # REQUIRED >
IDs and IDREFs

• An element can have at most one attribute of type ID


• The ID attribute value of each element in an XML
document must be distinct
– Thus the ID attribute value is an object identifier
• An attribute of type IDREF must contain the ID value of
an element in the same document
• An attribute of type IDREFS contains a set of (0 or more)
ID values. Each ID value must contain the ID value of
an element in the same document
Bank DTD with Attributes

• Bank DTD with ID and IDREF attribute types.


<!DOCTYPE bank-2[
<!ELEMENT account (branch, balance)>
<!ATTLIST account
account_number ID # REQUIRED
owners IDREFS # REQUIRED>
<!ELEMENT customer(customer_name, customer_street,
customer_city)>
<!ATTLIST customer
customer_id ID # REQUIRED
accounts IDREFS # REQUIRED>
… declarations for branch, balance, customer_name,
customer_street and customer_city
]>
XML data with ID and IDREF attributes

<bank-2>
<account account_number=“A-401” owners=“C100
C102”>
<branch_name> Downtown </branch_name>
<balance> 500 </balance>
</account>
<customer customer_id=“C100” accounts=“A-401”>
<customer_name>Joe </customer_name>
<customer_street> Monroe </customer_street>
<customer_city> Madison</customer_city>
</customer>
<customer customer_id=“C102” accounts=“A-401 A-
402”>
<customer_name> Mary </customer_name>
<customer_street> Erin </customer_street>
<customer_city> Newark </customer_city>
</customer>
</bank-2>
XML Schema

• XML Schema is a more sophisticated schema language


which addresses the drawbacks of DTDs. Supports
– Typing of values
• E.g. integer, string, etc
• Also, constraints on min/max values
– User-defined, comlex types
– Many more features, including
• uniqueness and foreign key constraints, inheritance
• XML Schema is itself specified in XML syntax, unlike DTDs
– More-standard representation, but verbose
• XML Scheme is integrated with namespaces
• BUT: XML Schema is significantly more complicated than
DTDs.
XML Schema Version of Bank DTD
<xs:schema xmlns:xs=http://www.w3.org/2001/XMLSchema>
<xs:element name=“bank” type=“BankType”/>
<xs:element name=“account”>
<xs:complexType>
<xs:sequence>
<xs:element name=“account_number” type=“xs:string”/>
<xs:element name=“branch_name” type=“xs:string”/>
<xs:element name=“balance” type=“xs:decimal”/>
</xs:squence>
</xs:complexType>
</xs:element>
….. definitions of customer and depositor ….
<xs:complexType name=“BankType”>
<xs:squence>
<xs:element ref=“account” minOccurs=“0” maxOccurs=“unbounded”/>
<xs:element ref=“customer” minOccurs=“0” maxOccurs=“unbounded”/>
<xs:element ref=“depositor” minOccurs=“0” maxOccurs=“unbounded”/>
</xs:sequence>
</xs:complexType>
</xs:schema>
XML Schema Version of Bank
DTD
• Choice of “xs:” was ours -- any other
namespace prefix could be chosen
• Element “bank” has type “BankType”,
which is defined separately
– xs:complexType is used later to create the
named complex type “BankType”
• Element “account” has its type defined in-
line
More features of XML Schema
• Attributes specified by xs:attribute tag:
– <xs:attribute name = “account_number”/>
– adding the attribute use = “required” means value must be specified
• Key constraint: “account numbers form a key for account elements
under the root bank element:
<xs:key name = “accountKey”>
<xs:selector xpath = “]bank/account”/>
<xs:field xpath = “account_number”/>
<\xs:key>
• Foreign key constraint from depositor to account:
<xs:keyref name = “depositorAccountKey” refer=“accountKey”>
<xs:selector xpath = “]bank/account”/>
<xs:field xpath = “account_number”/>
<\xs:keyref>
Querying and Transforming
XML Data
• Translation of information from one XML schema to
another
• Querying on XML data
• Above two are closely related, and handled by the same
tools
• Standard XML querying/translation languages
– XPath
• Simple language consisting of path expressions
– XSLT
• Simple language designed for translation from XML to XML and XML
to HTML
– XQuery
• An XML query language with a rich set of features
XPath

• XPath is used to address (select) parts of documents using


path expressions
• A path expression is a sequence of steps separated by “/”
– Think of file names in a directory hierarchy
• Result of path expression: set of values that along with their
containing elements/attributes match the specified path
• E.g. /bank-2/customer/customer_name evaluated on the
bank-2 data we saw earlier returns
<customer_name>Joe</customer_name>
<customer_name>Mary</customer_name>
• E.g. /bank-2/customer/customer_name/text( )
returns the same names, but without the enclosing tags
XPath (Cont.)
• The initial “/” denotes root of the document (above the top-
level tag)
• Path expressions are evaluated left to right
– Each step operates on the set of instances produced by the
previous step
• Selection predicates may follow any step in a path, in [ ]
– E.g. /bank-2/account[balance > 400]
• returns account elements with a balance value greater than 400
• /bank-2/account[balance] returns account elements containing a
balance subelement
• Attributes are accessed using “@”
– E.g. /bank-2/account[balance > 400]/@account_number
• returns the account numbers of accounts with balance > 400
– IDREF attributes are not dereferenced automatically (more on this
later)
Functions in XPath
• XPath provides several functions
– The function count() at the end of a path counts the number of
elements in the set generated by the path
• E.g. /bank-2/account[count(./customer) > 2]
– Returns accounts with > 2 customers
– Also function for testing position (1, 2, ..) of node w.r.t. siblings
• Boolean connectives and and or and function not() can be
used in predicates
• IDREFs can be referenced using function id()
– id() can also be applied to sets of references such as IDREFS and
even to strings containing multiple references separated by blanks
– E.g. /bank-2/account/id(@owner)
• returns all customers referred to from the owners attribute of account
elements.
XQuery
• XQuery is a general purpose query language for XML data
• Currently being standardized by the World Wide Web Consortium
(W3C)
– The textbook description is based on a January 2005 draft of the standard.
The final version may differ, but major features likely to stay unchanged.
• XQuery is derived from the Quilt query language, which itself borrows
from SQL, XQL and XML-QL
• XQuery uses a
for … let … where … order by …result …
syntax
for  SQL from
where  SQL where
order by  SQL order by
result  SQL select
let allows temporary variables, and has no equivalent in SQL

You might also like