Database Basics
Database Basics
Database Basics
[PM Jat, DAIICT, Gandhinagar]
Contents
• What is Database?
• How do we represent databases?
• DBMS based computing architecture
• Database and Database Schema
• Three Schema Architecture
What is Database?
In simple terms, Database is a collection of “Relevant” “Data”.
We elaborate terms “Relevant” and “Data” next.
A database does not have a random collection of data; it typically
• Records all facts about some “application” context.
• Records necessary information of all events that occur as “business process”, for example an
order is received; an items is added in store, and so forth
Data
Data is an atomic value; represents a fact about some entity.
For instance, “Amit Kumar” is value for attribute name of a student; “28-June-1986” value for
attribute date-of-birth of the student, and so forth. A set of values for various attributes describe an
entity.
Data being atomic mean a “single value” for an attribute.
Relevant Data
Databases are built for some purpose. A data being relevant mean, data is required for meeting the
objective of building the database.
For example, do we need to record names of dependents of an employee in a company database?
Answer would depend if the company provides certain benefits to dependents of employees then it
may be required otherwise not required.
Hereby we say that data is “relevant” if it is required in database, otherwise not.
Basically, intuition here is, we are able to draw some boundary to separate out data of database and
data of universe.
Book elmasri/navathe uses a term called “mini data world”, while the book Korth uses a term
“enterprise” for capturing the notion of “boundary” for what is relevant and what is not?
Database can be expressed as set of sets of some entity sets and interaction sets.
Where
Operations on Databases
Following are main operations on database (also referred as database manipulation operations).
• Update operations: that changes data of databases; typically add, modify, and delete entities.
o Add more facts [entities and their interaction]
o Modify existing data
o Delete existing facts
It is the business events that trigger the database update. For example following events will be
update the corresponding sets in “da-acad” database –
Authorized Access: Database is owned by some user; other users may be granted permission for
viewing or updating. It is also possible to grant partial access to a database, a user is allowed to
view a subset of a database.
Simple interface to update (add/modify/delete) and query the database: Though databases are
actually used on disk files, and often have complex file organization. User should not be required to
deal with such files; rather use should be able to view and manipulate databases as some logical
units like “entities” or “relations”. For working with databases, we have query language like SQL.
Database Constraints
Database constrains are basically “data existential rules”.
“Rules” that must hold true on data (values) in a database.
For example in DA-ACAD database
• StudentID is key attribute of student entity, i.e. no two student entities can have same value
for ID.
• A course offering necessarily need to have an instructor associated with; and exactly one
instructor.
• An elective can be taken only from respective domain, and so forth
Any data that violate such rules cannot exist in database; if do, then database is said to be in invalid
state; such a state is called as “inconsistent” state of database.
The interpretation of “Inconsistent”, here is database is not consistent with its constraints (rules)
Constraints are part of “database description”, referred as database schema, and any valid state of
database should satisfy these rules.
A snapshot of a database instance (values) is referred as database state. In any real system,
databases keep updating; and continuously changing its state.
Database Schema-
While database instance actually holds data; database schema describes “structures of
database”. Database structure, implicitly, also includes description of database constraints.
In other words, database schema contains description of database structure and its constraints.
• Structure describes
– What entities? Or what relations if represented in relational model.
– What “attribute values” for each entity (or relations)
• Constraints
– Defines what is “domain” for each attribute of an entity (or relation)?
– What is key attribute?
– What interaction it has with other entities?
is that interaction is mandatory?;
– What is cardinality of interaction?
That is, to how many entities “an entity of a type” might be interacting with; how
many other entities, an entity can associate with; and so forth.
Below are depiction of XIT database schema in “entity relation-ship representation” and relational
representation-
Here we have three different description of database schema – ER, Relational, and SQL-DDL. Each
description serves some specific purpose.
• Schema in SQL DDL script, as a program, becomes input to a database management system
so that empty database instance is created on a computer system.
• Data are stored in disk file. Maintaining efficiently searchable data file is complex task.
Suppose, B+ tree is an efficient searching mechanism, implementing this on secondary storage
is complex. We would want this complex organization to be transparent from applications.
• “data manipulation” functionality, for example search based on key or some other attribute,
insert an entity, modify, delete, or so, is almost repeatable in every database application. We
would not want this to be part of application; instead want external and simply reused in every
application.
• We would want “data manipulation” functionality to be independent of any application that
accesses databases. So that it can be independently developed, improved, and maintained.
• Issues related to concurrent access by multiple users, authorized access, dealing with system
failures, or so
DBMS is a complex system; a figure below (Fig 11: DBMS-COMPONENTS) from book
Elmasri/Navathe depicts its various components and their roles.
Schema Information as data is what we refer as Meta-Data, that is, Information about database data
in the form of data? Given below is snapshot of queried schema information from postgresql
catalogue. Query used to fetch this information is given in footnote 1
Data Abstraction
What is Data Abstraction in general? Consider example below, hopefully illustrates the
notion? Observe how floating points are internally represented, and relevant operations are
actually performed. User is transparent to underlying binary representation, and procedure
of performing floating point operations.
Programming languages do provide such typed abstraction over binary representation and
manipulation of data.
1
SELECT table_name, column_name, data_type FROM information_schema.columns
WHERE table_name in (select tablename from pg_tables where schemaname='xit')
and table_schema = 'xit';
For example, Relational is a popular database representation and manipulation model. RDBMS
is DBMS based on Relational model, and provides representational and operational
transparencies.
• Defining and manipulating database as relations or table while data are actually stored on
disks.
• Perform various manipulation operations on relations in logical manner (rather than how
actually they are performed by RDBMS). Obviously DBMS uses sophisticated algorithms
for performing database manipulation operations (add/modify/delete, and query/search)
RDBMS provide SQL support for manipulating databases. Database is manipulated as tables, rather
than data structure on disk files.
This way RDBMS hides complexity of performing operations. Below are some SQL statements for
certain database operations-
Actual algorithms that perform these operations are quiet complex. They work on disk files
corresponding to concerned relations.