DF_Unit3_DataBaseManagement
DF_Unit3_DataBaseManagement
Word 'Data' is originated from the word 'datum' that means 'single piece of
information.' It is plural of the word datum.
In computing, Data is information that can be translated into a form for efficient
movement and processing. Data is interchangeable.
What is Database?
A database is an organized collection of data, so that it can be easily accessed and
managed.
You can organize data into tables, rows, columns, and index it to make it easier to
find relevant information.
Database handlers create a database in such a way that only one set of software
program provides access of data to all the users.
There are many dynamic websites on the World Wide Web nowadays which are
handled through databases. For example, a model that checks the availability of
rooms in a hotel. It is an example of a dynamic website that uses a database.
There are many databases available like MySQL, Sybase, Oracle, MongoDB,
Informix, PostgreSQL, SQL Server, etc.
The Evolution
File-Based
1968 was the year when File-Based database were introduced. In file-based
databases, data was maintained in a flat file. Though files have many advantages,
there are several limitations.
One of the major advantages is that the file system has various access methods, e.g.,
sequential, indexed, and random.
1968-1980 was the era of the Hierarchical Database. Prominent hierarchical database
model was IBM's first DBMS. It was called IMS (Information Management System).
Below diagram represents Hierarchical Data Model. Small circle represents objects.
Like file system, this model also had some limitations like complex implementation,
lack structural independence, can't easily handle a many-many relationship, etc.
In this model, files are related as owners and members, like to the common network
model.
This model also had some limitations like system complexity and difficult to design
and maintain.
Relational Database
Relational database model has two main terminologies called instance and schema.
Schema specifies the structure like name of the relation, type of each column and
name.
This model uses some mathematical concept like set theory and predicate logic.
During the era of the relational database, many more models had introduced like
object-oriented model, object-relational model, etc.
Cloud database
Cloud database facilitates you to store, manage, and retrieve their structured,
unstructured data via a cloud platform. This data is accessible over the Internet.
Cloud databases are also called a database as service (DBaaS) because they are
offered as a managed service.
Lower costs
Generally, company provider does not have to invest in databases. It can maintain
and support one or more data centers.
Automated
Increased accessibility
You can access your cloud-based database from any location, anytime. All you need
is just an internet connection.
NoSQL Database
A NoSQL database is an approach to design such databases that can accommodate a
wide variety of data models. NoSQL stands for "not only SQL." It is an alternative to
traditional relational databases in which data is placed in tables, and data schema is
perfectly designed before the database is built.
NoSQL can handle an extensive amount of data because of scalability. If the data
grows, NoSQL database scale it to handle that data in an efficient manner.
High Availability
NoSQL supports auto replication. Auto replication makes it highly available because,
in case of any failure, data replicates itself to the previous consistent state.
Disadvantage of NoSQL
Open source
Management challenge
GUI tools for NoSQL database are not easily available in the market.
Backup
Backup is a great weak point for NoSQL databases. Some databases, like MongoDB,
have no powerful approaches for data backup.
o Atomicity
o Consistency
o Integrity
o Durability
o Concurrency
o Query processing
Graph Databases
A graph database is a NoSQL database. It is a graphical representation of data. It
contains nodes and edges. A node represents an entity, and each edge represents a
relationship between two edges. Every node in a graph database represents a unique
identifier.
Graph databases are beneficial for searching the relationship between data because
they highlight the relationship between relevant data.
Graph databases are very useful when the database contains a complex relationship
and dynamic schema.
o DBMS provides the interface to perform the various operations like creation,
deletion, modification, etc.
o DBMS allows the user to create their databases as per their requirement.
o DBMS accepts the request from the application and provides specific data
through the operating system.
o DBMS contains the group of programs which acts according to the user
instruction.
o It provides security to the database.
Advantage of DBMS
Controls redundancy
It stores all the data in a single database file, so it can control data redundancy.
Data sharing
Backup
Disadvantage of DBMS
Size
Cost
DBMS requires a high-speed data processor and larger memory to run DBMS
software, so it is costly.
Complexity
DBMS creates additional complexity and requirements.
o Table
o Record/ Tuple
o Field/Column name /Attribute
o Instance
o Schema
o Keys
An RDBMS is a tabular DBMS that maintains the security, integrity, accuracy, and
consistency of the data
Types of Databases
There are various types of databases used for storing different varieties of data:
1) Centralized Database
It is the type of database that stores data at a centralized database system. It
comforts the users to access the stored data from different locations through several
applications. These applications contain the authentication process to let users
access data securely. An example of a Centralized database can be Central Library
that carries a central database of each library in a college/university.
o It has decreased the risk of data management, i.e., manipulation of data will not affect
the core data.
o Data consistency is maintained as it manages data in a central repository.
o It provides better data quality, which enables organizations to establish data
standards.
o It is less costly because fewer vendors are required to handle the data sets.
o The size of the centralized database is large, which increases the response time for
fetching the data.
o It is not easy to update such an extensive database system.
o If any server failure occurs, entire data will be lost, which could be a huge loss.
2) Distributed Database
Unlike a centralized database system, in distributed systems, data is distributed
among different database systems of an organization. These database systems are
connected via communication links. Such links help the end-users to access the data
easily. Examples of the Distributed database are Apache Cassandra, HBase, Ignite,
etc.
o Homogeneous DDB: Those database systems which execute on the same operating
system and use the same application process and carry the same hardware devices.
o Heterogeneous DDB: Those database systems which execute on different operating
systems under different application procedures, and carries different hardware
devices.
3) Relational Database
This database is based on the relational data model, which stores data in the form of
rows(tuple) and columns(attributes), and together forms a table(relation). A relational
database uses SQL for storing, manipulating, as well as maintaining the data. E.F.
Codd invented the database in 1970. Each table in the database carries a key that
makes the data unique from others. Examples of Relational databases are MySQL,
Microsoft SQL Server, Oracle, etc.
A means Atomicity: This ensures the data operation will complete either with
success or with failure. It follows the 'all or nothing' strategy. For example, a
transaction will either be committed or will abort.
C means Consistency: If we perform any operation over the data, its value before
and after the operation should be preserved. For example, the account balance
before and after the transaction should be correct, i.e., it should remain conserved.
I means Isolation: There can be concurrent users for accessing data at the same
time from the database. Thus, isolation between the data should remain isolated. For
example, when multiple transactions occur at the same time, one transaction effects
should not be visible to the other transactions in the database.
D means Durability: It ensures that once it completes the operation and commits
the data, data changes should remain permanent.
4) NoSQL Database
Non-SQL/Not Only SQL is a type of database that is used for storing a wide range of
data sets. It is not a relational database as it stores data not only in tabular form but
in several different ways. It came into existence when the demand for building
modern applications increased. Thus, NoSQL presented a wide variety of database
technologies in response to the demands. We can further divide a NoSQL database
into the following four types:
a. Key-value storage: It is the simplest type of database storage where it stores every
single item as a key (or attribute name) holding its value, together.
b. Document-oriented Database: A type of database used to store data as JSON-like
document. It helps developers in storing data by using the same document-model
format as used in the application code.
c. Graph Databases: It is used for storing vast amounts of data in a graph-like
structure. Most commonly, social networking websites use the graph database.
d. Wide-column stores: It is similar to the data represented in relational databases.
Here, data is stored in large columns together, instead of storing in rows.
5) Cloud Database
A type of database where data is stored in a virtual environment and executes over
the cloud computing platform. It provides users with various cloud computing
services (SaaS, PaaS, IaaS, etc.) for accessing the database. There are numerous cloud
platforms, but the best options are:
6) Object-oriented Databases
The type of database that uses the object-based data model approach for storing
data in the database system. The data is represented and stored as objects which are
similar to the objects used in the object-oriented programming language.
7) Hierarchical Databases
It is the type of database that stores data in the form of parent-children relationship
nodes. Here, it organizes data in a tree-like structure.
Data get stored in the form of records that are connected via links. Each child record
in the tree will contain only one parent. On the other hand, each parent record can
have multiple child records.
8) Network Databases
It is the database that typically follows the network data model. Here, the
representation of data is in the form of nodes connected via links between them.
Unlike the hierarchical database, it allows each record to have multiple children and
parent nodes to form a generalized graph structure.
9) Personal Database
Collecting and storing data on the user's system defines a Personal Database. This
database is basically designed for a single user.
What is RDBMS
RDBMS stands for Relational Database Management Systems..
All modern database management systems like SQL, MS SQL Server, IBM DB2,
ORACLE, My-SQL and Microsoft Access are based on RDBMS.
How it works
Data is represented in terms of tuples (rows) in RDBMS.
Due to a collection of organized set of tables, data can be accessed easily in RDBMS.
What is table
The RDBMS database uses tables to store data. A table is a collection of related data
entries and contains rows and columns to store data.
1 Ajeet 24 B.Tech
2 aryan 20 C.A
3 Mahesh 21 BCA
4 Ratan 22 MCA
5 Vimal 26 BSC
What is field
Field is a smaller entity of the table which contains specific information about every
record in the table. In the above example, the field in the student table consist of id,
name, age, course.
1 Ajeet 24 B.Tech
What is column
A column is a vertical entity in the table which contains all information associated
with a specific field in a table. For example: "name" is a column in the above table
which contains all information about student's name.
Ajeet
Aryan
Mahesh
Ratan
Vimal
NULL Values
The NULL value of the table specifies that the field has been left blank during record
creation. It is totally different from the value filled with zero or a field that contains
space.
Data Integrity
There are the following categories of data integrity exist with each RDBMS:
Domain integrity: It enforces valid entries for a given column by restricting the type,
the format, or the range of values.
Referential integrity: It specifies that rows cannot be deleted, which are used by
other records.
User-defined integrity: It enforces some specific business rules that are defined by
users. These rules are different from entity, domain or referential integrity.
The main differences between DBMS and RDBMS are given below:
1) DBMS applications store data as file. RDBMS applications store data in a tabular form.
2) In DBMS, data is generally stored in In RDBMS, the tables have an identifier called
either a hierarchical form or a primary key and the data values are stored in the
navigational form. form of tables.
4) DBMS does not apply any security with RDBMS defines the integrity constraint for the
regards to data manipulation. purpose of ACID (Atomocity, Consistency, Isolation
and Durability) property.
5) DBMS uses file system to store data, so in RDBMS, data values are stored in the form of
there will be no relation between the tables, so a relationship between these data values
tables. will be stored in the form of a table as well.
6) DBMS has to provide some uniform RDBMS system supports a tabular structure of the
methods to access the stored data and a relationship between them to access the
information. stored information.
9) Examples of DBMS are file Example of RDBMS are mysql, postgre, sql
systems, xml etc. server, oracle etc.
After observing the differences between DBMS and RDBMS, you can say that RDBMS
is an extension of DBMS. There are many software products in the market today who
are compatible for both DBMS and RDBMS. Means today a RDBMS application is
DBMS application and vice-versa.
DBMS:
A database approach is a well-organized collection of data that are related in a
meaningful way which can be accessed by different users but stored only once in a
system. The various operations performed by the DBMS system are: Insertion,
deletion, selection, sorting etc.
In the above figure,
There are the following differences between DBMS and File systems:
Where to use Database approach used in large File system approach used
systems which interrelate many in large systems which
files. interrelate many files.
DBMS Architecture
o The DBMS design depends upon its architecture. The basic client/server architecture
is used to deal with a large number of PCs, web servers, database servers and other
components that are connected with networks.
o The client/server architecture consists of many PCs and a workstation which are
connected via the network.
o DBMS architecture depends upon how users are connected to the database to get
their request done.
1-Tier Architecture
o In this architecture, the database is directly available to the user. It means the user
can directly sit on the DBMS and uses it.
o Any changes done here will directly be done on the database itself. It doesn't provide
a handy tool for end users.
o The 1-Tier architecture is used for development of the local application, where
programmers can directly communicate with the database for the quick response.
2-Tier Architecture
3-Tier Architecture
o The 3-Tier architecture contains another layer between the client and server. In this
architecture, client can't directly communicate with the server.
o The application on the client-end interacts with an application server which further
communicates with the database system.
o End user has no idea about the existence of the database beyond the application
server. The database also has no idea about any other user beyond the application.
o The 3-Tier architecture is used in case of large web application.
Fig: 3-tier Architecture
Data Models
Data Model is the modeling of the data description, data semantics, and consistency
constraints of the data. It provides the conceptual tools for describing the design of a
database at each level of data abstraction. Therefore, there are following four data
models used for understanding the structure of the database:
1) Relational Data Model: This type of model designs the data in the form of rows
and columns within a table. Thus, a relational model uses tables for representing data
and in-between relationships. Tables are also called relations. This model was initially
described by Edgar F. Codd, in 1969. The relational data model is the widely used
model which is primarily used by commercial data processing applications.
4) Semistructured Data Model: This type of data model is different from the other
three data models (explained above). The semistructured data model allows the data
specifications at places where the individual data items of the same type may have
different attributes sets. The Extensible Markup Language, also known as XML, is
widely used for representing the semistructured data. Although XML was initially
designed for including the markup information to the text document, it gains
importance because of its application in the exchange of data.
Relational schema: A relational schema contains the name of the relation and name
of all columns or attributes.
Relational key: In the relational key, each row has one or more attributes. It can
identify the row in the relation uniquely.
o In the given table, NAME, ROLL_NO, PHONE_NO, ADDRESS, and AGE are the
attributes.
o The instance of schema STUDENT has 5 tuples.
o t3 = <Laxman, 33289, 8583287182, Gurugram, 20>
Properties of Relations
o Name of the relation is distinct from all other relations.
o Each relation cell contains exactly one atomic (single) value
o Each attribute contains a distinct name
o Attribute domain has no significance
o tuple has no duplicate value
o Order of tuple can have a different sequence