dbms unit -1
dbms unit -1
Data: Data is a raw material. it’s a collection of facts and figures. Data does not have a
significant meaning because of its raw nature. Data may include text, figures, facts, images,
numbers, graphs, and symbols and it can be generated from different sources like sensors,
surveys, transactions, social media etc.
Database:
The database is a collection of inter-related data i.e: Data which is dependent on each other , which
is used to retrieve, insert and delete the data efficiently. It is also used to organize the data in the
form of a table, schema, views, and reports, etc.
Database = Data + Base, the actual storage of all the informations and data.
Basically database is a place where all the data and informations are stored.
For example: The college Database organizes the data about the admin, staff, students and faculty
etc.Using the database, you can easily retrieve, insert, and delete the information.
There are different characteristics of the database approach from the much older approach of
programming with files. In a traditional file processing system
Some of the most important characteristics of the database approach to the file processing
approach are the following as follows.
One of the most fundamental characteristics of the database approach is that the
database system contains not just the actual data (like names and numbers) but also
detailed information about how that data is organized and what rules apply to it. This
extra information is called metadata. It describes the structure of the database, such as
the types of data it holds and any restrictions on that data, helping the system manage
and use the information effectively.
This information is stored in the DBMS Catalog. The catalog includes details about the
structure of each file, how data is organized and stored, and the rules or constraints that
apply to the data. This helps the database management system understand and manage
the data effectively
The catalog stores metadata, which describes the structure of the database.This
information is used by the DBMS software and database users, like administrators, to
understand the database layout.
The catalog helps the DBMS understand how to manage and access data effectively, and it is
essential for database administrators and users to work with the database efficiently
This independence is possible due to data abstraction, which allows users to interact
with data without worrying about how it’s stored or how operations are performed
internally. The DBMS provides a simplified view of the data, known as a data
model, that shows logical concepts like objects and relationships. This hides
complex storage and implementation details, making it easier for users to work with
the data without getting into technical details.
In a database, there are often many users, each needing their own specific view of
the data. A view can be a part of the database or data that is created from existing
data but not actually stored. Users don’t need to worry about whether the data they
are seeing is stored directly or generated from other data.
In a multi-user DBMS, where different users have different needs, the system must
allow for the creation of multiple views. This feature is particularly useful for large
databases like the Aadhaar database, as it lets different users access only the
information relevant to them, making data management more efficient.
A multi-user DBMS allows multiple users to access and update the database at the
same time. This is important for systems that integrate data from multiple
applications, like WhatsApp’s integration with Facebook.
To ensure updates happen correctly, the DBMS uses concurrency control, which
prevents conflicts when users try to update the same data simultaneously. For
example, if several agents are booking seats on a flight, the system ensures that no
two agents assign the same seat to different passengers
These types of systems are called online transaction processing (OLTP)
applications. The DBMS ensures that all transactions are processed accurately,
without errors or inconsistencies. Each transaction must follow the ACID
properties:
This guarantees reliable and efficient database operations, even with many users.
Data models :-
A data model in a database defines how data is structured, stored, and manipulated.
It provides a blueprint for organizing data and determines the relationships between
different pieces of information. There are several types of data models used in
databases, each with its own approach to structuring data
Data models are used to describe how the data is stored, accessed, and updated in a
DBMS.
1. Hierarchical Model :-
The hierarchical data model is one of the oldest data models, developed in
the 1950s by IBM. In this data model, the data is organized in a hierarchical
tree-like structure. This data model can be easily visualized because each
record in DBMS has one parent and many children
The above-given image represents the data model of the Vehicle database, vehicle are
classified into two types. two-wheelers and four-wheelers and then they are further
classified.
The main drawback we can see here is we can only have one too many relationships under
this model, hence the hierarchical data model is very rarely used nowadays.
2) Network Model:-
A network model is nothing but a generalization of the hierarchical data model as this data
model allows many to many relationships therefore in this model a record can also have
more than one parent.
The network model in DBMS can be represented as a graph and hence it replaces the
hierarchical tree with a graph in which object types are the nodes and relationships are the
edges.
Here you can see all three departments are linked with the director which was not possible
in the hierarchical data model.
In the network model, there can be many possible paths to reach a node from the root node
(College is the root node in the above case), therefore the data can be accessed efficiently
when compared to the hierarchical data model. But, on the other hand, the process of
insertion and deletion of data is quite complex.
An Entity-Relationship model is a high-level data model that describes the structure of the
database in a pictorial form which is known as ER-diagram. In simple words, an ER
diagram is used to represent logical structure of the database easily.
ER model develops a conceptual view of the data hence it can be used as a blueprint to
implement the database in the future. Developers can easily understand the system just by
looking at ER diagram.
In the above-represented ER diagram, we have two entities that are Employee and Company, and
the relationship among them. Also, in the above-represented ER diagram, we can see that both the
employee and company have some attributes and the relationship is of “works in” type, which
means the employee works in a company.
4) Relational Model :-
This is the most widely accepted data model. In this model, the database is represented as
a collection of relations in the form of rows and columns of
a two-dimensional table. Each row is known as a tuple (a tuple contains all the data for an
individual record) while each column represents an attribute. For example –
The above table shows a relation “STUDENT” with attributes such as Stu. Id, Name, and
Branch which consists of 4 records or tuples
5) Object-Oriented Data model :-
Since data is stored as objects we can easily store audio, video, images, etc in the database
which was very difficult and inconvenient to do in the relational model. As shown in the
image below two objects are connected with each other through links.
In the above image, we have two objects that are Employee and Department in which all
the data is contained in a single unit (object). They are linked with each other as they share
a common attribute i.e. Department_Id
The biggest disadvantage of the data model is, one must know the characteristics
of physical data to build a data model.
Sometimes in big databases, it is quite difficult to understand the data model also
the cost incurred is very high.
DBMS Architecture :-
DBMS architecture refers to the design and layout of how a database management system (DBMS)
organizes, manages, and communicates with data. It outlines how data flows between users,
applications, and the database itself. There are mainly three types of DBMS architecture:
Highlights:
b- All the components of DBMS, i.e., the server, database, and client, reside on
a single system.
The 2-Tier architecture is the same as the basic client-server. In the two-tier
architecture, applications on the client end can directly communicate with
the database on the server side. For this interaction, APIs like ODBC, and
JDBC are used.
The user interfaces and application programs are run on the client side.
The server side is responsible to provide the functionalities like query
processing and transaction management. To communicate with the DBMS,
the client-side application establishes a connection with the server side.
Multiple users can use it at the same time. Hence, it can be used in an
organization.
It has high processing ability as the database functionality is handled
by the server alone.
Faster access to the database due to the direct connection and improved
performance.
Because of the two independent layers, it’s easier to maintain.
Highlights:-
The 3-Tier architecture contains another layer between the client and
server. In this architecture, the client can’t directly communicate with the
server.
The application on the client end interacts with an application server which
further communicates with the database system.
The end-user has no idea about the existence of the database beyond the
application server. The database also has no idea about any other user
beyond the application. The 3-Tier architecture is used in the case of the
large web application.
The main advantages of Three Tier DBMS Architecture are:
Scalability – Since the database server isn’t aware of any users beyond the application
layer and the application layer implements load balancing, there can be as many clients as
you want.
Data Integrity – Data corruption and bad requests can be avoided because of the checks
performed in the application layer on each client request.
Security – The removal of the direct connection between the client and server systems via
abstraction reduces unauthorized access to the database.
Note – In Three Tier DBMS Architecture, an additional layer (Application Layer) is added
between the Client and the Server. This increases the number of layers present between the
DBMS and the end-users, making the implementation of the DBMS structure complex and
difficult to maintain.
Highlights :
Data independence :-
Data Independence means the ability of the data to change the schema at one level of the database
without having to change the schema at the next higher level. In simple words, we can say that
Data independence is a property of a database that allows the User or Database Administrator to
change the schema at one level without affecting the data or schema at another level.
Purpose: The purpose of data independence is to enhance the security of the system, save time and
reduce costs needed once the information is changed or altered.
Logical Data Independence is a property of a database that can be used to change the
logic behind the logical level without affecting the other layers of the database. Logical
data independence is usually required for changing the conceptual schema without
having to change the external schema or application programs. It allows us to make
changes in a conceptual structure like adding, modifying, or deleting an attribute in the
database.
Example: If there is a database of a banking system and we want to add the details of
a new customer or we want to update or delete the data of a customer at the logical
level data will be changed but it will not affect the Physical level or structure of the
database.
These changes can be done at a logical level without affecting the application program
or external layer.
These changes can be done at a logical level without affecting the application program
or external layer:
Physical Data Independence can be defined as the ability to change the physical level without
affecting the logical or Conceptual level. Physical data independence gives us the freedom to
modify the – Storage device, File structure, location of the database, etc. without changing the
definition of conceptual or view level.
Example: For example, if we take the database of the banking system and we want to scale up the
database by changing the storage size and also want to change the file structure, we can do it without
affecting any functionality of logical schema.
Below changes can be done at the physical layer without affecting the conceptual layer –
Changing the storage devices like SSD, hard disk and magnetic tapes, etc.
Changing the access technique and modifying indexes.
Changing the compression techniques or hashing algorithms.
Physical data independence is used to change the Logical data independence is making sure that if you
internal schema without requiring a change in the add any new field or delete any existing field we do
logical schema. not need to change the application program.
Physical data independence is easy to attain in It is difficult to attain logical data independence
comparison to logical data independence. compared to physical data independence.
Physical data independence provides feasibility if we Logical data independence helps us to change the data
want to shift the database or want to change the file definition and the structure of the data without having
organization structure. changes in the physical schema.
Physical data independence deals with the internal Logical data independence deals with conceptual
structure of the schema. schema.
Examples of changes in Physical independence are Examples of changes in logical independence are
Changing the compression techniques, hashing Adding, deleting, or modifying the entity or
algorithms, SSD, location of the database, etc. relationship.
Data independence is achieved through the separation of the database into multiple levels in the
DBMS architecture:
This is the lowest level of abstraction in the DBMS. It describes how the data is physically
stored on storage devices such as hard drives or SSDs. It deals with low-level details like
data structures, indexing, file organization, and data compression
This is the middle level of abstraction and describes what data is stored in the database and
the relationships between data entities, without specifying how the data is stored. It focuses
on the overall structure of the database (schema) and defines entities, attributes, data types,
and relationships.
View Level: Describes how the data is viewed by users or specific applications
The highest level of abstraction, where different users or applications are provided with
specific views of the data that they need. These views are typically a subset or customized
presentation of the data from the logical level, and they can simplify access by hiding
irrelevant details.
Flexibility: Data independence allows for changes to be made in the database schema
(structure) without affecting the way data is accessed or presented to users. This flexibility
makes it easier to adapt the database to evolving requirements and business needs.
Application Compatibility: Changes to the logical schema do not impact the application
programs or queries that rely on the database. This means that existing applications can
continue to function correctly even when the database structure changes, reducing the risk
of disruptions.
Easier Maintenance: Database administrators can perform routine maintenance tasks,
such as reorganizing data for performance optimization or implementing security updates,
without disrupting user access or application functionality.
Enhanced Security: Data independence allows for security measures and access controls
to be implemented at the logical level, protecting sensitive data from unauthorized access
or modification. Security policies can be enforced without exposing the underlying physical
storage details.
Data Integrity: Changes to the logical schema can be managed carefully to ensure data
integrity and consistency. Referential integrity constraints and validation rules can be
applied at the logical level to maintain data quality.
While data independence offers many advantages in database management systems, it’s essential
to consider its potential disadvantages and limitations: