0% found this document useful (0 votes)
15 views18 pages

dbms unit -1

Uploaded by

ayanashpatel68
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
15 views18 pages

dbms unit -1

Uploaded by

ayanashpatel68
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 18

Unit -I

Before we know about database first we have to understand what is data

Data: Data is a raw material. it’s a collection of facts and figures. Data does not have a
significant meaning because of its raw nature. Data may include text, figures, facts, images,
numbers, graphs, and symbols and it can be generated from different sources like sensors,
surveys, transactions, social media etc.

Database:
The database is a collection of inter-related data i.e: Data which is dependent on each other , which
is used to retrieve, insert and delete the data efficiently. It is also used to organize the data in the
form of a table, schema, views, and reports, etc.

Database = Data + Base, the actual storage of all the informations and data.

Basically database is a place where all the data and informations are stored.

For example: The college Database organizes the data about the admin, staff, students and faculty
etc.Using the database, you can easily retrieve, insert, and delete the information.

Database Management System (DBMS):-

A database management system is software that manages a database by storing, retrieving,


and manipulating the data from a database. Oracle, MySQL, etc are well-known DBMS
tools.

Some functions of a Database Management System are:

 A database management system (DBMS) provides an interface for performing


various activities such as creation, deletion, and modification of the data.
 A database management system (DBMS) allows users to design databases that meet
their specific needs.
 A database management system (DBMS) is a collection of programs that respond
to user commands.
 It ensures the database’s safety by providing security patterns like password
protection and verification to ensure access to only authorized users.
 It can be easily used using the queries.

Note - A database is a collection of organized data, while a DBMS (Database


Management System) is the software that manages and interacts with the database,
allowing users to store, retrieve, and manipulate data efficiently.
Characteristics Database approach :

There are different characteristics of the database approach from the much older approach of
programming with files. In a traditional file processing system

Some of the most important characteristics of the database approach to the file processing
approach are the following as follows.

Approach-1 : Self-Describing Nature of a Database System:-

 One of the most fundamental characteristics of the database approach is that the
database system contains not just the actual data (like names and numbers) but also
detailed information about how that data is organized and what rules apply to it. This
extra information is called metadata. It describes the structure of the database, such as
the types of data it holds and any restrictions on that data, helping the system manage
and use the information effectively.

 This information is stored in the DBMS Catalog. The catalog includes details about the
structure of each file, how data is organized and stored, and the rules or constraints that
apply to the data. This helps the database management system understand and manage
the data effectively

 The catalog stores metadata, which describes the structure of the database.This
information is used by the DBMS software and database users, like administrators, to
understand the database layout.

 A general-purpose DBMS is not designed for a specific application; it relies on the


catalog to know how to handle various databases. The DBMS can work with different
types of databases (like university, banking, or corporate) because the database
definitions are in the catalog. In traditional file processing, data definitions are part of
the files, limiting access to specific databases.In contrast, a DBMS can access multiple
databases by retrieving definitions from the catalog.

Note:- A catalog in a database management system (DBMS) is a collection of metadata that


describes the structure and organization of the database. It includes information such as: Tables
and their structures, File formats and storage details ,Constraints and rules

The catalog helps the DBMS understand how to manage and access data effectively, and it is
essential for database administrators and users to work with the database efficiently

Approach-2:Insulation between Programs and Data, and Data Abstraction:


 In traditional file processing systems, the structure of data is built directly into the
application programs. This means that if the data structure changes, all programs
using that data would need to be updated as well.

 With a DBMS (Database Management System), this problem is mostly avoided


because the data structure is kept separate from the programs. This separation is
called program-data independence. Changes to the data structure often don’t require
changes to the programs that access the data.

 This independence is possible due to data abstraction, which allows users to interact
with data without worrying about how it’s stored or how operations are performed
internally. The DBMS provides a simplified view of the data, known as a data
model, that shows logical concepts like objects and relationships. This hides
complex storage and implementation details, making it easier for users to work with
the data without getting into technical details.

Approach-3 : Support for Multiple Views of the Data:-

 In a database, there are often many users, each needing their own specific view of
the data. A view can be a part of the database or data that is created from existing
data but not actually stored. Users don’t need to worry about whether the data they
are seeing is stored directly or generated from other data.

 In a multi-user DBMS, where different users have different needs, the system must
allow for the creation of multiple views. This feature is particularly useful for large
databases like the Aadhaar database, as it lets different users access only the
information relevant to them, making data management more efficient.

Approach-4 : Sharing of knowledge and Multi-user Transaction Processing :-

 A multi-user DBMS allows multiple users to access and update the database at the
same time. This is important for systems that integrate data from multiple
applications, like WhatsApp’s integration with Facebook.

 To ensure updates happen correctly, the DBMS uses concurrency control, which
prevents conflicts when users try to update the same data simultaneously. For
example, if several agents are booking seats on a flight, the system ensures that no
two agents assign the same seat to different passengers
 These types of systems are called online transaction processing (OLTP)
applications. The DBMS ensures that all transactions are processed accurately,
without errors or inconsistencies. Each transaction must follow the ACID
properties:

1. Atomicity: Either all operations in a transaction are completed, or none are.


2. Isolation: Transactions are processed independently, even when happening
concurrently.

This guarantees reliable and efficient database operations, even with many users.

Data models :-

A data model in a database defines how data is structured, stored, and manipulated.
It provides a blueprint for organizing data and determines the relationships between
different pieces of information. There are several types of data models used in
databases, each with its own approach to structuring data

Data models are used to describe how the data is stored, accessed, and updated in a
DBMS.

Types of Data Models in DBMS :-

1. Hierarchical Model :-

The hierarchical data model is one of the oldest data models, developed in
the 1950s by IBM. In this data model, the data is organized in a hierarchical
tree-like structure. This data model can be easily visualized because each
record in DBMS has one parent and many children

The above-given image represents the data model of the Vehicle database, vehicle are
classified into two types. two-wheelers and four-wheelers and then they are further
classified.
The main drawback we can see here is we can only have one too many relationships under
this model, hence the hierarchical data model is very rarely used nowadays.

2) Network Model:-

A network model is nothing but a generalization of the hierarchical data model as this data
model allows many to many relationships therefore in this model a record can also have
more than one parent.

The network model in DBMS can be represented as a graph and hence it replaces the
hierarchical tree with a graph in which object types are the nodes and relationships are the
edges.

Here you can see all three departments are linked with the director which was not possible
in the hierarchical data model.
In the network model, there can be many possible paths to reach a node from the root node
(College is the root node in the above case), therefore the data can be accessed efficiently
when compared to the hierarchical data model. But, on the other hand, the process of
insertion and deletion of data is quite complex.

3) Entity-Relationship Model (ER Model):-

An Entity-Relationship model is a high-level data model that describes the structure of the
database in a pictorial form which is known as ER-diagram. In simple words, an ER
diagram is used to represent logical structure of the database easily.

ER model develops a conceptual view of the data hence it can be used as a blueprint to
implement the database in the future. Developers can easily understand the system just by
looking at ER diagram.
In the above-represented ER diagram, we have two entities that are Employee and Company, and
the relationship among them. Also, in the above-represented ER diagram, we can see that both the
employee and company have some attributes and the relationship is of “works in” type, which
means the employee works in a company.

4) Relational Model :-

This is the most widely accepted data model. In this model, the database is represented as
a collection of relations in the form of rows and columns of

a two-dimensional table. Each row is known as a tuple (a tuple contains all the data for an
individual record) while each column represents an attribute. For example –

The above table shows a relation “STUDENT” with attributes such as Stu. Id, Name, and
Branch which consists of 4 records or tuples
5) Object-Oriented Data model :-

As suggested by its name, the object-oriented data model is a combination of object-


oriented programming and relational data model. In this data model, the data and their
relationship are represented in a single structure which is known as an object.

Since data is stored as objects we can easily store audio, video, images, etc in the database
which was very difficult and inconvenient to do in the relational model. As shown in the
image below two objects are connected with each other through links.

In the above image, we have two objects that are Employee and Department in which all
the data is contained in a single unit (object). They are linked with each other as they share
a common attribute i.e. Department_Id

Advantages of Data Models in DBMS:-

 Data models ensure that the data is represented accurately.


 The relationship between the data is well-defined.
 Data Redundancy in DBMS can be minimized and missing data can be identified
easily.
 Last but not least, the security of the data is not compromised.
Disadvantages of Data Models in DBMS:-

 The biggest disadvantage of the data model is, one must know the characteristics
of physical data to build a data model.
 Sometimes in big databases, it is quite difficult to understand the data model also
the cost incurred is very high.

DBMS Architecture :-

DBMS architecture refers to the design and layout of how a database management system (DBMS)
organizes, manages, and communicates with data. It outlines how data flows between users,
applications, and the database itself. There are mainly three types of DBMS architecture:

2) Single Tier Architecture (One-Tier Architecture)


3) Two-Tier Architecture
4) Three-Tier Architecture

1. Single Tier Architecture (One-Tier Architecture):-


 In this architecture, the database is directly available to the user. It means
the user can directly sit on the DBMS and uses it.
 Any changes done here will directly be done on the database itself. It
doesn’t provide a handy tool for end users.
 The 1-Tier architecture is used for development of the local application,
where programmers can directly communicate with the database for the
quick response

Single Tier DBMS Architecture is used whenever:

 The data isn’t changed frequently.


 No multiple users are accessing the database system.
 We need a direct and simple way to modify or access the database for
application development.

Highlights:

a- Simplest DBMS architecture.

b- All the components of DBMS, i.e., the server, database, and client, reside on
a single system.

c- The user can directly access the database.

d- Used when data isn’t changing frequently.

e-Suitable for programmers, database designers, and single-user access.


2. Two Tier Architecture:-

 The 2-Tier architecture is the same as the basic client-server. In the two-tier
architecture, applications on the client end can directly communicate with
the database on the server side. For this interaction, APIs like ODBC, and
JDBC are used.
 The user interfaces and application programs are run on the client side.
 The server side is responsible to provide the functionalities like query
processing and transaction management. To communicate with the DBMS,
the client-side application establishes a connection with the server side.

The main advantages of having a two-tier architecture over a single tier


are:

 Multiple users can use it at the same time. Hence, it can be used in an
organization.
 It has high processing ability as the database functionality is handled
by the server alone.
 Faster access to the database due to the direct connection and improved
performance.
 Because of the two independent layers, it’s easier to maintain.

The main disadvantages of Two-Tier DBMS Architecture are:


 Scalability – As the number of clients increases, the load on the server
increases. Thereby declining the performance of the DBMS and, in
turn, the client-side application.
 Security – The Direct connection between the client and server systems
makes this architecture vulnerable to attacks.

Highlights:-

A. Similar to a client-server architecture.


B. Faster access, Easier to maintain, and can handle multiple users
simultaneously
C. Used when we wish to access DBMS via applications and APIs
D. Has scalability and security issues because of direct client-server
connection.

3. Three Tier Architecture:-

 The 3-Tier architecture contains another layer between the client and
server. In this architecture, the client can’t directly communicate with the
server.
 The application on the client end interacts with an application server which
further communicates with the database system.
 The end-user has no idea about the existence of the database beyond the
application server. The database also has no idea about any other user
beyond the application. The 3-Tier architecture is used in the case of the
large web application.
The main advantages of Three Tier DBMS Architecture are:

 Scalability – Since the database server isn’t aware of any users beyond the application
layer and the application layer implements load balancing, there can be as many clients as
you want.
 Data Integrity – Data corruption and bad requests can be avoided because of the checks
performed in the application layer on each client request.
 Security – The removal of the direct connection between the client and server systems via
abstraction reduces unauthorized access to the database.

Note – In Three Tier DBMS Architecture, an additional layer (Application Layer) is added
between the Client and the Server. This increases the number of layers present between the
DBMS and the end-users, making the implementation of the DBMS structure complex and
difficult to maintain.

Highlights :

1- Most widely used DBMS architecture.


2- Follows Client-Application-Server architecture.
3- Enhanced security, data integrity, and scalability.
4- Has complexity and maintenance issues because of the extra layer.

Data independence :-

Data Independence means the ability of the data to change the schema at one level of the database
without having to change the schema at the next higher level. In simple words, we can say that
Data independence is a property of a database that allows the User or Database Administrator to
change the schema at one level without affecting the data or schema at another level.

Purpose: The purpose of data independence is to enhance the security of the system, save time and
reduce costs needed once the information is changed or altered.

There are two types of data independence:

A. Logical Data Independence:-

Logical Data Independence is a property of a database that can be used to change the
logic behind the logical level without affecting the other layers of the database. Logical
data independence is usually required for changing the conceptual schema without
having to change the external schema or application programs. It allows us to make
changes in a conceptual structure like adding, modifying, or deleting an attribute in the
database.
Example: If there is a database of a banking system and we want to add the details of
a new customer or we want to update or delete the data of a customer at the logical
level data will be changed but it will not affect the Physical level or structure of the
database.

These changes can be done at a logical level without affecting the application program
or external layer.

These changes can be done at a logical level without affecting the application program
or external layer:

 Adding, deleting, or modifying the entity or relationship.


 Merging or breaking the record present in the database.

B. Physical Level Data Independence:-

Physical Data Independence can be defined as the ability to change the physical level without
affecting the logical or Conceptual level. Physical data independence gives us the freedom to
modify the – Storage device, File structure, location of the database, etc. without changing the
definition of conceptual or view level.

Example: For example, if we take the database of the banking system and we want to scale up the
database by changing the storage size and also want to change the file structure, we can do it without
affecting any functionality of logical schema.

Below changes can be done at the physical layer without affecting the conceptual layer –

 Changing the storage devices like SSD, hard disk and magnetic tapes, etc.
 Changing the access technique and modifying indexes.
 Changing the compression techniques or hashing algorithms.

Difference between Logical Data Independence and Physical Data Independence:


Physical Data Independence Logical Data Independence

Physical data independence is used to change the Logical data independence is making sure that if you
internal schema without requiring a change in the add any new field or delete any existing field we do
logical schema. not need to change the application program.

Physical data independence is easy to attain in It is difficult to attain logical data independence
comparison to logical data independence. compared to physical data independence.

Physical data independence provides feasibility if we Logical data independence helps us to change the data
want to shift the database or want to change the file definition and the structure of the data without having
organization structure. changes in the physical schema.

Physical data independence deals with the internal Logical data independence deals with conceptual
structure of the schema. schema.

Examples of changes in Physical independence are Examples of changes in logical independence are
Changing the compression techniques, hashing Adding, deleting, or modifying the entity or
algorithms, SSD, location of the database, etc. relationship.

How Data Independence is Achieved:-

Data independence is achieved through the separation of the database into multiple levels in the
DBMS architecture:

 Physical Level: Describes how the data is physically stored.

This is the lowest level of abstraction in the DBMS. It describes how the data is physically
stored on storage devices such as hard drives or SSDs. It deals with low-level details like
data structures, indexing, file organization, and data compression

 Logical Level: Describes the structure and organization of the data.

This is the middle level of abstraction and describes what data is stored in the database and
the relationships between data entities, without specifying how the data is stored. It focuses
on the overall structure of the database (schema) and defines entities, attributes, data types,
and relationships.

 View Level: Describes how the data is viewed by users or specific applications

The highest level of abstraction, where different users or applications are provided with
specific views of the data that they need. These views are typically a subset or customized
presentation of the data from the logical level, and they can simplify access by hiding
irrelevant details.

Advantages of Data Independence :-

 Flexibility: Data independence allows for changes to be made in the database schema
(structure) without affecting the way data is accessed or presented to users. This flexibility
makes it easier to adapt the database to evolving requirements and business needs.
 Application Compatibility: Changes to the logical schema do not impact the application
programs or queries that rely on the database. This means that existing applications can
continue to function correctly even when the database structure changes, reducing the risk
of disruptions.
 Easier Maintenance: Database administrators can perform routine maintenance tasks,
such as reorganizing data for performance optimization or implementing security updates,
without disrupting user access or application functionality.
 Enhanced Security: Data independence allows for security measures and access controls
to be implemented at the logical level, protecting sensitive data from unauthorized access
or modification. Security policies can be enforced without exposing the underlying physical
storage details.
 Data Integrity: Changes to the logical schema can be managed carefully to ensure data
integrity and consistency. Referential integrity constraints and validation rules can be
applied at the logical level to maintain data quality.

Disadvantages of Data Independence:-

While data independence offers many advantages in database management systems, it’s essential
to consider its potential disadvantages and limitations:

 Complexity: Maintaining multiple levels of schema (external, conceptual, and internal) to


achieve data independence can introduce complexity into the database system. This
complexity can make database design and management more challenging.
 Performance Overhead: Implementing data independence can sometimes result in
performance overhead. The added layers of abstraction between the logical and physical
data can impact query performance and data retrieval efficiency.
 Resource Consumption: Managing data independence may require additional system
resources, such as storage space and processing power, to handle the various schema layers
and translations between them.
 Potential for Redundancy: In some cases, data independence may lead to data
redundancy. Different external schemas might require the same data to be stored in multiple
formats or physical locations, which can increase storage requirements and synchronization
challenges.
 Data Integrity Risk: Changes in the logical schema, if not managed carefully, can lead to
data integrity issues. Ensuring that data remains consistent and that referential integrity
constraints are maintained can be challenging when altering the logical schema.

You might also like