0% found this document useful (0 votes)
16 views39 pages

Lecture#02 FileSystemAndDB

This document provides an overview of the objectives and topics that will be covered in a database systems course. The course aims to study how to design, implement, query, and manipulate databases from initial design through long-term use. It will cover relational and non-relational databases, database optimization techniques, and advanced parallel and distributed database systems. The outline presented includes types of data, advantages of database management systems over file systems, the ACID properties, and SQL versus NoSQL databases.

Uploaded by

farhan121921
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
Download as pptx, pdf, or txt
0% found this document useful (0 votes)
16 views39 pages

Lecture#02 FileSystemAndDB

This document provides an overview of the objectives and topics that will be covered in a database systems course. The course aims to study how to design, implement, query, and manipulate databases from initial design through long-term use. It will cover relational and non-relational databases, database optimization techniques, and advanced parallel and distributed database systems. The outline presented includes types of data, advantages of database management systems over file systems, the ACID properties, and SQL versus NoSQL databases.

Uploaded by

farhan121921
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1/ 39

CS-2005 DATABASE

SYSTEMS

Week-02
Course Objectives
In this course we aim at studying:

Parallel and
distributed
Connection with DBMSs, NoSQL
various DBMS
How to refine
and speed up
How to query data retrieval
and manipulate and
How to design databases manipulation
and implement
databases from
‘cradle-to-grave’

Application-Centric Systems-Centric & Theory-Centric Advanced Topics


(A Brief Overview)
2
Outline

1. Types of Data
2. Drawbacks of File System
3. Advantages of DBMS
4. ACID property
5. SQL vs NoSQL

3
Files

4
File System

DBMS

5
Drawbacks of File System

■ Data Redundancy & Data Inconsistency


■ Difficulty in Accessing Data
■ Data Isolation
■ Integrity Problems
■ Atomicity Problems
■ Concurrent Access Anomalies
■ Security Problems

6
1. Data Redundancy & Data Inconsistency
OFFICE

LIBRARY

Accounts
OFFICE

LIBRARY

Accounts
2. Difficulty in Accessing Data
OFFICE

LIBRARY

ACCOUNT
3. Data Isolation
4. Integrity Problems
5. Atomicity Problems

Account A Account B

Initial Balance: 1000/- Initial Balance: 2000/-

Fund Transfer of 800/- from Account A to Account B


6. Concurrent Access Anomalies
7. Security
PURPOSE OF DBMS
DATABASE SYSTEMS OFFER SOLUTIONS TO
ALL THE ABOVE PROBLEMS
Advantages of DBMS over file
system
1. Fast Data Access: The data response time increases in DBMS.

2. Minimized Data Redundancy: DBMS has different constraints


using them same data can't be stored in more than one places.

3. Data Consistency: Since DBMS solves the problem of data


redundancy, the problem of data consistency is automatically
solved.

4. No attributes for accessing the data: Here, we don't need to


know the location of the file. The user makes a request from
any web application or app and the server responds
accordingly.

17
4. Concurrent Access: Multiple users can access the database
at the same time when we are using the Database
Management System.

5. Security: We have role-based access control in DBMS. Each


user has a different set of access thus the data is secured
from problems like data leaks, misuse of data etc.
Examples RDBMS

19
ACID Property

■ Atomicity - a transaction to transfer funds from


one account to another involves making a
withdrawal operation from the first account
and a deposit operation on the second. If the
deposit operation failed, you don’t want the
withdrawal operation to happen either.

20
Cont..

■ Consistency: This property ensures that the


transaction maintains data integrity
constraints, leaving the data consistent. The
transaction creates a new valid state of the
data and if some failure happens, return all the
data with the state before the transaction
being executed.

21
Cont..

■ Isolation: This property ensures the isolation of


each transaction, ensuring that the
transaction will not be changed by any other
concurrent transaction. It means that each
transaction in progress will not be interfered
by any other transaction until it is completed.

22
Cont..

■ Durability: Once a transaction is completed and


committed, its changes are persisted
permanently in the database. This property
ensures that the information that is saved in
the database is immutable until another update
or deletion transaction affects it.

23
Quiz
Find at least 2 design errors in the given database

24
Types of Data
Types

Structured Unstructured
Data Data

25
Structured Data

■ Stored in tabular format


■ Clearly defined
■ Data is stored in a pre-defined data model
■ Think of data that fits neatly within fixed fields and
columns in relational databases and spreadsheets.
■ Examples of structured data include
– names, dates, addresses,
– credit card numbers,
– stock information,
– geolocation, and more.

26
Structured Data

RDBMS

Structured data is stored in relational databases


27
Unstructured Data
■ No predefined structure
■ No data model
■ Irregular and ambiguous
■ Examples of unstructured data include
– text,
– video files,
– audio files,
– mobile activity,
– social media posts,
– satellite imagery
■ Non-relational or NoSQL databases are the best fit for
managing unstructured data.

28
Types of Databases

29
Horizontal Vs. Vertical
Scaling Horizontal scaling Vertical scaling

Increase or decrease the number of nodes in Increase or decrease the power of a


Description a cluster or system to handle an increase or system to handle increased or reduced
decrease in workload workload

Add or reduce the number of virtual Add or reduce the CPU or memory
Example
machines (VM) in a cluster of VMs capacity of the existing VM

Execution Scale in/out Scale up/down


Workload is distributed across multiple
nodes. A single node handles the entire
Workload distribution
Parts of the workload reside on these workload.
different nodes

Distributes multiple jobs across multiple Relies on multi-threading on the


Concurrency machines over the network, at a go. This existing machine to handle multiple
reduces the workload on each machine requests at the same time

Required architecture Distributed Any


30
Horizontal Vs. Vertical
Scaling
Implementation Takes more time, expertise, and effort
Takes less time, expertise, and
effort

Complexity and maintenance Higher Lower

This requires modifying a sequential piece No need to change the logic. The
Configuration of logic in order to run workloads same code can run on a higher-
concurrently on multiple machines spec device

Necessary to actively distribute workload


Load balancing Not required in the single node
across the multiple nodes

Low because other machines in the cluster High since it’s a single source of
Failure
offer backup failure
Low-cost initially; less cost-
Costs High costs initially; optimal over time
effective over time

Slower machine-to-machine
Networking Quick inter-machine communication
communication
Performance Higher Lower
Limited to the resource capacity
Limitation Add as many machines as you can
the single machine can handle

31
Architectures

32
SQL
■ It has a predefined schema.
■ Add Nil if data is not present (Memory Wastage)
■ Change Schema or Data in case of modifications
■ Tabular format
■ Not easily scalable (designed for 90’s technology or
worse)
■ Requires joins

33
NoSQL

■ Schema-less Database
■ Change can be easily incorporated

 Key/value (Dynamo)
 Columnar/tabular
(HBase)
 Document (mongoDB)

34
SQL vs NoSQL

35
Is NoSQL better than SQL?

■ NoSQL tends to be a better option for modern


applications that have more complex, constantly
changing data sets, requiring a flexible data model
that doesn’t need to be immediately defined.

■ NoSQL databases can store and process data in real-


time.

■ NoSQL databases can't typically enforce or


guarantee uniqueness for keys within documents like
traditional relational systems do.

36
When to Choose SQL
 Structured Data: If your data has a well-defined schema with fixed
tables and relationships between them, SQL databases are a
good choice.

 ACID Compliance: SQL databases are ACID (Atomicity,


Consistency, Isolation, Durability) compliant, which ensures data
consistency and reliability. If your application requires strict
transaction management, SQL databases are a strong choice.

 Complex Queries: SQL databases excel at handling complex


queries, especially those involving multiple joins and
aggregations.

 Scalability: SQL databases can be scaled vertically (by adding


more resources to a single server) or, in some cases, horizontally.
While horizontal scalability can be more challenging with SQL
databases, it's still possible with certain configurations.

37
When to Choose NoSQL
 Flexible Schema: NoSQL databases are schema-less or have a
flexible schema, making them suitable for projects with
evolving Ultimately, the
or unstructured choice between
data.
SQL and NoSQL databases
depends
 High Throughput: NoSQL on databases
the dataare structure,
often chosen for high-
velocity scale,
applications with large volumes of data and high
performance
read/write rates, such as social media platforms, IoT, and
requirements,
real-time analytics.
and the
development team's familiarity
 Horizontal with theNoSQL
Scalability: technology.
databasesInaresomedesigned for easy
horizontal cases,
scaling. a They
hybrid
canapproach using
distribute data across multiple
servers or bothnodes,
types providing excellent within
of databases scalability and fault
tolerance.
the same project may be the
 Variety of most appropriate
Data Models: solutioncome
NoSQL databases to in different
take
flavors, including advantage
document-orientedof (e.g.,
their
MongoDB), key-
value stores (e.g., strengths.
respective Redis), column-family stores (e.g.,
Cassandra), and graph databases (e.g., Neo4j). You can
choose the NoSQL type that best matches your data and
query requirements. 38
QUESTIONS

You might also like