No SQL - Types, CAP Theorem

Uploaded by

Guhan Bala

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

118 views12 pages

No SQL - Types, CAP Theorem

Uploaded by

Guhan Bala

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

NoSQL Databases: Introduction – CAP Theorem

NoSQL is a type of database management system (DBMS) that is designed to handle

and store large volumes of unstructured and semi-structured data. Unlike traditional relational
databases that use tables with pre-defined schemas to store data, NoSQL databases use flexible
data models that can adapt to changes in data structures and are capable of scaling horizontally
to handle growing amounts of data.
The term NoSQL originally referred to “non-SQL” or “non-relational” databases, but
the term has since evolved to mean “not only SQL,” as NoSQL databases have expanded to
include a wide range of different database architectures and data models.
NoSQL databases are generally classified into four main categories:
Document databases: These databases store data as semi-structured documents, such as JSON
or XML, and can be queried using document-oriented query languages.
Key-value stores: These databases store data as key-value pairs, and are optimized for simple
and fast read/write operations.
Column-family stores: These databases store data as column families, which are sets of
columns that are treated as a single entity. They are optimized for fast and efficient querying
of large amounts of data.
Graph databases: These databases store data as nodes and edges, and are designed to handle
complex relationships between data.
NoSQL databases are often used in applications where there is a high volume of data
that needs to be processed and analyzed in real-time, such as social media analytics, e-
commerce, and gaming. They can also be used for other applications, such as content
management systems, document management, and customer relationship management.
However, NoSQL databases may not be suitable for all applications, as they may not
provide the same level of data consistency and transactional guarantees as traditional relational
databases. It is important to carefully evaluate the specific needs of an application when
choosing a database management system.
NoSQL originally referring to non SQL or non relational is a database that provides a
mechanism for storage and retrieval of data. This data is modeled in means other than the
tabular relations used in relational databases. Such databases came into existence in the late
1960s, but did not obtain the NoSQL moniker until a surge of popularity in the early twenty-
first century. NoSQL databases are used in real-time web applications and big data and their
use are increasing over time.
NoSQL systems are also sometimes called Not only SQL to emphasize the fact that
they may support SQL-like query languages. A NoSQL database includes simplicity of design,
simpler horizontal scaling to clusters of machines and finer control over availability. The data
structures used by NoSQL databases are different from those used by default in relational
databases which makes some operations faster in NoSQL. The suitability of a given NoSQL
database depends on the problem it should solve.
NoSQL databases, also known as “not only SQL” databases, are a new type of database
management system that have gained popularity in recent years. Unlike traditional relational
databases, NoSQL databases are designed to handle large amounts of unstructured or semi-
structured data, and they can accommodate dynamic changes to the data model. This makes
NoSQL databases a good fit for modern web applications, real-time analytics, and big data
processing.
Data structures used by NoSQL databases are sometimes also viewed as more flexible
than relational database tables. Many NoSQL stores compromise consistency in favor of
availability, speed and partition tolerance. Barriers to the greater adoption of NoSQL stores
include the use of low-level query languages, lack of standardized interfaces, and huge previous
investments in existing relational databases.
Most NoSQL stores lack true ACID(Atomicity, Consistency, Isolation, Durability)
transactions but a few databases, such as MarkLogic, Aerospike, FairCom c-treeACE, Google
Spanner (though technically a NewSQL database), Symas LMDB, and OrientDB have made
them central to their designs.
Most NoSQL databases offer a concept of eventual consistency in which database
changes are propagated to all nodes so queries for data might not return updated data
immediately or might result in reading data that is not accurate which is a problem known as
stale reads. Also some NoSQL systems may exhibit lost writes and other forms of data loss.
Some NoSQL systems provide concepts such as write-ahead logging to avoid data loss.
One simple example of a NoSQL database is a document database. In a document database,
data is stored in documents rather than tables. Each document can contain a different set of
fields, making it easy to accommodate changing data requirements
For example, “Take, for instance, a database that holds data regarding employees.”. In
a relational database, this information might be stored in tables, with one table for employee
information and another table for department information. In a document database, each
employee would be stored as a separate document, with all of their information contained
within the document.
NoSQL databases are a relatively new type of database management system that have
gained popularity in recent years due to their scalability and flexibility. They are designed to
handle large amounts of unstructured or semi-structured data and can handle dynamic changes
to the data model. This makes NoSQL databases a good fit for modern web applications, real-
time analytics, and big data processing.
Key Features of NoSQL :
Dynamic schema: NoSQL databases do not have a fixed schema and can accommodate
changing data structures without the need for migrations or schema alterations.
Horizontal scalability: NoSQL databases are designed to scale out by adding more nodes to
a database cluster, making them well-suited for handling large amounts of data and high levels
of traffic.
Document-based: Some NoSQL databases, such as MongoDB, use a document-based data
model, where data is stored in semi-structured format, such as JSON or BSON.
Key-value-based: Other NoSQL databases, such as Redis, use a key-value data model, where
data is stored as a collection of key-value pairs.
Column-based: Some NoSQL databases, such as Cassandra, use a column-based data model,
where data is organized into columns instead of rows.
Distributed and high availability: NoSQL databases are often designed to be highly available
and to automatically handle node failures and data replication across multiple nodes in a
database cluster.
Flexibility: NoSQL databases allow developers to store and retrieve data in a flexible and
dynamic manner, with support for multiple data types and changing data structures.
Performance: NoSQL databases are optimized for high performance and can handle a high
volume of reads and writes, making them suitable for big data and real-time applications.
Advantages of NoSQL: There are many advantages of working with NoSQL databases such
as MongoDB and Cassandra. The main advantages are high scalability and high availability.

High scalability: NoSQL databases use sharding for horizontal scaling. Partitioning of data
and placing it on multiple machines in such a way that the order of the data is preserved is
sharding. Vertical scaling means adding more resources to the existing machine whereas
horizontal scaling means adding more machines to handle the data. Vertical scaling is not that
easy to implement but horizontal scaling is easy to implement. Examples of horizontal scaling
databases are MongoDB, Cassandra, etc. NoSQL can handle a huge amount of data because of
scalability, as the data grows NoSQL scale itself to handle that data in an efficient manner.
Flexibility: NoSQL databases are designed to handle unstructured or semi-structured data,
which means that they can accommodate dynamic changes to the data model. This makes
NoSQL databases a good fit for applications that need to handle changing data requirements.
High availability: Auto replication feature in NoSQL databases makes it highly available
because in case of any failure data replicates itself to the previous consistent state.
Scalability: NoSQL databases are highly scalable, which means that they can handle large
amounts of data and traffic with ease. This makes them a good fit for applications that need to
handle large amounts of data or traffic
Performance: NoSQL databases are designed to handle large amounts of data and traffic,
which means that they can offer improved performance compared to traditional relational
databases.
Cost-effectiveness: NoSQL databases are often more cost-effective than traditional relational
databases, as they are typically less complex and do not require expensive hardware or
software.
Agility: Ideal for agile development.
Disadvantages of NoSQL: NoSQL has the following disadvantages.
Lack of standardization: There are many different types of NoSQL databases, each with its
own unique strengths and weaknesses. This lack of standardization can make it difficult to
choose the right database for a specific application
Lack of ACID compliance: NoSQL databases are not fully ACID-compliant, which means
that they do not guarantee the consistency, integrity, and durability of data. This can be a
drawback for applications that require strong data consistency guarantees.
Narrow focus: NoSQL databases have a very narrow focus as it is mainly designed for storage
but it provides very little functionality. Relational databases are a better choice in the field of
Transaction Management than NoSQL.
Open-source: NoSQL is open-source database. There is no reliable standard for NoSQL yet.
In other words, two database systems are likely to be unequal.
Lack of support for complex queries: NoSQL databases are not designed to handle complex
queries, which means that they are not a good fit for applications that require complex data
analysis or reporting.
Lack of maturity: NoSQL databases are relatively new and lack the maturity of traditional
relational databases. This can make them less reliable and less secure than traditional databases.
Management challenge: The purpose of big data tools is to make the management of a large
amount of data as simple as possible. But it is not so easy. Data management in NoSQL is
much more complex than in a relational database. NoSQL, in particular, has a reputation for
being challenging to install and even more hectic to manage on a daily basis.
GUI is not available: GUI mode tools to access the database are not flexibly available in the
market.
Backup: Backup is a great weak point for some NoSQL databases like MongoDB. MongoDB
has no approach for the backup of data in a consistent manner.
Large document size: Some database systems like MongoDB and CouchDB store data in
JSON format. This means that documents are quite large (BigData, network bandwidth, speed),
and having descriptive key names actually hurts since they increase the document size.
Types of NoSQL database: Types of NoSQL databases and the name of the databases system
that falls in that category are:

1. Graph Databases: Examples – Amazon Neptune, Neo4j

2. Key value store: Examples – Memcached, Redis, Coherence
3. Tabular: Examples – Hbase, Big Table, Accumulo
4. Document-based: Examples – MongoDB, CouchDB, Cloudant

When should NoSQL be used:

1. When a huge amount of data needs to be stored and retrieved.
2. The relationship between the data you store is not that important
3. The data changes over time and is not structured.
4. Support of Constraints and Joins is not required at the database level
5. The data is growing continuously and you need to scale the database regularly to handle
the data.

CAP theorem
It is very important to understand the limitations of the NoSQL database. NoSQL
cannot provide consistency and high availability together. This was first expressed by Eric
Brewer in CAP Theorem.
CAP theorem or Eric Brewers theorem states that we can only achieve at most two
out of three guarantees for a database: Consistency, Availability, and Partition Tolerance.
Consistency means that all nodes in the network see the same data at the same time.
Availability is a guarantee that every request receives a response about whether it was
successful or failed. However, it does not guarantee that a read request returns the most recent
write. The more number of users a system can cater to better is the availability.
Partition Tolerance is a guarantee that the system continues to operate despite arbitrary
message loss or failure of part of the system. In other words, even if there is a network outage
in the data center and some of the computers are unreachable, still the system continues to
perform.
Out of these three guarantees, no system can provide more than 2 guarantees. Since in
the case of distributed systems, the partitioning of the network is a must, the tradeoff is always
between consistency and availability.
As depicted in the Venn diagram, RDBMS can provide only consistency but not
partition tolerance. While HBase and Redis can provide Consistency and Partition tolerance.
And MongoDB, CouchDB, Cassandra, and Dynamo guarantee only availability but no
consistency. Such databases generally settle down for eventual consistency meaning that after
a while the system is going to be ok.
Let us take a look at various scenarios or architectures of systems to better understand
the CAP theorem.
The first one is RDBMS where the Reading and writing of data happen on the same
machine. Such systems are consistent but not partition tolerant because if this machine goes
down, there is no backup. Also, if one user is modifying the record, others would have to wait
thus compromising the high availability.
The second diagram is of a system that has two machines. Only one machine can accept
modifications while the reads can be done from all machines. In such systems, the
modifications flow from that one machine to the rest. Such systems are highly available as
there are multiple machines to serve. Also, such systems are partition tolerant because if one
machine goes down, there are other machines available to take up that responsibility. Since it
takes time for the data to reach other machines from node A, the other machine would be
serving older data. This causes inconsistency. Though the data is eventually going to reach all
machines and after a while, things are going to be okay. There we call such systems eventually
consistent instead of strongly consistent. This kind of architecture is found in Zookeeper and
MongoDB.
In the third design of any storage system, we have one machine similar to our first
diagram along with its backup. Every new change or modification at A in the diagram is
propagated to the backup machine B. There is only one machine which is interacting with the
readers and writers. So, It is consistent but not highly available.
Let’s first understand C, A, and P in simple words:
Consistency: means that all clients see the same data at the same time, no matter which
node they connect to in a distributed system. To achieve consistency, whenever data is written
to one node, it must be instantly forwarded or replicated to all the other nodes in the system
before the write is deemed successful.
Availability: means that every non-failing node returns a response for all read and write
requests in a reasonable amount of time, even if one or more nodes are down. Another way to
state this — all working nodes in the distributed system return a valid response for any request,
without failing or exception.
Partition Tolerance: means that the system continues to operate despite arbitrary
message loss or failure of part of the system. In other words, even if there is a network outage
in the data center and some of the computers are unreachable, still the system continues to
perform. Distributed systems guaranteeing partition tolerance can gracefully recover from
partitions once the partition heals.
The CAP theorem categorizes systems into three categories:
CP (Consistent and Partition Tolerant) database: A CP database delivers
consistency and partition tolerance at the expense of availability. When a partition occurs
between any two nodes, the system has to shut down the non-consistent node (i.e., make it
unavailable) until the partition is resolved.
Partition refers to a communication break between nodes within a distributed system.
Meaning, if a node cannot receive any messages from another node in the system, there is a
partition between the two nodes. Partition could have been because of network failure, server
crash, or any other reason.
AP (Available and Partition Tolerant) database: An AP database delivers
availability and partition tolerance at the expense of consistency. When a partition occurs, all
nodes remain available but those at the wrong end of a partition might return an older version
of data than others. When the partition is resolved, the AP databases typically resync the nodes
to repair all inconsistencies in the system.
CA (Consistent and Available) database: A CA delivers consistency and availability
in the absence of any network partition. Often a single node’s DB servers are categorized as
CA systems. Single node DB servers do not need to deal with partition tolerance and are thus
considered CA systems.
In any networked shared-data systems or distributed systems partition tolerance is a
must. Network partitions and dropped messages are a fact of life and must be handled
appropriately. Consequently, system designers must choose between consistency and
availability.

Unit - 3
No ratings yet
Unit - 3
34 pages
Comprehensive Guide to NoSQL Databases
No ratings yet
Comprehensive Guide to NoSQL Databases
1 page
Unit 2 Bda
No ratings yet
Unit 2 Bda
28 pages
Unit Iii
No ratings yet
Unit Iii
22 pages
BDA Unit-5
No ratings yet
BDA Unit-5
18 pages
No SQL
No ratings yet
No SQL
11 pages
Understanding NoSQL Databases Explained
No ratings yet
Understanding NoSQL Databases Explained
21 pages
NoSQL Databases: Overview and Benefits
No ratings yet
NoSQL Databases: Overview and Benefits
28 pages
Overview of NoSQL Database Types
No ratings yet
Overview of NoSQL Database Types
21 pages
Understanding NoSQL Databases Explained
No ratings yet
Understanding NoSQL Databases Explained
36 pages
Unit VI Big Data
No ratings yet
Unit VI Big Data
19 pages
Understanding NoSQL Databases and Features
No ratings yet
Understanding NoSQL Databases and Features
10 pages
Understanding NoSQL Databases Explained
No ratings yet
Understanding NoSQL Databases Explained
27 pages
NOSQL
No ratings yet
NOSQL
5 pages
Challenges of Relational Databases vs NoSQL
No ratings yet
Challenges of Relational Databases vs NoSQL
18 pages
NoSQL: A Guide for IT Students
No ratings yet
NoSQL: A Guide for IT Students
15 pages
Understanding NoSQL Databases
No ratings yet
Understanding NoSQL Databases
26 pages
NoSQL PDF
No ratings yet
NoSQL PDF
21 pages
3.1 Introduction To NoSQL
No ratings yet
3.1 Introduction To NoSQL
10 pages
NoSQL Databases: Types, Features, and CAP Theorem
No ratings yet
NoSQL Databases: Types, Features, and CAP Theorem
112 pages
Unit 2
No ratings yet
Unit 2
48 pages
Nosql Database
No ratings yet
Nosql Database
19 pages
Understanding NoSQL Databases and Benefits
No ratings yet
Understanding NoSQL Databases and Benefits
137 pages
NoSQL Databases for Developers
No ratings yet
NoSQL Databases for Developers
28 pages
Module 1 Introduction
No ratings yet
Module 1 Introduction
9 pages
6.unit 2 Bda
No ratings yet
6.unit 2 Bda
50 pages
Unit II Nosql Data Management
No ratings yet
Unit II Nosql Data Management
57 pages
Overview of NoSQL Database Types
No ratings yet
Overview of NoSQL Database Types
12 pages
Full Stack UNIT3
No ratings yet
Full Stack UNIT3
57 pages
Unit No 1
No ratings yet
Unit No 1
34 pages
Nosql Databases
No ratings yet
Nosql Databases
2 pages
NoSQL Database Comprehensive Report
No ratings yet
NoSQL Database Comprehensive Report
75 pages
Unit II - BIG DATA ANALYTICS
No ratings yet
Unit II - BIG DATA ANALYTICS
11 pages
Understanding NoSQL Database Models
No ratings yet
Understanding NoSQL Database Models
29 pages
Understanding NoSQL Databases: Types & Features
No ratings yet
Understanding NoSQL Databases: Types & Features
11 pages
No SQL
No ratings yet
No SQL
3 pages
NoSQL Complete QB
No ratings yet
NoSQL Complete QB
43 pages
NoSQL Databases: Overview & Benefits
No ratings yet
NoSQL Databases: Overview & Benefits
8 pages
DB 5
No ratings yet
DB 5
39 pages
Unit 2
No ratings yet
Unit 2
25 pages
Understanding NoSQL Databases Explained
No ratings yet
Understanding NoSQL Databases Explained
33 pages
Understanding NoSQL Databases Explained
No ratings yet
Understanding NoSQL Databases Explained
4 pages
Understanding NoSQL Databases Explained
No ratings yet
Understanding NoSQL Databases Explained
11 pages
What Is NoSQL
No ratings yet
What Is NoSQL
52 pages
NoSQL for Modern Developers
No ratings yet
NoSQL for Modern Developers
14 pages
Key Features of NoSQL Databases
No ratings yet
Key Features of NoSQL Databases
11 pages
NoSQL Technologies Notes Unit 1
100% (1)
NoSQL Technologies Notes Unit 1
20 pages
Overview of NoSQL Databases Types
No ratings yet
Overview of NoSQL Databases Types
19 pages
BDA Unit2 Complete
No ratings yet
BDA Unit2 Complete
56 pages
Understanding NoSQL Database Types
No ratings yet
Understanding NoSQL Database Types
36 pages
Understanding NoSQL Data Management
No ratings yet
Understanding NoSQL Data Management
70 pages
BigData Unit2 V2
No ratings yet
BigData Unit2 V2
70 pages
NoSQL Database Overview and Features
No ratings yet
NoSQL Database Overview and Features
32 pages
NoSQL Databases
No ratings yet
NoSQL Databases
10 pages
Unit 1-NoSQL
No ratings yet
Unit 1-NoSQL
31 pages
5.1 BDA NoSQL
No ratings yet
5.1 BDA NoSQL
23 pages
NoSQL Database Overview and Comparison
No ratings yet
NoSQL Database Overview and Comparison
51 pages
Understanding NoSQL Databases
No ratings yet
Understanding NoSQL Databases
4 pages
Cbse X It Pre Board-1 QP 24-09-25
No ratings yet
Cbse X It Pre Board-1 QP 24-09-25
7 pages
11
No ratings yet
11
3 pages
Database Management Assignment
No ratings yet
Database Management Assignment
4 pages
SSIS Prefix Naming
No ratings yet
SSIS Prefix Naming
1 page
RMAN Restore and Recovery Scenarios
No ratings yet
RMAN Restore and Recovery Scenarios
8 pages
090624-Russ Tront-Slides-Excel For BI Using Oracle OLAP
No ratings yet
090624-Russ Tront-Slides-Excel For BI Using Oracle OLAP
43 pages
Developer Language Preferences
No ratings yet
Developer Language Preferences
4,975 pages
(Part One - 40 Part Two - 60) : (STD - Id Number (4), Course - Id Varchar2 (10), Start - Date Date, End - Date Date)
No ratings yet
(Part One - 40 Part Two - 60) : (STD - Id Number (4), Course - Id Varchar2 (10), Start - Date Date, End - Date Date)
5 pages
Ad3391 Database Design and Management
No ratings yet
Ad3391 Database Design and Management
2 pages
Row-Based Storage Vs Column-Based Storage - A Beginner's Guide - by Santosh Beora - Medium
No ratings yet
Row-Based Storage Vs Column-Based Storage - A Beginner's Guide - by Santosh Beora - Medium
11 pages
Sneha Prabhu Resume PDF
No ratings yet
Sneha Prabhu Resume PDF
2 pages
Hyperion Planning Incremental Metadata Load Using ODI
100% (7)
Hyperion Planning Incremental Metadata Load Using ODI
30 pages
JDBC Operations for Fresco Play Project
100% (1)
JDBC Operations for Fresco Play Project
3 pages
Oracle DBA Architecture and Features Guide
No ratings yet
Oracle DBA Architecture and Features Guide
5 pages
Ip Pa1
No ratings yet
Ip Pa1
15 pages
SQL: Structured Query Language: Prepared By: Prof Momhamad Ubaidullah Bokhari
No ratings yet
SQL: Structured Query Language: Prepared By: Prof Momhamad Ubaidullah Bokhari
102 pages
SQL Joins and Aggregation Explained
No ratings yet
SQL Joins and Aggregation Explained
21 pages
Understanding SQL Subqueries and Views
No ratings yet
Understanding SQL Subqueries and Views
32 pages
MSIT 5210-01 Databases - AY2025-T3
No ratings yet
MSIT 5210-01 Databases - AY2025-T3
13 pages
Intro to Databases & RDBMS
No ratings yet
Intro to Databases & RDBMS
18 pages
UBSHS Research Database Proposal
No ratings yet
UBSHS Research Database Proposal
10 pages
Corrupted Document Analysis
100% (1)
Corrupted Document Analysis
11 pages
Jpa&Hibernate Qna
No ratings yet
Jpa&Hibernate Qna
4 pages
18691ptla - Xii Computer - Part 2
No ratings yet
18691ptla - Xii Computer - Part 2
48 pages
Database Upgrade - DBUA
No ratings yet
Database Upgrade - DBUA
4 pages
Migrating from Oracle to SQL Server
No ratings yet
Migrating from Oracle to SQL Server
2 pages
SAP HANA for Developers
No ratings yet
SAP HANA for Developers
4 pages
Oracle Database Health & Query Tuning
No ratings yet
Oracle Database Health & Query Tuning
55 pages
JavaFX: Getting Started Guide
No ratings yet
JavaFX: Getting Started Guide
51 pages
SQL Exam Prep: 70-761 Questions
No ratings yet
SQL Exam Prep: 70-761 Questions
355 pages

No SQL - Types, CAP Theorem

Uploaded by

No SQL - Types, CAP Theorem

Uploaded by

NoSQL Databases: Introduction – CAP Theorem

NoSQL is a type of database management system (DBMS) that is designed to handle

1. Graph Databases: Examples – Amazon Neptune, Neo4j

When should NoSQL be used:

You might also like