Chap 4

The document provides an overview of NoSQL databases, highlighting their non-relational nature, scalability, and performance benefits for big data applications. It categorizes NoSQL databases into key-value, document, column family, and graph databases, detailing their unique characteristics and use cases. Specific examples such as Amazon DynamoDB, MongoDB, HBase, and Neo4j are discussed to illustrate the functionalities and advantages of different NoSQL database types.

Uploaded by

chandanaambadagatti

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views18 pages

Chap 4

Uploaded by

chandanaambadagatti

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Shri Dharmasthala Manjunatheshwara College of Engineering and Technology,

Dharwad-580002
Department of Information Science and Engineering
Big Data Analytics
(18UISC700)

Course Instructor:

Dr. Rajashekarappa
Chapter – 4
NoSQL

Source: Arshdeep Bahga and Vijay Madisetti

NoSQL Databases
• Non-relational databases ("NoSQL databases") are becoming popular with the
increasing use of cloud computing services.
• Non-relational databases have better horizontal scaling capability and improved
performance for big data at the cost of having less rigorous consistency models.
• NoSQL databases are popular for applications in which the scale of data involved
is massive and the data may not be structured. Furthermore, real-time
performance is more important than consistency. These systems are optimized for
fast retrieval and appending operations on records.
• Unlike relational databases, the NoSQL databases do not have a strict schema.
• The records can be in the form of key-value pairs or documents. Most NoSQL
databases are classiﬁed in terms of the data storage model or type of records that
can be stored.

4
NoSQL Database Types

• Key-Value Databases
• Document databases
• Column family databases
• Graph databases

5
Key-Value Databases
• Key-value databases are the simplest form of NoSQL databases.
• These databases store data in the form of key-value pairs. The keys are used to
identify uniquely the values stored in the database.
• Applications that want to store data, generate unique keys and submit the
key-value pairs to the database. The database uses the key to determine where the
value should be stored.
• Most key-value databases have distributed architectures comprising of multiple
storage nodes. The data is partitioned across the storage nodes by the keys.
• For determining the partitions for the keys, hash functions are used. The partition
number for a key is obtained by applying a hash function to the key. The hash
functions are chosen such that the keys are evenly distributed across the
partitions.
• Unlike relational databases in which the tables have ﬁxed schemas and there are
constraints on the columns, in key-value databases, there are no such constraints.
Key-value databases do not have tables like in relational databases.
6
Amazon DynamoDB
• Amazon DynamoDB is a fully- managed,
scalable, high- performance NoSQL
database service from Amazon.
• DynamoDB provides fast and predictable
performance and seamless scalability
without any operational overhead.
• DynamoDB’s data model includes Tables,
Items, and Attributes. A table is a
collection of items and each item is a
collection of attributes.

7
Document Databases
• Document store databases store semi-structured data in the form of documents
which are encoded in different standards such as JSON, XML, BSON or YAML.
By semi-structured data we mean that the documents stored are similar to each
other (similar fields, keys or attributes) but there are no strict requirements for a
schema.
• Documents are organized in different ways in different document database such in
the form of collections, buckets or tags.
• Each document stored in a document database has a collection of named fields and
their values. Each document is identified by a unique key or ID.
• There is no need to define any schema for the documents before storing them in the
database.
• While it is possible to store JSON or XML-like documents as values in a key-value
database, the benefit of using document databases over key-value databases is that
these databases allow efficiently querying the documents based on the attribute
values in the documents.
8
• Document databases are useful for applications that want to store semi-structured
MongoDB
• MongoDB is a document-oriented
non-relational database system. MongoDB
is powerful, flexible and highly scalable
database designed for web applications and
is a good choice for a serving database for
data analytics applications.
• The basic unit of data stored by MongoDB
is a document.
• A document includes a JSON-like set of
key-value pairs.

9
Column Family Databases
• In column family databases the basic unit of data storage is a column, which has
a name and a value.
• A collection of columns make up a row which is identiﬁed by a row-key.
Columns are grouped together into columns families.
• Unlike, relational databases, the column family databases do not need to have
fixed schemas and a fixed number of columns in each row.
• The number of columns in a column family database can vary across different
rows.
• A column family can be considered as a map having key-value pairs and this map
can vary across different rows.
• Column family databases store data in a denormalized form so that all relevant
information related to an entity required by the applications can be retrieved by
reading a single row. 10
HBase
• HBase is a scalable, non-relational,
distributed, column-family database that
provides structured data storage for large
tables.
• HBase can store both structured and
unstructured data.
• The data storage in HBase can scale linearly
and automatically by the addition of new
nodes.
• HBase has been designed to work with
commodity hardware and is a highly reliable
and fault tolerant system.
• HBase allows fast random reads and writes.

1
1
HBase Data Model
•An HBase table is consists of rows, which are indexed by the row key.
•Each row includes multiple column families.
•Each column family includes multiple columns.
•Each column includes multiple cells or entries which are timestamped.
•HBase tables are indexed by the row key, column key and timestamp.
•Unlike relational database tables, HBase tables do not have a ﬁxed
schema.
•HBase columns families are declared at the time of creation of the table
and cannot be changed later.
•Columns can be added dynamically, and HBase can have millions of
columns.
12
HBase Architecture
• HBase has a distributed architecture.
• An HBase deployment comprises multiple region
servers which usually run on the same machines as
the Hadoop data nodes.
• HBase tables are partitioned by the row key into
multiple regions (HRegions). Each region server has
multiple regions.
• HBase has a master-slave architecture with one of
the nodes acting as the master node (HMaster) and
other nodes are slave nodes.
• The HMaster is responsible for maintaining the
HBase meta-data and assignment of regions to
region servers.
• HBase uses Zookeeper for distributed state
coordination.
• HBase has two special tables - ROOT and META,
for identifying which region server is responsible 13
Graph Databases
• Graph stores are NoSQL databases designed for storing data that has
graph structure with nodes and edges.
• While relational databases model data in the form of rows and columns,
the graph databases model data in the form of nodes and relationships.
• Nodes represent the entities in the data model. Nodes have a set of
attributes. A node can represent different types of entities, for example, a
person, place (such as a city, restaurant or a building) or an object (such as
a car).
• The relationships between the entities are represented in the form of links
between the nodes. Links also have a set of attributes. Links can be
directed or undirected. Directed links denote that the relationship is
unidirectional.
14
Neo4j
• Neo4j is one the popular graph
databases which provides support for
Atomicity, Consistency, Isolation,
Durability (ACID).
• Neo4j adopts a graph model that
consists of nodes and relationships.
• Both nodes and relationships have
properties which are captured in the
form of multiple attributes (key-value
pairs).
• Nodes are tagged with labels which are
used to represent different roles in the
15
domain being modeled.
Neo4j - Cypher

• For create, read, update and delete

(CRUD) operations, Neo4j provides a
query language called Cypher.
• Cypher has some similarities with the
SQL query language used for relational
databases.

16
Comparison of NoSQL databases

17
Thank You

Lecture 6 - NoSQL
No ratings yet
Lecture 6 - NoSQL
43 pages
Big Data Unit 3
No ratings yet
Big Data Unit 3
374 pages
Unit 2
No ratings yet
Unit 2
26 pages
Understanding NoSQL Databases and Types
No ratings yet
Understanding NoSQL Databases and Types
65 pages
Lecture 3.1.2
No ratings yet
Lecture 3.1.2
47 pages
NoSQL Complete QB
No ratings yet
NoSQL Complete QB
43 pages
Introduction To NoSQL
No ratings yet
Introduction To NoSQL
38 pages
U5 Final
No ratings yet
U5 Final
45 pages
NoSQL Databases: A Developer's Guide
No ratings yet
NoSQL Databases: A Developer's Guide
36 pages
Module 3 Bigdata Analytics
No ratings yet
Module 3 Bigdata Analytics
19 pages
Unit III (FSWD)
No ratings yet
Unit III (FSWD)
27 pages
Unit 2 Handouts
No ratings yet
Unit 2 Handouts
11 pages
NoSQL for Developers and IT Pros
No ratings yet
NoSQL for Developers and IT Pros
3 pages
No SQL
No ratings yet
No SQL
32 pages
No SQL
No ratings yet
No SQL
38 pages
06 NoSQL
No ratings yet
06 NoSQL
80 pages
Overview of NoSQL
No ratings yet
Overview of NoSQL
17 pages
Wide-Column Databases Overview
No ratings yet
Wide-Column Databases Overview
10 pages
Big Data Unit-Ii Notes
No ratings yet
Big Data Unit-Ii Notes
7 pages
Nosql PDF
No ratings yet
Nosql PDF
21 pages
MongoDB Slides Until ClassTest
No ratings yet
MongoDB Slides Until ClassTest
221 pages
Understanding NoSQL Databases and Their Applications
No ratings yet
Understanding NoSQL Databases and Their Applications
12 pages
Understanding NoSQL Database Types
No ratings yet
Understanding NoSQL Database Types
98 pages
Module 1 Introduction
No ratings yet
Module 1 Introduction
9 pages
Chapter14 BigData&NoSQLDatabases
No ratings yet
Chapter14 BigData&NoSQLDatabases
39 pages
Nosql 20240103 114025 0000
No ratings yet
Nosql 20240103 114025 0000
24 pages
No SQL
No ratings yet
No SQL
12 pages
Features of Nosql: Non-Relational
No ratings yet
Features of Nosql: Non-Relational
7 pages
NoSQL Databases for Tech Enthusiasts
No ratings yet
NoSQL Databases for Tech Enthusiasts
33 pages
Unit 1 (Iot)
No ratings yet
Unit 1 (Iot)
11 pages
Nosql Module 1
No ratings yet
Nosql Module 1
23 pages
Nosql Databases Unit-1
No ratings yet
Nosql Databases Unit-1
16 pages
No SQL
No ratings yet
No SQL
3 pages
Non-Relational Databases (NoSQL)
No ratings yet
Non-Relational Databases (NoSQL)
15 pages
BDA Module 5 - Part1 (No SQL) 2023
No ratings yet
BDA Module 5 - Part1 (No SQL) 2023
32 pages
Unit 2
No ratings yet
Unit 2
25 pages
Bda Unit-5 PDF
No ratings yet
Bda Unit-5 PDF
83 pages
Understanding NoSQL Databases and Features
No ratings yet
Understanding NoSQL Databases and Features
10 pages
Full Stack UNIT3
No ratings yet
Full Stack UNIT3
57 pages
Big Data Tech: NoSQL & Hadoop
No ratings yet
Big Data Tech: NoSQL & Hadoop
16 pages
Session 8 - NoSQL
No ratings yet
Session 8 - NoSQL
17 pages
NoSQL Databases
No ratings yet
NoSQL Databases
10 pages
Module 5 - NoSQL Databases
No ratings yet
Module 5 - NoSQL Databases
33 pages
Nosql Database
No ratings yet
Nosql Database
19 pages
NOSQL
No ratings yet
NOSQL
25 pages
NoSQL Lec
No ratings yet
NoSQL Lec
45 pages
Nosql
No ratings yet
Nosql
10 pages
Understanding NoSQL Databases
No ratings yet
Understanding NoSQL Databases
31 pages
No SQL
No ratings yet
No SQL
12 pages
Bda Unit12
No ratings yet
Bda Unit12
9 pages
Bda CHP 3
No ratings yet
Bda CHP 3
75 pages
10gen Top 5 NoSQL Considerations
No ratings yet
10gen Top 5 NoSQL Considerations
10 pages
BD Unit 4
No ratings yet
BD Unit 4
45 pages
NOsql Presentation
No ratings yet
NOsql Presentation
20 pages
AppDetectivePRO & DbProtect Product Licensing
No ratings yet
AppDetectivePRO & DbProtect Product Licensing
2 pages
Functional Dependency & Normalization
No ratings yet
Functional Dependency & Normalization
111 pages
T-SQL Cursor Operations Tutorial
No ratings yet
T-SQL Cursor Operations Tutorial
5 pages
Document 1509653.1UPGRADE TIMEZONE
No ratings yet
Document 1509653.1UPGRADE TIMEZONE
19 pages
16 Mark Questions: Structure of Relational Databases
No ratings yet
16 Mark Questions: Structure of Relational Databases
48 pages
Data Fabric Corp
No ratings yet
Data Fabric Corp
2 pages
Dbms Manual 2023 24
No ratings yet
Dbms Manual 2023 24
57 pages
SQL Project Part 1
No ratings yet
SQL Project Part 1
3 pages
Relational Algebra Operations in DBMS
No ratings yet
Relational Algebra Operations in DBMS
35 pages
DBMS Unit-1 Question
No ratings yet
DBMS Unit-1 Question
2 pages
Data Warehouse Development for X-Mart
No ratings yet
Data Warehouse Development for X-Mart
38 pages
Database Management System Lab Assignment - 6
No ratings yet
Database Management System Lab Assignment - 6
6 pages
Oracle 12c Exam Prep Guide
No ratings yet
Oracle 12c Exam Prep Guide
222 pages
MongoDB Crash Course Overview
100% (3)
MongoDB Crash Course Overview
40 pages
Evolution of CDS and AMDP Explained
No ratings yet
Evolution of CDS and AMDP Explained
4 pages
Syllabus 25fall
No ratings yet
Syllabus 25fall
3 pages
Rdbms Versus Ordbms Versus Oodbms
100% (1)
Rdbms Versus Ordbms Versus Oodbms
19 pages
08 PW 5 SW 1 Yv - Js
No ratings yet
08 PW 5 SW 1 Yv - Js
3 pages
Data Analyst Test - AdvaRisk Score
No ratings yet
Data Analyst Test - AdvaRisk Score
13 pages
Streams and Task
No ratings yet
Streams and Task
11 pages
Understanding ODBC and JDBC Integration
No ratings yet
Understanding ODBC and JDBC Integration
8 pages
Dynamic SQL Cross-Tab Queries
No ratings yet
Dynamic SQL Cross-Tab Queries
5 pages
Database Nulls & 3VL Explained
No ratings yet
Database Nulls & 3VL Explained
6 pages
A Certification Questions
100% (2)
A Certification Questions
67 pages
Computer 4 Sem Data Base Management System Rdbms 120843 May 2018
No ratings yet
Computer 4 Sem Data Base Management System Rdbms 120843 May 2018
2 pages
Lecture-10 Triggers & DCL
No ratings yet
Lecture-10 Triggers & DCL
18 pages
Zilla Swasthya Samiti Nayagarh
No ratings yet
Zilla Swasthya Samiti Nayagarh
4 pages
Game Database Schema Overview
No ratings yet
Game Database Schema Overview
1 page
Install and Get Started With Postgresql 10 On Arch Linux
No ratings yet
Install and Get Started With Postgresql 10 On Arch Linux
14 pages
Ais Chapter 4
No ratings yet
Ais Chapter 4
32 pages

Chap 4

Uploaded by

Chap 4

Uploaded by

Shri Dharmasthala Manjunatheshwara College of Engineering and Technology,

Source: Arshdeep Bahga and Vijay Madisetti

• For create, read, update and delete

You might also like