Performance Benchmarking and Comparison of Cloud-Based Databases Mongodb (Nosql) Vs Mysql (Relational) Using Ycsb
Performance Benchmarking and Comparison of Cloud-Based Databases Mongodb (Nosql) Vs Mysql (Relational) Using Ycsb
net/publication/344047197
CITATION READS
1 2,596
1 author:
Rachit Pandey
National College of Ireland
3 PUBLICATIONS 1 CITATION
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Deep learning models to detect CoVID cases using Chest X-ray Scans View project
All content following this page was uploaded by Rachit Pandey on 02 September 2020.
Abstract
Databases are backbone of any Business application and it is of the utmost
importance that the database serving the application stands out with respect to
performance, availability, scalability, data integrity and security. Recently we have
seen a sea of new cloud data serving databases which cater to cloud OLTP (online
transaction processing) applications though they do not support ACID ((Atomicity,
Consistency, Isolation, Durability) ) transactions to a very great extent. Examples
of such systems are MongoDB, HBase, Cassandra etc. They are also called as
NoSQL (schema-less) systems. On the other hand we have traditional RDBMS
systems which support ACID transactions and are widely used for a host of ap-
plication types. It is becoming extremely important to measure the performance
of databases with respect to certain parameters and decide which DBMS system
(NoSQL or RDBMS) is best suited for the business needs.
In this report we will try to replicate low and high volume application operations
into MongoDB and MySQL databases using Yahoo! Cloud Serving Benchmark
(YCSB) tool and analyze the performance differences between both the systems
using the quantitative output generated by YCSB. The report describes the exper-
imental setup to perform the test and evaluation of the results.
Keywords: NoSQL, YCSB, MySQL, RDBMS, MongoDB
Contents
1 Introduction 2
1.1 YCSB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3 Database Architectures 6
3.1 Relational-MySQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2 NoSQL MongoDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1
4 Comparison of capabilities of the Database Management Systems 9
4.1 Comparison Scalability in MySQL and MongoDB . . . . . . . . . . . . . . . 10
5 Related Work 11
8 Conclusions 35
References 36
1 Introduction
This is the age of big-data revolution and with this revolution the demand to efficiently
manage huge amount of data is growing. Traditionally the database management systems
were governed and driven by the principles of relational database management systems
(RDBMS) based on the structured query language (SQL). But recently we have been
witnessing a new group data management solutions known as NoSQL (Not-only SQL)
have emerged as a strong contender in the DBMS arena. NoSQL is gaining fast popularity
because of its improved scalability and flexibility over RDMS. Top global organizations
like Google, Facebook which deal with huge amounts of online data daily are adopting
to NoSQL-style database.
SQL systems have their own unique characteristics and their ability to adhere to
ACID compliance along with greater structure, powerful interface and complex operation
support makes them still the most popular choice for mid and large organizations who
predominantly deal in structured data. Both SQL and NoSQL databases have their own
set of advantages and usage depending on where they are going to be used. RDBMS
systems are widely adapted by applications and they can handle limited amount of data
with good performance, but to handle large volumes of data including internet data and
multimedia the traditional DB falls short. The term NoSQL was created by Carlo Strozzi
[1] in 1998 and refers to nonrelational databases, term which was later reintroduced in
2009 by Eric Evans. The primary benefit of NoSQL DB is that unlike a RDBMS it can
handle unstructured data easily as they use identification keys to locate data.
Strategies to store data in a NoSQL Database
• Document.
2
• Column.
• Graph-Oriented
NoSQL DB provides flexibility to add or remove attributes from the DB.In this paper we
focus bench-marking the performance of a RDBMS (mySQL) and a NoSQL DB (Mon-
goDB). The bench-marking of the databases will be performed using 1.1 framework which
is one of the best and most popular methods of bench-marking and testing performances
of database systems.
1.1 YCSB
In their paper Cooper et al.[2] describe the importance and challenges to understand the
performance of modern day database systems. In this paper they propose the objective
to create a standard bench-marking framework which can assist users to evaluate per-
formance of databases. They propose the creation of a framework called YCSB (Yahoo!
Cloud Serving Benchmark ) which allows users to select and run different kinds of work-
loads against the database systems and measure key parameters like throughput and
latency and understand the behaviour of the system. The workloads can be standard
(A-F) or custom designed to suit ones need and the output can be obtained with running
multiple iterations and a variety of test loads ranging from short to really huge sets of
operations. The below figure 1 shows a typical architecture of YCSB framework.
The main feature of the YCSB framework is its extensibility and the ability to create
custom workloads for testing. YCSB is a java based program and the program generates
test loads for the database. The workload executor drives multiple client threads and
each thread executes as sequential series of calls to the DB interface layer both to load
the database (the load phase) and to execute the workload (the transaction phase).
3
Workloads can be executed by providing details of the workload files such as type of
workload and operations count and DB properties such as threads, scan ratio etc in the
YCSB command.
This paper is divided into two main parts, one part will cover the setup 6 of the per-
formance evaluation exercise for comparing MongoDB and MySQL databases using the
YCSB framework and the second section 7 will analyse the results obtained from the
performance bench-marking. The analysis of results will provide us some good insights
about the behaviour of the two databases performing the CURD (Create, Update, Read,
Delete) operations. Other sections such as related work and conclusion are also provided
as part of this paper.
2.1 Relational-MySQL
MySQL 1 is a leading open source Relational database management systems in the world.
It is suited for applications that demand high performance and solubility and reliability.is
one of the world’s most popular open-source RDBMS. MySQL design works best on the
data whose fields are structured and finite, MySQL is able to search and organize through
it in multiple dimensions. But this strategy cannot be used on non structured data. Below
are some key features of MySQL RDBMS 2 .
4
multiple.
7. Platform Independent : MySQL is platform independent and can be used on
a variety of popular operating systems without compatibility issues.
8. Geographic Information System (GIS) : MySQL supports GIS functions so
that it can process GIS object data. These objects can have spatial attributes such as
geo co-ordinates.
2.2 NoSQL-MongDB
MongoDB 3 is a document style NoSQL DB developed in C++. It was designed for
developing high performing scalable systems wire binaries and audio/video files.It has a
flexible schema and it allows objects to not have fixed schema or type.Queries can be
performed over collections or using map-reduce.
1. High Performance : MongoDB uses Binary JSON (BSON) for storing data into
documents, it improves overall performance with sharding, load balancing and replica
sets.
2. Load Balancing : MongoDB scalable architecture uses load balancing to dy-
namically balance the query operations and manages balanced documents spread over
multiple nodes for read and write.
3. Aggregation Framework (MapReduce): MongoDB provides map-reduce op-
eration support for aggregation for data summarization.
4. Horizontal Scalability (Sharding) : MongoDB implements horizontal scaling
through sharding, it disseminates data over multiple machines which helps in generating
high throughput over high volume of data. Using sharding we can add additional instances
to expand the capacity of the database.
5. Schema-less : MongoDB is a database without schema where we can store any
kind of data without any structured schema. This makes migration of data simple and
efficient.
6. Multiple Storage Engines : MongoDB provides the option of using multiple
storage engines as per the application requirements. This helps in developing highly
robust databases.
7. Capped Collections : MongoDB supports creating broadly utilized capped
collections where the size of the collections can be confined by the users.
8. Documents Indexing: MongoDB provides various types of indices it supports
single, compound, multikey, geospatial and hashed indexes on data. Documents can be
discovered using indexes without executing the entire collection scan.
9. Master Slave Replication :MongoDB makes various copies of data over multiple
locations, this protects the data from losses. Concept replica set in MongoDB gives
consistency for read operations from primary nodes because at-least one redundant copy
shall be available in secondary nodes.
3 MongoDB :http://www.mongodb.org
5
3 Database Architectures
3.1 Relational-MySQL
MySQL uses a networked client server architecture, it is the most adaptable and flexible
RDBMS. A typical MySQL architechure has below components.4
MySQL Server
• This is also called mysqld and manages access to the actual DB hosted on the disk.
• Multithreaded
Client programs
• Connection utility to the MySQL Server.
MySQL is a mesh of task related functions that work to complete the job of a database
server. The below figure describes the overview of the system and their interactions. Each
subsystem is independent from the other.MySQL’s architecture is a web of task-related
functions that work together to completed the responsibility of database server. The
figure 3 below is an overview of the subsystems that interact with each other through
a well-defined function interface. Each subsystem has its own responsibilities and inde-
pendent from each other
4 MySQL:http://www.mysql.com
6
The main components of the MySQL server are:
Query Cache: It is in memory store where the MySQL engine will look for the
query results first, if found the results will be returned from here. .
Storage Engine: This component manages the physical data files and the locations
they are stored. This engine is responsible to fetch the data from data files. The most
widely used and oracle recommended default storage engine is InnoDB for most MySQL
requirements.
The Base Function Library :A common set of functions shared across all MySQL
subsystems.
Process, Thread and resource Management: Facilitating thread based client
server architecture.
Cache and Buffer management : Helps in caching and data retrieval of different
forms of data used by threads executing server process. Caching and buffering reduces
I/O time.
Memory Allocation: .
• Per session : Dynamic allocation and de-allocation of memory for a specific session.
• Per instance: Once allocated per server instance, shared by all server processes and
all threads.
7
3.2 NoSQL MongoDB
MongoDB is a general purpose open source document database. It is fully featured
with capabilities like Map Reduce, Geo-spatial queries, aggregation, text search and
rich queries. It is a horizontally scalable database where capacity can be added as per
need and it automatically balances the load.Below figure 4 depicts the MySQL physical
architecture.
The main components of MongoDB physical architecture are described briefly below.
Storage Engines:MongoDB has various storage engines and the best one is util-
ized.The four predominant storage engines can work in a single replica set of MongoDB.
The default storage engine is WiredTiger which is an all round performer with local com-
pression support. Encryption protects sensitive data without impacting the performance
and the in-memory storage engine (ISE) provides wide support for real-time analytic
applications.
Sharding :MongoDB implements horizontal scaling to address hardware limitations
for databases by utilizing commodity hardware to distribute data among the hardware.
This technique is called sharding and the hardware nodes are called shards. MongoDB
automatically manages sharding.
Replica Sets:MongoDB implements native application support and performs data
replication into multiple hardware units. These copies are called replica sets. All read
write operations are controlled via the primary replica set and rest of the copies are called
secondary sets. If the primary replica set is down one of the secondary set is selected as
a replica set. These sets are shards which self heal and hence DB downtime is reduced if
failures occur.
Query Model : MongoDB utilizes caching, indexing and query plan to optimize
query performance.MQM has a router which routes the query to the engine regardless of
8
the number of shards. Query optimization is done at run time by utilizing more that one
indexes.
9
• In MongoDB either all or nothing is committed hence making the data integrity
more tight, and the data changes are not visible outside the transaction until commit
is completed.
• In case a transaction is writing to multiple shards not all outside read operations
need to wait for committed transaction to be visible.
• In case of abort all changes made in the transaction are discarded without being
visible
MySQL is open source relational database management system which strictly follows
ACID property of transaction (Atomic, Consistent, Isolated and Durable) model. Which
is highly suitable for application highly relays on the transitions completeness. services.6
MySQL transaction management system allows users to execute data manipulation (DM)
operations and ensures that the database does not contain the results of a partial opera-
tion. Thus if in a set of operations if one fails the rest of the operations are rolled back to
restore the earlier state of the database. This way MySQL supports ACID transactions.
10
After reviewing the scalability features of MongoDB and MySQL it is found that Mon-
goDB sharding works better and hence the scalability of MongoDB is more than MySQL
and is also easy to implement as most of the things are taken care by MongoDB itself.
While MySQL maintains the integrity of the data even during scalability and sharding
makes it one of the popular choices for high performing RDBMS.
5 Related Work
In this section we will discuss related work and research presented in various other papers
on NoSQl and relational databases.
11
Figure 5: The graphic display of the architecture for integration and uniform use of
components of hybrid SQL/NoSQL database [4]
While defining the architecture of this hybrid model the authors try to answer the fol-
lowing questions
• How to organize and manage data on object types, on database types they belong
to and on specific languages they use?.
• How to translate SQL statement into a hybrid language so that the target DB can
understand.
• How to join the data from databases of different types within the hybrid SQL/NoSQL
database
The authors propose to develop the below components to answer the above questions
• Key Words Search Component (KWS) – For Metadata extraction from user state-
ment.
• Statement Mapper (SM) – To map the user statement to the specific database
12
After executing the use cases with an example business application the authors have
concluded that the hybrid database model shows potential and shows improved per-
formance.There is definitely a potential to explore this model and pursue this for future
research and development although the support for DDL statements to be executed in
both the database environments regardless of the language it is written shall be the first
criteria of the future work.
13
MongoDB and NoSQL comparison for eCommerce data
In this paper Aboutorabi et al.[8] describe that how MySQL is the most widely used
RDBMS while MongoDB is one of the most popular choice for NoSQL solutions.The paper
evaluates the performance of MongoDB and MySQL for a large ecommerce application
with huge amount of data.The results show that MongoDB performs better than MySQL
in all aspects.However in a study done by R Panda et al. [9], opens a discussion that the
performance of the databases depend on the design and hence different types of custom
benchmarks should be executed before arriving to any conclusion.
Reference Description
Győrödi et al. [3] Comparing MongoDB and MySQL using manual DML operations
Bjeladinovic et al. [4] Proposed a new Hybrid Database model using
RDBMS and NoSQL
Yusuf etal.[5] {Evaluation of multiple NoSQL databases using YCSB
Khazaei et al. [6] Comparison of various bench-marking tools
like YCSB, PigMix, GRIDMix, CALDA
Matallah et al. [7] Comparison of MongoDB and Hbase
using YCSB custom workloads
Aboutorabi et al.[8] Evaluate the performance of MongoDB and MySQL
for a large eCommerce application
14
• RAM:8 GB
Cloud Machine
• RAM:8 GB
• Disk: 80 GB
MongoDB configuration
MySQL configuration
• MySQL Version: mysql Ver 14.14 Distrib 5.7.31, for Linux (x86-64)
Benchmarking
• Version:0.17.0
• Test Setup: Unix shell based bash Scripts provided by NCI for automating the
workload execution
Visualization Tools
15
6.0.2 Workloads
YCSB provides the option to run various workloads against database to evaluate the
performance. Different types of standard workloads available are described below and
the workloads which are used for this experiment are highlighted in color and marked.
A—Update heavy Read:50% Zipfan Session store recording re- YES 4,8,16,32
Update:50% cent actions in a user ses-
sion
B—Read heavy Read:95% ZipFan Photo tagging; add a tag is NO
Update:5% an update, but most opera-
tions are to read tags
6.0.4 DB Setup
After installation the DB BenchTestfor MySQL was created and a table called usert-
able with 10 fields was created in the database. The DB.Properties file was updated
with the correct DB and user id and password to login to the root user in MySQL was
updated in the properties file. Other configuration settings such as Java path and Jar
file copy was done.
7 Install MongoDB :https://docs.mongodb.com/manual/tutorial/install-mongodb-on-ubuntu/
8 Github MongoDB :https://github.com/brianfrankcooper/YCSB
16
For MongoDB a runtime database was created by YCSB and hence there was no need
to create a database or tables. MongoDB was enabled to autostart after system reboot
using Systemctl enable command so that there is no need to start the database for
each run.
For testing purpose NCI provided automated testing scripts where used as it can
execute multiple workloads on one time. Some minor tweaking was done in the script
file to run the workloads in a multi-threaded mode and also -s parameter was added
to display the detailed status of the run on the screen.Also the script was modified to
generate the output files in csv format instead of text format.
• Workload A executed for MySQL with the operation and record count of
12500,50000,100000,150000,200000
• Workload A executed for MongoDB with the operation and record count of
12500,50000,100000,150000,200000
• The above workloads were executed with the thread options of 4,8,16,32.
17
Overall Run Time Load/Insert
Below Graph shows the overall run time of workload A during Insert(load) operation for
both MongoDB and MySQL for the mentioned operations count and threads. Workload
A is 50/50 read and write during the load phase the YCSB tool inserts the records as
mentioned in the record count parameter,overall run time is recorded and plotted below.
• Overall Run time increases for both MongoDB and MySQL as the load increases.
• For varying workloads the run time of MongoDB is consistently less than that of
MySQL.
• In multi thread mode the runtime decreases when the thread count is increased.
18
Figure 8: Overall Run time comparison during Read/Update
• Overall Run time increases for both MongoDB and MySQL as the load increases.
• For varying workloads the run time of MongoDB is consistently less than that of
MySQL.
• In multi thread mode the run time for MySQL decreases when the thread count is
increased.
• Multi threading is not very helpful in a single node clustered environment especially
for MongoDB.
19
Figure 9: Overall Throughput during Load/Insert
• As the operations count increases the throughput for both MongoDB and MySQL
increases..
• As the thread count increases the throughput for both MongoDB and MySQL
increases.
20
Figure 10: Overall Throughput during Read/Update
• As the operations count increases the throughput for both MongoDB and MySQL
increases..
• As the thread count increases the throughput for both MongoDB and MySQL
increases.
Latency Analysis
In this section we will analyze the Average Latency parameter during Insert, Read and
Update operations for various loads/runs (ops count 12500,50000,100000,150000,200000)
and compare them. We will also look at the Latency vs Throughput performance which
is a key performance indicator for databases.
21
Average Latency during Insert for MongoDB and MySQL
The above chart 11 shows the average insert latency for various operation counts for
both the databases. As we can see from the graph that the Average Latency for Insert
operations increases as we increase the operations count. While average latency of MySQL
is always more than MongoDB which shows that for the same set of loads and same
number of threads MySQL has performed slower in Inserting records in the DB. Infact
MySQL is almost 3 times slower than MongoDB for different ops count with 4, 8 threads
but for 16 and 32 threads the difference is reduced to almost 30% or lower. This shows
that MySQL can perform better in Multi threaded mode with multiple shards but overall
MongoDB is a better performer.
The below 2 graphs 12, 13 show the average latency for read and update operations for
multiple values of ops count for MongoDB and MySQL databases. As we can see from
the graph 12 that the read latency difference between MongoDB and MySQL is not very
significant and both perform almost similar for multiple read request with varying number
of threads. Having multiple threaded operation does not impact significantly the read
latency and the variation between MongoDB and MySQL is somewhat constant. For 4
threads the difference between the read latency for MySQL and MongoDB is 20% and
then for threads 4, 8, 16 its 12% approximately.Here also MongoDB is a better performer.
As we can see from the graph 13 that the update latency difference between MongoDB
and MySQL approximately 2.5 times for 4 threads but for threads 8,16,32 the difference is
around 70% which again shows that for update operations also MongoDB has performed
better than MySQL.
22
Figure 12: Average Latency for Read operations
23
Average Latency v/s Throughput during Insert, Read, Update for MongoDB
and MySQL
The above graph shows the latency plot for insert operation with respect to throughput
for both MongoDB and MySQL. From the graph we can see that the latency decreases as
the throughput increases, both MongoDB and MySQL perform in similar way for insert
operation and the latency tends to follow similar patter.
The below graphs 15a and 15b show the latency v/s throughput chart for read and
update operations. As the Workload A is 50% read and 50% update the below charts
show how MongoDB latency falls sharply when the throughput increases, while MySQL
is consistent in the latency with respect to throughput. This again shows that MongoDB
is better performing DB when loads are heavy and data read and update has to be
performed for huge amount of data.
(a) Latency Vs Throughput for Read (b) Latency Vs Throughput for Update
24
Throughput Vs Threads
The below two graphs show throughput during load and run phase and provide a view
with respect the thread count, we can see how the throughput increases when the thread
count is increased
(a) Throughput vs Thread During load (b) Throughput vs Thread During Run
As we can see from the above graph when the thread count increases the throughput
increases. The throughput for MongoDB is higher in both insert and update operations
for multiple record counts.
• Workload F executed for MySQL with the operation and record count of
12500,50000,100000,150000,200000.
• Workload F executed for MongoDB with the operation and record count of
12500,50000,100000,150000,200000.
• The above workloads were executed with the thread options of 4 and 8 threads.
25
Overall Run Time Load/Insert
• Overall Run time increases for both MongoDB and MySQL as the load increases.
• For varying workloads the run time of MongoDB is consistently less than that of
MySQL.
• In multi thread mode the run time decreases when the thread count is increased
for MySQL but not for MongoDB.
• The run time for MongoDB for 4 threads and 8 threads is the same and hence multi
threading does not improve the runtime much
• The run time for MySQL for 4 threads and 8 threads is different and when using
8 threads the run time comes down by 20% for lower ops count and almost to half
for lager ops count
• The run time for MySQL for is almost 4 times more than that of MongoDB for the
same amount of load which is consistently high for all loads. Hence for inserting
data MongoDB seems to be performing much much better.
• Overall Run time increases for both MongoDB and MySQL as the load increases.
26
Figure 18: Overall Run time comparison during Read/Modify
• For varying workloads the run time of MongoDB is consistently less than that of
MySQL.
• In multi thread mode the run time decreases when the thread count is increased
for MySQL but not for MongoDB
• The run time for MongoDB with the same amount of load is almost 20% less
than MySQL and when the load is increased (100000 and above) the run time for
MySQL almost doubles for 4 threads but for 8 threads the difference in run time is
not doubled but remains less
27
Figure 19: Overall Throughput During Insert/Load
• As the operations count increases the throughput for both MongoDB and MySQL
increases.
28
Figure 20: Overall Throughput During Run phase
• As the operations count increases the throughput for both MongoDB and MySQL
increases.
Latency Analysis
In this section we will analyze the Average Latency parameter during Insert, Read and
Update operations for various loads/runs (ops count 12500,50000,100000,150000,200000)
for Workload F and compare them. We will also look at the Latency vs Throughput
performance which is a key performance indicator for databases. The loads have been
executed for 4 and 8 threaded option.
29
Average Latency during Insert for MongoDB and MySQL for Workload F
The above chart 21 shows read avg latency plot for MongoDB and MySQL for threads 4
and 8. We can see that the latency for MySQL is almost 4 times higher than MongoDB
for 4 thread option. When the thread count is changed to 8 the latency for MongoDB in-
creases almost 2 times for the same ops count. For MySQL the latency remains somewhat
on similar trend with thread count 4,8.
30
Figure 23: Average Latency for Update operations
The above 2 graphs 22, 23 show the average latency for update and read-write-modify
operations for multiple values of ops count for MongoDB and MySQL databases in a
multi threaded mode (threads 4,8) As we can see from the graph 22 that the read-write-
modify latency for MySQL is almost 2 times that of MongoDB for various ops counts
and with 4 thread option while with 8 thread option the difference is close to 1.5 times.
As we can see from the graph 13 that the update latency difference between MongoDB
and MySQL approximately 3 times for 4 and 8 threads again shows that for update
operations also MongoDB has performed better than MySQL.
31
The above graph shows the latency plot for read operation with respect to throughput
for both MongoDB and MySQL. From the graph we can see that the latency decreases
as the throughput increases, both MongoDB and MySQL perform in similar way for read
operation and the latency tends to follow similar pattern.
The below graphs 25a and 25b show the latency v/s throughput chart for update and
read-write-modify operations. As the Workload A is 50% read and 50% read-write-modify
the below charts show how MongoDB and MySQL latency falls when the throughput
increases. Comparing the latency parameters we can conclude that MongoDB performs
better than MySQL.
(a) Latency Vs Throughput for Update (b) Latency Vs Throughput for Write-Modify
Throughput Vs Threads
The below two graphs show throughput during load and run phase and provide a view
with respect the thread count, we can see how the throughput for MySQL increases when
the thread count is increased but for MongDB there is not much impact in Throughput
with increase in thread count.
(a) Throughput vs Thread During load for (b) Throughput vs Thread During Run for
Workload F Workload F
32
7.0.3 Workload C Execution and Interpretation
Workload C is a 100% read workload and it can provide a good idea about the read latency
and operations, in this section the parameters such as runtime, throughput and average
read latency is compared for Mongo DB. The workload C is run in a single thread mode
because it is a 100% read workload and multi-threading in a single node would not have
substantial difference. Workload C is used to test scenarios where database operations
are read heavy.
• Workload C executed for MySQL with the operation and record count of
12500,50000,100000,150000,200000.
• Workload C executed for MongoDB with the operation and record count of
12500,50000,100000,150000,200000.
Figure 27: Run time comparison for Workload C for various ops count
33
From the above chart we can draw the below conclusions
• As the workload increased the runtime for MySQL increases.
• For higher loads 100000 and above MySQL run time is around 25% more than
MongoDB.
Latency Analysis
In this section we will analyze the Average Latency parameter during Read phase opera-
tions for various loads/runs (ops count 12500,50000,100000,150000,200000) for Workload
C and compare them.
34
Average Latency during Read for MongoDB and MySQL for Workload C
The above chart 29 shows read avg latency plot for MongoDB and MySQL. We can see
that for small workload the latency for MongoDB is higher than that of MySQL but for
loads more than 100000 operations count the average latency for MongoDB is less than
that of MySQL. This shows that MongoDB is more scalable for large loads.
8 Conclusions
An organization can face a lot of challenges if it wants to switch from traditional RDBMS
to NoSQL database. It is not easy to let go the strict ACID properties of a RDBMS. As
discussed in the report MongoDB does not offer JOIN operations, but there are work-
arounds for this issue. The advantages of MongoDB can be seen from the tests conducted
using the YCSB framework, where for each type of workload (A,C,F) it is shown that
for all the parameters MongoDB has performed better than MySQL. Especially in terms
of Latency and Throughput MongoDB stands out especially for higher number of oper-
ations.
We have also compared the properties of both MongoDB and MySQL databases and
found that MongoDB outperforms MySQL in terms of features such as sharding, security,
performance and availability. Several related works were reviewed and almost all of them
have shown that NoSQL databases are better choice for large applications that run on
35
cloud. Also YCSB happens to be the best tool to run custom and standard workloads
on databases and benchmark the performances, and that comes handy while making
decisions about the choice of a database for business applications.
References
[1] Carlo Strozzi. Nosql-a relational database management system, 2010.
[2] Brian F Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell
Sears. Benchmarking cloud serving systems with ycsb. In Proceedings of the 1st ACM
symposium on Cloud computing, pages 143–154, 2010.
[3] Cornelia Győrödi, Robert Gyorodi, George Pecherle, and Andrada Olah. A compar-
ative study: Mongodb vs. mysql. 06 2015.
[4] Srdja Bjeladinovic, Zoran Marjanovic, and Sladjan Babarogic. A proposal of archi-
tecture for integration and uniform use of hybrid sql/nosql database components.
Journal of Systems and Software, page 110633, 2020.
[5] Yusuf Abubakar, Thankgod Sani Adeyi, and Ibrahim Gambo Auta. Performance
evaluation of nosql systems using ycsb in a resource austere environment. Performance
Evaluation, 7(8):23–27, 2014.
[6] Hamzeh Khazaei, Marios Fokaefs, Saeed Zareian, Nasim Beigi-Mohammadi, Brian
Ramprasad, Mark Shtern, Purwa Gaikwad, and Marin Litoiu. How do i choose the
right nosql solution? a comprehensive theoretical and experimental survey. Big Data
& Information Analytics, 1(2&3):185, 2016.
[7] Houcine Matallah, Ghalem Belalem, and Karim Bouamrane. Experimental compar-
ative study of nosql databases: Hbase versus mongodb by ycsb. Comput. Syst. Sci.
Eng, 32(4):307–317, 2017.
[9] Reena Panda, Christopher Erb, Michael Lebeane, Jee Ho Ryoo, and Lizy Kurian
John. Performance characterization of modern databases on out-of-order cpus. In
2015 27th International Symposium on Computer Architecture and High Performance
Computing (SBAC-PAD), pages 114–121. IEEE, 2015.
36