Analysis of Couchbase and Mysql Performance With Four-Way Sharding
Analysis of Couchbase and Mysql Performance With Four-Way Sharding
Abstract
This project analyzed the performance of a MySQL cluster and a
CouchBase NoSQL cluster with respect to the amount of time taken
to complete a finite set of queries. Five separate loads were run,
consisting of two read-heavy loads, one write-heavy load, and two
loads of mixed queries. After running the loads, we have noticed
several differences between the two configurations. The MySQL
cluster performed better on read-heavy loads, but CouchBase excelled
at completing the write-heavy loads quickly. Couchbase performed
quite poorly when presented with a mixed load, in comparison to
MySQL. This performance difference has several likely reasons,
including the difference between synchronous and asynchronous
queries.
1 Introduction
This project was centered around the analysis of the performance of a NoSQL Couchbase
database cluster and a traditional SQL database (implemented in MySQL). The goal was to
determine which database configuration has better performance under which circumstances. In
order to determine this, we subjected each database to a number of queries, with a variety of
loads.
We stress tested each system with a total of five different query processes; two of these
processes were read-heavy queries (consisting of greater than 80% read operations), one was
a write-heavy query (consisting of greater than 80% write operations), and two were mixed
loads of read-write queries (where the load-split between reading and writing is more equal than
an 80-20 split; for these, we tried to achieve as close to a 50-50 split as possible).
2 Data
This section describes the data used throughout the project and the reasoning behind relevant
decisions.
We had a number of reasons for selecting this data for the project. Most importantly, we
wanted to use a valid, pre-existing dataset rather than an arbitrarily manufactured dataset.
We had this as a criterion because we wanted to ensure that the analysis we performed was
conducted in a manner representative of real use; while it is possible to estimate usage patterns
and conduct experiments with manufactured data, it is not as effective or realistic as using
actual data. Additionally, the production of semi-realistic data (i.e., using data with a variety
of different data types, relations and references, different data section sizes, etc) is very time-
consuming. We chose to concentrate on the analysis itself rather than spend time on creating a
dataset.
Another reason we chose this particular dataset is that it is different from the typical data that
would likely be tested. This is because the dataset is relatively small, by the standard of many
modern databases. Additionally, the dataset is relational (as wine information does connect
to both appellation information and grape information), but it is not heavily relational; this is
important because we wanted to give neither system an unreasonable advantage by either
favoring heavily-relational data or completely non-relational data. Due to these differences
from the more common testing databases, which are frequently on the scale of giga-, tera-, or
petabytes, we may discover new data management issues.
3 Experiment Description
This section describes the operation of each experimental process we applied to the two
different systems, as well as the components of the system tested by each.
Tested This was the best choice for a write-heavy operation for a number of
Components reasons. Primarily, this will concentrate as much work as possible on the
actual writes the system performs, since other write operations (such as
updating) require reading the record first; insertion and deletion are as close
to pure-writes as possible. Thus, we test the systems writing performance.
Secondarily, this query will span the entire database, and we can thus test
the efficiency of the system under these conditions.
3.4 Balanced-1
Process This experiment outputs the results from the experiment Ready Heavy-
1 into a new table. We insert records into this table directly following the
read, which will result in intermingled read-writes, since each grape type-
appellation combination has an individual query.
Tested Here, we test the ability of the system to handle intermingled read-writes
Components efficiently. While we could instead have made these processes singular,
with all the reads followed by all the writes, we wanted to test how quickly
the system can shift gears, changing from a read to a write. Additionally,
the two systems have very different advantages here. SQL has an
atomic query for this operation, the SELECT INTO (or INSERT INTO
SELECT, in MySQL) command. While this does allow for atomicity, this
also restricts sharding to only those servers which have write privileges.
While Couchbase lacks atomicity, all servers can be utilized.
3.5 Balanced-2
Process This experiment is another example where our desire was to test one of the
most common operations of the database; as such, we decided to perform
an update operation on each bottle of wine, incrementing the score by five
points.
Tested Similar to the write-heavy experiment, this process has very little
Components sophisticated computation, and instead focuses on what will be a very
common operation. While an update is exclusively a write operation, we
also require the information of the previous score, as this value required for
the computation of the new score. Here, we test how well the system can
handle this type of operation, considering it spans the entire dataset.
3.7 Metrics
Ultimately, we decided that the metric which would be most valuable to us is that of time. A
number of other metrics were considered, but were ultimately discarded.
3.7.1 Time
Time is the metric most frequently and ubiquitously targeted, and the reason is simple: people
are impatient. Many of the processes performed on these systems are time-sensitive; i.e., in
many cases, sooner is better, and now is best. If a system is incredibly efficient on system
resources, but takes a long time to finish, that system will not be used as frequently.
Time is gathered by getting the system time both before and after the process run, and
computing the difference. The measurement is precise, measured in nanoseconds (though
our final results are measured in larger units, with three-decimal precision, for readability). This
measure of time is wall-clock time, and not CPU-time, because the wall-clock time is what will
be perceived by users. Optimizing this is the most important.
3.7.3 Throughput
Throughput was similarly disregarded. While throughput may differ from time, the two are
directly correlated, and the user is more interested in the time taken. When a user desires high
throughput, the reason is because the user wishes something to run quickly. Thus, the time
measure serves the same purpose.
4 Implementation
The experiment was implemented on six Amazon EC2 instances. Four of these form the
MySQL and NoSQL clusters and the final two run client processes. The MySQL cluster is
configured with one master node and three slave nodes. The NoSQL cluster is configured with
four nodes running Couchbase Server 2.0, which automatically shards the data across multiple
servers. MySQL is limited in the fact that all writes must be sent to the master and all reads
must be distributed amongst the three slave nodes.
We developed separate test harnesses for the MySQL and NoSQL configurations. The MySQL
harness was developed in Java and uses JDBC to connect to the cluster. The NoSQL harness
was developed in Javascript and uses a Node.js-based client to connect to the cluster. We
originally attempted to write both harnesses in Java, but could not get the CouchBase Java
Client to successfully return the views necessary for the more complex queries. The Java client
is still in Developer Preview at the time of this writing, so we attributed our issues to bugs in
the developing code base.
In addition to issues with the Couchbase Java Client, we experienced some setup hiccups due
to the EC2 owners shutdown of most of our nodes due to inactivity. This caused much of our
configuration to reset, which cost us a few more hours of additional setup and fixes. One large
issue that we ran into with this is MySQLs replication implementation is rather unintelligent in
the fact that adding a new slave to the pool will not automatically propagate all data to the new
slave.
The MySQL replication issue presented itself when we discovered that Server 4 had been
misconfigured for a short while during our testing phase. Server 4 did not have a record for any
of the new users that we added, nor did it have the test database that we had set up on the
master. Unfortunately, we discovered that adding a new node to the MySQL cluster does not
actually propagate any old data over, it simply makes it available for new transactions to be
processed. In order to work around this issue, we were forced to recreate the same users and
databases that were available on the master node by hand.
5 Experiment Results
Read Heavy-1 Read Heavy-2 Per-Query*
Metric Couchbase MySQL Metric Couchbase MySQL
Time (ms) Time (ms) Time (ms) Time (ms)
Total Run Time 1404 1113.000 Per-Query Mean
1042.47 6.364
Per-Query Mean 0.978 (100 iterations)
Per-Query Mean
Per-Query Median 0.898 10479.79 5.357
(1000 iterations)
Per-Query Mean
106556 5.187
(10000 iterations)
Read Heavy-2 Totals Per-Query Median
1257 5.112
Metric Couchbase MySQL (100 iterations)
Time (ms) Time (ms)
Per-Query Median
10368.5 4.854
Total Time (1000 iterations)
2.032 0..636
(100 iterations)
Per-Query Median (10000
107441 4.842
Total Time iterations)
20.580 5.357
(1000 iterations)
Total Time
(10000 iterations)
196.093 51.871 Balanced-1 Per-Query*
Metric Couchbase MySQL
Time (ms) Time (ms)
Balanced-1 Totals Per-Read/Insert Mean (10
3577.8 1345.763
Metric Couchbase MySQL iterations)
Time (s) Time (s) Per-Read/Insert Mean
3528.23 1294.178
(100 iterations)
Total Time
35.778 13.458 Per-Read/Insert Mean
(10 iterations) 3635.072 1344.129
(1000 iterations)
Total Time Per-Read/Insert Median
352.823 129.418 3575.5 1323.808
(100 iterations) (10 iterations)
Per-Read/Insert Median 3514
Total Time 1246.415
3635.072 1344.129 (100 iterations)
(1000 iterations)
Per-Read/Insert Median
3530 1270.117
(1000 iterations)
* Due to the Couchbase systems asynchronous method-calling, the means and medians
reported for several of the Couchbase tasks are not good measures of the performance, since
all calls returned at approximately the same time (i.e., the total time may be approximately equal
to the median run-time of each query).
6 Analysis
The NoSQL server metrics were heavily affected by the distinct use of synchronous queries
or asynchronous queries. Additionally, in contrast to the MySQL calls, the Couchbase calls all
involved an HTTP request/response which significantly affected the latency of each read or
write. Unlike when using MySQL, where a single read or write was relatively fast, Couchbases
speed lies in the number of concurrent read/write requests it handled.
7 Conclusions
Throughout our testing, we noticed a significant performance difference with regards to data
writes between the NoSQL database and the MySQL database. Due the to NoSQL databases
superior sharding architecture, it is able to handle a much higher write throughput than the
MySQL database. This is likely because the MySQL cluster forces all writes to be sent to the
one master node. So the MySQL cluster can only write to one node, while the NoSQL cluster
can write to all nodes simultaneously.
Bibliography
[1] http://users.csc.calpoly.edu/~dekhtyar/365-Fall2012/data/WINE/README.WINE.txt