0% found this document useful (0 votes)
210 views39 pages

Mysql Cluster Deployment Best Practices

This document provides best practices for deploying MySQL Cluster. It discusses suitable applications, hardware and network requirements, configuration options, and administration practices. Key points include selecting dedicated hardware, using multiple network interfaces and switches for redundancy, separating real-time and reporting workloads, and choosing disk subsystems and storage layouts suited to the workload's read/write patterns and performance needs.

Uploaded by

sushil pun
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
210 views39 pages

Mysql Cluster Deployment Best Practices

This document provides best practices for deploying MySQL Cluster. It discusses suitable applications, hardware and network requirements, configuration options, and administration practices. Key points include selecting dedicated hardware, using multiple network interfaces and switches for redundancy, separating real-time and reporting workloads, and choosing disk subsystems and storage layouts suited to the workload's read/write patterns and performance needs.

Uploaded by

sushil pun
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 39

<Insert Picture Here>

MySQL Cluster Deployment Best Practices


Agenda
•  Suitable Applications
•  MySQL Cluster compared to InnoDB – main
differences
•  Network & Hardware Selection
•  Disk Data Deployment
•  Configuration
•  Administration & Implementation Best Practices
•  Online/Offline Operations
•  Backup and restore
•  Monitoring
•  Services available to get started
MySQL Cluster – Users & Applications
HA, Transactional Services: Web & Telecoms

•  User & Subscriber Databases


•  Service Delivery Platforms
•  Application Servers
•  Web Session Stores
•  eCommerce
•  VoIP, IPTV & VoD
•  Mobile Content Delivery
•  On-Line app stores and portals
•  On-Line Gaming
•  DNS/DHCP for Broadband
•  Payment Gateways
•  Data Store for LDAP Directories

http://www.mysql.com/customers/cluster/
Suitable Applications
•  Good fit
•  OLTP apps with short running queries
•  Application with realtime characteristics and requirements
•  A lot of concurrent requests
•  Write intensive applications
•  Typically the following are a poor fit:
•  Heavy reporting type (OLAP)
•  Data Warehouse

•  However, replicate from MySQL Cluster to


regular MySQL (innodb) which runs the
reporting.
Realtime and Reporting Architecture
Don't mix real-time operations with Reporting - separate!

Realtime Apps
Reporting System
App Servers

•  Replication
SQL Layer Complex
•  Mysqldump reporting queries

Storage •  ndb_restore
Layer → csv
→ LOAD DATA INFILE
Data Collection/Aggregation
Architecture
Aggregate data from peripheral systems (sources)
HA Shard Catalog
•  Shard Catalog stores user_id → shard_id and other indexes/
mappings (user_id → friend_id:shard_id).
•  Shard Catalog can grow online

Shard Catalog Shard_0 Shard_n


MySQL Cluster
App Servers

Memcached / caching layer


SQL Layer

Storage
Layer
MySQL Cluster compared to InnoDB /
Other databases
•  Every database has its characteristics
•  MySQL Cluster is designed for
•  Short, but many, parallell transactions
•  High volume
•  High degree of concurrency
•  High availability (99.999%)
•  Let’s look how MySQL Cluster compares to Innodb (and most other
traditional databases)

Refer to Docs comparisons:


http://dev.mysql.com/doc/refman/5.5/en/mysql-cluster-compared.html
MySQL Cluster compared to InnoDB -
Table Locks
•  Table locks are usually taken before an Offline operation (e.g
ALTER to change data type). During normal traffic then a small
granularity is preferred, such as ROW LEVEL locking.
•  InnoDB
•  LOCK TABLES tablename READ will lock the table for
writes on the mysql server.
•  MySQL Cluster
•  LOCK TABLES tablename READ will lock the table for
writes only on the mysql server where the command is
issued!!
•  To lock 'tablename' on the entire cluster you must do LOCK
TABLES tablename READ on every mysql server.
•  Or if you use the Configurator scripts:
•  cd tools
•  ./execute-all-mysql.sh -e “LOCK TABLES
tablename READ”
Cluster compared to InnoDB - ALTER
•  InnoDB
•  Blocking alter tables. Altered table is locked.
•  Cluster
•  Online (non blocking) – add column online (ALTER ONLINE
TABLE … ADD COLUMN x BIGINT ) , add index online, drop
index online.
•  Other ALTER (changing column size, data type, column name etc,
is not online)
•  Non-online ALTER TABLE is not blocking!
•  You can do the ALTER TABLE on one MySQL and still write to
the table on another MySQL server → inconsistent data.
•  Non-blocking – There is no table lock distributed across all
mysql servers.
•  Use LOCK TABLES manually before on all mysql servers, then
ALTER, then UNLOCK TABLES on all mysql servers
MySQL Cluster compared to InnoDB -
FOREIGN KEYS
•  Considerations for Foreign Keys
•  FKs simplify business logic, but FKs incur a performance overhead
•  What is the role of your data? What is the role of the application?
•  InnoDB
•  Is the only storage engine currently supporting Foreign Keys
•  MySQL Cluster
•  Workaround is to use TRIGGERs to emulate Foreign Keys

For more info


http://forge.mysql.com/wiki/
ForeignKeySupport#Appendix_A:_Triggers_implementing_foreign_key_constraints
MySQL Cluster compared to InnoDB -
Transactions
•  Failed transactions must be retried by the application
•  Also true for InnoDB (and most other databases on the market)
•  If the REDOLOG or REDOBUFFER become full, the transaction
will be aborted
•  This differs from InnoDB behaviour, InnoDB will run slower (and
potentially grind to a virtual halt)
•  There are also other resources / timeouts
•  "Lock wait timeout" – transaction will abort after
TransactionDeadlockDetectionTimeout
•  MaxNoOfConcurrent[Operations/Transactions]
•  Nodefail / noderestart will cause transaction to abort
Example Setup
Clients

Load Balancer(s)

Redundant switches

S Q L + M g m S Q L + M g m
+AppServer +AppServer
+WebServer... +WebServer...

Bonding

Data node Data node


Recommendation
•  Start with four computers ..
•  2 x Data Nodes MYSQLD MYSQLD

•  2 x MySQL servers NDB_MGMD NDB_MGMD


•  2 x Management servers
•  … and scale it from there.

NDBMTD NDBMTD
Hardware Selection : Network I
•  Dedicated >= 1Gb/s networking
•  On Oracle Sun CMT it may be necessary to bond 4 or more NICs
together because typically many data nodes are on the same
physical host.
•  Prevent network failures (NIC x 2, Bonding, dual switches)
•  Use dedicated network for cluster communication
•  Put Data nodes ansd MySQL Servers on e.g 10.0.1.0 network and
let MySQL listen on a “public” interface.
•  No security layer to management node
•  Enable port 1186 access only from cluster nodes and
administrators
Hardware Selection : Network II
•  The speed of the network greatly affects the performance
•  ping <hostname>
•  If ping time is > 0.200ms check (on 1Gig-E)
•  routes – do you have >1 switch hop from one data
node to another?
•  Do you have full duplex?
•  NAPI enabled (should be)?
•  On my machines I have 0.150ms (on 1Gig-E), but if
the switches are good then 0.080-0.100 is also
possible
•  JUMBO frames, you can try to enable this but I have not
seen any noticeable improvements with this.
Hardware Selection - RAM & CPU
•  Storage Layer (Data nodes)
•  One data node can (7.0+) use 8 cores
•  CPU: 2 x 4 core (Nehalem works really well). Faster CPU → faster
processing of messages.
•  RAM: As much as you need
•  a 10GB data set will require 20GB of RAM (because of
redundancy
•  Each node will then need 2 x 10 / # of data nodes. (2 data nodes
→ 10GB of RAM → 16GB RAM is good
•  SQL Layer (MySQL Servers)
•  CPU: 2 – 16 cores
•  RAM: Not as important – 4GB enough (depends on connections and
buffers)
Hardware Selection - Disk Subsystem
low-end mid-end high-end

LCP LCP LCP


REDOLOG REDOLOG REDOLOG

1 x SATA 7200RPM 1 x SAS 10KRPM 4 x SAS 10KRPM


•  For a read-most, write •  Heavy duty (many MB/s) •  Heavy duty (many MB/s)
not so much •  No redundancy •  Disk redundancy (RAID1+0)
•  No redundancy (but other data node is hot swap
(but other data node is the mirror)
the mirror)

•  REDO, LCP, BACKUP – written sequentually in small chunks (256KB)


•  If possible, use Odirect = 1
Hardware Selection - Disk Data Storage

Minimal recommended high-end

LCP UNDOLOG
REDOLOG (REDO LOG)
UNDOLOG
TABLESPACE 1
TABLESPACE

TABLESPACE 2
2 x SAS 10KRPM (preferably)
(REDO LOG / UNDO LOG)
LCP
4 x SAS 10-15KRPM (preferably)
•  Use High-end for heavy read / write workloads (1000's of 10KB records per sec) of data
(e.g Content Delivery platforms)
•  SSD for TABLESPACE is also interesting – not much experience of this yet
•  Having TABLESPACE on separate disk is good for read performance
•  Enable WRITE_CACHE on devices
Disk Space Usage

•  The data nodes use the disk for:


•  LCP: 3 x sizeof(used DataMemory)
•  REDO: [4-6]xDataMemory
•  More (6x) REDO log for write intensive
•  Don’t have a too short REDO (e.g 2x or 3x)
•  Backups: sizeof(used DataMemory)
•  TableSpace (if disk data tables): Must fit dataset.
Choosing the Filesystem

•  Most customers uses EXT3 (Linux) and UFS (Solaris)


•  EXT2 is an option (but recovery is longer)
•  Mount with noatime
•  ZFS
•  You must separate journal (Zil) and filesystem
•  Raw device is not supported
•  EXT4, XFS – we haven't experienced so much…
Configuration : Disk Data Storage

•  Use Disk Data tables for


•  Simple accesses (read/write on PK)
•  Same for InnoDB – you can easily get IO BOUND (iostat)
•  Set
•  DiskPageBufferMemory=3072M
•  is a good start if you rely a lot on disk data – like the
Innodb_Buffer_Pool, but set it as high as you can!
•  Increased chance that a page will be cached
•  SharedGlobalMemory=384M-1024M
•  UNDO_BUFFER=64M to 128M (if you write a lot)
•  You cannot change this BUFFER later!
•  Specified at LOGFILE GROUP creation time
•  DiskIOThreadPool=[ 8 .. 16 ] (introduced in 7.0)
Configuration : General
•  Set
•  MaxNoOfExecutionThreads<=#cores
•  Otherwise contention will occur → unexpected behaviour.
•  RedoBuffer=32-64M
•  If you need to set it higher → your disks are probably too
slow
•  FragmentLogFileSize=256M
•  NoOfFragmentLogFiles= 6 x DataMemory (in MB) /
(4x 256MB)
•  Most common issue – customers never configure large
enough redo log
•  The above parameters (and others, also for MySQL)
are set for production usage at:
•  www.severalnines.com/config
Administration
•  Data nodes – designed for zero maintenance.
•  Logs
•  Writes error logs and trace files in its data directory.
•  Configurable how many error messages/trace files that should be saved
•  Memory Fragmentation
•  Free pages are reclaimed and can be reused
•  If you do a lot of insert/delete on VAR* attributes (of different sizes) you
can get fragmentation
•  OPTIMIZE TABLE / Rolling restart of data nodes can help reduce
fragmentation
•  See http://johanandersson.blogspot.com/2009/03/memory-
deallocationdefragmentation-and.html
•  Management servers
•  Writes cluster log (rotating, size configurable) in its data directory
•  Cluster logs can be sent to Syslog if desired
Administration
•  MySQL Servers
•  Binary logs - (if enabled) must be removed manually (can be
done with –expire_logs_days but are you sure all have
been applied on the slave?)
•  General log / error log / slow log - does not rotate
automatically. A script called mysql_log_rotate can help.
•  Or move/cp log manually (or scripted) and do FLUSH LOGS
•  For MySQL Cluster it is also good to have a dedicated
MySQL Server for administration purposes.
•  Perform offline ALTER TABLE (like change data type etc)
Administration Layer
•  Introduce a MySQL Server for administration purposes!
•  Should never get application requests
•  Simplifies heavy (non online) schema changes

Application layer

SQL layer

Storage layer
Synchronous Replication #give explicit nodeid in config.ini:

[mysqld]

id=8 

hostname=X

Admin layer # in my.cnf:

ndb_connectstring=”nodeid=8;x,y”

ndb_cluster_connnection_pool=1
Administration Layer
•  Modifying Schema is NOT online when you perform
the following:
•  Rename a table
•  Change data type
•  Change storage size
•  Drop column
•  Rename column
•  Add/Drop a PRIMARY KEY
•  Altering a 1GB table requires 1GB of free
DataMemory (copying)
•  Online (and ok to do with transactions ongoing):
•  Add column (ALTER ONLINE …)
•  CREATE INDEX
•  Online add node
Admistration Layer
•  ALTER TABLE etc (non-online DDL) performed on Admin
Layer!
•  1. Block traffic from
SQL layer to data nodes
App layer •  ndb_mgm>
ENTER SINGLE USER
MODE 8
•  Only Admin mysqld is now
SQL layer connected to the data nodes
STOP!! No Traffic Now! •  Or do LOCK TABLES on SQL
Layer!
•  2. Perform heavy ALTER on
Storage
admin layer
layer
Synchronous Replication •  3. Allow traffic from SQL layer
to data nodes
#give explicit nodeid in config.ini

[mysqld]

•  ndb_mgm> EXIT SINGLE
id=8 
 USER MODE
hostname=X

Admin layer # in my.cnf:
 •  Or do UNLOCK TABLES on
ndb_connectstring=”nodeid=8;x,y”

ndb_cluster_connnection_pool=1
the whole SQL Layer!
Admistration Layer
•  You can also set up MySQL Replication from Admin layer to the
SQL layer
•  Replicate mysql database
•  GRANT, SPROCs etc will be replicated.
•  Keeps the SQL Layer aligned¨

App layer

SQL layer

Storage layer
Synchronous Replication

Admin layer
binlog_do_db=mysql
Online Upgrades
•  Change Online
•  OS, SW version (7.0.x → 7.2.x)
•  Configuration( e.g, increase DM, IM, Buffers, redo log, [mysqld] slots
etc
•  Hardware (upgrade more RAM etc)
•  These procedures requires a Rolling Restart
•  Change config.ini, copy it over to all ndb_mgmd
•  Stop ndb_mgmd , start ndb_mgmd with --reload
•  Restart one data node at a time
•  Restart one mysqld at a time
•  Adding data nodes (7.0 and above)
•  Adding MySQL Servers
•  Make sure you have free [mysqld] slots
•  Start the new mysqld
Scaling
•  One data node can (7.0+) use up to 8 cores
•  CPU: Reaches bottleneck at about 370% CPU
•  add another node group (to spread load)
•  DISK: iostat -kx 1 : Check util; await, svctime etc..
•  Add disks
•  NETWORK: iftop (linux)
•  add another node group (to spread load)
•  MySQL Server
•  CPU: About the same – 300-500%
•  Add another MySQL Server to offload query processing
•  DISK: Should not be a factor if you are using only NDB tables
•  NETWORK:
•  Add another MySQL Server to offload query processing
Monitoring
•  Mandatory to monitor
•  CPU/Network/Memory usage
•  Disk capacity (I/O) usage
•  Network latency between nodes
•  Node status ...
•  Used Index/Data Memory
•  www.severalnines.com/cmon - monitors data nodes and mysql
servers
•  New in MySQL Cluster 7.1 :
•  NDB$INFO Table in INFORMATION_SCHEMA
•  Check node status
•  Check buffer status etc
•  Statistics
Best Practice : Primary Keys
•  To avoid problems with
•  Cluster 2 Cluster replication
•  Recovery
•  Application behavior (KEY NOT FOUND.. etc)
•  ALWAYS DEFINE A PRIMARY KEY ON THE TABLE!
•  A hidden PRIMARY KEY is added if no PK is specified. BUT..
•  .. NOT recommended
•  The hidden primary key is for example not replicated (between
Clusters)!!
•  There are problems in this area, so avoid the problems!
•  So always, at least have
id BIGINT AUTO_INCREMENT PRIMARY KEY
•  Even if you don't “need” it for you applications
Best Practice : Query Cache
•  Don't enable the Query Cache!
•  It is very expensive to invalidate over X mysql servers
•  A write on one server will force the others to purge their
cache.
•  If you have tables that are read only (or change very
seldom):
•  my.cnf:
•  query_cache_type=2 (ON DEMAND)
•  SELECT SQL_CACHE <cols> .. FROM table;
•  Cache only queries with SQL_CACHE
•  This can be good for STATIC data
Best Practice : Large Transactions
•  Remember MySQL Cluster is designed for many and
short transactions
•  You are recommended to UPDATE / DELETE in small chunks
•  Use LIMIT 10000 until all records are UPDATED/DELETED
•  MaxNoOfConcurrentOperations sets the upper
limit for how many records than can be modified
simultaneously on one data node.
•  MaxNoOfConcurrentOperations=1000000 will use 1GB
of RAM
•  Despite being possible, we recommend DELETE/UPDATE in
smaller chunks.
Best Practice : Table logging

•  Some types of tables account for a lot of WRITEs, but do not


need to be recovered (E.g, Session tables)
•  A session table is often unnecessary to REDO LOG and to
CHECKPOINT
•  Create these tables as 'NO LOGGING' tables:
mysql> set @ndb_curr_val=@@ndb_table_no_logging;

mysql> set ndb_table_no_logging=1;

mysql> create table session_table(..) engine=ndb;

mysql> set ndb_table_no_logging=@ndb_curr_val;


•  'session_table' will not be
•  REDO logged or Checkpointed → No disk activity for this table!
•  After System Restart it will be there, but empty!
Best Practice : Backup
•  Backup of NDB tables
•  Online – can have ongoing transactions
•  Consistent – only committed data and changes are backed up
•  ndb_mgm -e “START BACKUP”
•  Copy backup files from data nodes to safe location
•  Non-NDB tables must be backed up separately
•  MySQL system tables are stored only in MYISAM.
•  You want to backup (for each mysql server)
•  mysql database
•  Triggers, routines, events ...
•  Use 'mysqldump'
•  mysqldump mysql > mysql.sql
•  mysqldump --no-data --no-create-info -R > routines.sql
•  Copy my.cnf & config.ini files
Best Practice: Restore
•  ndb_restore is in many cases the MOST write intensive operation on
Cluster
•  The problem is that ndb_restore produces REDO LOG
•  This is unnecessary but a fact for now
•  Restores many records in parallel, no throttling..
•  So 128 or more small records may be fine, but 128 BLOBs….
Temporary error: 410: REDO log buffers overloaded, consult online manual
(increase RedoBuffer, and|or
decrease TimeBetweenLocalCheckpoints, and|or increase NoOfFragmentLogFiles)

•  If you run into this during restore


•  Try increase RedoBuffer (a value of higher than 64MB is seldom practical nor
needed)
•  Run only one instance of ndb_restore
•  ndb_restore -p10 ....
•  Or even a lower value, e.g, -p1 RB
Synced
•  If this does not help → faster disk(s) is/are needed every
TBGCP
Resources

•  Getting Started with MySQL Cluster – 5 Steps, <15 minutes


•  http://www.mysql.com/products/database/cluster/get-started.html#quickstart

•  MySQL Cluster Evaluation Guide


•  http://www.mysql.com/why-mysql/white-papers/mysql_cluster_eval_guide.php

•  MySQL Cluster Performance Tuning Best Practices


http://www.mysql.com/why-mysql/white-papers/mysql_wp_cluster_perfomance.php

You might also like