Virtualizing SQL Server With VMware PDF
Virtualizing SQL Server With VMware PDF
Michael Corey
Jeff Szastak
Michael Webster
Foreword
Preface
About the Authors
About the Technical Reviewer
Acknowledgments
Reader Services
1 Virtualization: The New World Order?
Virtualization: The New World Order
Virtualization Turns Servers into Pools of Resources
Living in the New World Order as a SQL Server DBA
A Typical Power Company
Summary
2 The Business Case for Virtualizing a Database
Challenge to Reduce Expenses
The Database Administrator (DBA) and Saving Money
Service Level Agreements (SLA) and the DBA
Avoiding the Good Intention BIOS Setting
DBAs Top Reasons to Virtualize a Production Database
High Availability and Database Virtualization
Performance and Database Virtualization
Provisioning/DBaaS and Database Virtualization
Hardware Refresh and Database Virtualization
Is Your Database Too Big to Virtualize?
Summary
3 Architecting for Performance: The Right Hypervisor
What Is a Hypervisor?
Hypervisor Is Like an Operating System
What Is a Virtual Machine?
Paravirtualization
The Different Hypervisor Types
Type-1 Hypervisor
Type-2 Hypervisor
Paravirtual SCSI Driver (PVSCSI) and VMXNET3
Installation Guidelines for a Virtualized Database
Its About Me, No One Else But Me
Virtualized Database: Its About Us, All of Us
DBA Behavior in the Virtual World
Shared Environment Means Access to More If You Need It
Check It Before You Wreck It
Why Full Virtualization Matters
Living a DBAs Worst Nightmare
Physical World Is a One-to-One Relationship
One-to-One Relationship and Unused Capacity
One to Many: The Virtualized World
The Right Hypervisor
Summary
4 Virtualizing SQL Server: Doing IT Right
Doing IT Right
The Implementation Plan
Service-Level Agreements (SLAs), RPOs, and RTOs
Baselining the Existing vSphere Infrastructure
Baselining the Current Database Workload
Birds-Eye View: Virtualization Implementation
How a Database Virtualization Implementation Is Different
Summary
5 Architecting for Performance: Design
Communication
Mutual Understanding
The Responsibility Domain
Center of Excellence
Deployment Design
SQL Workload Characterization
Putting It Together (or Not)
Reorganization
Tiered Database Offering
Physical Hardware
CPU
Memory
Virtualization Overhead
Swapping, Paging? Whats the Difference?
Large Pages
NUMA
Hyper-Threading Technology
Memory Overcommitment
Reservations
SQL Server: Min/Max
SQL Server: Lock Pages in Memory
Storage
Obtain Storage-Specific Metrics
LSI Logic SAS or PVSCSI
Determine Adapter Count and Disk Layout
VMDK versus RDM
VMDK Provisioning Type
Thin Provisioning: vSphere, Array, or Both?
Data Stores and VMDKs
VMDK File Size
Networking
Virtual Network Adapter
Managing Traffic Types
Back Up the Network
Summary
6 Architecting for Performance: Storage
The Five Key Principles of Database Storage Design
Principle 1: Your database is just an extension of your storage
Principle 2: Performance is more than underlying storage devices
Principle 3: Size for performance before capacity
Principle 4: Virtualize, but without compromise
Principle 5: Keep it standardized and simple (KISS)
SQL Server Database and Guest OS Storage Design
SQL Server Database File Layout
Number of Database Files
Size of Database Files
Instant File Initialization
SQL Server File System Layout
SQL Server Buffer Pool Impact on Storage Performance
Updating Database Statistics
Data Compression and Column Storage
Database Availability Design Impacts on Storage Performance
Volume Managers and Storage Spaces
SQL Server Virtual Machine Storage Design
Virtual Machine Hardware Version
Choosing the Right Virtual Storage Controller
Choosing the Right Virtual Disk Device
SQL Virtual Machine Storage Layout
Expanding SQL Virtual Machine Storage
Jumbo VMDK Implications for SQL Server
vSphere Storage Design for Maximum SQL Performance
Number of Data Stores and Data Store Queues
Number of Virtual Disks per Data Store
Storage IO ControlEliminating the Noisy Neighbor
vSphere Storage Policies and Storage DRS
vSphere Storage Multipathing
vSphere 5.5 Failover Clustering Enhancements
RAID Penalties and Economics
SQL Performance with Server-Side Flash Acceleration
VMware vSphere Flash Read Cache (vFRC)
Fusion-io ioTurbine
PernixData FVP
SQL Server on Hyperconverged Infrastructure
Summary
7 Architecting for Performance: Memory
Memory
Memory Trends and the Stack
Database Buffer Pool and Database Pages
Database Indexes
Host Memory and VM Memory
Mixed Workload Environment with Memory Reservations
Transparent Page Sharing
Internet Myth: Disable Memory TPS
Memory Ballooning
Why the Balloon Driver Must Run on Each Individual VM
Memory Reservation
Memory Reservation: VMware HA Strict Admission Control
Memory Reservations and the vswap File
SQL Server Max Server Memory
SQL Server Max Server Memory: Common Misperception
Formula for Configuring Max Server Memory
Large Pages
What Is a Large Page?
Large Pages Being Broken Down
Lock Pages in Memory
How to Lock Pages in Memory
Non-Uniform Memory Access (NUMA)
vNUMA
Sizing the Individual VMs
More VMs, More Database Instances
Thinking Differently in the Shared-Resource World
SQL Server 2014 In-Memory Built In
Summary
8 Architecting for Performance: Network
SQL Server and Guest OS Network Design
Choosing the Best Virtual Network Adapter
Virtual Network Adapter Tuning
Windows Failover Cluster Network Settings
Jumbo Frames
Configuring Jumbo Frames
Testing Jumbo Frames
VMware vSphere Network Design
Virtual Switches
Number of Physical Network Adapters
Network Teaming and Failover
Network I/O Control
Multi-NIC vMotion
Storage Network and Storage Protocol
Network Virtualization and Network Security
Summary
9 Architecting for Availability: Choosing the Right Solution
Determining Availability Requirements
Providing a Menu
SLAs, RPOs, and RTOs
Business Continuity vs. Disaster Recovery
Business Continuity
Disaster Recovery
Disaster Recovery as a Service
vSphere High Availability
Hypervisor Availability Features
vMotion
Distributed Resource Scheduler (DRS)
Storage vMotion
Storage DRS
Enhanced vMotion X-vMotion
vSphere HA
vSphere App HA
vSphere Data Protection
vSphere Replication
vCenter Site Recovery Manager
VMware vCloud Hybrid Service
Microsoft Windows and SQL Server High Availability
ACID
SQL Server AlwaysOn Failover Cluster Instance
SQL Server AlwaysOn Availability Groups
Putting Together Your High Availability Solution
Summary
10 How to Baseline Your Physical SQL Server System
What Is a Performance Baseline?
Difference Between Performance Baseline and Benchmarks
Using Your Baseline and Your Benchmark to Validate Performance
Why Should You Take a Performance Baseline?
When Should You Baseline Performance?
What System Components to Baseline
Existing Physical Database Infrastructure
Database Application Performance
Existing or Proposed vSphere Infrastructure
Comparing Baselines of Different Processor Types and Generations
Comparing Different System Processor Types
Comparing Similar System Processor Types Across Generations
Non-Production Workload Influences on Performance
Producing a Baseline Performance Report
Performance Traps to Watch Out For
Shared Core Infrastructure Between Production and Non-Production
Invalid Assumptions Leading to Invalid Conclusions
Lack of Background Noise
Failure to Considering Single Compute Unit Performance
Blended Peaks of Multiple Systems
vMotion Slot Sizes of Monster Database Virtual Machines
Summary
11 Configuring a Performance TestFrom Beginning to End
Introduction
What We UsedSoftware
What You Will NeedComputer Names and IP Addresses
Additional Items for Consideration
Getting the Lab Up and Running
VMDK File Configuration
VMDK File Configuration Inside Guest Operating System
Memory Reservations
Enabling Hot Add Memory and Hot Add CPU
Affinity and Anti-Affinity Rules
Validate the Network Connections
Configuring Windows Failover Clustering
Setting Up the Clusters
Validate Cluster Network Configuration
Changing Windows Failover Cluster Quorum Mode
Installing SQL Server 2012
Configuration of SQL Server 2012 AlwaysOn Availability Groups
Configuring the Min/Max Setting for SQL Server
Enabling Jumbo Frames
Creating Multiple tempdb Files
Creating a Test Database
Creating the AlwaysOn Availability Group
Installing and Configuring Dell DVD Store
Running the Dell DVD Store Load Test
Summary
Appendix A Additional Resources
Additional Documentation Sources
User Groups
VMUG: The VMware Users Group
PASS: Professional Association of SQL Server
VMware Community
Facebook Groups
Blogs
Twitter: 140 Characters of Real-Time Action
Index
Foreword
About 10 years ago, I started a new job. The company I started working for had a
couple hundred physical servers at the time. When several new internal software
development projects started, we needed to expand quickly and added dozens of new
physical servers. Pretty soon we started hitting all the traditional datacenter problems,
such as lack of floor space, high power consumption, and cooling constraints. We had to
solve our problems, and during our search for a solution we were introduced to a new
product called VMware ESX and Virtual Center. It didnt take long for us to see the
potential and to start virtualizing a large portion of our estate.
During this exercise, we started receiving a lot of positive feedback on the performance
of the virtualized servers. On top of that, our application owners loved the fact that we
could deploy a new virtual machine in hours instead of waiting weeks for new
hardware to arrive. I am not even talking about all the side benefits, such as VMotion
(or vMotion, as we call it today) and VMware High Availability, which provided a
whole new level of availability and enabled us to do maintenance without any
downtime for our users.
After the typical honeymoon period, the question arose: What about our database
servers? Could this provide the same benefits in terms of agility and availability while
maintaining the same performance? After we virtualized the first database server, we
quickly realized that just using VMware Converter and moving from physical to virtual
was not sufficient, at least not for the databases we planned to virtualize.
To be honest, we did not know much about the database we were virtualizing. We
didnt fully understand the CPU and memory requirements, nor did we understand the
storage requirements. We knew something about the resource consumption, but how do
you make a design that caters to those requirements? Perhaps even more importantly,
where do you get the rest of the information needed to ensure success?
Looking back, I wish wed had guidance in any shape or form that could have helped
along our journeyguidance that would provide tips about how to gather requirements,
how to design an environment based on these requirements, how to create a
performance baseline, and what to look for when hitting performance bottlenecks.
That is why I am pleased Jeff Szastak, Michael Corey, and Michael Webster took the
time to document the valuable lessons they have learned in the past few years about
virtualizing tier 1 databases and released it through VMware Press in the form of this
book you are about to read. Having gone through the exercise myself, and having made
all the mistakes mentioned in the book, I think I am well qualified to urge you to soak in
all this valuable knowledge to ensure success!
Duncan Epping
Principal Architect, VMware
Yellow-Bricks.com
Preface
As we traveled the globe presenting on how to virtualize the most demanding business-
critical applications, such as SQL Server, Oracle, Microsoft Exchange, and SAP, it
became very clear that there was a very real and unmet need from the attendees to learn
how to virtualize these most demanding applications correctly.
This further hit home when we presented at the VMworld conferences in San Francisco
and Barcelona. At each event, we were assigned a very large room that held over 1,800
people; within 48 hours of attendees being able to reserve a seat in the room, it was
filled to capacity. We were then assigned a second large room that again filled up
within 24 hours.
Recognizing that the information we had among the three of us could help save countless
others grief, we decided to collaborate on this very practical book.
Target Audience
Our goal was to create in one booka comprehensive resource that a solution architect,
system administrator, storage administrator, or database administrator could use to
guide them through the necessary steps to successfully virtualize a database. Many of the
lessons learned in this book apply to any business-critical application being virtualized
from SAP, E-Business Suite, Microsoft Exchange, or Oracle, with the specific focus of
this book on Microsoft SQL Server. Although you dont have to be a database
administrator to understand the contents of this book, it does help if you are technical
and have a basic understanding of vSphere.
Approach Taken
Everything you need to succeed in virtualizing SQL Server can be found within the
pages of this book. By design, we created the book to be used in one of two ways. If you
are looking for a comprehensive roadmap to virtualize your mission-critical databases,
then follow along in the book, chapter by chapter. If you are trying to deal with a
particular resource that is constraining the performance of your database, then jump to
Chapters 5 through 8.
At a high level, the book is organized as follows:
Chapters 1 and 2 explain what virtualization is and the business case for it. If you
are a database administrator or new to virtualization, you will find these chapters
very helpful; they set the stage for why virtualizing your databases is doing IT
right.
Chapters 3 through 9 are the roadmap you can follow to successfully virtualize the
most demanding of mission-critical databases. Each chapter focuses on a
particular resource the database utilizes and how to optimize that resource to get
the best possible performance for your database when it is virtualized. We
purposely organized this section into distinct subject areas so that you can jump
directly to a particular chapter of interest when you need to brush up. We expect
that you will periodically return to Chapters 5 through 8 as you are fine-tuning the
virtualized infrastructure for your mission-critical databases.
The last two chapters walk you through how to baseline the existing SQL Server
database so that you adequately determine the resource load it will put onto the
virtualized infrastructure. In these chapters, we also provide detailed instructions
on how to configure a stress test.
Here are the three major sections of the book with the associated chapters:
Jeff Szastak (@Szastak) is currently a Staff Systems Engineer for VMware. Jeff has
been with VMware for over six years, holding various roles with VMware during his
tenure. These roles have included being a TAM, Systems Engineer Specialist for
Business-Critical Applications, Enterprise Healthcare Systems Engineer, and a CTO
Ambassador. Jeff is a recognized expert for virtualizing databases and other high I/O
applications on the vSphere platform. Jeff is a regular speaker at VMworld, VMware
Partner Exchange, VMware User Groups, and has spoken at several SQL PASS events.
Jeff holds a Master of Information Assurance degree as well as the distinguished CISSP
certification. Jeff has over 13 lucky years in IT and is passionate about helping others
find a better way to do IT.
We would like to thank the entire team at VMware Press for their support throughout
this project and for helping us get this project across the lineespecially Joan Murray
for her constant support and encouragement. We would like to thank our editorial team.
Thank you Ellie Bru and Mandie Frank for your attention to detail to make sure we put
out a great book, and last but not least, we would especially like to thank our technical
reviewer, Mark Achtemichuk (VCDX #50).
Michael Corey
Anyone who has ever written a book knows first hand what a tremendous undertaking it
is and how stressful it can be on your family. It is for that reason I thank my wife of 28
years, Juliann. Over those many years, she has been incredible. I want to thank my
children, Annmarie, Michael, and especially John, who this particular book was hardest
on. John will know why if he reads this.
Jeff and Michael, my co-authors, are two of the smartest technologists I have ever had
the opportunity to collaborate with. Thank you for making this book happen despite the
many long hours it took you away from your families. Mark Achtemichuk, our technical
reviewer, rocks! He helped take this book to a whole new level. To my friends at
VMwareDon Sullivan, Kannan Mani, and Sudhir Balasubramanianthank you for
taking all my late-night emails and phone calls to discuss the inner workings of vSphere.
To the publishing team at Pearson, what can I say? Thank you Joan Murray for believing
and making this book possible.
Special thanks go to my Ntirety familyJim Haas, Terrie White, and Andy Galbraith
are all three incredible SQL Server technologists. And special thanks to people like
David Klee and Thomas LaRock and to the entire SQL Server community. Every time I
attend a SQLSaturday event, I always think how lucky I am to be party of such a special
community of technologist who care a lot and are always willing to help.
Jeff Szastak
I would like to thank my loving wife, Heather, for her love, support, and patience during
the writing of this book. I want to thank my children, Wyatt, Oliver, and Stella, for it is
from you I draw inspiration. A huge thank-you to Hans Drolshagen for the use of his lab
during the writing of this book! And thanks to my mentor, Scott Hill, who pushed me,
challenged me, and believed in me. Thanks for giving a guy who couldnt even set a
DHCP address a job in IT, Scott.
Finally, I would like to thank the VMware community. Look how far we have come. I
remember the first time I saw a VMware presentation as a customer and thought, If this
software works half as well as that presentation says it does, this stuff will change the
world. And it has, because of you, the VMware community.
Michael Webster
Id like to thank my wife, Susanne, and my four boys, Sebastian, Bradley, Benjamin, and
Alexander, for providing constant love and support throughout this project and for
putting up with all the long hours on weeknights and weekends that it required to
complete this project. I would also like to acknowledge my co-authors, Michael and
Jeff, for inviting me to write this book with them. I am extremely thankful for this
opportunity, and it has been a fantastic collaborative process. Finally, Id like to thank
and acknowledge VMware for providing the constant inspiration for many blog articles
and books and for creating a strong and vibrant community. Also, thanks go out to my
sounding boards throughout this project: Kasim Hansia, VMware Strategic Architect
and SAP expert, Cameron Gardiner, Microsoft Senior Program Manager Azure and
SQL, and Josh Odgers (VCDX #90), Nutanix Senior Solutions and Performance
Architect. Your ideas and support have added immeasurable value to this book and the
IT community as a whole.
We Want to Hear from You!
As the reader of this book, you are our most important critic and commentator. We
value your opinion and want to know what were doing right, what we could do better,
what areas youd like to see us publish in, and any other words of wisdom youre
willing to pass our way.
We welcome your comments. You can email or write us directly to let us know what
you did or didnt like about this bookas well as what we can do to make our books
better.
Please note that we cannot help you with technical problems related to the topic of
this book.
When you write, please be sure to include this books title and author as well as your
name, email address, and phone number. We will carefully review your comments and
share them with the author and editors who worked on the book.
Email: VMwarePress@vmware.com
Mail: VMware Press
ATTN: Reader Feedback
800 East 96th Street
Indianapolis, IN 46240 USA
Reader Services
It is not the strongest of the species that survives nor the most intelligent
but the one most responsive to change.
Charles Darwin
This chapter is about a new computing paradigm where your SQL Server databases are
virtualized. In this chapter, we discuss what it means to break the tether of the database
from the physical server. We use real-world examples to demonstrate how a virtualized
database will better enable you to respond to the needs of your business, day in and day
out.
Summary
In this chapter, we introduced you to a new world where all your SQL Server databases
are virtualizeda world where your database is not physically tethered to a physical
server, just as your cell phone is not tethered to a particular cell tower. In this new
world order, your database can move from physical server to physical server as
resource demand fluctuates. If a cell phone is dropped or broken, you are out of luck,
but with virtualization you can protect your database from all kinds of failures. Using
the example of Cyber Monday, we showed how you could dynamically allocate
additional vCPU or memory, as it was most needed, to a SQL Server database running a
retail website. This new world order is a world where your SQL Server database has
access to any computing resource, on any physical server, at any time.
Chapter 2. The Business Case for Virtualizing a Database
In this chapter, we review the business case for why you should virtualize your
business, with specifics around the benefits of virtualizing a business-critical
application such as a Microsoft SQL Server 2012 database. Topics covered include the
following:
Server/database consolidation
Database as a Service (DBaaS)
IT efficiency (the golden template)
Service-level agreements (SLAs on steroids)
Is your database to big to virtualize?
These topics will be discussed in the context of virtualizing a business-critical
application, which is different from a non-business-critical application. Specifically,
what are common drivers for virtualizing a database and what are not.
Tip
Check the BIOS power management settings on all servers that may host a
database to ensure they are enabled for performance versus power savings. You
may use more energy but your database performance will improve.
As you can see, you have a lot to consider when configuring a virtualized environment
to optimally support a database.
Figure 2.3 The Petabyte Challenge: 2011 IOUG Database Growth Survey. The
Petabyte Challenge: 2011 IOUG Database Growth Survey was produced by Unisphere
Research, and sponsored by Oracle. Figure provided courtesy of Unisphere Research, a
Division of Information Today, Inc. and the Independent Oracle Users Group (IOUG).
As these databases get bigger and more complex, the ability to recover also becomes
more complex. With virtualization, you have redundancy up and down the entire
infrastructure stack. By maintaining a high level of redundancy, you can avoid a
situation where you would have to perform a database recovery in the first place.
Figure 2.4 illustrates the many levels of redundancy you have when your database is
virtualized. For example, if a network interface card (NIC) or even a port were to fail,
the VMware hypervisor would detect the failure and reroute traffic to another available
port. If a host bus adapter (HBA) path were to fail, the VMware hypervisor would
detect the failure and reroute the request to the storage system another way. Best part,
all this is built in to the hypervisor and is transparent to the database and applications.
Figure 2.4 Virtualization protections at every level.
The Petabyte Challenge: 2011 IOUG Database Growth Survey was produced by Unisphere Research, and
sponsored by Oracle. Figure provided courtesy of Unisphere Research, a Division of Information Today, Inc. and the
Independent Oracle Users Group (IOUG).
At the server level, you have options such as VMware High Availability (HA). With
VMware HA, if your server were to fail, all the affected virtual machines would be
restarted onto another available server with capacity. If an operating system failed
within a virtual machine, VMware HA would detect the failure and restart the VM.
VMware Fault Tolerance takes the level of protection up a notch. In the event of a
server failure, VMware Fault Tolerance provides transparent failover at a virtual
machine level with no disruption of service.
Moving all the way to the right side of Figure 2.4, you have VMware Site Recovery
Manager (SRM). This maps virtual machines to the appropriate resources on a failover
site. In the event of a site failure, all VMs would be restarted at the failover site.
With these tools as a DBA, you now have more options than ever before to improve the
availability of the database and are now able to take high availability to a new level.
Summary
In this chapter, we discussed the business case for virtualization, including ROI.
Companies operating in a one-server-to-one-application environment has led to a
number of costly inefficiencies in how businesses operate. Companies that adopt
virtualization typically see significant cost savings and increased utilization.
This has created a very powerful financial reason to adopt virtualization. Combined
with the many capabilities of a virtualized infrastructure, this provides a DBA with
many options. The inherent capabilities surrounding redundancy up and down the entire
infrastructure stack that comes with a virtualized platform will improve the availability
of the databases, enabling you to exceed the documented and undocumented service
levels you have with your customers.
We also discussed how virtualization has created an abstracted layer from the physical
environment; once you break your databases away from the shackles of the physical
server hardware, hardware becomes a commodity you can easily leverage. We stressed
the importance of understanding the resource requirements of your database in a shared
environment, and how important it is to size the VM that houses the database
appropriately. A proper baseline of your database is key to understanding resource
requirements and will help you avoid a lot of problems up the road.
Chapter 3. Architecting for Performance: The Right
Hypervisor
In this chapter, we discuss what a hypervisor is and the different types of virtualization
hypervisors on the market. We also discuss why some hypervisors run applications true
to the native or physical stack whereas other hypervisors do not.
This is especially important to understand given that a SQL Server database is one of
the most complex applications you may ever virtualize. When a hypervisor does not run
true to the physical stack, it is possible to encounter bugs that would not exist in the
physical world, thus introducing an additional level of complexity and risk that you need
to be aware of.
We look at the different generations of VMware vSphere hypervisor in this chapter. Just
as there are many versions of the SQL Server database, there are many versions of the
vSphere hypervisor. You would not run your most demanding SQL server workloads on
SQL Server 2000, just as you would not run your most demanding virtualized
workloads on VMware Infrastructure 3.0. Its important that you are running on a
version of vSphere that was built to support the complex resource needs and demands of
a SQL Server database. Finally, we discuss some additional things to consider when
virtualizing your SQL Server database.
What Is a Hypervisor?
To help you better understand what a hypervisor is and the role it plays, lets look at a
portion of a typical infrastructure before and after it has been virtualized. Figure 3.1
illustrates a small slice of a much larger infrastructure. Imagine you have three physical
servers. Each server is running a different operating system. One server or physical host
is running a flavor of Linux, another server is running a version of the Windows Server
2008 operating system, and the final server is running a version of the Windows Server
2012 operating system. For the purposes of this example, it is not important what
version of the particular operating system the different physical machines are running.
Figure 3.1 Three physical hosts before virtualization.
Each of these individual operating systems is responsible for providing physical
resources such as CPU, memory, disk, and network to the different applications sitting
on it.
For example, sitting on the Windows Server 2008 operating system could be a series of
applications that include a SQL Server database and a number of other applications.
The Windows operating system would provide each of those applications access to the
physical resources CPU, memory, and disk.
This would also hold true for the other servers in Figure 3.1. There would be a series of
applications running on those servers, including databases and various other
applications, and each OS would provide access to its resources. Figure 3.1 is a high-
level illustration of this before the environment is virtualized.
Definition
Guest operating systemThe operating system that runs on a virtual machine.
Figure 3.2 illustrates the same environment when it is virtualized. Sitting on the physical
host is the hypervisor. The hypervisors job is to provide physical resources such as
CPU, memory, and disk to its customers. Its customers are the many guest OSs running
on the virtual machines.
Note
A virtual machine is a software-based partition of a computer. It contains
an operating system. The many applications running on the VM, such as a
SQL Server database, execute the same way they would on a physical
server.
Applications such as your SQL Server databases run in a software-based partition the
same way they would run in a nonvirtualized infrastructure. The core tenants of
virtualization include the following:
PartitioningThe ability to run multiple operating systems on one physical
machine. Also, the ability to divide resources between different virtual machines.
IsolationThe ability to have advance resource controls to preserve
performance. Fault and security isolation is at the hardware level.
EncapsulationThe ability to move and copy virtual machines as easily as
moving and copying files. The entire state of a virtual machine can be saved to a
file.
Hardware independenceThe ability to provision or migrate any virtual
machine to any similar or different physical server.
As a DBA, these are important points to keep in mind as you manage the database.
Paravirtualization
Some other vendors decided to implement paravirtualization. To quote Wikipedia,
paravirtualization is a virtualization technique that presents a software interface to
virtual machines that is similar, but not identical to that of the underlying hardware.
The key words here are but not identical.
The definition goes on further to say, The intent of the modified interface is to reduce
the portion of the guests execution time spent performing operations which are
substantially more difficult to run in a virtual environment compared to a non-
virtualized environment. The paravirtualization provides specially defined hooks to
allow the guest(s) and host to request and acknowledge these tasks, which would
otherwise be executed in the virtual domain (where execution performance is worse).
The goal of paravirtualization was to get lower virtualization overhead. In order to
accomplish this goal, vendors enable the guest operating system to skip the virtual layer
for certain types of operations. In order to enable this functionality, the vendors have to
alter the guest operating system so it is able to skip the virtualization layer.
For example, Red Hat Linux running on a physical host will be a different version of
Red Hat Linux running on a hypervisor that uses paravirtualization. Every time the
hypervisor is updated, it requires modifying the operating system. This opens up the
possibility for a database to behave differently when it is virtualized.
In the context of this conversation, we have been talking about CPU instructions. The
authors of this book agree with the VMware approach to virtualization: Altering the
guest operating system is not acceptable. There are too many inherent risks associated
with running a database on an altered OS.
When it comes to device drivers, making them aware they are virtualized can be a real
advantage. The classic example of this is the VMXNET3 driver for the network. In the
section titled Paravirtual SCSI Driver (PVSCSI) and VMXNET3, we discuss these
types of drivers in more detail.
Type-1 Hypervisor
A Type-1 hypervisor sits on the bare metal or physical hardware. Think of bare metal
as a computer without any operating system.
Starting at the bottom in Figure 3.3, on the left side you have the physical hardware, or
bare metal. Sitting on top of that is the Type 1 hypervisor. VMware vSphere ESXi is an
example of a Type-1 hypervisor. Moving further up on the left side of the Type-1
hypervisor are the many different self-contained virtual machines with the guest
operating systems. In the example, we show two virtual machines, but there would
typically be many. An important point to make is that until the hypervisor is started,
none of the virtual machines are able to run.
Type-2 Hypervisor
A Type-2 hypervisor runs directly on another operating system. This means until the
underlying operating system has booted, you would not be able to use the Type-2
hypervisor. That is an easy way to distinguish the type of hypervisor you are running.
Once again, refer to Figure 3.3, only on the right side this time.
Starting at the bottom-right side, the physical hardware is illustrated. Moving up from
there sitting on top of the physical hardware is the operating system (for example,
Linux). Note that this is the native operating system, not a guest operating system. On top
of the native operating system, you could have both applications running on the native
operating system itself and a Type-2 hypervisor also running. Then running on the Type-
2 hypervisor could be one or more virtual machines and their various guest operating
systems.
Drawbacks to the Type-2 Hypervisor
If the operating system sitting on the hardware crashes, it will bring everything down on
the box, including the Type-2 hypervisor. If a hacker breaches the operating systems on
which the Type-2 hypervisor is running, then everything is at risk. This makes the Type-
2 hypervisor only as secure as the underlying operating system on which it is running. If
critical security patches are released for that operating system that have nothing to do
with virtualization, you are now required to patch those boxes and work these patches in
with your patching of the guest operating systems. In our opinion, serious virtualization
requires a Type-1 hypervisor at a minimum.
Tip
When installing your database on a virtualized infrastructure, follow the same
installation guidelines you would on a physical infrastructure.
Its About Me, No One Else But Me
From your perspective as a DBA, when you set up a database in the physical world, its
typically about me, just me, and no one else but me. Its your database server,
and everyone else on it is either an invited guest or an unwanted visitor. As we have
been discussing, this is the world of a one-to-one relationship. Your production
database sits on a server whose only purpose is to support the production SQL Server
database that supports the business. When it sits idle, those resources go to waste. If you
need more resources, you are limited to what the physical server the database sits on
was purchased with.
As a DBA, you tune up the database to take full advantage of all resources available to
it. For example, Max Server Memory would be configured to take advantage of the
entire RAM on the box, except what is needed for the operating systems. You are only
setting a small amount of RAM aside because it is good for me. Max Server Memory is
talked about in great detail in Chapter 7, Architecting for Performance: Memory. In
fact, when databases get moved onto storage arrays, as DBAs, we dont take too well to
that at first. This means its not just about me. You have to deal with a storage array
administrator who may not have the best interests of the database as their top priority.
The storage administrator needs to ensure performance for all the systems connected to
the array, not just the database servers.
Lets face it: As DBAs, we dont play well in the sandbox with others. We have grown
up in a world where we dont have to share and we are not historically good about it.
We have grown up in a world where we have to solve problems by ourselves all the
time. When you virtualize your database, its important to note that the world changes.
There are others in the sandbox with you. You have to learn how to share and rely on
others if you are to succeed. Good communication and understanding of your
requirements among the different teams is critical in this new world.
Important
A virtualized database is housed on a shared environment. Its important as
DBAs that we dont hoard resources such as CPU, memory, and disk. Its
important we communicate clearly what the VM that houses the database needs.
Its also important for the vSphere administrator, storage administrator, and
network administrators to work with you to meet your requirements.
This means working closing with the storage administrator, network administrators, and
vSphere administrators. As our coauthor Michael Corey likes to say, I flunked Mind
Reading. If you dont tell me what you need, I wont know.
The customer had a mission-critical third-party application they ran their business on
that required updating. The application and the database ran on a virtual machine. An
upgrade plan was put in place. As part of this upgrade, we had to take the most current
backup and restore the database onto a new virtual machine, and then proceed with an
upgrade of the application.
When we attempted to restore the database on the new virtual machine, it would not
work. We double-checked the backup logs to see if there was a problem with the
original SQL Server backup. Every indication we had was there were no issues. We
then decided to do another full backup of the production SQL Server database and apply
that to the new virtual machine. No matter what we did, we could not restore the
database. We could not find a problem anywhere with the backups that were being
performed.
This was a production SQL Server database where the backups were being taken with
no errors, yet would not work on the restore. As a database administrator, this is a
serious problemand a DBAs worst nightmare. We immediately opened up critical
tickets with Microsoft, the vendor that provided the hypervisor, and the third-party
application vendor. When we got the answer, we nearly lost it. This was a known
problem when a SQL Server database was being virtualized on this vendors
hypervisor. The virtualization vendor did not have any workarounds and acted like this
was not a big deal.
From a DBAs perspective, this was a huge problem. By altering the operating stack,
they had created a situation that could have put the company out of business. Because the
database was running on a Type-2 hypervisor, the database was running differently than
it would have on physical equipment. The combination of the alterations to the operating
system and the lack of full virtualization created this very dangerous situation.
When you virtualize a database, make sure its on VMware vSphere, which has been
proven by hundreds of thousands of customers successfully running mission-critical
systems. vSphere is a full-virtualization implementation and does not alter the
operating stack in any way. This means the virtualized database will perform exactly
as its counterpart on physical hardware, and you wont ever have to worry about your
database backup not being valid like our customer experienced in this example.
Do not confuse a paravirtual hypervisor with paravirtual drivers. Paravirtual drivers
are built to optimize performance in a virtual environment. In our experience, you
should take full advantage of these drivers where it makes sense. A great example of a
driver you should consider for your SQL Server database is the Paravirtual SCSI
driver.
This chapter focuses on the things you need to know and do as you start down the path of
database virtualization. The advice given in this chapter takes a very conservative
approach, with the end goal of helping you avoid the common traps and pitfalls
encountered when you first virtualize a production SQL Server database. Topics
covered include the following:
Documentation
The implementation plan
The importance of obtaining a baseline
Additional considerations
A birds-eye view of the implementation process
Doing IT Right
Our experience has taught us that the best place to start down the path of database
virtualization is to read the documentation. The first thing many DBAs do when a new
version of the database comes out is to install it and start using it. (In a nonproduction
environment, of courseno DBA is going to deploy a new version of the database,
including a database patch, without first testing it.)
The problem is that those same DBAs dont always circle back and do a complete read
of the documentation from front to back. This is further compounded by the fact that
vSphere is easy to install and use right out of the boxit lulls you into thinking you do
not need to read the documentation. A strong word of caution is in need here: What has
worked up until now in your virtualization infrastructure will not necessarily work
when you put the demands of a production database onto that environment!
Tip
Read all the documentation from all the vendors. That includes VMware,
Microsoft, the network vendor, and especially the storage array vendorin
particular, their SQL Server Best Practice Guides.
Tip
When you baseline a SQL Server database, make sure your sample interval is
frequent. CPU, memory, and disk should be sampled in 15-second intervals or
less. A lot can happen in a database in a short amount of time.
Phase 2: Discovery
Phase 2 is the discovery stage of the process. This is unlike the discovery phase shown
in the previous plan, which is focused on obtaining a proper inventory of the current
system and an assessment of the current physical infrastructure to ensure you understand
the full scope of what you will be virtualizing and then establishing a baseline of the
current workloads so you can establish the requirements of CPU, memory, disk, and
network.
The discovery stage for virtualizing a database is focused on establishing a baseline of
the existing vSphere environment and comparing it to the baseline of the existing
database workload to understand where the environment is deficient. At this point in
time, you already have an existing vSphere infrastructure onto which you will be
introducing the database workload. You need to understand what will happen when you
introduce the demands of a production database onto that infrastructure. Identifying
those deficiencies in the existing environment and making the necessary adjustments to
support the production database are important at this stage.
Phase 2.1: Database Consolidations
This is an excellent point in the process to give some serious consideration to going
through a database consolidation exercise. A lot of the information you need for the
database consolidation you are already gathering in the discovery phase. Based on our
experience with SQL Server database consolidation efforts, we typically see greater
than 50% consolidation ratios. Not only does this lower the database management
footprint of the environment, it can have an impact on licensing.
Phase 3: Infrastructure Adjustments
At this point, you have analyzed the vSphere baseline and compared it to the database
baseline and fully understand where the existing infrastructure is deficient. You need to
make the necessary adjustments to that infrastructure so it is able to meet the resource
needs of the database.
This could be as simple as adding more memory to a host, adding additional hosts to a
cluster, moving virtual machines off a host to free up resources for a database, or adding
a high-performance storage array to the existing environment. What is important is that
once you understand where the existing infrastructure is deficient, you make the needed
adjustments so when you virtualize the database your efforts will be successful.
Phase 4: Validation and Testing
It is always important you take the time to test the infrastructure and validate that it will
be able to meet the demands of the database once it is placed onto that infrastructure.
One of the scenarios you need to take the time to test is what happens when the physical
host that houses your production database fails. You want to make sure that the
infrastructure, as configured, will still be able to meet the business requirements for
availability with adequate performance during this scenario.
Phase 5: Migration and Deployment
As you prepare for the migration of the database over to the virtualized environment,
you have a number of ways to accomplish this. During the requirements-gathering phase,
you will have determined the acceptable amount of downtime for each of the database
instances you are about to virtualize. For those production databases where you have an
adequate downtime window, it is common to see a backup/restore used.
For those database instances where downtime needs to be minimized, experience has
taught us that the go-to method with low impact is log shipping. You pre-create the
database instance on the virtualized infrastructure, move over the instance-level objects
(such as database mail settings and profiles, instance-level logins, agent jobs,
maintenance plans, SSIS packages, and server-level triggers), and then use log shipping
to move over the data to the new database instance.
The plans for how the virtualized database will be backed up should also be reviewed.
The most important step is to perform an actual restoration from this backup to ensure it
works and that the DBA team is confident in the database-restoration process.
Phase 6: Monitoring and Management
You will read over and over in this book how a virtualized infrastructure is a shared
infrastructure. This means changes in how you monitor and manage the environment
once the production databases go live on it. Its important that the DBA communicates
what they need from the team responsible for the shared environment and recognizes
moving forward that they need to work with the team for the common good of all.
Summary
The focus of this chapter is summarized in its title: Virtualizing SQL Server: Doing IT
Right. That fact that you are virtualizing already means you are well on the path to
doing information technology right. Its important to start off right by reading the
documentation. I know this sounds old school, but sometimes old school is the right way
to do things. Make it a point to read all the vendor documentation, including VMware,
Microsoft, the network vendor, and the storage array vendor. We strongly encouraged
you to pay special attention to the storage array vendor documentation. Our experience
and VMware Support data both support the fact that the storage layer is the source of
many issues that can be avoided if one takes the time to understand the storage
capabilities and deploy it correctly.
We stressed the importance of a proper baseline of both the existing vSphere
infrastructure and the current database infrastructure. You use these two baselines to
determine where the existing infrastructure needs to be adjusted to be able to meet the
demands of a production SQL Server database when it is added to the environment. We
ended the chapter by walking through how a database virtualization implementation plan
is different from when you first start virtualizing your infrastructure.
Chapter 5. Architecting for Performance: Design
Communication
Before we get into the technical aspects of architecting SQL Server, lets talk about
what is likely the single most import aspect of virtualizing SQL Server. As you will see,
this is a chapter of one words, and the one word we have found most critical to
successfully running databases in a virtual environment is communication. For example,
when a virtualization administrator mentions vMotion, DRS, or VAAI, does the DBA
know what that means? What about the term cluster? Whose cluster are we talking
about? For DBAs, the term means one thing and for vSphere administrators it means
something different.
Tip
Effective communication starts with everyone using the same language, so cross-
train each other. Once everyone is speaking the same language, then when terms
are thrown around, everyone understands what is being discussed. Reduce the
ambiguity.
What does communication have to do with architecture? You can have the biggest
servers, fastest storage, best-tuned network, but if effective communication does not
exist between the teams responsible for running SQL Server, then expectations are
improperly set. At some point, when the system either breaks (hey, this is IT after all) or
performance does not meet expectations, the blame game will begin. In addition,
despite how good someone is at his or her job, nobody can know everything. You need
to be able to rely on your coworkers who have deep knowledge and years of experience
in their related fields.
If the necessary individuals are brought on board and made part of the process from the
beginning, they are more likely to buy into the success of the project. They buy into it
because they are part of the process. Because they are part of the process, they will
want to succeed. Or, if that psycho mumbo jumbo does not work, they will buy into the
project because their name is associated with it and self-preservation will kick in. They
will assist because they want to keep their job.
Note
We have been on many database virtualization initiatives over the years.
We have seen plenty of DBAs start the process kicking and screaming,
hugging their physical servers not wanting to let them go. However, when
management states the direction is to virtualize SQL and they are part of a
successful project, it is fun to watch their attitudes change. (We might need
to get out more often.)
Mutual Understanding
For the VMware administrators out there, please take off your shoes and hand them to
the DBAs. DBAs, please take off your shoes and hand them to the VMware admins.
Now, lets walk a bit in each others shoes. Based on our combined years of experience
working with VMware administrators and DBAs (and, yes, this is a generalization), we
have found it is not always the fear of virtualization that prevents database
virtualization; instead, it is the unknown, a lack of knowledge by both sides of each
others world that stalls this initiative. In addition, most vSphere administrators do not
understand what it takes to manage and maintain a production database that holds more
than application configuration information.
When we look at virtualization through the eyes of a DBA, their view can be summed up
in one word: shared. DBAs have spent hours designing, sizing, and optimizing a
database for a dedicated environment, and now the virtualization team is asking them to
move it from their dedicated environment to a shared environment. The fact that their
SQL virtual machine will be sharing the resources of the physical host with other guest
operating systems causes grave concerns and raises anxiety to unprecedented levels.
So how do we lower these anxiety levels and address these concerns? In a word,
education. Education of both the DBA and the vSphere administrator is a necessity. The
DBAs need to be educated as to the benefits of virtualization, how virtualization works,
and the best practices, management, and troubleshooting of a virtualized database
environment. Once DBAs better understand the virtual environment, they are able to
communicate with their coworkers in an effective manner.
We have spoken with many DBAs, and many of them understand that virtualization is a
train that is coming and that they are staring down the tracks at its headlamp. They want
to get on board and begin the process of virtualizing their databases, but they just dont
know how or where to get started. They do not know what they do not know. Lack of
knowledge creates fear, and it is this fear that causes angst and opposition in meetings.
It is through education that we are able to reduce this angst and make progress on
providing the best infrastructure on which to run a database.
For the VMware administrators, it is all about taking a breath and understanding that
although the virtualization journey has been successful so far (lets face it, you are
talking about putting databases onto vSphere, so you are doing something right), the way
you approach the virtualization of databases is going to need to change. It needs to
change because the way you virtualize nondatabase workloads is different from how
you virtualize database workloads. When looking to virtualize large, complex, mission-
critical workloads, vSphere administrators need to slow down and work with the
individuals on other teams who have deep expertise in their respective knowledge
domain to create a trusted platform for these applications. Remember, this is a journey,
not something that is going to happen overnight. Also, you only get one shot to get this
right. You must take the time to understand the database and DBA requirements in order
to ensure success.
Center of Excellence
If you dont have time to do it right, when will you have time to do it over?
John Wooden
We started this section with a quote from a coach. Virtualization of business-critical
workloads requires a team effort, and it requires individuals on that team to work
together for the good of their organization. Whats needed to be successful when
virtualizing SQL Serverand what we have seen lead to successful virtualization
projects with our customersis the building of a Center of Excellence (CoE) team.
Now, we all need another reason to have a meeting because we do not have enough of
them throughout the day. Sarcasm aside, we deem this a critical piece of successful SQL
(and other mission-critical workloads) deployments.
The structure of a CoE is straightforward. The CoE team consists of one or two
individuals from a respective technology to represent their team during a CoE meeting.
For example, the following teams would have representation during this meeting:
virtualization, DBA, networking, storage, security, and procurement, as depicted in
Figure 5.2.
Figure 5.2 Example of a Center of Excellence team structure.
The meeting structure serves multiple purposes. One purpose is to ensure proper
communication of information between teams. The CoE should be looked at as a means
for teams to discuss upcoming technologies, patches, updates, upgrades, and so on that
may affect the infrastructure. This is key because virtualization has a symbiotic
relationship with multiple aspects of the IT landscape.
For example, say the storage team is looking to upgrade the firmware on their
controllers. By communicating this change during a CoE meeting, the respective teams
affected (vSphere administrators and infrastructure administrators) are responsible for
checking their systems for compatibility prior to the upgrade. The last thing someone
wants is an upgrade to occur and it generate problems with the infrastructure, especially
if we are talking mission-critical workloads.
Tip
Always double-check the hardware compatibility list between versions of
vSphere. Ensure that the hardware you are using is on the compatibility list. Also,
double-check firmware versions for HBA drivers with your hardware vendors.
Deployment Design
Now that we have a team in place and everyone is speaking the same language, lets
start looking at the technical aspects of virtualizing SQL Server. We will begin this
section with gaining an understanding of the current environment and then discuss
deployment options. Most of the customers we meet with have existing databases on
physical servers that they want to migrate into their VMware environment.
We are going to start with a simple exerciseunderstanding the SQL Server workload
types. We will then move on to deployment considerations. We are talking about
deployment considerations in this chapter because if we standardize our deployment
process so that we have a consistent, repeatable process, then we can better predict the
impact of additional SQL workloads on the existing infrastructure. This makes capacity
planning easier. Consistent and repeatable deployments make management and
monitoring of systems easier. When something goes wrong, having a consistent,
repeatable deployment makes troubleshooting easier.
Soapbox
When you are virtualizing SQL Server and Tier 1 workloads in general, the
emphasis on virtualization ratios must be deemphasized. Although virtualization
of SQL can yield cost savings, consolidation, and other great benefits, this is not
the primary reason for virtualizing SQL. If management is basing virtualization
success by consolidation ratios, it is time to adjust expectations because
virtualizing brings more to the table than just consolidation and cost savings. It
can help improve service levels overall.
Reorganization
Did your heart skip a beat at that title? I know mine does every time I hear that word.
Lets review how we designed and implemented our physical database servers with a
trip down Memory Lane, if you will.
We would try and figure out how much of a given resource we would need to run the
current workload, that workloads anticipated growth over the next three to five years,
and then add X% buffer to be safe. Then we went to our favorite server vendor and
asked for the biggest box they had that could run our workload for our five-year
projected requirement. We ordered the box, and when it came in, we were happy... for
about a week because then we found out a bigger, faster, better model was just released!
We unboxed our new hardware, racked it, stacked it, installed the operating system,
installed SQL, and then we were off and runningand, boy, werent we happy. It was
the best of times... for about a week. Because, sure enough, a project popped up and,
wouldnt you know it, the project required a database and there was nowhere to put it.
Therefore, the database ended up on the database server you just racked and stacked.
Over time, these database servers become populated with projects, DBAs shift to a
mentality of if it fits, put it on. This is known as consolidation. So many times when
we speak to DBAs about the benefits of virtualization, they reply in kind with, We
already consolidate our databases to get the maximum performance out of our servers.
True, they are running those boxes to the limit, but are they running them as efficiently
and as effectively as possible? Perhaps the single biggest downside we have seen with
this configuration is security.
Lets dive into what we are talking about concerning security. What we find in the
majority of our conversations with customers is that as these database servers are filling
up, the speed at which the business is requesting database creation does not allow for
proper segmentation. So what ends up happening is that internal, production databases
are now run among third-party databases, test databases, development databases, and so
on.
There are a couple challenges here. With SQL Server patching, the lowest common
denominator is the instance level. So what happens when a third party does not support
the latest SQL Server patch that your information security team is asking you to install to
prevent a serious security risk? Do you patch the server to prevent the risk and satisfy
the information security team, or do you run in an unsupported configuration for this
third-party application? What if this is a customer-facing, revenue-generating
application?
Another item to consider and discuss is compliance. We have had many DBAs tell us
they are being asked by Audit and Compliance whether their database server will meet
their particular industrys regulatory guidance. For a lot of customers, the answer is no.
What do we do about all of this? Reorganization. Because we are going down the path
of virtualizing our physical SQL Server environment, why not take the opportunity to
reorganize our database servers into compliant stacks. Identify and group databases
according to their operational level (production, QA, test, development), internal
development, third party, adherence to industry regulation (HIPPA, SOX, and so on),
and whether they play well with others. This last grouping of databases includes
recipients of bad up-level code, those that have unique performance requirements, those
that require special maintenance windows outside the other database, those that host
applications from a third party that does not always support the latest patches, and so
on. The design goals here are to group databases that will optimize security, stability,
and performance, thus leading to decreased operational overhead and producing a less
complex environment to manage with a smaller group of standard configurations.
Tip
Grouping databases according to function, operational level, compliance
requirement, and other factors can lead to improved operations, increased
security, and a more stable SQL Server environment.
As we have discussed in this section, the typical deployment we see in the physical
world within our customer base is a large physical server loaded with as much RAM as
the server can possibly hold. The SQL Server is configured with multiple instances and
many databases sitting behind each instance. What we see after customers begin their
journey down SQL Server virtualization is smaller virtual machines (smaller being
relative), with SQL Server being configured with only one instance. However, we do
have customers who carry over the same model of SQL Server running multiple
instances and supporting many databases.
It is important to remember the entire system when designing your architecture. Do not
forget about features within the vSphere platform, such as DRS, vMotion, Storage DRS,
Storage I/O Control, and Network I/O Control, just to cite a few, and how some of these
features work more efficiently with smaller virtual machines than larger virtual
machines. We are not saying these features do not work or work well with larger
systems. Instead, what we are trying to convey is that you need to make your design
agile, flexible, and highly available while delivering the necessary performance
requirements. There are more places for smaller virtual machines to fit and schedule
than there are for very large virtual machines.
The reason customers end up with a design featuring relatively smaller virtual machines
running only one instance and many databases behind it is to take advantage of the scale-
out performance and operational benefits of cloud computing. Customers find it easier
to scale out than to scale up. Yes, this means there may be more operating systems to
manage, but with the amount of automation that exists today, this is becoming less and
less of a concernespecially when one of the objectives of virtualizing mission-critical
databases is reducing operational risk and improving SLAs.
Note
SQL Server SysPrep is limited to standalone instances of SQL Server and
is supported with SQL Server 2008 R2 and later. To learn more, including
limitations, go to http://msdn.microsoft.com/en-us/library/ee210754.aspx.
For information on deploying SQL Server 2012 via SysPrep, go to
http://msdn.microsoft.com/en-us/library/ee210664.aspx.
Tip
To drive down operational complexities and increase the ability to deliver
requests to the business, consider a tiered database offering and personalized
delivery of these services. Use this service to provide transparency concerning
the cost of running these services.
Physical Hardware
Now we are going to discuss the importance of selecting the right hardware for running
your database workloads. Buy the biggest, fastest, baddest servers and storage and you
will be alright. There, that about does it, right? If it were only that simple. However,
this is how a lot of physical servers that are running database workloads today are
purchased. One of the main reasons for this is the limitations of the physical world. We
have to plan for how large this database (or databases) will be three to five years out
and how often are we right? How often do we end up placing databases on these servers
because there is room and not because this is where we intended them to reside when
we originally designed the system. Say it with me, consolidation.
One of the first places to start when considering physical server hardware to run your
SQL virtual machines is to take an inventory of the makes and models of hardware
present in the data center today and combine this with the future direction of the
organization. For example, is the organization moving away from rack mount servers
toward blade servers? Then try to determine whether the direction in which the
organization is moving will impose any constraints to your design. An example may be
that a Tier 1 database requires more RAM than is currently available in the blade make
and model that your organization procures.
CPU
Next, it is important to understand the proper amount of physical CPU power that will
be necessary to run the databases you are looking to virtualize. Remember, a virtual
environment pools and abstracts resources; therefore, we must adjust our vernacular to
reflect this change. For DBAs, it is important to change from how many CPUs the
database requires to what the aggregate clock cycle is the database requires. Lets
look at this in more detail.
Historically, from a CPU perspective, in the physical world DBAs look to procure as
many CPUs as possible in their systems because they are planning for three to five years
out and they want to get these resources upfront and not have to fight for them later.
When SQL Server is virtualized and the right versions of the Windows operating system
and SQL Server are running, virtual machines can have the memory size and virtual
CPU count increased while the systems are running. If the required versions of
Windows for an operating system or SQL Server are not available, the option still
remains; however, it may become a downtime operation. If AlwaysOn is used and
automated failover is available, you can update the configuration of the standby node,
let it catch up and then fail over, and then update the primary node, let it sync back up,
and then fail backall with almost no disruption. The same could be said of a Failover
Cluster Instance environment: You only have as much downtime as it takes for two
failovers to occur.
Another point to understand in a virtual environment is how resources from virtual
machines are allocated time against physical components of the server. For vSphere,
this is handled by the Scheduler. The CPU Schedulers main goal is to assign execution
contexts to processors in a way that meets system objectives such as responsiveness,
throughput, and utilization (https://www.vmware.com/files/pdf/techpaper/VMware-
vSphere-CPU-Sched-Perf.pdf). Therefore, scheduling is when the hypervisor needs to
identify and execute the CPU instructions of a given virtual machine. The more virtual
machines that exist on a physical server, the more complex this operation becomes.
Complexity becomes further increased as virtual machines vary in virtual CPU
configurations because the hypervisor needs to schedule all the virtual CPUs to execute
in a timely order.
The design goal is to run your SQL Server virtual machines with enough virtual CPUs to
satisfy peak requirements, and not a single virtual CPU more. By adding unnecessary
virtual CPUs to the virtual machines, you make the hypervisors job of scheduling more
complex, and this may cause unnecessary delays in the scheduling of your virtual
machines, thus introducing unnecessary performance issues.
Tip
It is important to understand how the current version of vSphere handles
scheduling of virtual CPUs. To learn more, read this whitepaper:
https://www.vmware.com/files/pdf/techpaper/VMware-vSphere-CPU-Sched-
Perf.pdf.
For those who are looking to migrate physical databases into vSphere, a good starting
point is VMware Capacity Planner. Run VMware Capacity Planner against the target
servers to get an understanding of the current CPU workload. It is important to
understand the sample intervals for which VMware Capacity Planner has been
configured and whether this meets the cyclical nature of the databases under analysis.
The last thing you want is your analytical tool to miss a workload spike inside the
database because it did not fall within the sample interval. Several analytical tools are
available to perform this analysis, so be sure to choose the right one. We recommend
reviewing the baselining discussion in Chapter 10, How to Baseline Your Physical
SQL Server System.
Note
Use the right tool that will capture the peaks and valleys of the target
database servers resource utilization. In addition, understand what level
the tool operates at: the entire server, SQL Server instance, or individual
database. If you miss peaks, you risk undersizing your virtual machines and
the physical hardware required to run them successfully.
From a design perspective, a typical starting point is 2 vCPUs:1 physical core. As you
increase the number of vCPUs in your virtual machines, this starting point will require
adjustment due to the requirements necessary to schedule this virtual machines vCPUs.
The previously stated numbers are to be treated as dynamic and not as definitive
guidance or used as a benchmark for density. This is a starting point that should be
adjusted up or down based on workload, processor type, processor speed, and other
factors. Remember, our guidance is to always start conservative with Tier 1 workloads
such as SQL Server. It is always easier to add additional work to a physical server than
it is to ask management for a new physical server because you underestimated the
requirements.
Memory
When it comes to sizing databases running on virtual machines and physical servers, we
will again start with understanding what the current environment supports and can
tolerate as well as the future stated direction. It should come as no surprise that in the
majority of our engagements, we find customers run out of physical RAM before they
run out of physical CPU resources on their vSphere hosts running database workloads.
The exception to this, we have found, is when customers insert flash into the physical
servers. From a design and architectural perspective, you need to understand if there is
a desire and need to implement flash storage inside the physical server. If this is the
case, work with VMware and the flash storage vendor to understand how the
implementation affects the overall memory sizing of the physical hosts and potentially
the consolidation ratio to ensure maximum benefit for your investment.
Balancing the cost of memory versus the number of virtual machines a physical host can
run is an art. As we have stated several times throughout this book, when we begin to
virtualize database workloads, consolidation ratios are the least important, and they
should not be the primary goal. Again, a good starting point for this is to leverage
VMware Capacity Planner against your physical database servers to get a high-level
understanding of the RAM requirements for a system.
Earlier in this chapter, we discussed reorganization. If this is a strategy you are looking
to implement for your database virtualization initiative, it is critical to use a tool that
can provide per-database statistics. Not all tools are created equal and will report at
different levels of the server. What we mean by this is that some tools work at the
system level, some work at the SQL instance level, and others work at the individual
database level. Make sure the tool you select for this project can provide the granularity
needed for your initiative.
Virtualization Overhead
In addition to understanding what the current environment will require, it is important to
understand the virtualization overhead of running virtual machines and to account for
this in your sizing and management of the physical ESXi hosts. Table 5.2 is from the
vSphere 5.5 Resource Management Guide and provides a sample overview of the
memory overhead associated with running a virtual machine. It is important to note that
changing either the number of virtual CPUs or amount of RAM assigned to a virtual
machine changes the amount required by ESXi to run the host.
Table 5.2 Sample Overhead Memory on Virtual Machines
It is important to understand the virtual machine overhead and to manage this
appropriately as you scale your systems. Not managed appropriately, the physical host
can run out of physical memory, thus affecting virtual machine performance. Ensuring
SQL Server has the appropriate amount of memory available is crucial to SQL
performing well in any environment.
We will get into more detail concerning the memory-reclamation techniques that
vSphere leverages in Chapter 7, Architecting for Performance: Memory. However,
we do want to mention them here, along with our recommendations. Here is a list of the
techniques employed by the vSphere hypervisor:
Transparent page sharing
Memory ballooning
Memory compression
Swapping
Our recommendation is to leave these settings enabled. This comes from the fact that in
a properly designed production environment, the environment should be architected to
avoid memory contention. In addition, should an event arise that causes memory
exhaustion of the physical host, if some of the recommendations are disabled, you are
forcing vSphere to default to the action of last resort, swapping, which has the heaviest
impact on performance. Based on the details covered in Chapter 7, it is our opinion that
when we compare the overhead associated with functions such as transparent page
sharing, memory ballooning, and memory compression, there is greater benefit to
leaving these features enabled compared to the performance benefits associated with
disabling them.
Note
To change the default location of the virtual machine swap file, see this
VMware KB article: http://kb.vmware.com/kb/1004082. Keep in mind
your failure zones and operational impact when making this change.
Large Pages
By default, VMware vSphere enables large pages. Mem.AllocGuest.LargePage is set to
1 out of the box, which means it is enabled. By default, ESXi will back all memory
requests with large pages. These large pages are broken down to 4KB in size to support
transparent page sharing (TPS). This is done to maximize the use of the precious
translation lookaside buffer (TLB) space and to increase performance.
Microsoft SQL does support the use of large pages, and beginning with SQL Server
2012 this is enabled by default when the account running sqlservr.exe has been given the
Lock Pages in Memory permission. Versions prior to SQL Server 2012 require the Lock
Pages in Memory right as well as Trace Flag 834 to be enabled. More information on
how to configure these settings can be found in Chapter 7. Note that large pages are
allocated at boot time by SQL Server, so a restart of the virtual machine is required
after configuration of these settings.
Note
Make sure to set the Lock Pages in Memory privilege (SQL 2012) and also
turn on large pages for SQL (version prior to SQL 2012). See this
Microsoft KB article for more information:
http://support.microsoft.com/kb/920093.
Tip
When large page support has been properly configured, SQL Server will attempt
to allocate contiguous pages in memory at boot time. This can cause longer boot
times for the SQL Server virtual machine. Refer to this blog post for more
information: http://blogs.msdn.com/b/psssql/archive/2009/06/05/sql-server-and-
large-pages-explained.aspx.
NUMA
NUMA stands for non-uniform memory architecture. When looking at your server, you
will notice sockets, cores, and memory residing on these servers. Memory is associated
with the sockets and core. Cores will preferably access memory local to them versus
memory located on another stack. This is all about data locality: The better the
locality, the better the performance. The goal here is to have the cores access memory
local to them versus having to travel to another socket and core stack to access memory.
This is known as remote memory access or cross-socket communication and has
performance implications because the request has to travel across the front-side bus of
the motherboard to access the remote memory region and then back to the originating
core to return the information. Figure 5.6 details out a NUMA configuration. NUMA
cross-talk occurs when the CPU on a NUMA node must traverse interconnects to access
memory on another NUMA node.
Figure 5.6 Sample NUMA architecture.
Note
To learn more about NUMA, go to http://en.wikipedia.org/wiki/Non-
uniform_memory_access.
NUMA is configured in the BIOS of your physical servers. Always read your server
vendors documentation, but typically NUMA is enabled by disabling node interleaving
in the BIOS. Having NUMA enabled is usually the default setting.
VMware has supported NUMA since ESX 2.5. This means that if your physical
hardware both supports and is configured for NUMA, the hypervisor will work to
maximize performance of the system. There are features in the vSphere platform that we
will discuss in the paragraphs that follow because they will come into play when
architecting SQL Server databases to run on a vSphere host.
The first feature is Wide NUMA. Wide NUMA was introduced in vSphere 4.1. Using
the image in Figure 5.6, we have a four-socket, six-core system with hyper-threading
enabled. If a virtual machine was created that had six vCPUs, this virtual machine
would be assigned to run on one of the four available NUMA nodes. This designation is
called the virtual machines home NUMA node. Now, a 12 vCPU virtual machine is
created. As part of determining placement, hyper-threading is ignored, so that means this
virtual machine will span two and only two NUMA nodes in our physical server. The
way this works is the maximum number of physical cores in a socket will become the
minimum number of vCPUs assigned for that virtual machine. For the 12-vCPU virtual
machine running on a four-socket, six-core box with hyper-threading enabled, the
NUMA Scheduler will assign a home NUMA node (node 1, 2, 3, or 4) and then manage
six of the 12 vCPUs on this NUMA node. The remaining six will be assigned another
home node on another NUMA node. Figure 5.7 provides a visual example of this.
Notice how NUMA node 3 has been designated as home node 1, and six vCPUs are
assigned to this NUMA node. Then, NUMA node 1 was selected as home node 2 to run
the remaining six vCPUs.
Note
vNUMA is disabled when CPU Hot Plug is enabled.
To change the default vNUMA setting of requiring nine or more vCPUs, open the
vSphere Web Client and navigate to the virtual machine you want to modify. Click Edit
Properties, then click the VM Options tab, expand Advanced, and click Edit
Configuration to the right of Configuration Parameters. If the
numa.vcpu.maxPerVirtualNode parameter is not present, click Add Row and manually
add the parameter. See Figure 5.8 for more information. In Figure 5.8, we inserted the
row and configured it for eight cores.
Figure 5.8 Advanced setting numa.vcpu.maxPerVirtualNode.
vNUMA is set upon the first boot of the virtual machine, and by default does not change
unless the vCPU count is modified. One of the reasons this does not change is that not
all operating systems (or even all applications) tolerate a change to an underlying
physical NUMA infrastructure topology. In addition, sometimes the operating system
can adjust, but the application cannot. Therefore, make sure to understand the
applications you are working with before making changes that could negatively affect
their performance. We will discuss advanced settings later that change the default
behaviors. Before we get there, we will discuss the defaults first, because it is
important to understand what vSphere is doing under the covers and to use this
information to determine whether a change is necessary.
The method by which the vNUMA topology is set is as follows: Upon the first boot of a
vNUMA-enabled virtual machine, a check is made to see if the Cores per Socket
(virtual socket) setting has been changed from the default value of 1. If the default value
of 1 is present, then the underlying physical servers NUMA architecture is used. If the
default value of Cores per Socket has been modified, this determines the virtual
machines NUMA architecture. See Figure 5.9 for a screenshot of where this setting is
located.
Figure 5.9 The Cores per Socket setting.
If you are going to change the default Cores per Socket setting, change it to an integer
multiple or integer divisor of the physical servers NUMA architecture. For example, if
you have a four-socket, six-core server, the Cores per Socket setting should be 2, 3, or
6. Do not factor hyper-threading into your calculation. When you are running a vSphere
cluster that has mixed physical NUMA configurations and you elect to modify the
default Cores per Socket setting, select a setting that aligns with the smallest NUMA
node size across all physical hosts in that vSphere cluster.
It is our recommendation that the default setting of 1 for Cores per Socket be used. The
reason for this recommendation is due to simplification, maintenance, and long-term
management. This parameter is set upon first boot for the virtual machine and does not
change if the virtual machine is vMotioned or cold-migrated to a physical server with a
different underlying NUMA architecture. This can result is a negative performance
impact on the virtual machine. The only time this setting is updated, by default, is when
the vCPU count is modified.
Note
Our recommendation is to leave Cores per Socket at the default setting
unless you have a reason to change it, such as licensing.
Real World
Lets be honest for a minute. Lets say that 21 months (I like odd numbers) after
modifying this setting on a SQL Server virtual machine, you introduce new
hardware into your vSphere cluster with a different underlying NUMA
architecture, and you vMotion the SQL virtual machine to the new hardware. Are
you going to remember to change the setting? When the DBA team calls and says
that despite you moving the SQL virtual machine to newer, bigger, faster
hardware, performance is worse, are you going to remember that the Cores per
Socket setting may be causing this performance dip? If you need to adjust the
parameter, adjust it. Just make sure you have well-defined operational controls in
place to manage this as your environment grows.
If possible, when selecting physical servers for use in your clusters, attempt to adhere to
the same underlying NUMA architecture. We know, this is easier said than done.
Initially when a cluster is built, this is more realistic; however, as time is introduced
into the cluster and servers need to be added for capacity or replaced for life cycle, this
makes adhering to the same NUMA architecture more difficult.
One final note on NUMA. We are often asked, How do I figure out my servers NUMA
node size? The best way is to work with your server provider and have them detail out
sockets, cores, and memory that make up a NUMA node. This is important to ask,
because the size of a NUMA node is not always the number of cores on a chip; take, for
example, the AMD Piledriver processor, which as two six-core processors on a single
socket. AMD Bulldozer has two eight-core processors on a single physical socket, also
making it two NUMA nodes.
Hyper-Threading Technology
Hyper-Threading Technology (HTT) was invented by Intel and introduced in the Xeon
processors in 2002. At a high level, HTT has two logical processors residing on the
same physical core, and these two logical resources share the same resources on the
core. The advantage of this is if the operating system is able to leverage HTT, the
operating system can more efficiently schedule operations against the logical
processors. When one logical core is not being used, the other processor is able to
leverage the underlying resources, and vice versa.
Note
IBM had simultaneous multithreading as early as the 1960s:
http://en.wikipedia.org/wiki/Simultaneous_multithreading.
In a vSphere environment, this provides the Scheduler more logical CPUs to schedule
against, thus increasing throughput. This allows for concurrent execution of instructions
from two different threads to run on the same core. This is not a doubling of throughput,
and the amount increase varies depending on the processor, workload, and sometimes
which way the wind is blowing.
HTT is enabled in the BIOS of the physical server, so double-check with your server
manufacturer on validating if this feature is turned on for your hardware. HTT is
enabled by default on ESXi. The VMkernel, which is responsible for the scheduling,
will do its best to schedule a multi-vCPU virtual machine on two or more different
cores. The VMkernel will attempt to spread this as wide as possible, unless you have
changed the preferHT advanced setting, while adhering to things like NUMA, to get the
best possible performance. If this is not possible, the VMkernel will schedule the multi-
vCPU virtual machine against logical processors on the same core. vSphere will
attempt to schedule against cores first, and once those are in use it will move on to using
the logical portion of the core. In addition, the CPU Scheduler will track how much time
is spent by a world (an execution context that is scheduled against a processor)
scheduled against a full core versus a partial core. Because time spent in a partial core
does not equate to equivalent performance of time spent in a full core, the CPU
Scheduler will track this time and, if necessary, move a world from executing against a
partial core to executing against a full core.
Note
A virtual machine is a collection of worlds. A world exists for each
vCPU, MKS (mouse, keyboard, screen), and the virtual machine monitor
(VMM). These worlds are executed against a CPU (physical or logical) via
the CPU Scheduler.
There have been two changes in vSphere 5.0 that allow for fairer balancing of worlds
against full and partial cores. The first is a contention check. The CPU Scheduler tracks
how much time is lost due to contention on behalf of HTT. Because time lost can even
out over time, meaning laggards can catch up, as longs as the fairness threshold that
would cause the CPU Scheduler to migrate the world from a partial core to a full core is
not exceeded, the world will continue to be scheduled against the logical processor.
The second change in vSphere 5.0 is that the CPU Scheduler will take into account
when both logical processors are busy on a core; in previous versions of vSphere this
was not the case. The amount of time a world spends executing is tracked as CPU Time,
and the tracking of this is called charging. The amount of time charged for a world in
CPU Time is affected by execution against a partial core. As of vSphere 5.0, the amount
of time charged when both logical CPUs are busy is greater, which leads to the ability
of a vCPU that has fallen behind to catch up to its other vCPU partners in a more timely
fashion.
Note
To learn more about the CPU Scheduler and optimizations, we recommend
reading The CPU Scheduler in VMware vSphere 5.1
(https://www.vmware.com/files/pdf/techpaper/VMware-vSphere-CPU-
Sched-Perf.pdf) and The vSphere 5.5 Resource Management Guide
(https://www.vmware.com/files/pdf/techpaper/VMware-vSphere-CPU-
Sched-Perf.pdf).
So, what does all this mean for virtualizing SQL Server? It is important that the DBAs
and the vSphere administrators both understand whether HTT is enabled and in use. In
addition, it is important that performance be monitored on the system. Remember, the
Windows OS has no idea it is being virtualized, so when it sees it has been assigned 32
cores, it thinks these are full cores, although under the covers these vCPU worlds may
be executing against a logical core. HTT is good; it allows you to get more useful work
done and more performance out of SQL. Our recommendation is to use HTT. Just
remember to account for HTT when it comes to performance and sizing of the SQL
Server virtual machines on your ESXi hosts.
Memory Overcommitment
Next up on the memory train is a discussion of memory overcommitment. Earlier in this
chapter, we discussed vSpheres memory-reclamation techniques and our
recommendation to leave them enabled. When running SQL Servers on vSphere, it is
our recommendation to allow the SQL Servers time to bake, or run in production, on the
vSphere hosts. The baking time we recommend is at least one business cycle. This will
allow you to capture, via your monitoring tool, the performance peaks of the SQL
Server. There are some databases that sit dormant for 2.9 months a quarter, but when
they ramp up for that last week of the quarter, there better not be anything in their way or
else! This also where a tool like vCenter Operations Manager comes in handy because
it can track performance of the database over extended periods of time. Basically, make
sure vCenter is reporting state usage less than what is present on the physical host.
Reservations
To reserve or not to reserve, that is the question. From a DBA perspective, the concept
of a shared environment can be, well, not appealing. DBAs are coming from an
environment in which they knew exactly how much of the physical servers RAM they
had (for those wondering, the answer is all of it) and they are moving into a world
where the physical servers RAM is now sharedand dont even think about those
newfangled memory-reclamation features. From the vSphere administrators
perspective, it is about optimization of the underlying resourcesgetting as much out of
the physical asset as possible.
Before we get too far down this road, our advice to you is to remember there is no such
thing as a free lunch. Remember, there are tradeoffs whenever you enable or disable a
feature, turn this dial up, or turn that dial down. So how does one resolve this
conundrum? Enter reservations. Reservations provide vSphere the ability to reserve, or
dedicate, a set amount of a resource to a virtual machine and only that virtual machine.
Even when the physical host enters an overcommitted state for vRAM, a reservation
guarantees physical RAM will be available for the virtual machine with the reservation.
Reservations are set on a per-virtual machine basis. Therefore, if you build a SQL
Server virtual machine with 32GB of virtual RAM on top of a host with 256GB of
physical RAM, you can reserve and dedicate 32GB of RAM on the physical host to this
SQL Server virtual machine. The benefit of this is that no other virtual machine can use
this physical RAM. The downside is that no other virtual machine can use this RAM.
From the DBAs perspective, this is great, just like the physical world! No sharing of
the underlying RAM and no need to worry about those other virtual machines using
physical RAM that his SQL Server may require. From the vSphere administrators
perspective, there are a lot of things that change under the covers now that a reservation
has been configured.
One of the items that changes that must be taken into account is how reservations are
accounted for with vSpheres Admission Control Policy for vSphere HA. Admission
Control is a feature within vSphere that is designed to ensure resources (CPU and
memory) are available at a vSphere Cluster level during a failure event, such as losing a
physical host. We will go into more detail on how Admission Control works in Chapter
9, Architecting for Availability: Choosing the Right Solution. Just know this is
affected by reservations and the Admission Control Policy selected. Read Chapter 9 to
get further information on Admission Control, because this feature needs consideration
when you are planning a SQL Server installation.
Another item affected is the vswp file size. The vswp file is one of the files that makes
up a virtual machine. By default, it is located in the VM folder containing the other
virtual machine files. The purpose of the vswp file is to accommodate memory
overhead from a virtual machine. In times of contention, the hypervisor can swap out
memory from physical RAM into the vswp file. By default, this file is set to the size of
the virtual RAM assigned to your virtual machine. Therefore, if you create a 32GB SQL
Server virtual machine, you have a 32GB vswp file on your storage. If you set a
reservation, the size of this file is affected by the reservation size. This means the
amount reserved is subtracted from the amount allocated, and the difference is the size
of the vspw file. Therefore, if you set a reservation for the full 32GB, you still have the
vswp file, but it is 0.0KB in size. Table 5.3 shows a virtual machine with no
reservation set, with a partial reservation, and with the Reserve All Guest Memory (All
Locked) setting checked, respectively.
Note
The size of the vswp file is determined at boot time of the virtual machine.
Changing this setting while the virtual machine is powered on will not
change the size of the vswp file. The virtual machine must be powered off
and powered back on for the changes to update.
Tip
If the value configured for Max Server Memory prevents the SQL Server from
starting, start SQL Server with the f option to start an instance of SQL with a
minimum configuration. See this TechNet article for more information:
http://technet.microsoft.com/en-us/library/ms190737.aspx.
Note
When using Max Server Memory and vSphere Hot Plug Memory, be sure to
increase the Max Server Memory configured value any time memory is
adjusted to ensure SQL Server takes advantage of the additional memory
provided.
So what if you are running more than one instance on the SQL Server virtual machine?
There are two options you have, and doing nothing isnt one of them. The first option is
to use Max Server Memory and create a maximum setting for each SQL Server instance
on the SQL Server. The configured value to should provide enough memory so that it is
proportional to the workload expected of that instance. The sum of the individually
configured Max Server Memory settings should not exceed the total assigned to the
virtual machine.
The second option is to use the Min Server Memory setting. Use this setting to create a
minimum amount of memory to provide each instance. Ensure the configured value is
proportionate to the workload expected by the individual instances. The sum of the
individually configured Min Server Memory settings should be 12GB less than the
RAM allocated to the virtual machine.
Our recommendation is to leverage the first optionthat is, configure the Max Server
Memory setting. Table 5.4 provides the pros and cons of the individual settings.
Table 5.4 Configuration Pros and Cons for Multiple Instances of SQL Server
Storage
How you configure and present storage to the SQL Server virtual machines will have a
profound impact on their performance. Chapter 6, Architecting for Performance:
Storage, goes into immense detail around storage configuration, so we will only
discuss the highlights in this chapter.
First, remember that the rules used to size SQL Server in a physical world carry over to
the virtual worldthey just need to be tweaked. Too often we see customers just pick
up SQL and throw it over the proverbial wall. Why is this? Well, lets take a trip down
Memory Lane; lets look back at the last 10 years and compare virtualization and
database implementations.
When we look at virtualization and how virtualization became mainstream, we notice
that the first systems virtualized were test and development systems. These systems did
not impact the business, and if they went down or had poor performance, only IT
noticed. As the software matured and the hardware evolved, we saw departmental
applications go onto the platform, starting with those owned by IT, and eventually
moving outward to non-IT departmental applications. And here we are today, where
there are few workloads that exist that cannot be virtualized due to the work by
VMware, independent software vendors, and hardware vendors. However, what we
didnt see keep up was the systems running vSphere, particularly the storage
subsystems.
Now, hold that thought for a second while we examine the database trajectory. Some
argue that data is the lifeblood of any company and that the health, performance, and
availability of databases are a reflection of how much a company relies on this data.
For a large number of companies, this was important, so when the DBA team said they
needed more compute power, faster disks, and so on, they tended to get what they
wanted. They were the recipients of some nice powerful Tier 1 equipment.
So, if we put those two items together, vSphere coming up and running on Tier 2
equipment along with database servers running on Tier 1 equipment, and someone
migrates databases over to this environment without doing basic engineering and
architecture work such as the number and speed of disks supporting the database, that
person could be in trouble. Trust us, we see this all the time. One of the first things we
do when customers say, It ran better in the physical world than in the virtual world, is
ask them for a side-by-side comparison of the supporting subsystem of each
environment. We ask them to detail out disk type, disk speed, RAID, paths, directories,
and so on. Although some of this may seem obvious, we cannot tell you how many
calls we get concerning SQL performance being slow (love those ambiguous
troubleshooting calls) and we find that storage is sized incorrectly.
Note
When PVSCSI was first introduced in vSphere 4.0, it was recommended
for workloads requiring 2,000 or more IOPS. This has been resolved as of
vSphere 4.1, and the PVSCSI adapter can be used for all workloads. For
more information, see http://kb.vmware.com/kb/1017652.
Caution
Double-check that you are running at the appropriate patch level for vSphere
because there have been updates to address an issue with Windows Server 2008
and Server 2008/R2 reporting operating system errors when running SQL Server.
vSphere versions prior to vSphere 5.0 update 2 should be checked. For more
information, review http://kb.vmware.com/kb/2004578.
Determine Adapter Count and Disk Layout
Once the performance requirements have been gathered, the next step is to determine the
virtual machine layout. How many PVSCSI adapters is this virtual machine going to
use? Remember, the more paths back to the storage array, the more options you provide
the operating system to send I/O out to the array. However, just because you can add
four PVSCI adapters, does not mean that you should. If you have a database that is
housing configuration information for an application, does it need four PVSCSI adapters
and the VMDK files fanned out across all these controllers? Probably not. Balance
performance requirements with management overhead. Again, this is where database
tiering can assist.
Note
To read more about VMDK versus RDM, read this blog article:
http://blogs.vmware.com/vsphere/2013/01/vsphere-5-1-vmdk-versus-
rdm.html.
Note
For more information, read the FAQ on VAAI
(http://kb.vmware.com/kb/1021976) as well as the VMware vSphere
Storage APIsArray Integration white paper
(http://www.vmware.com/files/pdf/techpaper/VMware-vSphere-Storage-
API-Array-Integration.pdf).
Now, lets move on to the provisioning types of a VMDK file. The first type we will
discuss is a Thin Provisioned VMDK. A Thin Provisioned VMDK is one that does not
immediately consume space on the storage in which it resides; however, the operating
system believes it has the full amount assigned when created. The storage consumed is
equal to the amount the virtual machine is actually using. Therefore, the size of the
VMDK will start small and grow over time, up to the size configured.
Thick Provisioned Lazy Zeroed disks are VMDK files that immediately consume the
VMFS space assigned to the virtual machine. The item to pay attention to with this disk
type is that when the hypervisor needs to write IO to the underlying storage, it will send
a zero first and then the data. This is only on the first write to a block; subsequent writes
to the same block do not incur this activity. This is commonly referred to as the First
Write Penalty. For general-purpose virtual machines, this is not a big deal because it is
washed out in the cache of the arrays. However, if you have an application such as a
database that is doing a large number of writes (to a database or log file), this could
have a performance impact. If your storage array supports the Write Same / Zero
primitive, then the zeroing operation, depending on your arrays implementation of this
primitive, may have little if any impact to the performance of the VMDK.
Thick Provisioned Eager Zeroed is the third and final type of VMDK provisioning type.
In this type, all VMFS spaced assigned to the VMDK is consumed. Also, zeroes are
written into each block of the VMDK file. This VMDK file type will take additional
time when created because it has to zero out every block. Just keep in mind what you
are doing when you create this type of VMDK fileyou are sending a whole bunch of
zeroes to the disk subsystem. This is something you want to plan if you are creating a lot
of virtual machines with Thick Provisioned Eager Zeroed disks. As we have stated, you
need to understand what you are doing when you create this VMDK file type, because
the last thing you need is to have an angry storage admin hunting you down because you
just created 4TB worth of activity on the production array during the middle of the day.
So which type do you use? At a high level, it really does not matter which you decide to
use for your standalone or AlwaysOn Availability Group SQL Serversremember, for
AlwaysOn Failover Cluster Instance (FCI), you must use RDMs. When we look into this
a bit more, if Thin Provisioned VMDKs are being considered, then management of
available disk space on the LUN must be managed. Trust us, you do not want to run out
of room on a LUN. From a Thick Provisioned Lazy / Eager Zeroed perspective, with the
Write Same / Zero VAAI primitive, the question now becomes when to take the First
Write Penalty tax. With Thick Provisioned Lazy, the tax is spread out across the life of
the virtual machine, and you only pay tax on the blocks accessed. With Thick
Provisioned Eager Zeroed VMDKs, the zeroing tax is paid up front. Also, you are
paying tax on every block in the VMDK, some of which you may never use. If your array
does not support the Write Same / Zero primitive, then our recommendation is to, at
minimum, use Thick Provisioned Eager Zeroed VMDK files for the database, log, and
tempdb VMDKs.
Tip
The UNMAP command must be manually initiated.
Note
For more information on using the UNMAP command, read
http://kb.vmware.com/kb/2057513.
Data Stores and VMDKs
If you are going with VMDK files, the next topic that arises is whether to dedicate one
VMDK file per data store or to run multiple VMDKs per data store. The guidance we
provide here is to validate with the SAN team and or SAN vendor what VAAI
primitives the SAN you are working on will support and what the performance impact
is of having, or not having, certain VAAI primitives in your environment. One key VAAI
primitive is Atomic Test Set (ATS). This updates the way vSphere does locking to
perform metadata updates. ATS replaces SCSI locking, so check to see if your array
supports this primitive. In the end, as long as the data store is capable of delivering the
IOPS required by the virtual machine(s) using a particular LUN, then either option is
valid. One item for consideration, given that IOPS is met, is the management overhead
associated with presenting a lot of LUNs in the case of one VMDK per data store. The
reason this can be of concern is that vSphere 5.5 still has a configuration maximum of
256 LUNs per host.
Networking
When we consider networking, we start with the virtual switch type. In terms of using a
standard virtual switch or a distributed virtual switch, the choice is yours. We, the
authors, recommend using the distributed virtual switch. The reasons for this come
down to ease of management and the additional features available with the distributed
virtual switch, such as network I/O control, which is discussed in the following
sections.
Virtual Network Adapter
When it comes to choosing a network adapter (or adapters), we recommend using the
VMXNET 3 network adapter, which is a paravirtualized network adapter designed for
performance. VMXNET 3 requires hardware version 7 and higher and is supported on a
specific set of operating systems. In addition, running the VMXNET 3 network adapter
requires the drivers installed as part of the VMware Tools installation. Therefore, if
this network adapter is chosen, network connectivity is not made available until the
required adapters are installed.
Note
To read more about the virtual network adapter options available for your
virtual machines, check out http://kb.vmware.com/kb/1001805.
Note
Network I/O control requires a virtual distributed switch.
Network I/O control is disabled by default, and requires a distributed virtual switch.
Once enabled, it allows for two types of resource pools: user-defined resource pools
and system-defined resource pools. These pools are managed by shares and limits
applied to the physical adapters and are only activated when contention exists on the
physical adapter. If there is no contention for the physical adapters resource, then
shares are not implemented (limits could be, though, depending on how they are
configured). By enabling network I/O control, an administrator can ensure that a
particular traffic type, such as vMotion, does not saturate the available bandwidth on a
physical adapter and cause a service interruption.
To expand, when the vMotion of a virtual machine is initiated, the vMotion traffic will
use as much bandwidth as it can. If SQL Server AlwaysOn Availability group
replication traffic is sharing the same physical adapter, there may be issues with SQL
Server replication traffic. With network I/O control enabled, vSphere will
automatically identify vMotion traffic, and by creating a user-defined network resource
pool for SQL Server replication traffic, you can better protect network flows.
In addition to system-defined and user-defined resource pools, an administrator also has
the ability to assign a Quality of Service (QoS) tag to all outgoing packets from a
particular network resource poolwhether that pool is a system-defined or user-
defined resource pool. Figure 5.11 shows how to configure QoS for a system-defined
resource pool (vMotion). This is an 802.1p tag, and has the configurable range of (none)
or 0 to 7 (see Table 5.5 for more information on QoS tagging).
Note
To learn more about network I/O control, read the vSphere Networking
for 5.5 white paper: http://pubs.vmware.com/vsphere-
55/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-55-
networking-guide.pdf.
Summary
We covered quite a bit in this chapter. We discussed how architecting SQL Server is a
team effort and how putting a Center of Excellence team in place can assist with
ensuring SQL Server virtual machines are properly configured to run on ESXi hosts. We
then walked through the four core resources and examined these from the physical,
hypervisor, and virtual machine levels.
This chapter has provided considerations and introduced concepts that will be
discussed in great detail in the following chapters. Specifically, in the next chapter, we
will dive into architecting storage for a SQL Server implementation.
Chapter 6. Architecting for Performance: Storage
All aspects of architecting your SQL Server Database for performance are important.
Storage is more important than most when compared to the other members of the IT
Food Group family we introduced in Chapter 5, Architecting for Performance:
Design, which consists of Disk, CPU, Memory, and Network. Our experience has
shown us, and data from VMware Support validates this belief, that more than 80% of
performance problems in database environments, and especially virtualized
environments, are directly related to storage. Understanding the storage architecture in a
virtualized environment and getting your storage architecture right will have a major
impact on your database performance and the success of your SQL Server virtualization
project. Bear in mind as you work through your storage architecture and this chapter that
virtualization is bound by the laws of physicsit wont fix bad code or bad database
queries. However, if you have bad code and bad queries, we will make them run as fast
as possible.
Tip
Greater than 80% of all problems in a virtualized environment are caused by the
storage in some way, shape, or form.
This chapter first covers the key aspects of storage architecture relevant to both physical
and virtual environments as well as the differences you need to understand when
architecting storage, specifically for virtualized SQL Server Databases. Many of the
concepts we discuss will be valid for past versions of SQL Server and even the newest
release, SQL Server 2014.
We provide guidance on what our experience has taught us are important database
storage design principles. We present a top-down approach covering SQL Server
Database and Guest OS Design, Virtual Machine Template Design, followed by
VMware vSphere Hypervisor Storage Design and then down to the physical storage
layers, including using server-side flash acceleration technology to increase
performance and provide greater return on investment. We conclude the chapter by
covering one of the biggest IT trends and its impact on SQL Server. Throughout this
chapter, we give you architecture examples based on real-world projects that you can
adapt for your purposes.
When designing your storage architecture for SQL Server, you need to clearly
understand the requirements and have quantitative rather than subjective metrics. Our
experience has taught us to make decisions based on fact and not gut feeling. You will
need to benchmark and baseline your storage performance to clearly understand what is
achievable from your design. Benchmarking and baselining performance are critical to
your success, so weve dedicated an entire chapter (Chapter 10, How to Baseline
Your Physical SQL Server System) to those topics. In this chapter, we discuss some of
the important storage system component performance aspects that will feed into your
benchmarking and baselining activities.
Caution
A lesson from the field: We were working with a customer, and they wanted to
design and run a database on vSphere that could support sustained 20,000 IOPS.
After we worked with the customers vSphere, SAN, Network, and DBA teams,
the customer decided to move forward with the project. The customer then called
in a panic saying, In our load test, we achieved 1,000 IOPS. We are 19,000
short of where we need to be. Trust me, this is a phone call you dont want to
get. Playing the odds, we started with the disk subsystem. We quickly identified
some issues. The main issue was the customer purchased for capacity, not
performance. They had to reorder the right disk. Once the new (right) disk
arrived and was configured, the customer exceeded the 20,000 IOPS requirement.
Tip
When it comes to storage devices, HDDs are cents per GB but dollars per IOP,
whereas SSDs are cents per IOP and dollars per GB. SSDs should be considered
cheap memory, rather than expensive disks, especially when it comes to
enterprise SSDs and PCIe flash devices.
Tip
If you are virtualizing existing databases, you might consider using a tool such as
VMware Capacity Planner, VMware Application Dependency Planner, Microsoft
System Center, or Microsoft Assessment and Planning Toolkit to produce the
inventory. VMware Capacity Planner and Application Dependency Planner are
available from VMware Professional Services or your preferred VMware
partner. When youre baselining a SQL Server database, a lot can happen in a
minute. We recommend your sample period for CPU, Memory, and Disk be 15
seconds or less. We recommend you sample T-SQL every minute.
Table 6.1 Number of Data Files and Temp DB Files Per CPU
Note
It is extremely unlikely you will ever reach the maximum storage capacity
limits of a SQL Server 2012 database system. We will not be covering the
maximums here. We recommend you refer to Microsoft
(http://technet.microsoft.com/en-us/library/ms143432.aspx).
Microsoft recommends as a best practice that you should configure one Temp DB data
file per CPU core and 0.25 to 1 data file (per file group) per CPU core. Based on our
experience, our recommendation is slightly different.
If your database is allocated eight or fewer vCPUs as a starting point, we recommend
you should configure at least one Temp DB file per vCPU. If your database is allocated
more than eight vCPUs, we recommend you start with eight Temp DB files and increase
by lots of four in the case of performance bottlenecks or capacity dictates.
Tip
Temp DB is very important because its extensively utilized by OLTP databases
during index reorg operations, sorts, and joins, as well as for OLAP, DSS, and
batch operations, which often include large sorts and join activity.
We recommend in all cases you configure at least one data file (per file group) per
vCPU. We recommend a maximum of 32 files for Temp DB or per file group for
database files because youll start to see diminishing performance returns with large
numbers of database files over and above 16 files. Insufficient number of data files can
lead to many writer processes queuing to update GAM pages. This is known as GAM
page contention. The Global Allocation Map (GAM) tracks which extents have been
allocated in each file. GAM contention would manifest in high PageLatch wait times.
For extremely large databases into the many tens of TB, 32 files of each type should be
sufficient.
Updates to GAM pages must be serialized to preserve consistency; therefore, the
optimal way to scale and avoid GAM page contention is to design sufficient data files
and ensure all data files are the same size and have the same amount of data. This
ensures that GAM page updates are equally balanced across data files. Generally, 16
data files for tempdb and user databases is sufficient. For Very Large Database (VLDB)
scenarios, up to 32 can be considered. See
http://blogs.msdn.com/b/sqlserverstorageengine/archive/2009/01/04/what-is-
allocation-bottleneck.aspx.
If you expect your database to grow significantly long term, we would recommend that
you consider configuring more data files up front. The reason we specify at least one
file per CPU is to increase the parallelism of access from CPU to data files, which will
reduce any unnecessary data access bottlenecks and lower latency. This also allows for
even data growth, which will reduce IO hotspots.
Caution
Having too few or too many Temp DB files can impact the overall performance
of your database. Our guidance is conservative and aimed to meet the
requirements for the majority of SQL systems. If you start to see performance
problems such as higher than normal query response times or excessive database
waits in PAGELATCH_XX, then you have contention in memory and may need to
increase the number of Temp DB files further and/or implement trace flag 1118
(which we recommend), which prevents single page allocations. If you see waits
in PAGEIOLATCH_XX, then the contention is at the IO subsystem level. Refer to
http://www.sqlskills.com/blogs/paul/a-sql-server-dba-myth-a-day-1230-
TempDB-should-always-have-one-data-file-per-processor-core/ and Microsoft
KB 328551 (http://support.microsoft.com/kb/328551).
Tip
The number of data files and Temp DB files is important enough that Microsoft
has two spots in the Top 10 SQL Server Storage best practices highlighting the
number of data files per CPU. Refer to http://technet.microsoft.com/en-
us/library/cc966534.aspx.
Note
When youre determining the number of database files, a vCPU is logically
analogous to a CPU core in a native physical deployment. However, in a
native physical environment without virtualization, each CPU core may
also have a hyper-thread. In a virtual environment, each vCPU is a single
thread. There is no virtual equivalent of a hyper-thread.
Figure 6.3 shows an example of data files, Temp DB files, and transaction log files
allocated to a SQL Server 2012 Database on a sample system with four vCPU and
32GB RAM.
Figure 6.3 SQL Database data file allocation.
Note
As Figure 6.3 illustrates, there is only one transaction log file per database
and per Temp DB. Log files are written to sequentially, so there is no
benefit in having multiples of them, unless you exceed the maximum log file
size (2TB) between backups. There is a benefit of having them on very fast
and reliable storage, which will be covered later.
Tip
Always configure SQL data files to be equal size to maximize parallelism and
overall system performance. This will prevent hot spots that could occur if
different files have different amounts of free space. SQL Server having equally
sized data files ensures even growth and more predictable performance.
The next important point is that you should preallocate all your data files and transaction
log files. This will eliminate the need for the database to constantly grow the files and
resize them, which will degrade performance and put more stress on your storage
platform. The files cant be accessed for the period of time they are being extended, and
this will introduce avoidable latency.
It is a Microsoft best practice and our recommendation to manually and proactively
manage file sizes. Because you are presizing and proactively managing your database
files, you shouldnt need to rely on Auto Grow as much. Even though it may not be
needed, we recommend that Auto Grow be left active as a safety net.
Tip
Auto Grow should be set to grow at the same or a multiple of the underlying
storage system block size. In VMware environments, the block size on data stores
will be between 1MB and 8MB. Your Database Auto Grow size should be set
similarly, or at a multiple of this. Auto Grow should not be configured for
unrestricted growth; it should be limited to less than the size of the underlying file
system, taking into consideration the size of any other files on the file system. See
VMware KB 1003565.
If you are unsure what your underlying block size is, set Auto Grow to a multiple of
1MB. To prevent Auto Grow from being active too often, consider configuring it to
grow at around 10% of your initial database size rounded up to the nearest 1MB (or
block size), up to a maximum of 4GB. In most cases, an Auto Grow amount of 256MB
to 512MB should be sufficient. This will ensure the grow operation doesnt take too
long and is aligned to the underlying storage subsystem.
Caution
Because Auto Grow will by default zero out all the blocks and prevent access to
the files during that period, you dont want the operation to take too long. You
also dont want these operations to happen too frequently. Therefore, the Auto
Grow size needs to be small enough that it completes in a reasonable time but not
too small as to require constant growth. The database file sizing guidelines need
to be adjusted based on the performance in terms of throughput of your storage
and the workload behavior of your database. If you are proactively managing the
size of your database files, then Auto Grow should not be kicking in at all and
this shouldnt be a concern.
Tip
By default, Auto Grow operations will expand one file at a time. This will
impact the proportional fill algorithm and could result in degraded performance
and storage hot spots. To avoid this behavior, you can use trace flag 1117 by
specifying startup option T1117 or by using the DBCC TRACEON command.
By using this trace flag, you will ensure that each file is grown by the same
amount at the same time. This trace flag is set by default when installing SAP on
SQL Server 2012. Refer to SAP Note 1238993 and
http://www.ciosummits.com/media/pdf/solution_spotlight/SQL%20Server%202012%20Tec
Note
To reduce the performance impact of file growth operations, Instant File
Initialization can be used; this is covered in the next section.
Now that weve covered the fundamentals, we can calculate the initial size of the
database files. The initial files sizes are fairly easy to determine if youre migrating an
existing systemin which case, we recommend you preset your files to be the same size
as the system that is being migrated, which would be the case if you are doing a
standard physical-to-virtual migration. If this is a new database being virtualized, you
will need to estimate the database files initial size.
Data File Sizing
For data files, the preset size you should use is based on the estimated or actual size of
your database. You should allow for reasonable estimated growth (three to six months).
Once you have the total estimated size of your database, including growth, divide that by
the number of files to get the size of each file. For example, if you had a database
200GB in size with four vCPUs configured, you would have four data files, assuming
one file per vCPU, with a preset size of 50GB each. Each data file should always be of
equal size and be extended at the same rate.
Note
As with other resource types, it is not necessary to factor in multiple years
of growth to your database file sizing up front in a virtualized environment.
It is quick and easy to expand the existing storage of a virtual machine
online when required. By right-sizing your virtual machines and your
VMware vSphere infrastructure, you will maximize your ROI.
Temp DB File Sizing
The size of your Temp DB files should be based on the high watermark usage you
estimate for your queries and the overall size of your database. This can be hard to
estimate without knowledge of your workload because different queries will impact
your Temp DB usage in different ways. The best way to determine the appropriate size
will be to monitor Temp DB usage during a proof of concept test, or benchmarking and
baselining activities.
As a starting point, we recommend you consider sizing Temp DB to 1% the size of your
database. Each file would then be equal to Total size of Temp DB divided by the
number of files. For example, if you had a 100GB database with four vCPUs
configured, you would have an initial total Temp DB size of 1GB, and each Temp DB
data file would be 250MB in size. If you see significantly more Temp DB use during
ongoing operations, you should adjust the preset size of your files.
Note
Temp DB files are cleared, resized, and reinitialized each time the
database is restarted. Configuring them to be preset to the high water mark
usage will ensure they are always at the optimal size.
Tip
If you care about data protection and preventing data loss, use full recovery
mode.
If you are doing daily backups, you will need to ensure that your log file is sufficiently
sized to allow up to at least a days worth of transactions. This will allow you to
recover back to the point in time your database goes down by using the last backup and
replaying the transaction logs. In some large database systems, you will need to back up
the transaction logs much more frequently than every day.
Caution
If you are using Simple or Bulk Logged Recovery Model, data loss is a
possibility. When using Simple Recovery Model for your database, it is not
possible to perform media recovery without data loss, and features such as
AlwaysOn Availability Groups, Database Mirroring, Log Shipping, and Point in
Time Restores are not available. For more information on recovery models, refer
to http://msdn.microsoft.com/en-us/library/ms189275.aspx.
When it comes to storage performance and sizing of your transaction log, the total size
and how fast you can write transactions to it are important but are not the only
considerations. You must also consider the performance of file growth, DB restart, and
backup and recovery operations. With this in mind, it is critical that not only is the total
size of your transaction log appropriate, but also how you grow your transaction log to
that size. The reason this is so critical is that in SQL Server, even though your
transaction log may be one physical file, its not one physical transaction log.
Your one physical transaction log is actually made up of a number of smaller units
called Virtual Log Files (VLFs). VLFs are written to sequentially, and when one VLF is
filled, SQL Server will begin writing to the next. They play a critical part in the
performance of database backup and recovery operations.
The number of VLFs is determined at the time a file is created or extended by the initial
size allocated to the transaction log and the growth amount chunk each time it is
increased in size. If you leave the default settings with a large database, you can quickly
find yourself with tens if not hundreds of thousands of VLFs, and this will cause a
negative performance impact. This is why the process of preallocating the transaction
log file and growing it by the right amount is so important.
Tip
To learn more about the physical architecture of SQL Server transaction log files,
refer to http://technet.microsoft.com/en-us/library/ms179355(v=sql.105).aspx.
If the VLFs are too small, your maintenance, reboots, and database recovery operations
will be excruciatingly slow. If your VLFs are too big, your log backups and clearing
inactive logs will be excruciatingly slow and may impact production performance. The
reason for the former is that SQL Server must load the list of VLFs into memory and
determine the state of each, either active or inactive, when doing a DB restart or
recovery. The latter is because a VLF cant be cleared until the SQL Server moves onto
the next one.
As you can see from Table 6.2, if you create or grow a transaction log file by 64MB or
less at a time, you will get four VLFs each time. If you need 200GB of transaction log,
and it is created or grown by this amount, you end up with 12,800 VLFs, with each VLF
being 16MB. At or before this point, youd start to notice performance problems.
Caution
The number of VLFs in your transaction log file should not exceed 10,000.
Above this level, there will be a noticeable performance impact. In an
environment with log shipping, mirroring, or AlwaysOn, the number of VLFs will
have an impact on the entire related group of SQL Servers. See Microsoft KB
2653893 (http://support.microsoft.com/kb/2653893) and SAP Note 1671126
(http://service.sap.com/sap/support/notes/1671126).
To avoid the performance problems covered previously, you should ensure your VLF
size is between 256MB and 512MB. This will guarantee that even if your transaction
log were to reach the maximum size of 2TB, it will not contain more than 10,000 VLFs.
To achieve this, you can preset your log file to either 4GB or 8GB and grow it (either
manually or with Auto Grow) by the same amount each time. If we take the example of
the 128GB transaction log, you would initially create a 8GB log file and then grow it by
8GB fifteen times. This will leave you with the 128GB log file and 256 VLFs within
that log file, at 512MB each. You should set your transaction log file Auto Grow size to
be the same as whatever growth increment you have decided upon.
Tip
One of the quickest and easiest ways to find out how many VLFs there are in your
database and to find out more about your log files is to execute the query DBCC
LOGINFO. The number of rows returned is the number of VLFs.
Note
If you are creating a database to support SAP, we recommend you review
the following link with regard to transaction log sizing in addition to SAP
Note 1671126:
http://blogs.msdn.com/b/saponsqlserver/archive/2012/02/22/too-many-
virtual-log-files-vlfs-can-cause-slow-database-recovery.aspx.
Caution
There is a bug when growing log files by multiples of exactly 4GB that affects
SQL Server 2012. If you attempt to grow the log by a multiple of 4GB, the first
attempt will fail to extend the file by the amount specified (you might see 1MB
added), but will create more VLFs. The second or subsequent attempt will
succeed in growing the file by the specified amount. This bug is fixed in SQL
Server 2012 SP1. As a workaround, if you are still using SQL Server 2012, you
should grow in increments of 4,000MB or 8,000MB rather than 4GB or 8GB. See
http://www.sqlskills.com/blogs/paul/bug-log-file-growth-broken-for-multiples-
of-4gb/.
Even if your database were relatively small, we would recommend that you start with a
4GB or 8GB (4,000MB or 8,000MB) transaction log file size. You should proactively
and manually manage the size of your transaction log. Proactive management will avoid
Auto Grow kicking in during production periods, which will impact performance. This
is especially important when considering the transaction log will be growing at 4GB or
8GB at a time and having all those blocks zeroed out. However, just as with data files
and Temp DB files, you should have Auto Grow enabled as a safety net and set it to
either 4GB or 8GB, depending on the growth size you have selected.
Figure 6.4 Enabling the Perform Volumes Maintenance Tasks security policy.
After you have made this change, you will need to restart your SQL Server services for
it to take effect. We recommend you make this setting a standard for all your SQL
Server databases and include it in your base template.
Note
Instant File Initialization (IFI) is only used for data files and Temp DB
files. Even when IFI is configured and active, it will not be used for
transaction log files. Transaction log files will continue to zero out every
block when they are created or extended. This is due to the internal
structure of the transaction log file and how it is used for data protection
and recovery operations. This makes it even more important for you to
proactively manage your transaction log files to prevent any avoidable
performance impacts.
Caution
Instant File Initialization is not available when youre using Transparent Data
Encryption (TDE) or when trace flag 1806 is set (which disables Instant File
Initialization). Although IFI has a positive impact on performance, there are
security considerations. For this reason, using IFI may not be suitable for all
environments. See http://technet.microsoft.com/en-
us/library/ms175935(v=sql.105).aspx. Based on our experience, in most
environments, the highlighted security considerations can be addressed by proper
controls and good system administration practice.
Tip
Separate the operating system, application binaries, and page file from core
database files so they dont impact the performance of the SQL Server Database.
From a database storage performance perspective, any paging is bad and should be
avoided. Details of the page file and SQL Server memory configuration will be covered
in Chapter 7, Architecting for Performance: Memory. Chapter 7 will show you how to
avoid paging and optimize performance from the memory configuration of your SQL
Server.
File System Layout for Data Files, Log Files, and Temp DB
When considering the design of the file system layout for data files, log files, and Temp
DB, our objectives are as follows:
1. Optimize parallelism of IO (Principle 1).
2. Isolate different types of IO from each other that may otherwise cause a
bottleneck or additional latency, such as OS and page file IO from database IO, or
sequential log file IO from random data file IO.
3. Minimize management overheads by using the minimum number of drive letters or
mount points required to achieve acceptable performance (Principle 5).
In order to achieve objectives 1 and 2, we recommend splitting out data files and Temp
DB files from log files onto separate drive letters or mount points. This has the effect of
killing two birds with one stone. By separating log files into their own drive or mount
point, you maintain the sequential nature of their IO access pattern and can optimize this
further at the hypervisor and physical storage layer later if necessary. If the log files
share a drive or mount point, the access pattern of that device will instantly become
random. Random IO is generally harder for storage devices to service. At the same
time, you are able to increase the parallelism needed for the IO patterns of the data files
and Temp DB files.
To achieve greater IO parallelism at the database and operating system layer, you need
to allocate more drive letters or mount points. The reason for this is that each storage
device (mount point or drive) in Windows has a certain queue depth, depending on the
underlying IO controller type being used. Optimizing the total number of queues
available to the database by using multiple drives or mount points allows more
commands to be issued to the underlying storage devices in parallel. We will discuss
the different IO controllers and queue depths in detail later.
As a starting point for standalone database instances, we recommend that you configure
a drive letter or mount point per two data files and one Temp DB file. This
recommendation assumes each file will not require the maximum performance
capability of the storage device at the same time. The actual number of drive letters or
mount points you need will be driven by your actual database workload. But by having
fewer drives and mount points will simplify your design and make it easier to manage.
The more users, connections, and queries, the higher the IO requirements will be, and
the higher the queue depth and parallelism requirements will be, and the more drive
letters and mount points you will need.
Caution
You should monitor your database for signs of contention in the underlying
storage subsystem. You can do this by querying the top wait states and checking
PAGEIOLATCH. If you see excessive waits, that is a sign of database IO
contention, and you may need to adjust your file system layout or underlying
storage that supports your virtualized databases.
Refer to http://blogs.msdn.com/b/askjay/archive/2011/07/08/troubleshooting-
slow-disk-i-o-in-sql-server.aspx.
The example in Figure 6.5 illustrates how you might arrange your database files for a
standalone instance. If you start to see IO contention and your database is growing (or is
expected to grow) very large or makes a lot of use of Temp DB, then you may wish to
separate out Temp DB files onto their own drive letters or mount points. This would
remove the chance of Temp DB IO activity impacting the IO activity of your other data
files and allow you to put Temp DB onto a separate IO controller (point 2 of our file
system layout objectives).
Figure 6.5 Sample SQL Server file system layoutTemp DB with data files.
Having a single Temp DB file on the same drive with two data files in general will
balance the IO activity patterns and achieve acceptable performance without an
excessive number of drives to manage. The reason for this layout is more likely on a
standalone instance instead of with a clustered instance, which will become clear on the
next page.
Tip
You should size each drive letter or mount point so that the preallocated database
files on it consume no more than 80% of the available capacity. When you need
to grow the capacity of your database, you have the option of either extending the
existing drives or mount points or adding in more. These operations can be done
online without any disruption to your running database. Auto Grow should be
configured so that in the worst-case scenario, the maximum growth of all the files
on the drive or mount point combined will never exceed the total capacity.
In the example in Figure 6.6, we have split out the Temp DB files onto separate drive
letters from the data files of the production database. If you have a very large database
or your database will have heavy IO demands on Temp DB, it makes sense to split it out
onto its own drives and a separate IO controller.
Figure 6.6 Sample SQL Server file system layoutdata files separate from Temp DB.
In databases that make extremely heavy use of Temp DB, such as peaking at more than
50% of total database size, it might make sense for each Temp DB file to be on its own
drive or mount point to allow each file access to more parallel IO resources. This
assumes that the underlying storage infrastructure can deliver more IO in parallel, which
we will cover later in this chapter.
In an AlwaysOn Failover Cluster Instance, an additional reason to separate Temp DB
onto different drives or mount points from other data files is that it can be hosted locally
to the cluster node, rather than on the shared storage. This makes a lot of sense given
that the Temp DB data doesnt survive instance restarts. This allows you to
optimize the performance of Temp DB without impacting the data files and log files that
are shared between cluster nodes. If you have extreme Temp DB IO requirements, you
could consider locating it on local flash storage, but consider that this would prevent the
guest restarting in a VMware HA event. In this case, the cluster node would be
unavailable if the local flash storage failed, which would trigger a failover to another
node. This is a new feature available with SQL Server 2012 AlwaysOn that wasnt
previously available (see http://technet.microsoft.com/en-us/sqlserver/gg508768.aspx).
More details about AlwaysOn Availability Groups and Failover Cluster Instances are
provided in Chapter 9, Architecting for Availability: Choosing the Right Solutions.
Tip
In the case where you are splitting Temp DB files out onto separate drives from
the other data files, it makes sense to also assign them to a separate IO controller.
This will optimize the path of the IOs from the database through Windows and
down to the underlying storage. We have used this as the foundation of our
AlwaysOn Availability Group example configuration in Chapter 11, Configuring
a Performance TestFrom Beginning to End, which is depicted in Figure 11.10.
Tip
For a SQL Server database that consists of very few, very large files, having a
much larger Allocation Unit is much more efficient from a file system, operating
system management, and performance perspective.
For the OS and Application Binary drive, keeping the default of 4KB Allocation Unit is
recommended. There is no benefit in changing from the default. If your page file is on a
separate drive from the OS, you should use a 64KB Allocation Unit size. For all SQL
Server database drives and mount points (data files, log files, and Temp DB files), we
recommend you use 64KB as your Allocation Unit Size setting (see Figure 6.7).
Tip
The Default NTFS Allocation Unit size is 4KB for all volumes up to 16TB in
size. Volumes greater than 16TB in size will have larger default Allocation Units.
Regardless of your volume size and the default NTFS Allocation Unit size, we
recommend you use 64KB. For most environments, its unlikely you will be using
more than 16TB for each volume.
See http://support.microsoft.com/kb/140365 for further details of the NTFS
Allocation Unit sizes for different-sized volumes.
Partition Alignment
Each storage device reads and writes data at different underlying block sizes. A block
on a storage device is the least amount of data that is read from or written to with each
storage option. If your file system partition is not aligned to the underlying blocks on the
storage device, you get a situation called Split IO in which multiple storage operations
are required to service a single operation from your application and operating system.
Split IOs reduce the available storage performance for productive IO operations, and
this gets even worse when RAID is involved, due to the penalty of certain operations,
which well cover later in this chapter.
Figure 6.8 shows what would be considered a worst-case scenario, where the file
system partition and the VMware vSphere VMFS partition are misaligned. In this case,
for every three backend IOs, you get one productive IO. This could have the effect of
causing each IO operation 3X latency, which is like getting 30% performance from your
100% storage investment. Fortunately, with Windows 2008 and above and with VMFS
volumes that are created through VMware vCenter, this problem is much less likely.
Figure 6.8 File system and storage that is not correctly aligned.
Starting with Windows 2008, all partitions are aligned to the 1MB boundary. This
means in almost all cases, they will be aligned correctly. The same is true with VMFS5
partitions created through VMware vCenter. They will align to the 1MB boundary.
However, if you have an environment that has been upgraded over time, you may still
have volumes that are not correctly aligned. The easiest way to check is to monitor for
Split IOs in both ESXTOP or in Windows Performance Monitor.
Figure 6.9 shows reading of one frontend block will require only one backend IO
operation, thus providing lower latency and higher IO performance.
Figure 6.9 File system and storage that is aligned.
Tip
There is a direct tradeoff between allocating more memory to the SQL Server
Database and its Buffer Pool to reduce read IO and allocating less memory and
having more read IO. For your design, you need to consider which resource will
be more of a constraint and a cost. In some situations, more memory for your
database and for your vSphere hosts could be cheaper than purchasing more
performance via your storage systems. However, server-side flash, which could
be thought of as cheap memory rather than expensive storage, combined with
smart software is impacting the economics of this equitation. We will discuss in
more detail later in this chapter how using flash storage local to the server can
allow you to consolidate more databases per host with less memory per database
without degrading performance to an unacceptable level.
Caution
When you upgrade an existing database to SQL Server 2012, the statistics may
become out of date and result in degraded performance. To avoid this, we
recommend you update statistics immediately after upgrading your database. To
do this manually, you can execute sp_updatestats. Refer to
http://www.confio.com/logicalread/sql-server-post-upgrade-poor-query-
performance-w02/, which contains an excerpt from Professional Microsoft SQL
Server 2012 Administration, published by John Wiley & Sons.
There are two primary methods to deal with the problem of outdated statistics impacting
your database and storage IO performance.
Trace Flag 2371Dynamic Threshold for Automatic Statistics Update
The first method involves using trace flag 2371 by setting startup option T2371 or
DBCC TRACEON (2371, -1). This is documented in Microsoft KB 2754171
(http://support.microsoft.com/kb/2754171). This trace flag tells SQL Server to
dynamically change the percentage a table needs to change before the statistics are
automatically updated. In very large tables, an automatic update of statistics can now be
triggered by a change of less than 1%. Using this option could result in significantly
improved performance for situations where you have very large tables.
Tip
Information with regard to trace flag 2371 in SAP environments can be found in
the following articles: http://scn.sap.com/docs/DOC-29222 and
http://blogs.msdn.com/b/saponsqlserver/archive/2011/09/07/changes-to-
automatic-update-statistics-in-sql-server-traceflag-2371.aspx.
Caution
Database statistics are complied against each table in your database. When SQL
Server updates statistics, this information is recompiled. Automatic statistics
update and trace flag 2371 may cause statistics to be updated more frequently
than necessary. So there is a tradeoff between the performance benefit of doing
statistics updates regularly and the cost of recompiling the statistics. The cost of
doing this operation is not free, and in rare cases it can have a detrimental impact
on performance. If you find any performance problems correlating to the periods
of time where statistics are being updated, then you may wish to control when
statistics updates occur. For the majority of customers we deal with, around 80%
experience positive performance improvements and no detrimental impact by
using the dynamic automatic updates for database statistics. Refer to
http://technet.microsoft.com/en-us/library/ms187348.aspx.
Tip
Its important that databases have updated statistics so that the Query Optimizer
works properly. This can be done via automatic settings or scheduled
maintenance jobs. Use scheduled maintenance jobs where the timing of the
gathering of these statistics needs to be done to minimize impact on performance
of the database during peak demand periods.
Caution
Data Compression introduces a CPU overhead and may increase CPU utilization
on your database. In most cases, this overhead is outweighed by the benefit in
performance you receive. In most virtualized environments, CPU performance
will not be your constraint; memory and storage IO are usually the bottleneck.
However, it wont benefit every workload and is not likely suitable for small
tables that change very often. The best workloads for data compression consist of
large tables that are predominately read oriented. Also Data Compression cant
be enabled for system tables. Refer to http://technet.microsoft.com/en-
us/library/cc280449.aspx and http://msdn.microsoft.com/en-
us/library/dd894051(SQL.100).aspx.
Tip
If you are using SAP with SQL Server 2012, then Page Compression is turned on
by default. For detailed information about using SQL Server Data Compression
with SAP, refer to http://scn.sap.com/docs/DOC-1009 and the SAP on SQL
Server Page (http://scn.sap.com/community/sqlserver).
Column Storage
Column Storage, also known as xVelocity memory optimized column store index, is a
new feature of SQL Server 2012 aimed at data warehouse workloads and batch
processing. Column Storage is much more space and memory efficient at storing and
aggregating massive amounts of data. Leveraging this feature can greatly improve the
performance of data warehouse queries. However, to use it you must make some
tradeoffs.
Tip
In SQL 2012, you can select from a column store index and you can also rebuild.
A new feature added to SQL 2014 is the ability to select and rebuild a column
store index but also directly insert, update, or delete individual rows.
When using Column Storage, you will not be able to use Large Pages and Lock Pages in
Memory (trace flag 834) because this will increase the work the translation look-aside
buffer (TLB, see Chapter 7) has to do. Also, the tables using the column store index will
be read-only. Any time you need to write data to the table, you need to drop and re-
create the column store index, but this can easily be done with scheduled batch jobs. For
the types of workloads that Column Storage is well suited to, these tradeoffs are
normally worth the benefits.
Note
For detailed information on the xVelocity memory optimized column store
feature, see the following Microsoft article:
http://technet.microsoft.com/en-us/library/gg492088.aspx.
The benefits of Column Storage as documented in the link in the following tip include:
Index compressionColumn Storage indexes are far smaller than their B-Tree
counterparts.
ParallelismThe query algorithms are built from the ground up for parallel
execution.
Optimized and smaller memory structures
From a storage perspective, the benefits of Column Storage are far less storage capacity
and performance being required to achieve the desired query performance. The
improvement in query performance ranges from 3X to 6X on average, up to 50X. See
http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-components-
postattachments/00-10-36-36-
43/SQL_5F00_Server_5F00_2012_5F00_Column_2D00_Store.pdf.
Tip
If you are using SAP BW with SQL Server 2012 (SP1 recommended, cumulative
update 2 minimum), then Column Storage is turned on by default (for SAP BW
7.0 and above) when certain support packs are applied. For detailed information
about using SQL Server 2012 Column Storage with SAP BW, refer to
http://scn.sap.com/docs/DOC-33129 and
http://blogs.msdn.com/b/saponsqlserver/.
Tip
No matter which availability choice you make, you need to plan for the storage
performance and capacity requirements of that choice. We will cover the details
of SQL Server availability design, including AlwaysOn Availability Groups and
Failover Cluster Instances, in Chapter 9.
Tip
We recommend the use of basic disks in Windows and the GPT (GUID Partition
Table) partition format for all SQL Server partitions. If you want to boot from a
GPT partition thats larger than 2TB, you can use the UEFI boot features of
vSphere 5.x.
We have now covered how to optimize storage performance for SQL Server and
Windows at the operating system level. Now we will look at how to optimize storage
performance with your virtual machine template and discuss the different configuration
options available to you. In this section, we cover different virtual machine hardware
versions, virtual IO controllers, types of virtual disk, and how to size and deploy your
virtual disks onto your storage devices. In this section, we start to look further at IO
device queues and how they impact virtual machine storage performance.
Caution
Changing the storage controller type after Windows is installed will make the
disk and any other devices connected to the adapter inaccessible. Before you
change the controller type or add a new controller, make sure that Windows
contains the necessary drivers. On Windows, the driver must be installed and
configured as the boot driver. Changing the storage controller type can leave your
virtual machine in an unbootable state, and it may not be possible to recover
without restoring from backup.
Choosing a virtual storage controller with a higher queue depth will allow SQL Server
to issue more IOs concurrently through Windows and to the underlying storage devices
(virtual disks). By having more virtual disks (more drives or mount points), you
increase the amount of queues that SQL Server has access to. Balancing the number of
data files to drive letters, to virtual disks, and to adapters allows you to maximize the
IO efficiency of your database. This will reduce IO bottlenecks and lower latency.
Not all virtual disks will issue enough IOs to fill all of the available queue slots all of
the time. This is why the adapter queue depths are lower than the aggregate total number
of queues per device multiplied by the total number of devices per adapter. PVSCSI, for
example, has 15 virtual disks, and each disk has a queue depth of 64 by default. The
number of devices multiplied by their queue depth would be 960, even though the
adapter default queue depth is only 256.
Tip
To determine the number of IO operations queued to a particular drive or virtual
disk at any particular time, you can use Windows Performance Monitor to track
the Average Disk Queue Length for each device. You should be recording this
parameter as part of your SQL Server baseline and capacity planning to help you
properly design the storage for your virtualized SQL Server systems.
In most cases, the default queue depths are sufficient for even very high performance
SQL Server systemsespecially when you are able to add up to four vSCSI adapters
and increase the number of virtual disks per adapter. With LSI Logic SAS, you have a
maximum of 32 queue slots per disk and a maximum of 128 queue slots per adapter.
Neither can be changed. In this case, your only option to scale IO concurrency is by
adding virtual controllers and adding virtual disks. This is a key consideration when
considering AlwaysOn Failover Cluster Instances, where LSI Logic SAS is the only
vSCSI adapter option.
With PVSCSI, you can modify the disk queue depth and the adapter queue depth from
their default settings. This is only required in very rare cases where your database
needs to issue very large amounts of IO in parallel (>1,000). To keep things
standardized and simple, we recommend leaving the default settings in your templates
and only modify them if absolutely necessary. This assumes your underlying disk
subsystems can support the parallelism required at low-enough latency.
Figure 6.11 shows an example of the registry entries configured to increase the
maximum adapter and virtual disk queue depths for a VMware PVSCSI adapter, as
documented in VMware KB 2053145.
Figure 6.11 PVSCSI advanced registry parameters.
Note
If you wish to use PVSCSI as your boot controller, you need to select it
when creating the virtual machine, and during the Windows installation you
need to mount the pvscsi-Windows2008.flp floppy disk image from the
ESXi vmimages folder. This means you will need to ensure that your
virtual machine is configured with a virtual floppy disk device. Information
on which versions of ESXi and Windows support PVSCSI as a boot device
can be found in VMware KB 1010398.
Caution
There have been issues with using the PVSCSI driver with Windows 2008 or
Windows 2008 R2 on versions of ESXi before 5.0 Update 1, as described in
VMware KB 2004578. If you are using VMware vSphere 5.0, we recommend
that for your SQL Server databases you upgrade to ESXi 5.0 Update 2 or later.
These problems are not relevant for ESXi 5.1 or 5.5.
If you choose not adjust the queue depth or are unable to adjust the queue depth of a
particular storage device or adapter, Windows will queue any additional IOs. Windows
will hold up to 255 IOs per device before issuing them to the adapter driver, regardless
of the devices underlying queue depth. By holding the IOs in the Windows OS before
issuing them to the adapter driver and the underlying storage, you will see increased IO
latency. To learn more about the Windows storage driver architecture (storport), we
recommend you read the article Using Storage Drivers for Storage Subsystem
Performance at Windows Dev Center [http://msdn.microsoft.com/en-
us/library/windows/hardware/dn567641 and http://msdn.microsoft.com/en-
us/library/windows/hardware/ff567565(v=vs.85).aspx].
Figure 6.12 shows the difference in IOPS and latency between PVSCSI, LSI Logic SAS,
and SATA AHCI. These tests were conducted using a single drive at a time on a single
VM. The VM was configured with two vCPUs and 8GB RAM. Each virtual disk was
placed on the same VMFS5 data store on top of a Fusion-io ioDrive2 1.2TB PCIe flash
card. IOMeter was used to drive the IO load and measure the results.
Caution
The IOMeter performance results included in this section were created only to
show the relative difference in performance capability between the different
virtual storage adapter types. Your results will be different. These tests did not
use real-world workload patterns and should not be relied upon for sizing or
capacity planning of your SQL Server databases. You should conduct your own
tests to validate your environment. See Chapters 10 and 11 for details of how to
validate and baseline your environment.
Note
Supported Clustering Configurations are covered in VMware KB 1037959
and the VMware Product Guide: Setup for Failover Clustering and
Microsoft Cluster Services (http://pubs.vmware.com/vsphere-
55/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-55-setup-
mscs.pdf).
Caution
Thin Provisioned VMDK growth operations on VMFS data stores generate
metadata updates. Each metadata update requires a lock for a brief period of time
on the VMFS data store. On some older storage arrays that do not support
VMwares API for Array Integration (VAAI) and where there is an excessive
number of Thin VMDKs or VMs per data store, this can cause SCSI reservation
conflicts, which may result in degraded performance (additional latency).
VMFS5 volumes newly created on arrays that support VAAI will use Atomic
Test and Set Locking (ATS) Only. ATS addresses the problems that used to be
caused by SCSI reservation conflicts. When selecting a storage array for use with
VMware vSphere 5.x and SQL Server, you should ensure it supports VAAI.
VMFS5 volumes that were upgraded from VMFS3 may fall back to using SCSI
reservations in certain cases. See VMware KB 1021976 and
http://blogs.vmware.com/vsphere/2012/05/vmfs-locking-uncovered.html.
The capacity savings from thin provisioning may well be enough to justify the
management overheads because you are able to purchase more on demand instead of up
front, and this could save a considerable amount of money. But you need to make sure
you can get the performance you need from the capacity that has been provisioned and is
used. Sizing for performance may necessitate much more capacity is provisioned on the
backend storage devices and therefore diminishes any savings that may have been had
when saving capacity through thin provisioning.
Caution
Restoring files from backup or copying files between VMs that have Thin
Provisioned VMDKs will cause those disks to expand. Once the disks are
expanded, they do not shrink automatically when the files are deleted. Also, since
Windows 2008, if you do a Full Format on a Thin Provisioned VMDK, it will
cause the disk to inflate, as a full format will zero out each block. If you use Thin
Provisioned disks, you should select the quick format option when partitioning a
disk in Windows. We strongly recommend that you dont over-provision storage
resources to the point an out-of-space (OOS) condition could result from
unexpected VMDK growth. See VMware KB 1005418 and Microsoft KB
941961.
If you dont use Instant File Initialization, then SQL Server will zero out its data files
whenever they are created or extended. This will ensure you get optimal performance
from the data files regardless of the underlying virtual disk format. But this comes at the
cost of the time taken to zero out the file and the resulting impact in terms of storage IO
to the underlying storage. As previously discussed, using Instant File Initialization
allows SQL Server to act as part of Windows and not write a zero to a block before
data is written to it. In certain cases, there could be substantial storage efficiencies (IO
Performance and Capacity) by combining the use of Instant File Initialization, thin
provisioning, and SQL Server compression. This may be especially advantageous to
development and test environments. There can be a significant performance penalty if
you use a non-VAAI array without using SQL Instant File Initialization on Thin and
Thick Lazy Zero disks. VAAI allows the zeroing operation to be offloaded to the array
and performed more efficiently, thus saving vSphere resources for executing VMs. If
you use Thin Provisioned or Lazy Thick VMDKs without a VAAI-compatible array, the
entire zeroing operation has to be handled solely by vSphere.
If your SQL Server and environment meets the following requirements, you may want to
consider using Thin Provisioned VMDKs with Instant File Initialization and SQL Data
Compression:
The SQL Server workload will be largely read biased.
Performance from your storage during times that blocks are initially written to and
zeroed out is sufficient to meet your database SLAs.
Performance is sufficient from the capacity required when thin provisioning is
used.
You are not planning to use Transparent Data Encryption.
You wish to minimize the amount of storage provisioned up front and only
purchase and provision storage on demand.
When you are using Thick Provisioning Lazy Zero (the default), the VMDKs space is
allocated up front by vSphere, although like with thin provisioning, it is not zeroed out
until its written to for the first time (or you select full format in Windows when
partitioning the disks). When you look at the data store, you may get a more accurate
view of free space and there may be less variance between provisioned space and
usage. The reason we say you may get a more accurate view of free space is that many
modern arrays will tell vSphere the storage is allocated or consumed but wont actually
do so until data is written to it, although it most likely will be reserved.
If you were considering using Thin or Thick Lazy Zero VMDKs for SQL Server, we
would recommend you choose the default of Thick Lazy Zero to minimize management
overheads. We would recommend using Thin where there are requirements that would
benefit from it and justify the management overheads. However, before you decide on
Thick Lazy Zero, you should consider Thick Eager Zero, which we cover next.
Using Thick Eager Zero Disks for SQL Server
The major difference between Thick Eager Zero and Thick Lazy Zero or thin
provisioning is when the blocks on the VMDK are zeroed out. As weve covered with
Lazy Zero and Thin VMDKs, blocks are zeroed on first write. With Eager Zero, the
blocks are zeroed when the VMDK is created as part of the provisioning process. This
means all blocks are pre-zeroed before Windows or SQL goes to access them. By doing
this, you are eliminating a first write penalty in the situations where that would occur.
This ensures there is no double write IO required to the VMDK after it is provisioned.
As you can imagine, it can take quite a bit longer to provision Thick Eager Zeroed
disks. Additionally, provisioning and zeroing out the blocks may impact the
performance of other VMs when using shared storage devices. The impact to your
environment will be dependent upon the type and utilization of your backend storage
devices. Some storage arrays will just throw away or ignore the zeros, and in these
cases, the provisioning operations will complete relatively quickly and have minimal
impact on performance.
In aggregate, over the life of a VMDK there is normally little difference in the amount of
IO generated when using Thin, Thick Lazy Zero, or Thick Eager Zero VMDKs. The
difference is all about the timing of when the IO is generated, either up front (in the case
of Eager Zero) or on demand (first write) with Thick Lazy Zero and Thin. Once a block
has been written to with Thick Lazy Zero or Thin, it has exactly the same performance
characteristics as if it were Eager Zeroed. However, with Eager Zero, even if a block is
never used, you have zeroed it out at the cost of a write IO operation.
Tip
When provisioning VMDKs for data files and transaction log files, we
recommend you size them to allow 20% free space, which allows for any
unforeseen required growth. There should be sufficient capacity for the predicted
workload over at least a three-to-six-month period. By right-sizing VMDKs and
holding data files and transaction log files for a reasonable period, you reduce
the management and administrative burden while at the same time optimize
overall performance and capacity consumption.
If you are proactively managing SQL Server data and transaction log files, and not using
Instant File Initialization, then the performance of your virtual machine will be the same
regardless of the virtual disk type you select. This is because SQL Server is zeroing out
the blocks first before they are used. If you enable IFI, then Eager Zero will give better
performance in terms of lower latency compared to Thick Lazy Zero or Thin, but only
when the block is first written to. All subsequent writes or access to the same block
will have exactly the same performance characteristics.
Although the aggregate amount of IO may be similar between the different virtual disk
options, Eager Zero generally provides the more predictable response times because
IOs will not be impacted by the additional write operation when data is written to a
new block. This predictability of IO response and generally lower latency is why Eager
Zero is required for the non-shared disks of a SQL Server Failover Cluster Instance.
Increased latency or poor IO performance can cause unnecessary cluster failovers
between nodes.
Tip
In the case of your backend storage devices supporting VAAI and the Write Same
primitive, the operation to zero out the blocks will have a minimal impact on
performance regardless of the timing of the operation, whether Eager Zero, Lazy
Zero, or Thin.
With the advent of VMwares VAAI and modern arrays that support it, the impact to the
environment of zeroing operations is reduced and therefore the performance impact of
using Eager Zero Thick disks is also reduced during initial provisioning. If you were
previously thinking of using Thick Lazy Zero VMDKs and you have a VAAI-capable
array that supports the Write Same primitive, we would recommend you use Thick
Eager Zero instead. This provides lower management overheads and optimal
performance. Regardless of whether you are using IFI or not, and in spite of the possible
overhead of having written zeros to a block that may not be used, we feel this is
justified for the decreased latency and increased predictability of IO responses that are
provided to SQL Server. This is especially important for business-critical production
databases. It is fine to use Thin or Thick Lazy Zero for your Windows OS disk, while
using Eager Zero Thick for your database drives (data files, Temp DB, and transaction
logs). When using SQL AlwaysOn Failover Cluster Instance, it is recommended that you
configure Windows OS disks as Eager Zero Thick; shared LUNs will in this case be
configured as physical RDMs.
Figure 6.13 shows a sample configuration of a virtual disk with the selection of Thick
Provision Eager Zeroed.
Figure 6.14 VMFS and RDM performance comparisons: IOPS vs. IO size.
Figure 6.14 illustrates the performance comparison between VMFS and RDM using a
random 50/50 mixed read-write workload pattern and the different IO sizes based on
data published at http://www.vmware.com/files/pdf/performance_char_vmfs_rdm.pdf.
Note
Although Virtual Mode RDMs (vRDMs) are included in Figure 6.14 for
performance comparison, they are not supported for use with Windows
2008 and above failover clustering or SQL Failover Cluster Instances.
Tip
There is a common myth that you cant perform vMotion operations when using
RDMs. This is not the case. VMs configured with RDMs support vMotion, but
only when the virtual machine is not using SCSI bus sharing. It is the SCSI bus
sharing required in a SQL Server AlwaysOn Failover Cluster Instance that
prevents the vMotion operations from being supported currently, not the fact the
VM is configured to use RDMs. See VMware KB 1005241.
Tip
We recommend the VMDK for the Windows OS C: drive, application binaries,
and Windows page file be hosted on a separate data store. Because the IO from
the Windows OS C: drive, application binaries, and page file should be minimal,
you may be able to host a number of them on a single data store, while keeping
data files and transaction log disks and their data stores separate. You should
take into account your availability requirements and risks, as the loss of access to
a single data store in this case could impact multiple SQL systems. Backup disks
can be shared with the same IO controller as the OS, and we recommend they are
on their own VMDK and data store if their size and performance requirements
justify it.
The example in Figure 6.18 shows each VMDK mapped to a dedicated data store. This
layout is suitable for SQL systems that need extreme IO performance and scalability. It
allows IO to be spread across more storage devices, and each VMDK has access to the
maximum possible amount of parallel IO. The increased number of data stores and
therefore LUNs will limit the total number of VMs that can be supported per host. You
will have many more data stores to manage per VM, which will increase your
management overheads.
Tip
We recommend all non-shared disks of your SQL FCI be set to Independent
Persistent to ensure they are not impacted by accidental VM snapshot operations.
Any VM snapshot operations on these disks can cause unexpected cluster
failovers.
Tip
Because the transaction log is written to sequentially and not striped, it is
recommended that the VMDK or RDM be extended if necessary, rather than hot-
adding a disk and creating a new log file. In vSphere 5.x, a VMDK can be
expanded online without disruption up to a maximum of 2TB512 Bytes.
Tip
The maximum theoretical SQL 2012 database size is 524PB. The maximum data
file size is 16TB, and the maximum log file size is 2TB. For further maximums,
see http://technet.microsoft.com/en-us/library/ms143432.aspx.
Although having lots of 62TB virtual disks is unrealistic, having a few virtual disks >
2TB is possible and potentially desirable for large SQL Servers. You can use a single
virtual disk for your transaction logs (max 2TB per transaction log file), and you would
be able to use a single virtual disk for your backup drive. Both transaction logs and
backups are sequential in nature and could benefit from the capacity of a larger > 2TB
VMDK without the performance drawbacks that would be likely for data files. Your
underlying storage platform would need to support a VMFS data store of a LUN size big
enough to support all of these large VMDKs. You should also consider your restore
times when using large VMDKs. If you cant restore a large VMDK within your SLAs,
it is not a good choice. Just because you can use Jumbo VMDKs doesnt mean you
always should.
Caution
You cant extend virtual disks > 2TB online. You must shut down your virtual
machine first, and extend the virtual disk offline through the vSphere Web Client.
This is due to the disk needing to be in the GPT format. Once a virtual disk has
been extended to > 2TB, each time you need to extend it further, you must shut
down the VM. Alternatively, you can hot-add a new virtual disk to the VM online
at any time and the new virtual disk can be > 2TB. Jumbo VMDKs can only be
managed through the vSphere Web Client because the traditional VI Client
(VMware C# Client) only supports VMware vSphere 5.0 features. All newer
features are only available through the Web Client. We recommend you create all
SQL data file, Temp DB file, transaction log, and backup drives using the GPT
format.
VMFS Heap Size Considerations with Monster VMs and Jumbo VMDKs
ESXi 4.x and 5.x prior to 5.5 used a VMFS Heap value to control how much memory
was consumed to manage the VMFS file system and for open or active VMDK capacity
on a single ESXi host. This limit was not documented in the vSphere Maximums
product document, and by default with a 1MB block size on ESXi 5.0 GA, it would
limit a host to being able to open 8TB of total VMDKs before errors could occur. The
maximum on ESXi 5.0 GA was 25TB with a 1MB block size, which required adjusting
the advanced parameter VMFS3.MaxHeapSizeMB. This was later increased to 60TB
by default on ESXi 5.0 by applying the latest patches and in ESXi 5.1 Update 1. The
only downside of this was 640MB of RAM was consumed for the VMFS Heap.
Caution
For the vast majority of environments, you dont need to change the default VMFS
settings, and the information in this section should be considered carefully
alongside your knowledge and understanding of your particular environment,
circumstances, and requirements. This really is for when youre considering
virtualizing business-critical apps and Monster VMs with very large storage
footprints.
In vSphere 5.5, the whole VMFS Heap size problem has been addressed. The VMFS
Heap is now irrelevant as a measure of how much open and active VMDK capacity a
single ESXi 5.5 host can handle. This is due to major improvements in the way the
VMFS Heap and pointer blocks are managed.
VMFS pointer blocks are a pointer to a VMFS block on disk. When a VMDK is opened
on an ESXi 5.5 host, all of the VMFS pointer blocks are cached in the Pointer Block
Cache, which is not part of the main VMFS Heap (where the pointer blocks were
previously stored in prior versions of ESXi). This allows the open VMFS pointer
blocks to be addressed or accessed and managed as fast as possible without having to
access metadata from the VMFS file system directly. The pointer blocks will remain in
use so long as a VMDK or other file is open. However, many blocks in any individual
VMDK are not often active. Its usually only a percentage of the blocks that are actively
used (say, 20%). The images shown in Figures 6.21 and 6.22 display how the pointer
blocks are used to refer to data blocks on the VMFS file system. Each pointer block that
is active is stored in the pointer block cache to ensure the fastest possible access to the
most frequently used blocks.
Figure 6.21 VMFS pointer block indirectionmemory address mapping to physical
VMFS blocks. *1
1 Used with permission from Cormac Hogan (http://cormachogan.com/2013/11/15/vsphere-5-5storage-
enhancements-part-2-vmfs-heap/).
Figure 6.22 VMFS pointer block double indirection. Used for mapping very large
VMFS data sets.*
Pointer Block Eviction Process
This is where the new Pointer Block Eviction Process introduced in ESXi 5.5 comes in.
If the number of open and active VMFS blocks reaches 80% of the capacity of the
Pointer Block Cache, a Pointer Block Eviction Process will commence. This basically
means the pointer blocks that are not active, or least active, will be evicted from
memory and only the active blocks will remain in the cache. This new process greatly
reduces the amount of ESXi host memory consumed to manage VMFS file systems and
the open VMDKs capacity per host. The VMFS Heap itself in ESXi 5.5 consumes
256MB of host RAM (down from 640MB), and the Pointer Block Cache by default
consumes 128MB of host RAM. You no longer have to worry about adjusting the size of
the VMFS Heap at all. A new advanced parameter has been introduced to control the
size of the Pointer Block Cache, MaxAddressableSpaceTB.
As with all advanced parameters, you should not change MaxAddressableSpaceTB
without a good justification, and in most cases, it will not be necessary.
MaxAddressableSpaceTB by default is set to 32, with a maximum of 128. This controls
the amount of host RAM the Pointer Block Cache consumes. With the default setting at
32, it will consume 128MB of host RAM (as mentioned previously), and with the
maximum setting of 128, it will consume 512MB of host RAM. However, its important
to note that this does not limit the capacity of open VMDKs on the ESXi 5.5 Host, just
how many of the pointer blocks can stay cached in RAM. If only 20% of all VMDK
blocks are active, you could conceivably be able to have 640TB or more of open
VMDK capacity on the host, while still having the active pointer blocks cached without
much, if any, performance penalty.
The way this new Pointer Block Eviction Process works gives you a sense of having an
almost unlimited amount of open VMDK capacity per ESXi 5.5 host. But its not quite
unlimited; there is a tradeoff as the amount of active VMDK capacity on an ESXi 5.5
host increases. The tradeoff is possible Pointer Block Cache Thrashing, which may
impact performance.
With the default setting of MaxAddressableSpaceTB=32, the Pointer Block Eviction
Process wont kick in until the amount of open VMDKs exceeds 25.6TB. So if you
arent expecting the VMs on your hosts to routinely exceed 25TB of open and active
VMDK blocks, there is probably no need to even look at adjusting
MaxAddressableSpaceTB; this saves you some host RAM that can be used for other
things. In most cases, you would only have to adjust MaxAddressableSpaceTB if the
active part of all open VMDKs on a host exceeds 25TB. If active VMDK blocks exceed
the capacity of the Pointer Block Cache, then thrashing could result from constantly
evicting and reloading pointer blocks, which may have a performance penalty.
You will see signs of Pointer Block Eviction in the VMKernel logs on your hosts if it is
occurring. Syslog, vCenter Log Insight, or Splunk will help you spot this type of
activity. If you start to notice any sort of performance impact, such as additional storage
latency visible in KAVG in ESXTOP, and a correlation to Pointer Block Eviction, then
that would be a sign you should consider adjusting MaxAddressableSpaceTB. If youre
planning to have 100TB of open VMDKs per host routinely, as in the case of large SQL
Servers, we recommend setting MaxAddressableSpaceTB = 64 and adjusting upwards
if necessary. If youre not concerned about the amount of RAM the Pointer Block Cache
will consume, you could consider setting it to the maximum of 128.
Increasing MaxAddressableSpaceTB may consume host RAM unnecessarily and so
should be considered along with the total RAM per host and the RAM that is likely to
be consumed by all VMs. 512MB of RAM consumed for Pointer Block Cache on a host
with 512GB of RAM or more is not significant enough to worry about, but could be
worth considering carefully if your hosts only have 32GB of RAM.
Caution
Any time you change an advanced parameter in vSphere, its something that has to
be managed and considered when you are changing your environment. To Keep
It Simple and Standardized (Principle 5), you should avoid changing advanced
parameters if possible.
vSphere Storage Design for Maximum SQL Performance
We have so far covered SQL Server VM storage architecture from the database down to
the data store. We are now ready to dive into VMware vSphere storage design and
physical storage design to achieve maximum performance. This section will build on
what weve covered already and help you to design an underlying storage architecture
that supports your high-performance SQL Server systems on top of it. We will cover the
impacts of number of data stores, data store queues, storage performance quality of
service (QoS), storage device multipathing, RAID, and storage array features such as
auto-tiering.
Caution
Be aware that if your storage is under-configured or already overloaded,
increasing the queue depths wont help you. You need to be aware of any queue
depth limits on your storage array and processor ports and make sure that you
dont exceed them. If you overload a traditional storage processor and get a
QFULL SCSI sense code, the storage controller (HBA) will drop the queue depth
to 1 and slowly increase it over time. Your performance during this period will
suffer significantly (like falling off a cliff). We recommend that you consult with
your storage team, storage vendor, and storage documentation to find out the
relevant limits for your storage system before changing any queue depths. This
will help avoid any possible negative performance consequences that would
result from overloading your storage. Some storage arrays have a global queue
per storage port, and some have a queue per LUN. Whether your storage is Fibre
Channel, FCoE, or iSCSI, you need to understand the limits before you make any
changes.
Tip
The default queue depth on a QLogic HBA changed from vSphere 4.x to 5.x from
32 to 64. Emulex queue depth is still 32 by default (two reserved, leaving 30 for
IO), and Brocade is 32. If you didnt know this and simply upgraded, you could
suffer some overloading on your backend storage processors. If your array is
supporting vSphere hosts and non-vSphere hosts on the same storage processors,
it is possible in some cases for the vSphere hosts to impact the performance of
other systems connected to the same array. For more information and instructions
on how to modify your HBA queue depth, see VMware KB 1267 and
http://longwhiteclouds.com/2013/04/25/important-default-hba-device-queue-
depth-changes-between-vsphere-versions/.
In Table 6.9, where the data store maximum number of VMs per host is 1, the maximum
VMs on a given data store is effectively the maximum number of hosts that can be
supported in a cluster. To increase the aggregate amount of active IOs per VM, you need
to increase the number of LUNs and ensure VMs sharing those LUNs are split across
hosts.
Tip
You can set the per-device number of requests outstanding in vSphere 5.5 by
using the command
esxcli storage core device set d naa.xxx --sched-num-req-outstanding=<value>
where naa.xxx is the device name and <value> is a value from 1 to 256.
To list the storage devices on the system, use the following command:
esxcli storage core device list
By specifying the d naa.xx option, you can confirm the setting has been changed
as you expected. Also see VMware KB 1268 for further information.
Figure 6.23 shows the different queues at each level of the vSphere storage architecture.
The two values that are usually worth monitoring as a vSphere admin are the AQLEN
and the DQLEN. DQLEN can be adjusted up or down, depending on your requirements.
For high-IO SQL Server systems where PVSCSI is used on VMDKs, we suggest you set
the DQLEN to 64 as a starting point, while taking into account our previous
recommendations when modifying queue depths.
Figure 6.23 VMware vSphere storage queues.
Caution
If you have an under-configured storage array and insufficient individual spindles
or disks to service the aggregate IO requirements, then increasing the queue depth
will not improve performance. On an under-configured array, increasing queue
depth will just result in the queues becoming full and increased IO latency or
service times. Virtualization doesnt get around the laws of physics; you may
need more disks. Our goal is to ensure the path from the guest to the underlying
storage is not the bottleneck in software, so that you can get the most out of your
physical storage investments and get the highest performance possible.
Note
If you are presenting RDMs to a VM, such as with SQL FCI, and using the
LSI Logic SAS adapter, there is little benefit in increasing the queue depth
to 64. Windows will only be able to issue 32 outstanding IOs before it
starts queuing, and youll never be able to make use of the additional queue
depth. If you will be using a large number of RDMs on your hosts, see
VMware KB 1016106.
Figure 6.24 shows the different areas where storage IO latency is measured and the
relevant values inside vSphere. DAVG, which is the device latency, will indicate if you
have a bottleneck in your storage array, which may mean you need to add more disks or
reduce the load on that device. If you start to see KAVG constantly above 0.1ms, this
means the vSphere kernel is queuing IOs and you may need to increase device queue
depth, especially if the DAVG is still reasonable (< 10ms).
Tip
When consolidating multiple SQL servers onto fewer hosts, there is usually an
implicit assumption that SQL was not previously making full or optimal use of all
of the system resources. This includes CPU and RAM, but also storage IO and
HBA queues. Its your job as the architect or admin of the environment to ensure
your destination vSphere platform and each host has in aggregate sufficient
resources to service the blended peak IO workloads of all of the databases on a
single host. Once you know what the likely blended peak is, you can design your
host platforms accordingly.
Every millisecond of storage IO latency is potentially a millisecond that SQL
cant respond to an application request. Michael Webster
Note
Prior to the introduction of VMware APIs for Array Integration (VAAI) and
VMFS5, VMware used to recommend that no more than 25 VMDKs be
hosted on a single data store. This no longer applies if you have a VAAI-
capable array and a freshly created (rather than upgraded from VMFS3)
VMFS5 data store. Its unlikely you would want to go as high as this for
your SQL servers for production, but it might be applicable for Dev and
Test.
Tip
To ensure that two VMs that are sharing the same data store do not reside on the
same vSphere host, you can use vSphere DRS Rules to keep the VMs separated.
This will reduce the chance of queue contention between the two SQL servers
that might occur if they were on the same host. Having too many DRS Rules can
impact the effectiveness of vSphere DRS and increase management complexity,
so its use should be kept to a minimum. If you get your performance calculations
slightly wrong and you discover one of the VMDKs is busier than you expected,
you could easily migrate it to another data store using Storage vMotion. This can
be done online and is nondisruptive to SQL. Some additional IO latency may be
seen during the migration process.
Note
VMware DRS does not take active queue depth or SIOC into account when
considering compute-based load-balancing operations at this stage.
Caution
Because SIOC works only on data stores hosting multiple VMs, any data store
where a single VM resides will have the full access to all of the queue depth.
Usually this would be less than the aggregate queue depth used across multiple
hosts to a given LUN. In some cases, this could cause a performance impact, such
as when all data stores share the same RAID groups or disk groups on the array.
We recommend you enable SIOC as a standard on all of your data stores when using
traditional block-based storage arrays, regardless of whether or not they are hosting
more than one VM. This will ensure if things change in the future you know that your
VMs will always receive their fair share of the storage IO performance resources
available. If you have an auto-tiering array, we would recommend using the traditional
default values of 30ms for the static latency threshold and not using the injector with
vSphere 5.5.
Tip
We recommend you enable SIOC as a standard on all of your data stores,
regardless of whether or not they are hosting more than one VM.
Note
The recommendations to use SIOC assume traditional block-based shared
storage architecture is being used. Some modern storage systems dont
suffer from the problems that caused a need to have SIOC in the first place,
and therefore there is no need to use SIOC on these systems. An example is
the Nutanix Virtual Computing Platform, where data access is localized per
host, although it provides a distributed shared storage environment. In this
case, disk shares on each host ensure fairness of IO performance. The
Nutanix platform doesnt suffer from the problems that SIOC addresses,
and therefore SIOC is not required.
Figure 6.31 shows the vSphere 5.5 Storage IO Control Settings dialog box. By setting
SIOC to Manual, you effectively disable the injector, which is the preferred setting
when using auto-tiering arrays, or storage platforms where the injector is likely to get
inaccurate data.
Figure 6.31 vSphere 5.5 Storage IO Control settings.
Tip
In vSphere 5.5, you assign tags to data stores and then use those tags to create
storage policies. This is much like using hash tags on social media. They can
easily be searched on afterward and queried or manipulated using orchestration
and scripting (such as PowerCLI).
By pooling multiple (up to 32) similar data stores into data store clusters and using
Storage DRS, you can ensure that initial placement of virtual disks to the best data store
is automated, and this reduces the number of individual elements you need to actively
manage. Storage DRS can be configured to load balance based on capacity, IO
performance, or both, and can be set to simply make recommendations (manual) or be
fully automated. If your array does not include automated storage block tiering, you can
use Storage DRS to load balance data stores for IO performance, in addition to simply
load balancing for capacity. When IO Load Balancing is enabled, Storage DRS works
cooperatively with Storage IO Control and will collect IO metrics from the data stores
and uses the IO injector to determine performance capabilities. The data is then
analyzed periodically (by default, every 8 hours) to make IO load-balancing decisions.
Importantly, the cost of any storage migrations is taken into consideration when making
IO load-balancing decisions. Load balancing based on capacity or IO is achieved by
performing Storage vMotion migrations between the source and destination data stores
within a data store cluster.
Tip
If you wish to perform data store maintenance for any reason or migrate between
arrays, you can put one of more data stores of a data store cluster into
maintenance mode. This will enforce the evacuation of all virtual disks and files
on the data stores going into maintenance mode into the remaining data stores that
make up the data store cluster. Storage DRS will distribute the load and make
sure that your load balancing policies is adhered to.
The example shown in Figure 6.33 is of the standard storage DRS options, including the
Storage DRS Automation Level, configured for Fully Automated, and the I/O metrics
settings, which are disabled. You may wish to set Storage DRS to No Automation
(Manual Mode) for a period of time during operational verification testing or if you are
unfamiliar with Storage DRS and data store clusters, until you are familiar and
comfortable with the recommendations it makes.
Caution
Care should be taken when implementing Storage DRS on backend storage that is
thin provisioned if it doesnt include data de-duplication capabilities. Traditional
thin provisioned backend storage capacity could become full if a storage
migration takes place between one thin provisioned data store and another if the
space is not reclaimed. Because the IO injector is used to determine performance
capabilities when IO metric collection is enabled, it should not be used with
auto-tiering arrays because the data it gathers will be inaccurate and your array is
already managing the performance of each LUN. In the case of auto-tiering arrays,
you should only use Storage DRS for initial placement and capacity-based load
balancing.
The example in Figure 6.34 shows the Storage DRS Advanced Options expanded. Here,
you can set whether to keep VMDKs together by default and other settings. These
parameters will influence how much of an imbalance there needs to be before Storage
DRS will consider taking action. The most relevant settings for SQL Server are Keep
VMDKs together by default and the advanced option shown in this figure,
IgnoreAffinityRulesForMaintenance.
Figure 6.34 vSphere Storage DRS advanced options.
The default option for Storage DRS will keep all VMDKs from a VM on the same data
store. For a high-performance database, this is not what you would want. You will want
to leverage the available data stores and queue depth to get the best performance while
Storage IO Control sorts out any bumps in the road and ensures quality of service. Our
recommendation for SQL Server environments is to have Keep VMDKs Together
unchecked. This will cause Storage DRS to spread out the VMDKs among the available
data stores. If you have large numbers of SQL Servers, it may be preferable to run them
in a dedicated data store cluster, because this could limit the impact they have on other
workloads, and vice versa.
If at a later stage you want to add data store performance as well as capacity, you can
simply add more data stores to the data store cluster and they will be used for load-
balancing operations per VMDK as well as during initial placement. Separating the
VMDKs among the data stores will ensure quality of service and access to performance
of all the databases added to the data store cluster while making administration and
management significantly easier. We would recommend you leave the
IgnoreAffinityRulesForMaintenance advanced setting at 0, unless you are willing to
compromise your affinity rules and performance during data store maintenance
operations.
In Figure 6.35, we have combined storage policies with multiple data store clusters.
With the different virtual disks of each VM configured with a storage policy based on
the required capabilities, the storage policy then maps to a particular data store cluster.
Whenever a new VM is provisioned, its virtual disks will be provisioned in the correct
data store cluster. The advantage of this method is that you can have the different
VMDKs of a VM on a different class of storagefor example, where you want backup
on a lower tier, or the OS on a lower tier, while the database files and transaction logs
files are on a higher tier.
The sample diagram in Figure 6.36 shows multiple SQL Server VMs entirely within a
single data store cluster, which would be backed by a single class of storage or single
physical storage policy. Each VMs individual VMDKs would be split among the data
stores of the data store cluster. Storage Policies on each VM would dictate which data
store cluster the SQL Server is assigned, but an individual VM is not split between
multiple data store clusters, as was the case in Figure 6.35. This is the recommended
approach in environments that support automated block tiering at the storage array.
Tip
When you are defining your SQL Server Database Service Catalog, group your
VMs not only by CPU and Memory requirements but also Storage IO
requirements. The Storage IO requirements can then drive your storage policy
design, at both the vSphere and physical storage layer, and the relevant data store
clusters that need to be supported.
Note
For vSphere 5.1 and below, VMW_PSP_FIXED and VMW_PSP_MRU are
the only valid options when using SQL AlwaysOn Failover Clustering.
When using SQL AlwaysOn Availability Groups, you are free to choose
any path selection policy you like because it does not require shared disk
failover clustering or a shared SCSI bus configuration. vSphere 5.5
introduced support for VMW_PSP_RR for SQL AlwaysOn Failover
Clustering.
Tip
For a full list of all the supported storage devices and their associated default
SATP and PSPs, refer to the VMware Storage Compatibility Guide
(https://www.vmware.com/resources/compatibility/san_reports.php?
deviceCategory=san). For further information on Path Selection Policies, see
VMware KB 1011340.
VMware has designed vSpheres storage multipathing to be flexible and to allow
storage vendors to write their own multipathing plugins. The advantage of many of the
third-party vSphere multipathing plugins, such as EMCs PowerPath/VE, is that they use
target-side load balancing. This is where the load on the storage arrays paths, storage
processors, and individual queue depths may be taken into consideration when choosing
the best path for a particular IO operation. This can produce greatly improved
performance and lower latency. Many vendors offer their own plugins, so you should
check with your storage vendor to see if they have a plugin and what advantages it might
have for your environment. Most of these plugins come at an additional cost, but in our
experience it can usually be justified based on the additional performance.
Tip
When using iSCSI-based storage and the Software iSCSI initiator, ensure that you
configure the iSCSI Port Binding in vSphere correctly so that you can get the best
performance and reliability from your storage. Refer to the VMware Multipath
Configuration for Software iSCSI Using Port Binding white paper
(http://www.vmware.com/files/pdf/techpaper/vmware-multipathing-
configuration-software-iSCSI-port-binding.pdf).
The VMware vSphere Native Multipathing modules eliminate a lot of the problems and
complications traditionally associated with in-guest multipathing drivers. To simplify
your environment further, you could choose to put your VMDKs onto NFS data stores
mounted to vSphere. When using NFS, your load balancing will most likely be done on
the array, or by using the correct network teaming. NFS as a data store instead of VMFS
is a great solution, provided it is designed and deployed correctly to meet the
performance needs of your SQL Servers. The protocol itself will not be your limiting
factor for performance, especially on 10GB Ethernet. Whichever storage option or
protocol you choose, you just need to design it to meet your performance requirements
and verify through testing that it does. There are many situations where NFS could be a
valid option, and some of the benefits are covered in the section SQL Server on
Hyperconverged Infrastructure.
Note
For full details of VMwares Windows Failover Clustering support, refer
to VMware KB 1037959. When using large numbers of RDMs for failover
clusters, you may need to perennially reserve them to ensure fast host
storage rescans and boot times; refer to VMware KB 1016106. For further
details on vSphere 5.5 clustering enhancements, see VMware KB 2052238.
Read/Write Bias
Just because your applications drive SQL to generate a read-biased workload doesnt
mean the underlying storage system will see a read-biased IO pattern. The reason for
this is the SQL buffer pool is likely to mask a lot of read IO if you have sized your VM
correctly. This will mean your IO patterns may be very write biased. Writes will be
going to your data files, Temp DB files, and your transaction log all at the same time.
You will need to make sure you have sufficient array write cache so you dont get into a
position of a force flush and a subsequent instance of the cache going write through,
which will significantly degrade performance. You must have sufficient numbers of
disks in the array to handle the cache flushes easily.
Caution
Be very careful when using 7.2K RPM SATA or NL-SAS disks on a traditional
RAID array, even with automated storage tiering. Overloading a SATA or NL-
SAS LUN can cause forced flush and significant periods of array cache write
through (instead of the friendly cache write back), to the point where the storage
processors may appear to freeze. Also, you may find LUNs being trespassed on
active/passive arrays, or just lots of path flip flops on active/active arrays. With
modern storage systems, including SSDs to host the active working set data and
acting as a second cache area, the chances of forced flushes may be reduced. But
you will need to ensure that your active working set doesnt increase to the point
where it overflows the caches and SSD and causes writes directly to slow tiers.
Note
Some modern storage systems have done away with using RAID because of
the performance impact and risks introduced during disk rebuild
operations. If you are using a storage platform that has a different data
protection mechanism, its important that you understand how it works. The
advantages can be significantly higher performance during failure,
significantly faster recovery from failure, and greater predictability.
RAID Penalties
Random IO patterns, read/write bias, and failure events have a big impact on
performance due to the overheads and penalties for read and write operations
associated with using RAID. This is especially so with spinning disks. Storage array
vendors have come up with many ways to try and work around some of the limitations
with RAID, including the smart use of read and write caches. In your storage design,
though, we recommend you plan your performance based on the physical characteristics
of the underlying disks and plan for the rest to be a bonus. Table 6.10 displays the IO
penalties during normal operations for each of the common RAID schemes.
Note
The basis for the IOPS calculations in Table 6.10 is the rotational latency
and average seek time of each disk. These will be different depending on
the latency characteristics of different manufacturers disks. This would
also not apply for solid state disks and PCIe NAND flash devices. For
further information about IOPS, see http://en.wikipedia.org/wiki/IOPS and
http://www.symantec.com/connect/articles/getting-hang-iops-v13.
As you can see from Table 6.10, if you have a very write-biased workload, you could
get very low effective IOPS from your RAID disks. This is the primary reason why
arrays have write cacheand in some cases, lots of it. This allows the array to offset
much of the penalty associated with writes to RAID groups of disks. But the arrays
assume there will be some quiet time in order to flush the cache; otherwise, there will
be an impact to performance. The calculation for write IOPS is as follows:
Click here to view code image
However, this only works when things are going well. If you fill your cache by having
too much write IO on slow spindles, or just from general overloading, your array will
stop caching writes and bypass the cache altogether (go write through). In this case,
youll get at best the raw performance of the RAID groups. This problem can be made
worse when there is a disk failure and a group of RAID disks needs to be rebuilt.
Depending on the type of disks, this can take many hours and severely impact
performance during the rebuild operation.
Lets take the RAID penalties a bit further and look at an example where we are sizing
for performance. In this example, we will look at the requirements of a SQL data store
that needs to be able to deliver 5,000 IOPS. We will assume that the workload is 70%
read and 30% write, which is typical for some OLTP systems.
First, we need to calculate the effective number of IOPS required. This takes the 5,000
IOPS of a 70/30 read/write workload and adjusts for the RAID penalty as follows:
Click here to view code image
Required Array IOPS =
(Required IOPS * Read %) + RAID Write Penalty * (Required IOPS * Write %)
You can see from the example in Table 6.11 that to achieve 5,000 IOPS for a 70% read-
biased SQL workload, we need 9,500 IOPS at RAID 5 from the array. Now that we
know the required array IOPS, we can calculate the number of disks required to achieve
this performance at each of the RAID levels. To do this, we divide the number of IOPS
by the number of IOPS per disk. RAID penalties have already been taken into
consideration due to the previous calculations.
Table 6.11 Array IOPS Required at Different RAID Levels to Achieve 5,000 SQL
IOPS
To calculate the number of disks required to meet the required IOPS of a workload, we
use the following formula:
Required Disks for Required RAID IOPS = Required Array IOPS / IOPS per Disk
Example RAID 5 Disks = 9500 Array IOPS / 210 IOPS per 15K Disk = 45 Disks
As Table 6.12 demonstrates, to achieve 5,000 SQL IOPS 70% read at RAID 5 on 15K
RPM disks requires 45 disks, whereas it only requires 31 disks at RAID 1, RAID 10, or
RAID DPa saving of 14 disks. If the workload is only 30% read, then we would
require 74 15K RPM disks at RAID 5 and only 40 15K RPM disks at RAID 1, RAID
10, or RAID DP. This would be a saving of 34 disks to achieve the same performance.
This assumes each disk can achieve the high end of the IOPS for that device. The less
number of IOPS per disk, the more disks in total will be needed. In this example, weve
used the high-end IOPS of each disk for the calculations. Be sure to check with your
storage vendor on their recommendations for IOPS per disk when doing any
calculations.
Table 6.12 Min Disks Required at Different RAID Levels to Achieve 5,000 SQL IOPS
To achieve 5,000 IOPS at RAID 6 70% read on 7.2K RPM disks, wed need 125 disks
in total. At RAID 10 on 7.2K RPM disks, the required disks falls to 65, a saving of 60
disks. The difference is even more pronounced when the workload is only 30% read. At
RAID 6, we would require 225 disks, whereas at RAID 10, we would only require 85
disksa saving of a whopping 140 disks.
Tip
RAID 6 is commonly used with SATA and NL-SAS disks because the chance of
a second drive failure during a rebuild operation is quite high. This is due to the
time it takes to rebuild a RAID group when using slow 7.2K RPM high-capacity
disks > 1TB.
Those of you who know RAID will be thinking at this point that some of the numbers in
Table 6.12 are wrong, and youd be right. How do you get 31 disks in RAID 1 or 10, or
225 disks in RAID 6? The answer is, you dont. The numbers in Table 6.12 have not
been adjusted for the minimum required for a complete RAID group, or the likely size of
each RAID group that would be required to make up an entire aggregate or volume to be
created from. You would need to increase the numbers of disks to be able to build
complete RAID groups. For example, in RAID 5, its common to build RAID groups
consisting of 7 data disks +1 parity disk (8 total), and in RAID 6, it is common to build
8+2 or 10+2 RAID groups. RAID5 7+1 or RAID6 10+2 may be terms youve heard
before when talking to storage administrators.
Now that weve adjusted the figures in Table 6.13 for the RAID groups, you can see that
RAID 1 and 10 are even more efficient than RAID 5 and 6 in terms of the number of
disks to achieve the same performance. This is important to understand because it also
has a direct impact on the amount of capacity that will be provisioned to reach the
desired performance level.
Table 6.13 Min Disks per RAID Group Adjusted to Achieve 5,000 SQL IOPS
For this part of the example, well imagine that our SQL database that needs 5,000 IOPS
will be 2TB in size. There will be an additional 200GB for transaction logs, 200GB for
Temp DB, and another 100GB for the OS, page file, and so on. In totally, the capacity
required is approximately 2.5TB.
From Table 6.14, you can see the usable capacity after taking into consideration the
redundant or parity disks of the various RAID types needed to achieve 5,000 IOPS
based on the previous examples. The 2.5TB usable capacity requirement for our sample
SQL Server can easily be met by any of the selected RAID levels based on the number
of disks required to achieve 5,000 IOPS. In fact, all of the RAID levels provide a lot
more capacity than is actually requiredsome in the extreme.
Table 6.15 IOPS per TB Based on Example 30% Read Workload at 5000 IOPS
Tip
It is possible to achieve higher IOPS per disk by using only a small portion (say,
25%) of the disks total capacity. This is known as short stroking or partial
stroking a disk. This is because when you use the first part of a spinning disk, the
rotational latency is a lot lower, as the outside of the disk platters are spinning
faster than the inside, and you cover more sectors in less time. See
http://searchsolidstatestorage.techtarget.com/definition/Short-Stroking.
Table 6.17 EFDs at Different RAID Levels Required for Example SQL DB
Table 6.17 illustrates the number of EFDs required to meet both the performance and
capacity requirements of our sample SQL DB. In this example, the RAID 5 option is the
most cost effective from a performance and capacity perspective.
Comparing the number of 400GB EFDs required to meet the SQL requirements against
the most cost effective options for spinning disks (Gold Policy RAID 10), we can see
that we need five times less EFDs. For this workload, the eight EFDs may be the best
option if their combined cost is less than the 40 spinning disks. In many cases, the EFDs
will be less cost, especially when the reduced space, power consumption, and cooling
of EFDs is considered.
Lets add a Platinum storage policy in addition to the previous defined policies and
calculate the effective IOPS per TB based on our 400GB EFD example.
With the new Platinum storage policy in Table 6.18, we can easily meet the
performance requirement of 5000 IOPS, but we need additional disks to meet the
capacity requirement. Table 6.15 shows us that we need eight EFDs at 400GB in order
to achieve the required 2.5TB. Based on provisioning 2.8TB of usable capacity, we can
calculate that our achievable IOPS from that capacity at a conservative 4000 IOPS per
TB at RAID5 with write penalty of 4 is 11,200 IOPS. At this point, its likely that wed
run out of capacity well before running out of performance.
Table 6.18 IOPS per TB Based on Example 30% Read 5,000 IOPS and 2.5TB Capacity
Note
There are many new storage platforms that include only flash as part of
their architecture, meaning the entire array may become your primary tier.
Some of these platforms claim to offer economics similar to spinning disks,
by using advanced compression and data de-duplication techniques. These
platforms are normally aimed at the highest performance workloads, such
as critical SQL databases. These types of storage platforms are
unsurprisingly known as All Flash Arrays, and come from the likes of
EMC, NetApp, HP, PureStorage, Violin Memory, and others.
At this point, you might consider doubling the size of each EFD to 800GB. This would
halve the number of disks required to meet the capacity requirements. Assuming that
each individual 800GB EFD has the same IOPS performance as the 400GB versions,
you could achieve a better balance of performance and capacity. The larger EFDs
would have half the IOPS per TBin this case, to around 2,000. Five EFDs would be
required to reach the required capacity. This would mean 3.2TB of usable capacity is
deployed. The achievable IOPS from the deployed usable capacity would drop to
6,400. This is still a more performance than required. Also, although we are only using
5 800GB EFDs instead of 8 400GB EFDs, because they are double the capacity,
they are also likely to be double or more the cost.
An EFD might be marketed at 400GB or 800GB in size, but to protect against wear of
the NAND flash cells, the disk will usually have more physical capacity. This is to
provide more endurance and a longer service life. This may vary between different
vendors and individual SSDs, and we recommend you check with your storage vendor.
Tip
EFDs and SSDs are dollars per GB but cents per IOP, whereas spinning disks
are cents per GB and dollars per IOP. In order to achieve the best balance, you
need some of each. This is why many types of storage array include automatic
storage tiering. Automatic storage tiering is most effective when done at the block
level because individual blocks can be moved between the EFD and spinning
disk storage as performance and capacity needs change. Where available, we
recommend you use automatic storage tiering and seek advice from your storage
vendor to ensure effective implementation and operations.
To make calculating performance and capacity based on different types of disk,
numbers of disks, and RAID types easy, see the calculator at
http://www.wmarow.com/strcalc/.
Note
There are many new types of enterprise storage systems and converged
architectures on the market today that have moved away from using RAID
as the main means of data protection and instead have their own methods.
Often these alternative methods can achieve the same reliability and data
protection levels as RAID, but without all of the complication and
performance penalties. If you are using a system that doesnt rely on RAID
for data protection, you can safely ignore this section. You should seek
advice from your vendor with regard to sizing for capacity and
performance based on their data protection methods and overheads.
Figure 6.38 Flash acceleration and lyrics from the classic Queen song Flash Gordon.
But flash in an array has some limitations, and there is another location where we can
use flash SSDs, EFDs, and PCIe that can greatly improve SQL performance, directly in
the VMware ESXi servers hosting SQL. This is where server-side flash and associated
acceleration solutions come in. Server-side flash when used as part of an IO
acceleration solution can be thought of as cheap memory, rather than expensive disk. It
is definitely cents per IOP and dollars per GB, but the returns on investment and
performance can be substantial. Especially when it is not possible to add more RAM to
the buffer cache, which would be the fastest possible storage from a performance
perspective.
By using server-side flash acceleration, you can normally consolidate more SQL VMs
per ESXi host, with less memory directly assigned to each SQL VM, and without
sacrificing performance and user response times. Read or write IOs are offloaded to the
local server flash device, and this acts as a very large cache. It can also greatly reduce
the load on the back-end storage, which allows the array to improve its efficiency.
Because the flash devices are local to the server, the latencies can be microseconds (us)
instead of milliseconds (ms) and eliminate some traffic that would normally have gone
over the storage network. By reducing the storage IO latencies, not only are user
response times improved, but overall server utilization is improved. You may see
increased CPU utilization, as you are able to get more useful work done by reducing
system bottlenecks.
In this section, we cover three different server-side flash acceleration solutions that are
supported with VMware vSphere and can greatly improve the performance of your SQL
databases. The solutions we cover are VMware vSphere Flash Read Cache (vFRC),
which is included with vSphere 5.5, Fusion-io ioTurbine (IOT), and PernixData Flash
Virtualization Platform (FVP). The first two solutions act as a read cache only, as all
writes go directly to the backend storage while being cached and are therefore write
through. PernixData FVP, on the other hand, offers a full write back cache, where both
read IO and write IO can be accelerated.
Tip
If in doubt about what your cache block size should be, start at 8KB. Having the
cache block size smaller than the actual IO size is better than having it oversized.
Your cache block size should evenly divide the predominant IO size to ensure
best performance and lowest latency. If your predominant IO size were 64KB,
then having a cache block size of 8KB or 16KB would be fine because it can
evenly divide the IO size.
The cache size and block size are manually set when you enable vFRC on a VM, and
they can be changed at runtime without disruption. Having the cache too small will
cause increased cache misses, and having it too big is not just wasteful, it will impact
your vMotion times. By default, when vFRC is configured, the cache of a VM will be
migrated when the VM is vMotioned. If its set too big, this will increase the vMotion
times and network bandwidth requirements. You can, if desired, select the cache to be
dropped during a vMotion, but this will have an impact on SQL performance when the
VM reaches its destination while the cache is being populated again.
Caution
Make sure a large enough flash resource exists on each server in your vSphere
cluster. If you have an insufficient vFRC resource on a server, you may not be
able to migrate or power on a VM.
Note
Performance tests conducted by VMware using the Dell DVD Store to
simulate an ecommerce site with vFRC showed up to a 39% performance
improvement with certain configurations. A number of statistics can be
useful for monitoring and tuning vFRC. For detailed information on vFRC,
performance test results from VMware, and vFRC stats, refer to
http://www.vmware.com/files/pdf/techpaper/vfrc-perf-vsphere55.pdf.
Fusion-io ioTurbine
ioTurbine is caching software from Fusion-io that leverages the Fusion-io ioMemory
range of high-performance flash devices, such as the SLC- and MLC-based ioDrive and
ioScale PCIe cards. ioTurbine creates a dynamic shared flash pool on each ESXi server
that can be divided up between cache-enabled VMs based on proportional share
algorithm. By default, each VM is assigned the same shares and thus get an equal
proportion of the available flash cache resource pool.
Like VMwares vFRC, ioTurbine is a read cache, and all writes are sent through to
persistent storage while simultaneously being cached. Unlike vFRC, there are no manual
parameters to set on a per-VM basis to size the cache or the blocks that are cached. This
automatic and dynamic sizing of the flash cache of each VM is useful where you have
lots of VMs that can benefit from caching or where you have flash devices of different
sizes on different hosts. It reduces the management overhead.
Figure 6.41 displays a high-level overview of the ioTurbine architecture, including
Fusion-ios Virtual Storage Layer (VSL) driver. As of ioTurbine 2.1.3, which supports
vSphere 5.5, the VSL SCSI driver is used by default instead of the VSL block driver.
This can provide improved performance and better resiliency.
Figure 6.41 ioTurbine architecture overview.
In addition to being able to cache a VM, ioTurbine is capable of caching disks, files,
and entire volumes. With the optional in-guest agent, the caching becomes data and
application aware. This means particular files within the OS can be cached while others
are filtered out. This is very useful for SQL where we only want the data files and Temp
DB files cached while the transaction logs are not cached.
ioTurbine is fully compatible with VMware features such as DRS, HA, and vMotion.
ioTurbine also works in environments where not all ESXi hosts contain a flash device,
in which case the flash cache of a server would be set to 0.
In the example in Figure 6.42, if one of the VMs in the left ESXi host is migrated to the
right ESXi host, all VMs will be allocated one third of the flash cache capacity of each
host because there will be three cached VMs on each host.
Figure 6.42 ioTurbine dynamic and automatic allocation of flash cache.
Tip
Fusion-io has a tool called the ioTurbine Profiler that allows you to observe the
effects of caching on production or staged systems prior to investing in the
ioTurbine software and necessary hardware. The ioTurbine Profiler simulates the
effects of storage acceleration on a Linux or Windows system. For more
information, see http://www.fusionio.com/products/ioturbine-virtual/.
Table 6.19 was obtained from Fusion-io performance test results published at
http://www.fusionio.com/blog/performance-of-a-virtualized-ms-sql-server-poor-
ioturbine-to-the-rescue. The results demonstrated that by offloading reads to the
ioTurbine flash cache, write performance also increased by just over 20%. This test
was based on TPC-E workload. This demonstrates that read caching can also improve
write performance to a certain extent.
PernixData FVP
PernixData FVP is different from the other two solutions already discussed in that it
aggregates server-side flash devices across an entire enterprise to create a scale-out
data tier for the acceleration of primary storage. PernixData FVP optimizes both reads
and writes at the host level, reducing application latency from milliseconds to
microseconds. The write cache policy in this case can be write back, not just write
through. When the write back cache policy is used, the writes are replicated
simultaneously to an alternate host to ensure persistence and redundancy in the case of a
flash device or host failure.
Application performance improvements are achieved completely independent of storage
capacity. This gives virtual administrators greater control over how they manage
application performance. Performance acceleration is possible in a seamless manner
without requiring any changes to applications, workflows, or storage infrastructure.
Figure 6.43 shows a high-level overview of the PernixData Flash Virtualization
Platform architecture.
Note
PernixData has a demonstration of how it accelerates SQL performance
available at http://blog.pernixdata.com/accelerating-virtualized-databases-
with-pernixdata-fvp/. The PernixData FVP Datasheet is available at
http://www.pernixdata.com/files/pdf/PernixData_DataSheet_FVP.pdf.
The examples in Figures 6.44 and 6.45 show a SQL 2012 database driving around
7,000 IOPS consistently and the resulting latency both at the data store and at the VM
level. The total effective latency is what the virtual machine sees, even though the data
store itself is experiencing drastically higher latency. In this case, in spite the latency of
the data store being upwards of 25ms, the SQL VM response times are less than 1ms.
Figure 6.44 PernixData FVP acceleration for SQL Server 2012 IOPS.
Figure 6.45 PernixData FVP acceleration for SQL Server 2012 latency.
When FVP cannot flush the uncommitted data to primary persistent storage fast enough
that is, when more hot data is coming in than there is flash space availableFVP
will actively control the flow of the new data. This means that FVP will artificially
increase the latency, ultimately controlling the rate at which the application can send,
until the flash cluster has sufficient capacity and returns to normal. FVP does not
transition to write through, even when it is under heavy load. Applications normally
spike and are not continuously hammering the data path 100% all time, so FVP flow
control helps smooth out the spikey times, while providing the most optimized
performance possible.
Caution
Migrating a VM in an FVP flash cluster, in certain network failure scenarios, or
when the local or replica flash device fails, FVP will automatically change the
write back policy to write through. This ensures data protection, while degrading
write performance. However, reads may still be accelerated by requests being
serviced from the remainder of the flash cluster. When the issue is resolved the
policy will be automatically returned to write back. For more information, see
the Fault Tolerant Write Acceleration white paper on http://pernixdata.com and
http://frankdenneman.nl/2013/11/05/fault-tolerant-write-acceleration/. This is a
standard part of the FVP Fault Tolerant Write Acceleration Framework.
Tip
Nutanix has a SQL Server Best Practices white paper and reference
architecture available at http://go.nutanix.com/rs/nutanix/images/sql-on-nutanix-
bp.pdf. For detailed information on the entire Nutanix architecture, see the
Nutanix Bible by Steven Poitras at http://stevenpoitras.com/the-nutanix-bible/.
The VMware vSphere on Nutanix Best Practices white paper (available at
www.nutanix.com) covers in detail each vSphere feature and how it should be
designed and configured in a Nutanix environment.
Due to the simplified nature of the Nutanix storage architecture and NDFS, we can
simplify the storage layout for SQL Server. Figure 6.48 includes a sample layout, which
is standard in a Nutanix environment, consisting of a single NFS data store and single
storage pool. We do not need to configure multiple LUNs or calculate LUN queue
depths.
Table 6.20 Nutanix Benefits for OLTP and OLAP SQL Databases
To demonstrate the capability of the Nutanix platform for SQL Server, a number of
SQLIO benchmarks were performed as part of the SQL on Nutanix Best Practices
white paper (http://go.nutanix.com/TechGuide-Nutanix-SQLBestPractices_Asset.html),
reproduced here with permission. Figures 6.49 through 6.52 resulted from the
benchmarks.
Figure 6.49 SQL Server SQLIO single VM random IOPS by block size.
Figure 6.50 SQL Server SQLIO single VM throughput by block size.
Summary
Throughout this chapter, we have provided architecture examples based on real-world
projects that you can adapt for your purposes. Weve tried to explain all the relevant
considerations and best practices you need to worry about when architecting your
environment for high-performance and critical SQL Server databases. We covered the
key aspects of SQL Server storage architecture for all environments as well as the
differences you need to understand when architecting storage specifically for virtual
SQL Server databases, such as the IO Blender Effect and the way IO queues work
across hosts on the same data store.
We provided guidance on important database storage design principles and a top-down
approach covering SQL Server Database and Guest OS design, Virtual Machine Storage
design, VMware vSphere Storage Design, and then down to the physical storage layers,
including RAID and using server-side flash acceleration technology to increase
performance and provide greater return on investment. We concluded the chapter by
covering off one of the biggest IT trends and its impact on SQL Server:
hyperconvergence and scale-out, shared-nothing architectures.
Lets briefly recap the key SQL design principles:
Your database is just an extension of your storage. Make sure you optimize all
the IO paths from your database to storage as much as possible and allow for
parallel IO execution.
Performance is more than just the underlying storage devices. SQL Buffer
Cache has a direct impact on read IO, whereas virtual IO controller device queues
and LUN, HBA, and Storage Processor queues can all impact performance and
concurrency of IO before anything touches a physical storage device.
Size for performance before capacity. If you size for performance, capacity
will generally take care of itself. Much of this is due to the overheads associated
with RAID storage needed to provide enterprise-grade data protection and
resiliency. Use flash storage and automatic tiering to balance the performance and
capacity requirements to get a more cost-effective solution overall.
Virtualize, but without compromise. This involves reducing risk by assessing
current performance, designing for performance even during failure scenarios,
validating your design and its achievable performance, and ensuring storage
quality of service, such as Storage IO Control. These all contribute to a successful
SQL virtualization project. Make sure project stakeholders understand what
performance to expect by having SLAs aligned to achievable IOPS per TB.
Keep it standard and simple. Whatever design decisions you make for your
environment, keep them as consistent as possible and have defined standards.
Design for as few options as possible in your service catalog that cover the
majority of system requirements. Only deviate from defaults where required.
We have covered storage performance in depth, as it is one of the most critical
resources for a SQL Database. The next chapter will drill into how SQL memory
allocation impacts the performance of your database, and how SQL and memory might
change in the future.
Tip
Throughout this chapter, we have referred to SQL Server trace flags. A full list of
the trace flags can be viewed at
http://social.technet.microsoft.com/wiki/contents/articles/13105.trace-flags-in-
sql-server.aspx. To enable trace flags when using Windows 2012, you need to
run the SQL Configuration Manager, which doesnt appear in the list of
applications. To do this, enter sqlservermanager11.msc in the application
search box on the Apps screen.
Chapter 7. Architecting for Performance: Memory
Memory
One of the most critical resources a database has is memory. You want to speed up a
SQL Server database; the quickest way to do this based on my experience is to allocate
more memory to it. By allocating more memory, you are minimizing the amount of
physical I/O your database will have to perform. In other words, when the SQL Server
database does not have enough memory, the database will move more of its workload to
the physical I/O. A physical I/O request is still one of the slowest actions a database
can perform.
As mentioned before, a database is just an extension of your disk drives. A slow disk
array typically means a slow database. To speed up a database quickly, you need to
minimize the physical I/O the database has to perform. In a perfect world, you would
read all your data into memory, and the only time the database would have to go out to
the storage array is to record a transaction.
Caution
When a SQL Server database does not have enough memory, the database will
move more of its workload to physical I/O. Physical I/O is many orders of
magnitude slower than memory access. Remember that RAM operations are
measured in nanoseconds, whereas disk operations are measured in milliseconds.
Important
SQL Server can only access data if its first residing in the database buffer pool.
Only data that resides in the database buffer pool can be manipulated, inspected,
or altered by SQL Server.
Only data that resides in the database buffer pool can be manipulated, inspected, or
altered by SQL Server. Until it resides within the database buffer pool, it is not usable
by the database. Without memory, the SQL Server engine cannot do its work. As a
DBA, you control the size of the database buffer pool. Too small a buffer pool and the
database will constantly be calling outside to the storage array. Too large a pool and
you could take away valuable memory needed elsewhere. Remember a virtualized
environment is a shared resource pool of resources. How efficiently you use memory as
a resource is critical to overall database performance and the overall health of the
virtualized environment.
The fundamental unit of storage in a SQL Server database is the page. All data within
the database buffer pool is stored within the many pages that make up the database
buffer pool. In SQL Server, a database page is 8KB in size, and 8KB pages are
optimized for the Windows operating system and are not adjustable. Each time a SQL
Server page is touched, a counter within the page in incremented. The MRU algorithm
then takes the hottest pages, those with the highest count, and tries to keep them current
in the database buffer pool.
Paging and Swapping: A DBAs Nightmare
Quick question: Paging and swapping are common terms used by database
administrators and system administrators. So what is the difference between paging and
swapping?
Both paging and swapping are methods of moving the contents of data in memory to
another storage device. That storage device is typically a disk drive, and the contents
are placed within what is commonly called a swap file or page file. For example, in
VMware vSphere, the file is called a vSwap file.
Swapping is when you move all the memory segments belonging to a particular process
thats running onto another storage device. The important word here is all. When this
happens, all execution on that process stops, until enough space exists for all the
memory segments owned by that process to be brought back into memory. Remember,
its an all-or-nothing proposition.
Paging is when a subset of the memory segment IE: individual pages are able to be
swapped in and out as needed. In this case, the SQL Server database would look within
the page table. If the page needed is already in memory, SQL Server accesses the
contents of that page. If the page needed by the process is not in memory, you get a page
fault. Processing is temporarily suspended until the operating system is able to bring the
needed page into memory. The key here is that this is not an all-or-nothing proposition.
The coming in and out from the secondary storage device is done at a more granular
level. In this example, the paging in and out is at the individual page level.
Caution
When paging or swapping occurs, the performance of your virtualized database is
severely impacted. This should be avoided at all cost.
Continuing further down the stack, when the data the SQL server database engine needs
is not available within the database buffer pool, it must make a request to the storage
array for the needed information.
The storage array looks within its cache to see if the data needed is available to it. It
also uses proprietary algorithms to keep the storage array data cache populated with the
information you are most likely to need. Notice how memory is being used once again to
boost performance by helping to minimize I/O. When the storage array cannot resolve
the request, it then makes a request to retrieve the information from the physical drives.
Newer storage arrays, such as the EMC VMAX, IBM V7000, and NetApp FAS6200,
would look within the SSD drives. (I am using flash and SSD interchangeably for
purposes of this example.) According to Wikipedia (http://en.wikipedia.org/wiki/Solid-
state_drive), a solid-state drive (SSD) is a data storage device using integrated circuit
assemblies as memory to store data persistently. As mentioned previously, solid-state
storage should be thought of as cheap memory rather than expensive disks.
As you can see from the definition at Wikipedia, SSD drives are just another form of
memory cache. The storage array uses additional proprietary algorithms to keep the
SSD drives populated with the information you are most likely to need. When the SSD
drives cannot resolve the request, they then look to the SATA/SCSI drives for the data.
Depending on the storage array, it might contain SATA drives or SCSI drives. Blending
SSD drives with SATA or SCSI drives together gives you better performance at a much
more reasonable cost.
As this example illustrates, the trend within storage arrays is to minimize the amount of
physical I/O that might be needed by leveraging memory. Any time you have memory-to-
memory access happening, your database will perform faster.
Database Indexes
Another powerful tool we use to speed up a databases performance is the strategic
placement of indexes. Indexes can greatly reduce the amount of physical I/O needed to
retrieve the necessary data to resolve a query. This is an overly simplified way of
explaining how a database retrieves data, but it illustrates the point I am trying to make.
When a database retrieves data, it can do so in one of two ways. The database performs
a full table scan or an index seek. A full table scan is the equivalent of starting at the
beginning of a book and reading every page of the book until the very end. An index
read is the equivalent of using the index in the book and jumping right to the page you
need. The index has the effect of greatly minimizing the amount of I/O the database has
to perform. Unlike a book index, which points you to the page you need to go look up, a
database index can sometimes provide all the data that is needed to resolve a query
without going out to the actual source table itself for the data. We will provide an
example of how an index works in the next section of this chapter.
Note
Indexes are an important tool in a DBA or developers toolbox for
improving overall database performance. Its important that periodic
maintenance routines be put in place to keep those indexes operating
optimally.
Figure 7.2 A table filling an 8KB page, and an index based on the first two columns
filling the 8K page.
Think of an index as a subset of the table itself. If you create an index on the first two
columns of the table used in the example (and assuming no compression), then each row
of the index would use 200 bytes. As you can see in Figure 7.2, each 8KB page within
the database buffer pool would contain up to 40 index rows. One hundred pages would
contain up to a maximum of 4,000 rows of data.
As you can see, the index is able to pack substantially more rows of data into each page
of memory within the database buffer pool. This means substantially less I/O is
physically required to bring those rows of data into the database buffer pool. Less I/O
means faster performance of your virtualized database.
After loading the table with data, we then create an index on the table. The index we
create will be a compound/composite index on the first two columns of the table:
Click here to view code image
Create Index IX Myindex mytable on dbo.mytable (A,B)
We then issue a basic select statement against the table we created. The select statement
we issue will only retrieve data from columns A and B:
Click here to view code image
Select A,B from dbo.MYTABLE where A='Mary'
In this example, the SQL Server database will be able to resolve this query without ever
looking within the source table itself. Think of the index as a mini copy of the table, only
containing data from the columns referenced in the index.
In the select statement, we only reference columns A and B. The index was created
using columns A and B. Therefore, everything the query has requested is contained
within the index itself, so the query never has to go back to the source table for any data.
Once this select statement is modified to include column C or D, the query can no longer
resolve the request just using the index. Remember how we said the index is a mini
copy of the table. In the mini copy of the table, those columns do not exist. Therefore,
we must go back to the source table for the contents of C or D. This means that
retrieving what is stored in the other columns of the table requires looking within the
contents of MYTABLE itself. The following three select statements use the index to
help speed the query along, but also have to look at the source table ultimately to
retrieve all the data requested:
Click here to view code image
Select A,B,C from dbo.mytable where A='Mary'
What is clear is that whatever you can do to minimize physical I/O, the faster your
database will perform. Storage array vendors do this by putting intelligence into the
physical hardware (storage array) and how it utilizes memory to minimize physical I/O.
Database vendors such as Microsoft do this by putting intelligence into the software
(database engine) itself, the operating system, and how it leverages memory. Server
vendors do it by putting memory associated with the CPU sockets. At every levelfrom
the physical hardware (such as storage arrays) to the software (such as the SQL Server
database)vendors are finding ways to use memory to speed up performance.
As DBAs, we are constantly in a balancing act of how much of the IT food group (disk,
CPU, memory, and network) we feed our database. It is clear that memory is one of the
most powerful levers we have in our toolbox to optimize database performance. The
choices we make will have a huge impact on overall database performance.
Caution
When you first start virtualizing your production databases, its important you
dont overcommit memory. When there is not enough physical memory to meet the
demands of the VMs, excessive paging and swapping will occur, which will
impact database performance.
If the physical host is memory starved, then the individual VMs are at risk of being
memory starved, which will induce paging and swapping. The one exception to this rule
is memory reservations. To help prevent memory shortages, the hypervisor has tools
available to it, such as transparent page sharing and ballooning, that can help lessen the
strain on the physical host.
Tip
Transparent page sharing is more effective the more similar the VMs are. When
possible, put like operating systems on the same physical host.
Tip
A great resource for better understanding transparent page sharing in more detail,
as well as other memory management techniques, is the VMware performance
study titled Understanding Memory Resource Management in VMware vSphere
5.0, found at http://www.vmware.com/resources/techresources/10206.
Tip
To learn more about this study and why you do not want to disable transparent
page sharing, review the performance study from VMware titled Understanding
Memory Resource Management in VMware vSphere 5.0. The URL is
http://www.vmware.com/resources/techresources/10206.
Best Practice
We recommend that you keep the default setting for Mem.ShareScanTime and do
not disable transparent page sharing. This is a far more efficient way to deal with
memory constraints than the alternative of paging and swapping.
Memory Ballooning
Transparent page sharing is a process that is constantly running on the hypervisor when
there are spare CPU cycles, looking for opportunities to reclaim memory. Ballooning is
a memory-reclamation technique that only kicks in when the physical host is running low
on physical memory. Because TPS is scanning all the time, it will be activated before
ballooning in most cases. Memory ballooning happens at the guest virtual machine level
versus the hypervisor (host) level.
Caution
Never shut off the balloon driver. This is your first line of defense for a physical
host that is running low on physical memory. It is a far more efficient way of
dealing with a physical memory shortage than the alternative of the hypervisor
swapping.
When memory ballooning is taking place, there can be a performance impact. In the case
of a virtual machine that has a lot of free memory, ballooning might have no impact on
performance at all. The operating system will just give up the free memory back to the
hypervisor.
In the case of your virtualized SQL Server database, there will be a performance
impact. Ballooning is detrimental to SQL Server because of the database buffer pool.
As the balloon inflates and the operating system doesnt have enough pages on its free
list, the operating system may choose to page out its own memory (that is, use
pagefile.sys) to disk. You have a physical host running short of memory, and vSphere is
taking steps to alleviate the issue. Those steps have additional overhead associated with
them and will have an impact on database performance. Yet, the alternative to those
steps would be paging and hypervisor swappinga DBA and system administrators
worst nightmare.
In a perfect world, its best you never overcommit memory for your mission-critical
workloads and avoid the possibility of memory ballooning completely. However, none
of us lives in a perfect world.
For example, if you have a physical host with 30GB of physical memory and 45GB of
memory is in demand by the different virtual machines running on the physical host, the
balloon driver (known as vmmemctl.sys) might be invoked. There is not enough
physical memory available from the physical host after TPS has already done what it
could to alleviate the shortage, so the balloon driver now attempts to help. In Figure 7.6,
step 1 shows the balloon driver sitting idle. The host is now experiencing memory
shortages. In step 2, the balloon driver inflates itself inside the guest virtual machines if
it has identified spare memory. That forces the memory to be paged out, which in turn
frees the memory back to the hypervisor so it can be used by other more demanding
virtual machines. Later on in step 3, when the host no longer has a memory shortage, the
balloon driver within the guest OS deflates, allowing the guest OS to reclaim the
memory.
Figure 7.6 Balloon driver in action.
A great analogy to describe the balloon driver is Robin Hood: It steals available free
memory from the rich virtual machines by inflating the balloon driver, freeing up that
memory back to the hypervisor so that memory-constrained (poor) VMs can use it
temporarily when there is not enough physical memory to go around.
Memory Reservation
As you learned earlier in this chapter, a memory reservation provides the ability to
guarantee a set amount of physical memory to a particular virtual machine and only this
virtual machine. No other virtual machine will have access to the memory that is
reserved. For mission-critical workloads such as a production database, we recommend
you use memory reservations. This is especially important when you have mixed
workloads of production and nonproduction databases and you want to maintain quality
of service.
Once you have set a memory reservations, when you first start the virtual machine, a
check is made by the vSphere hypervisor to see if enough physical RAM is available to
meet the memory reservation requirement. If there is not enough physical RAM
available to meet the memory reservation requirement, the virtual machine will not start.
We discuss in the next section ways to override this default behavior.
No matter what the workload is on the physical host, this amount of memory is
guaranteed to the virtual machine that has the memory reservation set, which is why it
will not start if the memory is not available.
Tip
You should use memory reservations for the VMs that contain your tier-1 SQL
Server databases. The memory reservation should be for 100% of the VMs
configured memory size. At a minimum, it needs to cover the SQL Server
database buffer pool and the overhead of the operating system.
Caution
If you plan to overcommit memory, make sure you have enough disk space to
create the vswap file; otherwise, the VM will not start.
Throughout this chapter, we have talked about how important it is not to overcommit
memory. In the real world, the customers that get the most out of their VMware
environment routinely overcommit resources. They only overcommit once they
understand how resources such as memory are needed and used. Coming out of the gate,
follow our guidelines and dont overcommit. Let at least a full business cycle go by.
Once you understand the resources that are needed and when they are needed, it is okay
to introduce overcommitment into your environment. It is the key to making sure your
mission-critical virtual machines have the resources they need when they need them.
Caution
When configuring the database, you have two choices: Max Server Memory and
Min Server Memory. Doing nothing is not an option. It is important that you set
Max Server Memory to prevent the database from negatively impacting the
operating system.
As DBAs, we know firsthand that databases by their nature will consume as much
memory as we give them. When a database consumes all the available memory,
database performance will be severely impacted. By the database consuming all
available memory, it starves the operating system from the resources it needs, causing
the OS to page and swap. To prevent this from happening, its important that you
configure Max Server Memory properly. So even though we say you have two options,
you really only have one.
Large Pages
Another place where you can squeeze additional performance from your virtualized
tier-1 SQL Server database is through the use of large pages. For SQL Server to use
large pages, you must first enable it through the trace flag T834.
Figure 7.9 illustrates how you enable large pages for SQL Server using trace flag
T834.
Figure 7.9 Enabling large pages.
Tip
For your mission-critical SQL Server databases, we recommend you lock pages
in memory to prevent the SQL Server buffer pool being paged out by the
Windows operating system. Make sure you have a reservation for the amount of
memory at the hypervisor layer.
The opposite is also true for your noncritical workloads; we recommend that you do not
lock pages in memory. This will then enable the balloon driver to do its job and reclaim
memory for use by the hypervisor for other virtual machines on the host. This is
important especially when you are trying to consolidate a number of workloads onto a
single physical host. The assumption here is that they wont always need all of the
assigned resources at the same time. Never forget a virtualized infrastructure is a shared
infrastructure.
You want to avoid the yo-yo effect, where the reclamation process (Balloon Driver) is
recovery memory, then the resource (VM) that provided the excess memory is now in
need of it, so the reclamation process gives it back to the VM, then the balloon driver
recovers the memory again and so on and so on and so on. Every time the system
thrashes as resources ebb and flow, other resources are impacted, such as CPU and
disk. For example, as paging and swapping occur, the storage array is impacted.
Tip
To avoid NUMA remote memory access, size your virtual machine memory to
less than the memory per NUMA node. Dont forget to leave a little room for
memory management overhead.
VMware is NUMA aware. When a virtual machine first powers on, it is assigned a
home NUMA node. Think of a NUMA node as a set of processors and memory. It will
keep a particular VM running on the same NUMA node. If the hypervisor detects that the
NUMA node the VM is running on is busy, it will migrate the VM to another NUMA
node to get better performance. It is important to note that when you hot-plug a CPU, you
affect vSpheres ability to utilize NUMA. The two capabilities do not work well
together. When vSphere first starts up, it establishes the NUMA home nodes based on
the number of CPUs. When a CPU is hot-plugged, it affects these settings. In effect, it
disables vSpheres ability to use NUMA. Our experience has taught us that NUMA is
much more beneficial to the performance of your SQL Server database than the ability
to hot-plug a CPU.
vNUMA
Even though vSphere has been NUMA aware for a very long time, in vSphere 5
VMware introduced vNUMA. vNUMA helps optimize the performance of a virtual
machine too large to fit within a single NUMA node and must span NUMA boundaries
by exposing to the guest operating system the actual physical topology so that it can
make its own NUMA decisions. This is good news for large-scale SQL Server
workloads that are virtualized that cannot fit within a single NUMA node.
A great blog article titled vNUMA: What It Is and Why It Matters can be found at
http://cto.vmware.com/vnuma-what-it-is-and-why-it-matters/.
As we discussed earlier, NUMA and the ability to hot-plug a CPU should not be used in
combination with each other.
Tip
vNUMA is disabled if vCPU hot plug is enabled. The link to the VMware
knowledge base article is http://kb.vmware.com/kb/2040375.
To learn more about NUMA nodes, see the VMware technical white paper, The CPU
Scheduler in VMware vSphere 5.1, which can be found at
https://www.vmware.com/files/pdf/techpaper/VMware-vSphere-CPU-Sched-Perf.pdf.
Caution
When you create your virtual machines, keep the footprint as lean as possible.
Dont install database and Windows operating system features you will not need.
Make sure you disable all unnecessary foreground and background processes.
When you build the virtual machines, keep this in mind as you make the many choices
you have available to you. Dont install features of the Windows operating system you
will never need, and dont install features of the database you will never need. Also,
disable all unnecessary foreground and background processes you dont need; by doing
so, you will keep the VM as lean as possible.
When building a virtual machine that will house a production database, it is
recommended that you build it from scratch, not using the P2V converter. A database is
a very complex environment, and experience has taught us that over time the
environment can be become very cluttered with components that are no longer needed.
Use this opportunity to create the new database environment as lean and clean as
possible.
Summary
In this chapter, we discussed the IT food groups with a focus on memory. Memory is
one of the most critical resources you have available. Everyone from hardware vendors
to software vendors are finding new ways to leverage memory to speed up performance.
The newest version of SQL Server will have an in-memory database that is able to
perform magnitudes faster than its predecessors by levering memory as a resource.
We stressed how important it is that you have the right balance of resources if you want
to optimize the performance of your virtualized SQL Server database. This is especially
important in a shared environment. By using techniques such as setting memory
reservations, you can ensure that mission-critical resources have the resources they
need when they need them, even in a shared-resource environment.
We discussed the many tools available to the hypervisor, such as TPS and ballooning,
to help ensure the hypervisor is getting the most leverage out of the physical memory
available to it. We also discussed NUMA and a number of other things you need to take
into consideration when you virtualize your production SQL Server database.
Chapter 8. Architecting for Performance: Network
We have now reached the final IT food groupthe network. Although SQL Server is
generally not very network intensive, the network is very important as the means of
access for all clients and applications, as well as the means of access to storage in a
SAN environment. When you are using advanced configurations such as SQL AlwaysOn
Failover Cluster Instances and AlwaysOn Availability Groups, the network becomes
even more important because it is the means of data replication and cluster failure
detection. A fast, reliable, low-latency network will improve the speed of response to
your applications and clients. In a virtualized environment, the network is also heavily
used to provide greater flexibility and manageability through the use of VMware
vMotion and VMware DRS. Providing the appropriate quality of service for different
network traffic typessuch as client traffic, cluster traffic, replication traffic,
management, and vMotion trafficis important to ensure you can meet application
service levels.
Tip
For SQL Server DBAs, operating virtualized databases is simpler than physical
or native database servers from a network perspective. There is no need to
configure network teaming or VLAN drivers inside Windows. There is also no
need to configure storage multipathing driversthe only exception being where
you are using in-guest iSCSI connectivity to storage. Network teaming and
storage multipathing are handled transparently, reliably, and simply through
VMware vSphere. No longer will a misconfigured or misbehaving teaming driver
or Windows multipathing problems cause an issue for your database.
This chapter covers how to get the required network performance from your SQL
Server VMsfrom using the right network adapter type, cluster network settings, and
the benefits of jumbo frames, to designing and configuring your hypervisor and your
physical network for performance, quality of service, and network security.
Note
When VMXNET3 was released with vSphere 4.0, VMware published a
performance evaluation that compared it to other network adapter choices.
You can review this paper at
http://www.vmware.com/pdf/vsp_4_vmxnet3_perf.pdf.
Figure 8.1 shows the relative performance between the different virtual network adapter
options. The tests used the default maximum transmit unit (MTU) size of 1,500 bytes as
well as Windows 2012 running on vSphere 5.5 on a 10Gbps network between hosts.
The hosts had two eight-core Intel E5-2650 v2 (Ivy Bridge) CPUs, 256GB RAM, and
an Intel 82599EB 10G SFP+ dual-port NIC. Multiple test iterations were measured
using the netperf tool (http://www.netperf.org/netperf/) and a single TCP stream. The
graph shown in Figure 8.1 uses the average results of the multiple tests. The same hosts
were used for all network performance tests for data in this chapter.
Figure 8.1 Virtual network adapter performance with default settings.
Caution
There is a known issue with regard to E1000 and E1000E network adapters that
can cause high packet drop rates. See VMware KB 2056468, High rate of
dropped packets for guests using E1000 or E1000E virtual network adapter on
the same ESXi 5.x host. There is also a known issue with vSphere 5.1 and UDP
performance with Windows. For more information, see VMware KB 2040065.
Tip
If you have an application server that is particularly network intensive when
communicating with the database, you may be able to locate it on the same host
and on the same port group to greatly improve network communications
responsiveness and throughput. The reason for this is that VMs on the same host
and on the same port group communicate at memory speed and are not limited
by the physical network. The network traffic does not have to go outside of the
host.
By default, virtual network adapters are optimized for high throughput, and not for the
lowest latency. In addition to adjusting queues, interrupt moderation may need to be
disabled to reduce latency. Interrupt moderation reduces the number of CPU interrupts
that the virtual network adapter issues in order to reduce CPU utilization and increase
throughput, but by doing this it also increases latency.
Tip
Power management policy of your hosts and the guest operating system of your
virtual machine can have an impact on network latency and throughput. Generally,
the BIOS setting of OS Control Mode and the vSphere Power Management policy
of Balanced (default) are recommended. However, if you have particularly
latency-sensitive VMs, we recommend configuring your host power policy for
high performance or static high performance. We recommend in all cases that
your Windows Power Management policy be set to High Performance. For
further information on tuning for latency-sensitive workloads, see
http://www.vmware.com/files/pdf/techpaper/VMW-Tuning-Latency-Sensitive-
Workloads.pdf.
In some situations, adjustments of the virtual network adapter inside the guest operating
system may not be sufficient to achieve the required performance. In these cases,
interrupt coalescing may need to be disabled on the virtual network adapter, either in
the virtual machine configuration or on the host. To disable interrupt coalescing on a
virtual Ethernet adapter, you need to modify the advanced parameter
ethernetX.coalescingScheme (where X is the number of the Ethernet adapter,
such as 0 or 1) and set it to disabled. The following example shows how you would
add the required advanced setting to disable interrupt coalescing for virtual Ethernet
adapter 0 in a virtual machine configuration using VMware PowerCLI:
Click here to view code image
Alternatively, you can add a line to the VMs advanced settings using the vSphere Client
or vSphere Web Client, as shown in Figure 8.2.
Tip
To ensure that a single NIC port or VLAN does not become a single point of
failure, it is recommended that you configure two virtual network adapters for the
cluster heartbeat, separate from the vNIC for application traffic. Each of the
heartbeat vNICs should be configured to use a different virtual switch port group,
VLAN, and physical NIC port.
Jumbo Frames
Standard Ethernet frames allow for a maximum transmission unit (MTU) of 1,500 bytes.
This means the maximum protocol data unit of a single packet that can be sent at one
time before fragmentation is needed is 1,500 bytes. Anything larger than 1,500 bytes
will need to be fragmented on the source, sent in multiple packets, and then reassembled
on the destination. Although this has been perfectly acceptable in the past with up to
1Gbps networks, it introduces overheads when transmitting large amounts of data and
on high-speed networks, such as 10Gbps and above.
Modern 10Gbps and above network adapters include features such as Large Segment
Offload and Large Receive Offload to try and alleviate the overhead of dealing with
many standard-size packets when using a 1,500 byte MTU. However, this doesnt
address the entire overhead, and as a result both source and destination systems
experience lower throughput, higher latency, and higher CPU usage than is necessary.
Using jumbo frames can address these problems and provide the best performance for
high-bandwidth networks and where you are using features such as SQL AlwaysOn
Availability Groups.
Using jumbo frames allows for protocol data units above 1,500 bytes to be transmitted
and received without fragmentation, thereby reducing the total number of packets and the
amount of overall packet processing required to send large amounts of data. The MTU
and jumbo frame size can vary across different network vendors, but they commonly
allow for up to six times the size of a standard-sized packet.
Figure 8.4 displays the throughput and CPU utilization between Windows 2012 VMs on
different hosts when configured with jumbo frames.
Caution
The version of VMXNET3 shipped in VMware Tools as part of VMware
vSphere 5.0 GA had a bug that prevented jumbo frames from working. It is
recommended that you have the latest version of VMware Tools installed, and
that you are on at least Update 2 if you are using vSphere 5.0. See VMware KB
2006277 for further information.
Figure 8.5 Same host virtual network adapter performance with jumbo frames.
Although throughput is higher in Figure 8.5, so is CPU utilization. This is due to the
hypervisor not being able to make use of the physical network adapter offload
capabilities when transmitting between the two VMs on the same host. However, the
CPU cost is lower per Gbps of throughput compared to the test between hosts in Figure
8.4.
In order for you to use jumbo frames, it must be configured consistently from the source
to the destination system. This means you need to configure support for jumbo frames in
Windows for the virtual network adapter, on the virtual switch within VMware vCenter
or ESXi, and on the physical network. Jumbo frames configuration needs to be enabled
on any network devices between the source and destination systems that will carry the
packets. As a result, it can be much easier to configure jumbo frames when
implementing new network equipment, although with proper planning and verification, it
is easily achievable in an existing network environment.
Caution
Some network device vendors, such as Arista, ship their equipment from the
factory with jumbo frames enabled. However, other vendors and some older
network devices, if not properly configured, may not allow jumbo frames to pass
and will simply drop the packets instead of fragmenting them. Some devices will
break the packets down to a smaller size (fragment the packets) and allow them to
pass, but the cost of breaking the packets down will severely slow down your
network. Some network devices may only allow jumbo frames to be set globally,
and the settings may only take effect after a reboot. Because of this, it is important
that you check the support of jumbo frames on your network devices with your
vendor and have a thorough implementation and test plan to ensure desired
results.
The types of network traffic that will benefit most from using jumbo frames on 10Gbps
and above networks include SQL Database Mirroring, Log Shipping, AlwaysOn
Availability Groups, Backup Traffic, VMware vMotion (including Multi-NIC
vMotion), and IP-based storage, such as iSCSI and NFS. In order for SQL Server to use
jumbo frames effectively, the Network Packet Size Advanced option should be
increased from its default setting. This is in addition to configuring jumbo frames in
Windows and on the virtual and physical networks. The Network Packet Size setting
should be increased from its default value of 4096 to 8192, as shown in Figure 8.6.
Figure 8.6 SQL Server Network Packet Size advanced setting.
Figures 8.7 and 8.8 show the Edit Settings button and jumbo frames configuration,
respectively, for a virtual standard switch.
Figure 8.7 vSphere Web Client virtual standard switch Edit Settings button.
Figure 8.8 vSphere Web Client virtual standard switch jumbo frames setting.
In order to enable jumbo frames on a virtual standard switch, you need to configure each
host individually. Each virtual standard switch that needs to support jumbo frames
needs to be modified on each host. In the vSphere Web Client, you need to navigate to a
vSphere host and click the Edit Settings button, which looks like a pencil, as shown in
Figure 8.7.
Figure 8.8 shows the virtual standard switch that is enabled for jumbo frames with an
MTU of 9000.
Configuring jumbo frames on a vSphere distributed switch is slightly different because
it is centrally managed. This means there is only one place to configure this and it
applies automatically to all hosts connected to the switch. To get to the Edit Settings
dialog for a vSphere distributed switch, navigate to Network in the vSphere Web
Client, right-click on vSphere Distributed Switch, and then click Edit Settings. Figure
8.9 shows the configuration of jumbo frames on a vSphere distributed switch.
Figure 8.9 vSphere Web Client vSphere distributed switch jumbo frames setting.
Now that jumbo frames are configured on the physical network and the virtual switches,
we can configure Windows. Figure 8.10 shows the Windows network properties page
of a VMXNET3 vNIC. You need to click Configure to display the advanced vNIC
properties.
Figure 8.10 Windows vNIC properties page.
Figure 8.11 shows the advanced vNIC properties page with the Jumbo Packet option
highlighted. By setting the value to Jumbo 9000, you are enabling jumbo frames.
Figure 8.11 Windows VMXNET3 vNIC advanced properties page.
In addition to ping, if you are using vSphere 5.1 or above, with an Enterprise Plus
license and a vSphere distributed switch, then you can make use of the vSphere
distributed switch health check feature to verify your jumbo frames configuration. The
vSphere distributed switch health check periodically checks the configuration of the
physical and virtual network, including the configured VLANs, MTU, network teaming,
and failover settings. Figure 8.13 shows the results of a vSphere distributed switch
health check configured on a sample vSphere distributed switch from the vSphere Web
Client.
Tip
For additional information about the vSphere distributed switch health check,
including a video of the configuration settings, see VMware KB 2032878.
Virtual Switches
Virtual switches (vSwitches) connect virtual machines to virtual networks, as well as to
each other, on the same host. When a virtual switch is configured with a physical
adapter, it can connect virtual machines to the physical network and the outside world.
Each virtual machine connects to a virtual switch port on a port group. The port group
forms the boundary for communications on a virtual switch. This is an important concept
to understand, especially for security.
A virtual switch and a port group are layer 2 only; they do not perform any routing, and
there is no code that allows a VM connected to one port group to communicate with
another VM on another port group. Communications between different port groups is
prevented even if they are on the same virtual switch, unless the traffic goes via a router
or firewall VM or if traffic is sent out to the physical network and routed back into the
host.
A port group is where you define VLAN tags and other properties of the virtual
networks that are connected to the virtual switch. Communications between VMs that
are connected to the same port group and virtual switch remain within a host and are
performed at memory speed. They are not limited to the speed of any network adapter,
but instead are limited only to the speed of the host CPUs and memory bus.
There are two main types of virtual switch, as Table 8.4 shows.
Table 8.4 Virtual Switches
In addition to connecting VMs to virtual switches, you also connect vSphere Host
VMKernel Services, for things such as vMotion, management, NFS, and iSCSI.
Tip
When you are running business-critical SQL Server systems, using Enterprise
Plus licenses and the vSphere distributed switch is recommended. This allows
you to take advantage of the advanced quality of service features provided by
VMware vSphere to ensure you can meet your database SLAs. This is especially
important with very large databases, where network-based storage (NFS or
iSCSI) or database replication is being used.
Figure 8.15 shows an example of what a vSphere distributed switch configuration may
look like for a vSphere host with two 10Gb Ethernet NICs.
Figure 8.15 Example of vSphere host networking with a vSphere distributed switch.
We are often asked how many vSwitches should be used. The answer is simple: as few
as possible to meet your requirements. We recommend using a single vSwitch unless
you have a reason (such as security and physical separation across different switches)
to use more. There are no extra points for configuring more than one vSwitch.
Note
It has been common practice in the past to configure vSphere host
management networking on vSphere standard switches to avoid any
configuration errors impacting network availability and therefore host
management on a vSphere distributed switch. vSphere 5.1 and above
include features to protect against and detect vSphere distributed switch
configuration errors, and these features allow you to back up and restore
your vSwitch configuration. It is therefore much safer now to use a vSphere
distributed switch for all virtual networking, which allows all traffic to
benefit from the advanced features.
Number of Physical Network Adapters
A number of factors influence the number of physical network adapters required,
including vSphere License level, bandwidth requirements, latency and response time
requirements, whether database replication or AlwaysOn Availability Groups are used,
security policy for traffic separation, and storage networking requirements. The sizing
of your databases and hosts in terms of memory can also impact network adapter choice
due to vMotion requirements. It is important to balance these different requirements and
come up with a design that best meets your objectives. Our recommendation is to use as
few network adapters as necessary, to keep the design simple, and to use 10Gb Ethernet
if possible.
One of the biggest influences of network adapter selection is the requirement to separate
different types of traffic, such as management, vMotion, virtual machine, and storage
traffic. Before 10Gb Ethernet was widely available and cost effective, it was common
to see vSphere hosts configured with six or eight or more 1Gb Ethernet NICs. There
might be two for management, two for vMotion, and potentially two or four NICs for
virtual machine traffic, or two for VM traffic and two for storage traffic. This could
have been due to needing to support physical network separation to different switches
and where using VLAN trunk ports was not possible. This was also common where only
vSphere standard switches were in use and it was not possible to provide quality of
service or intelligent load balancing across the NIC ports. Figure 8.16 shows an
example of a common 1Gb Ethernet virtual networking configuration with vSphere
standard switches.
Figure 8.16 Example of vSphere host networking with vSphere standard switches.
With the cost of 10Gb Ethernet dropping rapidly, the availability of 40Gb Ethernet, and
the increasing popularity of convergence, it is much more common to see two or four
NICs total per host. The reduced number of ports reduces complexity and cost.
However, this means that separation of traffic through VLANs and quality of service
control become much more important and hence the increasing popularity of using
vSphere distributed switches.
Our recommendation is to use 10Gb Ethernet NICs wherever possible. Two 10Gb
Ethernet for SQL Server, vMotion, and/or storage, with two 1Gb Ethernet for
management, can be a good solution. With mega-monster SQL VMs (512GB RAM and
above) or where additional availability is required, we recommend the use of two dual-
port 10Gb Ethernet NICs per host, especially in the case of Ethernet-based storage
(iSCSI, NFS, or FCoE).
Note
Depending on the server and network adapter type being used, the physical
NIC adapters could be presented as multiple virtual adapters to the
vSphere host, such as a single 40Gb Ethernet interface being displayed in
the vSphere host as four 10Gb Ethernet NICs.
Tip
If you start to see a high number of pause frames on physical network interfaces
or dropped packets in ESXTOP or in guest operating systems, you will need to
investigate further. For ESXTOP, the key metrics to watch are %DRPTX and
%DRPRX.
Tip
In some cases, you may need more than two 10Gb Ethernet NICs for an extremely
large database. During the Software-Defined Datacenter Panel for Monster VM
Design at VMworld in 2013, a customer told a story of a SQL data warehouse
with 32 vCPUs, 512GB RAM, 60% read IO requiring 40K IOPS, using iSCSI
storage and CPU utilization of between 50% and 100%. The customer was
having difficulty when trying to perform vMotion operations for maintenance,
because only one 10Gb Ethernet NIC was being used. The recommended solution
in this case was to configure jumbo frames and use multi-NIC vMotion across at
least two 10Gb Ethernet NICs and use network IO control to ensure quality of
service. Each vSphere host had four 10Gb Ethernet NICs configured. To watch
the session, see https://www.youtube.com/watch?v=wBrxFnVp7XE.
Tip
LACP and static EtherChannel with the Route Based on IP Hash setting are not
generally recommended due to their configuration complexity and limitations.
Further information on LACP support and limitations can be found in VMware
KB 2051307, 2034277, and 2051826. Further discussion of the pros and cons of
EtherChannel, LACP, and load-based teaming can be found at
http://longwhiteclouds.com/2012/04/10/etherchannel-and-ip-hash-or-load-based-
teaming/.
In addition to high performance and resilient vSphere host networking, you should aim
to design a high-performance, scalable, and resilient data center network with no single
point of failure. Some oversubscription of network links may be possible to improve
efficiency and cost effectiveness, provided your throughput and latency requirements
can still be met. When you use virtualization, your network will experience higher
utilization, especially at the edge where vMotion, backup, replication, and virtual
machine traffic all combine. Where possible, Leaf-Spine network architecture is
recommended. Figure 8.19 shows an example of a simple Leaf-Spine architecture with
a vSphere host redundantly connected using two dual-port NICs.
Figure 8.19 uses Multi-Link Aggregation Group (MLAG) between the spine switches,
which is suitable on a small scale. Larger-scale designs would typically utilize Equal-
Cost Multi-Path (ECMP) between the Leaf and Spine nodes.
Note
Network I/O Control requires a virtual distributed switch. To learn more
about NIOC, read the vSphere Networking for 5.5 white paper:
http://pubs.vmware.com/vsphere-
55/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-55-
networking-guide.pdf and
http://www.vmware.com/files/pdf/techpaper/VMW_Netioc_BestPractices.pdf.
Caution
When a dependent or independent hardware iSCSI adapter or physical
Converged Network Adapter (CNA) is used for FCoE, none of the traffic will be
visible to vSphere, nor can it be managed or guaranteed by using NIOC. Physical
network-based quality of service, limits, or reservations may be needed to ensure
each type of traffic gets the bandwidth it is entitled to and that application SLAs
are met.
Multi-NIC vMotion
Multi-NIC vMotion, as the name implies, allows you to split vMotion traffic over
multiple physical NICs. This allows you to effectively load-balance any vMotion
operation, including single vMotions. This doesnt require any special physical switch
configuration or link aggregation to support because its all built in to VMware
vSphere. This feature is available from vSphere 5.0 and above and allows vMotion to
be load-balanced across up to sixteen 1Gb Ethernet or four 10Gb Ethernet NICs. This is
particularly important when you have incredibly large memory configurations per host
(512GB and above) or where you have mega-monster VMs, because it will allow you
to migrate VMs or evacuate a host using maintenance mode much faster, while reducing
overall performance impacts.
Although Multi-NIC vMotion doesnt require a vSphere distributed switch, using it in
conjunction with a vSphere distributed switch and the Network I/O Control feature is
recommended. This will reduce the amount of configuration effort required per host.
vMotion will consume all bandwidth available to it, so using it with a vSphere standard
switch can negatively impact other traffic types because you arent able to guarantee
quality of service. If you intend to use Multi-NIC vMotion with a vSphere standard
switch, then dedicated physical adapters for vMotion traffic are recommended.
You configure Multi-NIC vMotion by creating multiple VMKernel port groups, each
with a different active adapter, and any remaining adapters configured as standby or
unused. You need to ensure that vMotion is enabled for each of the VMKernel ports
assigned to the port groups. Figure 8.21 illustrates the configuration of Multi-NIC
vMotion on a host with two physical NICs using a vSphere distributed switch.
Caution
Multi-NIC vMotion is only supported on a single nonrouted subnet, so you must
ensure that all vMotion VMKernel ports that are to participate in the vMotion
network have an IP address on the same network subnet.
As mentioned in the Jumbo Frames section of this chapter, Multi-NIC vMotion can
benefit from the increased MTU that jumbo frames provide. Figure 8.23 shows the
performance characteristics of single NIC vMotion and Multi-NIC vMotion recorded
during a performance test of vSphere 5.0.
Figure 8.23 Single-NIC and Multi-NIC vMotion performance.
Tip
Each 10Gb Ethernet NIC worth of vMotion requires approximately one physical
CPU core of CPU utilization on the vSphere host. To get optimal performance,
you need to ensure that you have sufficient physical CPU resources for the
vMotion traffic and virtual machines.
Caution
In the original release of vSphere 5.0, there was an issue that could impact the
wider network when Multi-NIC vMotion was used, due to one of the MAC
addresses timing out. This issue was fixed in vSphere 5.0 Update 2. For further
information, see http://longwhiteclouds.com/2012/07/15/the-good-the-great-and-
the-gotcha-with-multi-nic-vmotion-in-vsphere-5/.
Tip
For information on configuring iSCSI port binding, see VMware KB 2038869.
For a comparison of supported storage protocols, see
http://www.vmware.com/files/pdf/techpaper/Storage_Protocol_Comparison.pdf
and http://www.netapp.com/us/media/tr-3916.pdf.
Summary
In this chapter, we covered how to get the required network performance from your
SQL Server VMs by choosing the right virtual network adapter, VMXNET3, and how
using jumbo frames, where appropriate, can greatly increase performance and reduce
CPU utilization. We showed you some of the advanced tuning options that may be
required in certain situations to optimize the virtual network performance of SQL
Server databases, including when using AlwaysOn Availability Groups and other forms
of database replication.
You cant have a high-performing virtual network without a high-performance, scalable,
and reliable physical network. We have covered the critical VMware vSphere network
design components, including virtual switches, physical adapters, teaming and failover,
and the physical network. Our recommendation is to use vSphere distributed switches
and port groups with the Route Based on Physical NIC Load setting to ensure SQL
Server network performance. To provide quality of service, leverage the Network I/O
Control feature, and to enhance availability, use vDS Network Health Check and vDS
Backup and Restore. When required for large hosts and monster VMs, use Multi-NIC
vMotion.
Your storage network is also critical to the performance of your SQL databases and
their data availability and integrity. Having a high-performance and resilient network
includes your storage network. Your objective is to design the best network possible
within your constraints, while minimizing or eliminating single points of failure.
We concluded this chapter by discussing network virtualization and virtualized network
security and how security can be greatly enhanced in a virtualized environment.
VMware is the biggest company for virtual server access ports in the world. By
virtualizing your networking and security, you benefit from the performance of
distributed virtualized computing and web-scale architectures, with performance
increasing in line with Moores Law.
Chapter 9. Architecting for Availability: Choosing the
Right Solution
There are many choices when it comes to protecting your database, but the database is
not the only application to consider. Too often individuals become too focused on a
technology and not a solution. This chapter will not provide a silver bullet solution
for database availability. The goal of this chapter, and the philosophy of the authors, is
to provide the right solution for the right use case based on customer discussions.
Although there are many right solutions, it is important to design the right solution for
the right use case.
This chapter will walk through options available to DBAs and vSphere administrators
for high availability, business continuity, disaster recovery, and backing up their
databases and their infrastructure. What value have you provided the business if the
database is up and running but the rest of the application stack is down?
Tip
You want the business to define the applications requirements for availability,
recoverability, downtime, and so on. It is easier to justify the cost of an
availability solution when the business is driving the requirements.
Come at this from multiple angles. The application itself may not be that important on
the surface, but the systems it interacts with may be very important and therefore change
how the application should be protected. It is your job to be consultative during this
process.
How to get started? Simple, generate a list of questions you need the answers to in order
to provide a consultative recommendation to the business. Make this a living document,
and do not be afraid to go back and interview an application owner again. Questions to
ask the business include the following:
Do peoples lives depend on this application? (This would be the case for a
healthcare application, for example.)
Is this a revenue-generating application?
What other applications interact and/or depend on this application?
How long can the application be down before it affects the business? (You must
quantify the cost of the downtime.)
Is a crash-consistent copy acceptable? (Define this for them.)
Is this application (database) cluster aware? (Direct them to the vendor, if
necessary.)
Is this application subject to any regulatory regulations? If so, which ones?
What is your budget for this application?
As you become more familiar with scoping and providing solutions, your list will grow
and become more refined. An important point: The list should not be designed so there
is only one right answer.
With the interviews complete, you should have an idea of the importance of the
application as well as the availability, continuity, and recoverability requirements. You
can now provide a menu of options to the application owner along with your
consultative recommendation. In a world where you are competing with cloud providers
for better, faster, cheaper, do not forget that great customer service goes a very long
way.
Business Continuity
Business continuity is the planning done by a company to ensure their critical business
functions remain operational or can resume shortly after some sort of service
interruption, whether minor or major. An example of a minor outage would be the loss
of a motherboard inside a physical server. An example of a major outage would be the
loss of a data center due to a natural disaster. What goes into developing the business
continuity plans are business impact analysis (BIA) studies to understand the impact
systems have and their importance to the company. BIAs often have recovery priority
assignments for applications. The companys business continuity plan should include
how to recover the environment, resume operations, ensure asset relocation (this
includes people), and have testing and validation of the plan included. The goal of a
business continuity plan is to ensure critical operations of the business despite the
outage incurred. It answers the question, How can the company be rebuilt after a
massive outage?
Note
For more information, the National Institute of Standards and Technology
(NIST) has created some recommended practices that can be found by
searching for the most recent revision of NIST Special Publication 800-34:
Continuity Planning Guide for Information Technology Systems.
Disaster Recovery
Disaster recovery involves activities executed during a disaster, whether minor or
major. Disaster recovery encompasses the actions taken by individuals during the actual
event. Disaster recovery operates at the micro level, whereas business continuity
operates at the macro level. One additional difference is that a disaster recovery plan
tends to focus more on technology, whereas business continuity takes into account the
entire scope of resumption from an outage.
For the virtualized SQL Server environment, it is important to understand the pieces and
parts that make up the virtualized SQL Server stack and the applications these databases
support, because this will dictate their recovery procedure and priority. The disaster
recovery plan contains the steps necessary to recover the virtualized SQL Server after
the outage.
Note
More information on SRM can be found at
https://www.vmware.com/products/site-recovery-manager/.
Fragmented is the one-word answer we get from the majority of our customers when
we ask them to define their disaster recovery plans. They tell us they have 37 different
procedures and methods for recovering applications in their environment. Customers
are looking for simplification of their disaster recovery plans. This is where we suggest
implementing a tiered solution. The tiered approach involves providing four or five
tiers, each offering different capabilities. Take these capabilities, place them in a menu
format, and present them to the application owners after the interview process. Table
9.1 has an example of what this solution might look like. The first tier (Tier 1) leverages
synchronous replication at the storage layer, provides multiple point-in-time rollback
capability, and is tied to an orchestration and automation tool. This solution is tied to
RPOs and RTOs. Each tier is composed of multiple technologies to achieve the most
cost-effective solution possible. The tiering of solutions also simplifies and reduces the
fragmentation of recovering systems during an event.
Table 9.1 DRaaS Tiered Offering
Tiering the solution accomplishes many objectives. The first is that it simplifies the
disaster recovery planning process. It reduces the cost of disaster recovery while
making cost more predictable and manageable. It flushes out noncritical systems from
the recovery plan, accomplished via Tier 5. Providing zero RPO/RTO in the event of a
disaster provides for the use of an existing technology (for example, tape) that
investments have been made in but does not plan into the future direction of the
organization.
Finally, remember that we work in IT. Murphy (aka Murphys Law) has a cubicle three
down from where you sit. Despite your best effortsflawless designs, impeccable
implementationssomething, somewhere will go wrong. Make sure you have plans to
recover and be sure to regularly test these plans.
vMotion
Do you remember your first vMotion? I sure do. There are few things in our lives that
we remember exactly when and where we were when they occurred. Throughout our
travels, one question that always gets a resounding yes is when we ask about peoples
first vMotion moment. vMotion provides the ability to migrate (move) a powered-on
virtual machine from one physical host to another physical host with no downtime or
interruption to the services provided by that virtual machine.
Memory Lane
My first vMotion moment occurred in a data center in Akron, Ohio. My VMware
SE, Bob, came on site and we updated ESX (long live the MUI!) to the
appropriate version and we tested vMotion. I was sure it was a trick. I watched
as a nonbuffered video played in a virtual machine and was migrated from
ESX01 to ESX02. In fact, I was so sure it was a trick, I shut down ESX01. I
cannot recall the video that was playing, but I do recall that moment in time. What
was yours?
vMotion migrates the entire state of a virtual machine from one physical host to another
physical host. State information includes current memory content, BIOS, devices
attached to the virtual machine, CPU, MAC address, and other properties that make up
the virtual machine.
For your standalone SQL Servers, the value this brings is if there is an issue with the
underlying hardware (for example, the NIC goes bad) in a virtual environment, the SQL
Server virtual machine network traffic is seamlessly routed out another physical NIC
and the SQL Server VM can then be vMotioned to another physical host. There is no
downtime incurred by SQL to fix the failed NIC. From an SLA perspective, SQL
continues to provide services. In the physical world, you will have to shut down the
SQL Server, replace the NIC, and then power on the SQL Server. This is a service
interruption, or this means you are staying around to fix the NIC and ensure SQL boots
and resumes services during a change window. I know which option I like better:
vMotion the SQL Server, and let someone on the infrastructure team deal with the failed
NIC. vMotion is a battle-tested, tried-and-true core feature of the vSphere platform;
there is no reason not to use it (unless you are running SQL Server AlwaysOn Failover
Cluster Instancesmore on that later).
Storage vMotion
Storage vMotion allows an administrator to migrate the storage of a virtual machine
from one data store to another data store while the virtual machine is powered with no
disruption in service. For the SQL Server virtual machines running on vSphere, this
provides many benefits. The first is if the data store the virtual machine is running on is
running out of room and action cannot be taken to grow the current data store, the SQL
Servers virtual disks can be relocated onto a data store that has sufficient room.
Another use case is when the SQL Server virtual machines I/O requirements exceed the
capabilities of the data store on which it resides, the virtual machines disk files can be
relocated onto a data store that will satisfy the performance requirements.
Note
Storage vMotion operations are greatly enhanced when using VAAI-
compliant subsystems. For more information on VAAI, refer to Chapter 6.
Storage DRS
Much like how DRS pools together CPU and memory resources, Storage DRS pools
together storage resources. Storage DRS provides a means by which the management of
a group of data stores is automated based on variables such as latency and utilization.
As virtual machines are placed into a Storage DRSenabled cluster, Storage DRS will
monitor the individual data stores and make appropriate migration decisions based on
how Storage DRS is configured. The benefit for a virtualized SQL Server is that
policies can be put in place to manage the SQL Server virtual machines to protect
against a poor-performing data store as well as a data store that is running out of
capacity.
Note
For more information on X-vMotion, see
http://blogs.vmware.com/vsphere/2012/09/vmotion-without-shared-
storage-requirement-does-it-have-a-name.html.
vSphere HA
vSphere HA provides protection when running multiple virtual machines on the same
host. vSphere HA is designed to monitor the physical ESXi hosts for availability. If an
ESXi host experiences an outage, it is vSphere HAs job to restart the affected virtual
machines on another ESXi host in the cluster. This provides a recovery time measured
in the reboot of a virtual machine. vSphere HA is turned on by a check box for an entire
vSphere cluster: no complex configuration or special skill sets required. The value this
brings for SQL Server virtual machines is twofold.
First, by virtualizing standalone SQL Servers, this automatically provides them
hardware-level protection. In the physical world, if there is a hardware failure, there is
a service interruption until the hardware issue is resolved. This usually translates to
downtime measured in hours, not minutes. When the ESXi host a SQL Server virtual
machine is running on experiences a hardware failure, vSphere HA detects this outage
and restarts the SQL Server virtual machine on another host. Based on the ACID
properties of the SQL database (discussed in section vSphere App HA, later in this
chapter) and the storage I/O crash-consistency properties within ESXi, the SQL Server
virtual machine is powered on, the Windows operating system boots, and SQL Server
loads and resumes operation. This is quite a handy feature if the failure occurs at 3 a.m.
Note
ESXi will maintain the correct order of writes to allow for a proper restart
after a crash. ESXi acknowledges a read or write to the guest operating
system only after the read or write is verified by the hardware controller to
ESXi. When reading the following KB article, be sure to note the
difference between a Type 1, bare-metal hypervisor (ESXi) versus a Type
2, hosted hypervisor (VMware Workstation):
http://kb.vmware.com/kb/1008542. The exception is if vFRC is involved,
then reads originate from cache and writes are sent to storage.
vSphere App HA
vSphere App HA allows for the monitoring of applications running inside virtual
machines. vSphere App HA allows an administrator to monitor the location and status
of these applications, define remediation actions to take place if a service
(sqlserve.exe, for example) becomes disabled, and generate alerts and notifications that
a monitored service has been impacted. As of the writing of this chapter, vSphere App
HA supports SQL Server 2005 through 2012. To learn more or to check for updated
information, check http://www.vmware.com/products/vsphere/features-application-HA.
We have worked with a lot of customers who have reevaluated their existing SQL
Server clusters to determine which SQL Server databases could run on a standalone
SQL Server running on the vSphere platform. There are valid reasons to move and not
to move SQL Server databases off a clustered instance, but customers have used
vSphere HA and vSphere App HA to drive complexity and cost out of their
infrastructure while maintaining and improving the availability of their SQL Server
databases.
vSphere Replication
VMwares vSphere Replication is a hypervisor-integrated replication engine. vSphere
Replication is integrated into the kernel of the ESXi hypervisor. vSphere Replication
enables the protection of virtual machines by replicating them from one ESXi host to
another ESXi host. vSphere Replication is hardware independent, so the target server
can be different from the source server. In addition to being hardware independent,
vSphere Replication is also storage independent. This means you can run your
production virtual machines on a high-performing storage array and replicate your
virtual machines to a remote facility running three-year-old servers with direct attached
disks. vSphere Replication also allows an administrator to select the VMDK file type.
For example, on the production side, the VMDK file type can be Thick Provisioned
Eager Zeroed and on the recovery side the VMDK file type can be Thin Provisioned.
vSphere Replication integrates with Microsoft Windows Volume Shadow Copy Service
(VSS) via VMware tools. Once this has been configured for the virtual machine, writers
are flushed and the application and operating system are quiesced to ensure full
application consistency for backups. If VSS fails for some reason, vSphere Replication
continues despite the failure and will provide OS-consistent backup and generates a
notification that a VSS-level backup was not achieved.
Why is this important for SQL Servers? This provides a hypervisor-level replication of
your SQL Server virtual machines. For some (maybe not all) SQL Servers, the recovery
procedure is restore from backup. With vSphere Replication, the ability exists to
replicate this virtual machine to a remote site. A company can buy standalone physical
servers, install ESXi on the direct attached storage, enable vSphere replication, and
now DR exists for all SQL Servers. Finally, for additional cost savings, the VMDK file
format can be changed to Thin Provisioning for increased density and utilization of the
physical assets at the alternate site.
Note
vSphere Replication works against powered-on VMDKs, requires Virtual
Hardware 7 or later, and has an RPO of 15 minutes to 25 hours. For more
information on vSphere Replication, VMware published this white paper:
http://www.vmware.com/files/pdf/vsphere/VMware-vSphere-Replication-
Overview.pdf.
ACID
In the 1970s, Jim Gray defined the properties of a reliable transaction system.1 ACID is
an acronym that represents a method by which database transactions are processed
reliably, and defined a transaction as a single logical transaction on the data.2
1 http://en.wikipedia.org/wiki/ACID
2 http://en.wikipedia.org/wiki/ACID
ACID stands for Atomicity, Consistency, Isolation, and Durability. Atomicity can be
equated to all or nothing. This means that if any part of a transaction fails, then the entire
transaction fails. For SQL Server, this means that if part of a transaction fails, then the
database itself is left unchanged; sometimes this involves rolling back changes made to
the database returns to the original state.
Consistency refers to the state in which the system is in before and after a transaction
begins. If for any reason a transaction would cause the system to enter an invalid state
upon its completion, the transaction is stopped, any changes made to the system are
rolled back, and the system returns to a consistent state once again. The system will start
and end a transaction in a consistent state.
Isolation allows a transaction to believe it is has sole access to the system, which may
not be the case. SQL Server is designed to be accessed by multiple individuals, and it is
imperative these individuals and their transactions believe they have exclusive use of
the system. Transactions may occur at the same time. If these transaction do not believe
they have dedicated use of the system, the system may not be deemed consistent, thus
causing a roll back. It is the isolation property that protects against the consistency
violation.
Durability is the last of the four transaction properties. Durability states that once a
transaction is committed, it is permanent. In other words, no matter what happens to the
system after a successful transaction, the transaction will persist. This includes
hardware failures, software failures, and so on. For SQL Server, this is accomplished
by writing information into a transaction log file prior to releasing the transaction.
Writing the transaction to physical media meets the durability requirement for a
transaction.
How does the introduction of ESXi affect the ACID properties of a transaction? The
short answer is, it does not. The reason is that ESXi will only acknowledge a read or
write to the guest operating system after the read or write is verified by the storage
controller; if vFRC is involved, then reads are from the cache and writes go to the
storage. Once this read or write is verified, it is handed off to the guest operating system
to process, and this now becomes a Windows and SQL operation. From a physical
server or virtual server perspective, it is exactly the same when dealing with a Type 1
hypervisor such as vSphere.
Note
If you want to read Jim Grays paper on the transaction concept, it can be
found at http://research.microsoft.com/en-
us/um/people/gray/papers/theTransactionConcept.pdf.
SQL Server AlwaysOn AGs support an Availability Group Listener (that is, a DNS
name and IP address) for each Availability Group. Point clients to the AG Listener for
them to access your SQL Server AG implementation, and the AG Listener will direct
them to the appropriate replica. The AG Listener is responsible for redirecting requests
when a SQL Server participating in a SQL Server AG is no longer available.
So what does a SQL Server AG implementation look like on vSphere? Pretty much any
way you want it to look. Whereas a SQL Server FCI uses a shared SCSI bus, a SQL
Server AG does not, which frees us from the tyranny of the shared SCSI bus. Because
the shared SCSI bus is not a factor, VMware will support the use of VMDK files,
vMotion, DRS, Storage vMotion, Storage DRS, Enhanced vMotion, vSphere HA, and
other features. In short, this is a great stack on which to run your mission-critical SQL
Server databases.
For the most current, up-to-date information on what is supported by VMware for SQL
Server AlwaysOn AGs, reference http://kb.vmware.com/kb/1037959 and review the
Non-Shared Disk and SQL AlwaysOn AG section in the link.
Summary
In this chapter, we have discussed how to put together the right high availability solution
for your environment. As we said at the start of this chapter, we never send a customer a
white paper and say this is the right solution for their environment. Instead, we work to
understand the business requirements and map those to the features and functionality
present in products to derive the right solution for the customer.
We discussed how shadow IT is now getting a budget. We discussed the growing
importance to understand the features and functionality present in the entire stack:
VMware and SQL Server. We talked about cross-education so that each team
understands the pros and cons of each solution as well as the creation of a menu to
simplify the offering for the business.
This marks the last of the architecting chapters. Chapter 10, How to Baseline Your
Physical SQL Server System, will discuss the importance of baselining your SQL
Servers and provide examples of how to baseline SQL Server.
Chapter 10. How to Baseline Your Physical SQL Server
System
The title of this book is Virtualizing SQL Server on VMware: Doing IT Right. An
essential part of doing it right is having a good understanding of the workload
characteristics and configuration of the existing physical source systems that are to be
virtualized. Remember that unlike in a physical environment, where its common
practice to oversize the server, in a virtualized infrastructure its important you right-
size the VM that houses your SQL Server database. You can always hot-plug a vCPU
and hot-add memory if more is needed.
Tip
As a DBA, it is very important you embrace this new way of managing the
environment, one where you right-size for today, knowing that in the future if you
need more resources, such as CPU, memory, and disk, they are just a click away.
In fact, oversizing VMs can actually degrade performance.
You can get this understanding of what is needed to properly configure the virtualized
environment by recording and analyzing a performance baseline of your existing
physical systems. This is one of the most critical success factors for physical-to-virtual
SQL Server migrations, so that you can prove the same or better performance
characteristics after virtualization.
Tip
Its very important to baseline your important physical systems. This is one of the
most important stepsif not the most important stepyou need to take if you
want to properly virtualize your critical SQL Server databases.
This chapter covers both infrastructure and application baseline activities related to
SQL Server 2012 and provides you with the why, when, what, and how of measuring
the baseline successfully, as well as how to ensure you at least meet (if not exceed) your
systems required performance once it is virtualized. This applies even if the database
is being implemented first as a virtual machinealthough in that case, you will design a
valid benchmark or load-test for that particular database to prove it meets your
requirements.
Tip
The baseline is a measurement of what the current system performance is as well
as the critical metrics that make up that performance.
Before you begin to baseline your existing system, ask yourself this question: Are you
happy with how the system performs today? If the answer is no, then what makes you
think that moving it to a virtualized infrastructure alone will make it better?
Virtualization is not a silver bullet that solves all problems. When you virtualize a poor-
performing system, you should expect poor performance unless something changes. This
is one of the many reasons establishing a proper baseline is so important.
To better illustrate the value of a proper baseline, lets talk about a situation we had
happen earlier this year. We had as a client a very large engineering firm that went out
and purchased state-of-the-art hardware (both a new server and storage array) to run
their entire environment on. They moved just the database onto the new infrastructure.
They expected everything to get faster, yet the opposite happened. The new
infrastructure ran substantially slower than the older infrastructure that was running the
database and a number of other applications. After several failed attempts to correct the
problem with the new infrastructure, the firm called us in to determine why.
The first thing we did was to baseline the existing system. We then baselined the
database sitting on the new infrastructure. When we compared the two baselines, what
jumped right out at us was the fact that the new disk storage array was substantially
slower than the old disk storage array. Slow disk drives always mean a slow database.
This is why it is so very important to baseline before you begin the journey of
virtualizing a database.
Tip
Over 80% of the problems in a virtualized environment have to do with storage.
The storage is either misconfigured, misused, or mis-sized.
The baseline is used as a reference to determine whether the virtualized systems meets
or exceeds the performance of the physical system it was migrated from. You need to
capture a baseline of the existing live production system to act as this reference while at
the same time not impacting production performance.
Tip
It is important to capture a baseline of the existing production system while at the
same time not impacting production performance.
Later in this chapter, we show you what tools to use and how to properly baseline the
performance of your production system while not impacting performance. An example
of some of the metrics to consider that make up the performance baseline are displayed
in Figure 10.1. In this figure you see many other things you need to consider, from
security to operations, but they are not the focus of this chapter.
Figure 10.1 SQL server migrationthe big picture.
Its very important to gather a baseline that is representative of the workload. There is
no point in baselining a system at night when all the real work happens during the day,
and vice versa.
Tip
Its very important that the baseline sample you take is a representative
workload.
In addition to system and application performance metrics, the baseline should include
time reference data, such as time of day, day of week, week of month, and month of
year. This is to ensure that seasonal or cyclic anomalies in workload patterns are
captured and understood. If the analysis period doesnt include critical cyclical system
peaks, adjustments will need to be made based on a risk factor during the design and
validation of the virtual infrastructure. System logs can be useful to help provide the
delta of system performance between the baseline during the analysis period and
historical peaks. It is also important to understand the workload. If your sampling
interval is every 5 minutes, and you have the system ramp up every 3 minutes, you might
not capture this peak in your sample set and you will not identify this until you are in
production.
Tip
Work with both the DBA and the application owners to understand the workload
and determine an appropriate sampling period and duration.
Our experience has taught us that when sampling a SQL Server database, the sampling
interval should be 5 minutes or less. We typically recommend 15-second intervals.
When sampling T-SQL, we recommend using a 1-minute interval. A lot can happen in a
database in a short amount of time.
Tip
When sampling a SQL Server database, we highly recommend using a very
frequent sampling interval of 5 minutes or lesswith 15 seconds being the
recommendation.
Note
Per-database stats on a consolidated database server are not easy to
measure when you are recording physical infrastructure performance
metrics. For this reason, you need to collect application-level performance
metrics in addition to the infrastructure.
A number of important metrics that are relevant to all SQL servers are available to
baseline. Table 10.1 illustrates the metrics you should monitor and the recommended
thresholds for them.
Table 10.1 SQL Server Baseline Infrastructure Metrics
While you are collecting and analyzing the various infrastructure metrics, you need to
pay attention to averages and peaks. Ideally you will be capturing the existing systems
performance during a cyclical system peak workload period, which might be end-of-
month, end-of-quarter, or end-of-year processing. You need to determine when the
likely peaks are based on your business and your particular systems. However, if you
are not able to capture the peaks and the averages during a peak business and system
period, you will need to make adjustments based on your knowledge of those times. In
some organizations, peaks of system volumes can range from 20X to 100X normal
system volumes.
If your system is supporting an Internet-facing application, you will have to take
unexpected peak loads and spikes into account. This can be very hard to predict, so you
should take a conservative approach based on what you think the worst-case scenario
is, or the maximum that has been observed in the past. You can then, based on your
workload model and business knowledge, extrapolate what may be required, taking into
account known future activities.
Note
A vCPU of a virtual machine is only single threaded. Unlike a physical
CPU core, it does not include a hyper-thread. If you want to get the most out
of your system and do like for like comparisons with the physical source
system, you will need to configure your VM with as many vCPUs as you
have CPU threads on your physical platform.
If you happen to be reading this book in advance of virtualizing an SAP system using
SQL Server 2012, you may be interested to know that the largest SAP system were
aware of using SQL Server as its database platform running on VMware vSphere is
approximately 10.2 million users and around 8 million SAPS, including production and
non-production environments. This is a substantial system for a very large government
department.
Note
When using SPECInt or SAPS Benchmarks as a comparison between CPUs
of different generations or types, you should take into consideration that
they were determined at close to 100% system utilization. This means they
are only good as a relative comparison and to translate a CPU utilization
figure on one system to another. You should allow some headroom for
peaks when doing your calculations.
Tip
Dont forget to exclude your data files and log files from your antivirus (AV)
scanning. It can have a huge impact on performance, especially when real-time
AV scanning is used.
Summary
This chapter covered both infrastructure and application baseline activities related to
SQL Server 2012. Much of the information covered could be applied to other versions
of SQL Server, or even completely different applications. It provided you with the why,
when, what, and how of measuring the baseline successfully, and how to ensure you at
least meet if not exceed your systems required performance when virtualized. When
virtualizing SQL Server databases, especially large and business-critical databases, its
important that you reduce risk and eliminate guesswork as much as possible. You want
to virtualize but you dont want to compromisebe it performance, availability,
recoverability, or any other SLA. If that is your goal, then baselining your workloads is
critical.
The real measure of your success will be when your databases are virtualized and meet
or exceed the requirements set out at the start of the project. You will never know that
without a good baseline to measure from. If you do this part of your job well, your
operational teams will also thank you for it. They will be able to leverage the data
during system maintenance and troubleshooting. It will form part of the ongoing
database capacity and performance management processes.
This chapter has given you the essential tools with which to successfully baseline any
SQL Server 2012 system. You now know how to compare between generations of
hardware platforms, even different hardware architectures, so you can right-size the
design and architecture of your systems, based on your requirements. This will allow
you to achieve optimal performance with service quality assurance.
Chapter 11. Configuring a Performance TestFrom
Beginning to End
To this point, we have provided deep dives into individual topics for virtualizing SQL
Server. We are often asked, How do I test SQL on vSphere? In this chapter, we are
going to put it all together and walk you, the reader, through setting up SQL 2012 on
Microsoft Windows Server 2012. We will configure the AlwaysOn Availability
Groups, and using an open source load-generation tool, Dell DVD Store, we will
simulate workload. Furthermore, it should be noted this configuration has also been
shown to work with Windows 2008 R2 as the operating system supporting SQL 2012
and Windows 8 as the desktop generating the workload.
Introduction
Before we begin discussing what is needed for the test, lets cover why we are running
this test:
Is this a test to show the DBAs in the organization how well virtualized SQL can
perform on a vSphere infrastructure?
Is the test part of a bakeoff between physical and virtual configurations?
Is this simply a test of functionality?
Once we understand the why, we can set proper expectations for the test. This means
creating the proper documentation, detailing the test plan, identifying and monitoring of
key performance indicators, ensuring consistency between tests (for example, if
measuring physical versus virtual performance), and ensuring proper sponsorship.
So that we are on the same page, we are creating a performance test in this chapter that
has the ability to stress the infrastructure beyond its limits. Be mindful of where this
configuration is being stood up, the time of day, and the duration of the testing. We have
seen individuals set up performance tests using production equipment and bring the
production environment to its knees. Dont be that person.
Caution
To be clear, run the following test configuration against non-production
equipment.
It should be noted that some of the configuration options presented in this chapter do not
follow production best practices for implementing SQL Server. Be mindful of this when
you are configuring your implementation and make the appropriate changes to ensure
adherence to your companys policies. Be cognizant of the settings that are being chosen
and understand their impact so as not to generate any REGs (rsum-generating events).
It is important to know why you are making a particular setting change before you make
that change. What may initially be a harmless configuration change can have serious
downstream implications to the environment.
Tip
Do not work on these performance tests in a vacuum. Depending on the goals,
size, and configuration, assistance and buy-in may be necessary from the DBA,
Network, and SAN teamsand critical to the success of this initiative. Use this
as an opportunity to educate your coworkers on the benefits of virtualizing SQL
Server.
We are creating this test in total isolation in our vSphere 5.5 environment. The vSphere
5.5 lab we used for this configuration consists of two IBM x3650 M2 hosts, each with
128GB of RAM. These hosts are connected via fiber channel to an EMC VNX. The
LUNs are configured in a Data Store Cluster configuration. Each data store is
approximately 1TB in size. Each physical host has seven 1GB NICs available, and we
are using distributed virtual switches on the ESXi hosts. We have carved out a
dedicated VLAN for the purposes of this test so as not to affect other workloads running
on these hosts. We stood up the Active Directory Domain Services Server, SQL
Servers, and Windows 8.1 virtual machines from Microsoft ISOs. We downloaded
vCOPs, Hyperic, and Virtual Infrastructure Navigator (VIN) virtual appliances and have
these running to provide us telemetry of our virtual environment. VMware vCenter
Server is running as a virtual machine in our configuration.
What We UsedSoftware
Here is a list of all the software used:
vCenter Server 5.5
Two ESXi 5.5 hosts
One Windows Server 2012 Standard running Active Directory Domain Services
(AD DS) along with DNS
Two Windows Server 2012 Datacenter Edition Servers, each running SQL Server
2012 Enterprise Edition Service Pack 1
One Windows 8.1 x64 desktop
Dell DVD Store 2.1
Strawberry Perl for x64 Windows
Unix-to-DOS conversion utility
What You Will NeedComputer Names and IP Addresses
You will need the following computer names:
AD DS virtual machine name
Two SQL Server 2012 virtual machine names
Windows 8.1 virtual machine name
Windows Failover Cluster name (shows up as a Computer Name is AD)
SQL Server Listener name (shows up as a Computer Name is AD)
The following is a bulleted list of the IP addresses needed to stand up the lab. We also
included Table 11.1, which represents the name, operating system version, SQL
version, and IP addresses of all the virtual machines used in our lab:
One IP address for the Windows Server 2012 AD DS virtual machine
Four IP addresses for the SQL Server 2012 virtual machines
One IP address for the Windows 8.1 virtual machine
One IP address for the Windows Failover Cluster
One IP address for the SQL Server 2012 Listener
Figure 11.4 Successful deployment of the Hyperic agent to a group of virtual machines.
Now that we have successfully installed the vCenter Hyperic agent inside our virtual
machines, we will move onto configuring their VMDK files.
Note
The difference between shared disk and non-shared disk in VMware
KB 1037959 is based on Microsofts requirement for a disk to be shared
among multiple systems. For SQL 2012 AlwaysOn Availability Groups,
this is not a requirement; however, SQL 2012 AlwaysOn Failover Cluster
Instances (FCIs) do have this requirement.
Note
Although additional settings can be made to increase performance of the
system, you should weigh the impact of these setting versus running a close
to a default configuration as possible. Remember the Keep It Simple rule
it scales better than having a bunch of one-off configurations. But if you
need it, use it.
To add additional virtual SCSI adapters to a virtual machine, browse to the virtual
machine and click Edit Settings. When the virtual machines dialog box pops up, click
the down arrow next to New device: and select SCSI Controller (see Figure 11.6) and
then click Add.
Figure 11.6 Adding a virtual SCSI controller to a virtual machine.
After you click Add, you will notice a new entry for New SCSI controller has appeared
in the dialog box. Click the down arrow to expand this selection, and select VMware
Paravirtual as the controller type, as shown in Figure 11.7.
Figure 11.7 Changing the virtual controller type to VMware Paravirtual.
Click Add and repeat this process until three new SCSI controllers have been added.
For the purposes of this lab, we striped the VMDKs across the three available
datastores in our datastore cluster. All three of our datastores have the same
performance characteristics and are roughly the same size. We attempted to distribute
them as much as possible to spread out the load. When looking at VMDK placement on
the disk subsystem, it is important to match the VMDKs purpose (OS and binary versus
log drive) to the underlying storage. For more information on the VMDK-to-datastore
mapping, read Chapter 6, which goes into various configuration options available for
your configuration.
Next, we are going to add VMDKs to the virtual machine and strategically place them
on the new SCSI controllers. Click the down arrow next to New device and select New
Hard Disk and click Add. A new line will appear in the dialog window labeled New
Hard Disk. Click the down arrow and make the following changes:
Correct VMDK size based on the size of the test. Note that each VMDK may be a
different size based on its function (database, logs, tempdb, backup).
Select Thick provision eager zeroed.
Set Virtual Device Node (see Table 11.2 for information on how we striped them
the VMDKs for these two virtual machines).
Note
SCSI0 is LSI Logic SAS since we are only putting the OS and backup
VMDKs on this adapter. For configurations in which we would put a
VMDK-hosting DB or log data, we would make this a Paravirtual SCSI
adapter. For more information on the differences between the LSI Logic
SAS and Paravirtual SCSI adapter and when to use them, read Chapter 6.
Tip
From a PowerShell prompt, type diskmgmt.msc to open the Disk Manager utility.
The recently added disks need to be brought online and initialized. Right-click each of
the newly added disks and select Online. After bringing all the added disks online,
right-click one of newly added disks and select Initialize Disk. A wizard opens; ensure
all the newly added disks are selected and select OK.
After the disks have been initialized, it is time to assign them drive letters. Before
clicking a disk, make sure you understand what the disks purpose is so you can ensure
proper labeling to coincide with your VMDK layout. See Table 11.3 for the layout used.
You will notice we are using the same drive mapping for both virtual machines; ensure
that whatever your drive-letter-naming scheme is, both virtual machines are identical to
one another.
Table 11.3 VMDK File Layout on Virtual Machines with Drive Letter Mappings
It is important as you go through this process that you understand which disk inside
Windows is related to which VMDK file. As you can see from Figure 11.11,
SQL_2012_a and SQL_2012_b added the VMDK files in a different order, assigning
them as different disks. For example, SQL_2012_a added the 45GB drive as Disk 2
whereas SQL_2012_b added the 50GB drive as Disk 2.
Tip
To help identify which virtual disk is used by Windows, see this KB article:
http://kb.vmware.com/kb/1033105.
Once the disks have been initialized and brought online, right-click the appropriate disk
and select New Simple Volume to bring up the New Simple Volume Wizard. In the
wizard, click Next to begin. Click Next on the Specify Volume Size page. On the
Assign Drive Letter or Path page, select the correct drive letter and click Next. On the
Format Partition page, change the Allocation unit size to 64K, label the volume
appropriately, and click Next (see Figure 11.12). On the final page of the wizard, the
Completing the New Simple Volume Wizard, click Finish. Repeat these steps until all
the disks have been added for both SQL Servers virtual machines.
Tip
If Eager Thick Zeroed was not selected earlier, unchecking the Perform a quick
format option will force Windows to go and check every block of disk, thereby
having ESXi touch every block. For information on the type of disks supported by
ESXi hosts, see http://kb.vmware.com/kb/1022242. For a detailed discussion on
these, read Chapter 6.
We have completed adding additional VMDKs to our SQL Server virtual machines and
configured that storage appropriately (see Figure 11.13). We will not be doing any of
the advanced configurations as detailed in Chapter 6, such as adjusting the PVSCI
adapter queue depth. After the configuration has been stood up and tested, and the
results documented, then go back and modify accordingly so you are able to determine
the impact of the modifications.
Memory Reservations
Protecting memory around certain applications can have a positive impact on their
performance. VMware best practices have stated that for production, latency-sensitive
systems, reserve memory for the virtual machine. This setting can be enabled or
disabled while the virtual machine is running. There are two options when setting a
memory reservation. A fixed setting can be configured, and this is the amount of memory
for this virtual machine that will be reserved for that virtual machine. The second option
is for a dynamic setting that will adjust as the memory assigned the virtual machine
changes. For a detailed explanation, refer to Chapter 7, Architecting for Performance:
Memory.
To enable memory reservations for the full virtual machine, open the properties of the
virtual machine, expand Memory, check the Reserve all guest memory (All locked)
box (see Figure 11.14), and then click OK. Repeat these steps on all SQL Server virtual
machines participating in this test.
Figure 11.14 Enabling memory reservations.
Note
Because this setting is dynamic, during the test, enable, disable, and adjust
the size of the reservation to observe the impact of the setting.
Note
Make sure the operating system and the application both support adding
CPU and memory while the system is running. Go to VMware.coms
VMware Compatibility Guide website, select Guest OS (What are you
looking for), ESXi 5.5 (Product Release Version), Microsoft (OS Vendor),
and under Virtual Hardware, select either Hot Add Memory or Hot Plug
vCPU. OS vendors will also only support this option at certain license
levels for both the OS and the application.
Right-click the virtual machine you want to change. Click the drop-down next to CPU
and check the box next to CPU Hot Plug / Enable CPU Hot Add, as shown in Figure
11.15, and before clicking OK, proceed to the next step.
Figure 11.15 Enabling CPU Hot Add.
Click the down arrow next to CPU to collapse the CPU options. Click the down arrow
next to Memory to expose the memory options. Check the Enable box to the right of
Memory Hot Plug (see Figure 11.16). Click OK to commit the changes. Repeat these
steps on the second SQL Server virtual machine.
Figure 11.16 Enabling Memory Hot Plug.
We have successfully configured Hot Add for CPU and Hot Add for Memory. In the
next section, we are going to configure affinity rules for the virtual machines.
Note
For more information on the Cluster Validation Wizard, see
http://technet.microsoft.com/library/jj134244.
On the Before You Begin page of the Validate a Configuration Wizard page, click
Next to continue. On the Testing Options page, ensure the Run all tests
(recommended) radio button is selected, as shown in Figure 11.25, and click Next.
Figure 11.25 Run all tests for a cluster validation.
On the Confirmation page, verify the settings are correct and click Next to begin the
testing. After the test is finished, it will provide a status detailing the results of the
analysis. It is important for production environments that a copy of this report is
retained (somewhere other than the default location of the report). In addition, anytime
you modify the cluster, you should always rerun the report and save this report as well.
The reason for retaining these reports is if you open a support ticket with Microsoft,
they will ask for these validation reports.
Tip
It is important a copy of the cluster validation report is retained. Also remember
to rerun the cluster validation each time you modify the cluster and save the new
report.
This information can come in handy when you are working through issues. You can do
this by clicking View Report (see Figure 11.26), and when the report opens in a
browser, save the report off to a location other than the virtual machine it is currently
running on. Return to the Validate a Configuration Wizard and click Finish to continue
creating the cluster.
Note
Be sure to configure all the SQL Servers with the exact same storage
layout.
After configuring the directory paths, click Next to continue the installation.
On the Error Reporting page, shown in Figure 11.57, click Next.
Figure 11.61 Creating a shared folder on the Windows 8.1 virtual machine.
We are done with the Windows 8.1 virtual machine for the time being; now lets get
back to our SQL Servers.
Next, open the SQL Server Configuration Manager. Once this is open, click SQL
Server Services, SQL Server (MSSQLSERVER). Then right-click SQL Server
(MSSQLSERVER) and select Properties. When the dialog box opens, click the
AlwaysOn High Availability tab. On the AlwaysOn High Availability tab, place a
check mark next to Enable AlwaysOn Availability Groups and then click Apply. Click
OK to acknowledge the warning message stating the SQL Server service requires
restarting prior to the setting taking effect. Figure 11.62 displays what this looks like on
a Windows Server 2012 operating system.
Figure 11.62 Enabling AlwaysOn High Availability.
Now we are going to enable large pages for SQL Server. This is an optional parameter
that can be part of your functional testing to determine the impact of enabling Large
Pages with SQL Server. This feature is automatically enabled in SQL 2012 when you
give the service account running sqlservr.exe Lock Pages in Memory permissions. For
versions previous to SQL 2012, you must enable this trace flag along with the Lock
Pages in Memory permission (configuration details in next step). Figure 11.63 shows
how to enable the trace flag. Click the Startup Parameters tab. After the tab opens,
type T834 and then click Add, click OK, and then click OK again to acknowledge the
warning message. It should be noted this setting can only be turned on during startup and
requires the Lock Pages in Memory user right to be configured (we do this in the next
step).
Figure 11.63 Enabling Large Pages in SQL Server.
Note
Microsoft does not recommend using this Large Pages trace flag when
using the Column Store Index feature, per
http://support.microsoft.com/kb/920093.
Note
For more information on this and other trace flags available for SQL
Server, visit Microsofts KB article on the topic:
http://support.microsoft.com/kb/920093.
Open the Local Security Policy management console. A quick way to do this is to open
PowerShell and type secpol.msc. Once the console is open, locate and expand Local
Policies and then click User Rights Assignment. Under the Policy column, locate Lock
pages in memory. Right-click Lock pages in memory and select Properties. Add the
SQL Service account (svcSQL2012) and click OK to configure the setting. Figure 11.64
shows what this looks like for Windows Server 2012. Verify for the Lock pages in
memory policy that the SQL Service Account is listed under the Security Setting
column. For more information on this setting, refer to Chapter 5 and Chapter 7, as these
chapters both discuss this setting.
Note
More information on the User Rights Assignment setting, refer to
http://msdn.microsoft.com/en-us/library/ms178067.aspx.
Without closing the Local Security Policy Management Console, find Perform volume
maintenance tasks. Right-click Perform volume maintenance tasks and select
Properties. Add the SQL Service account (svcSQL2012, in our case) and click OK to
commit the change. Figure 11.65 displays what this looks like on a Windows Server
2012 operating system. Verify the SQL Service account appears in the Security Setting
column to the right of Perform volume maintenance tasks in addition to the
Administrators account. For more information on this setting, review Chapter 6.
Figure 11.65 Enabling the Perform volume maintenance tasks setting.
Note
More information on the Instant File Initialization setting, refer to
http://msdn.microsoft.com/en-us/library/ms175935.aspx.
Repeat the enabling of AlwaysOn Availability Groups, Large Pages (if necessary),
Lock Pages in Memory, and Perform Volume Maintenance steps on all SQL Servers
participating in the AlwaysOn Availability Group.
Tip
If youre using Hot Add Memory capabilities and memory is added dynamically,
ensure that the max memory setting is adjusted to the new value.
Note
Make sure your network supports the larger packet sizes. Improper
configuration can cause performance problems. DBAs and VMware admins
should work with the Network team to understand the appropriate
configuration based on their input and information provided in Chapter 8.
Note
More information on the Instant File Initialization setting can be found at
http://technet.microsoft.com/en-us/library/ms175527(v=sql.105).aspx.
On the SQL Server virtual machines, open the SQL Server Management Studio,
expand Databases, expand System Databases, and click tempdb. Right-click tempdb
and select Properties. Once the Database Properties - tempdb dialog box opens, click
Files (located on the left). Then click Add and enter the proper number of additional
tempdb data files. Figure 11.68 displays the tempdb configuration we used in our lab.
When you are done configuring your tempdb files, click OK to build the files and close
the dialog box.
Figure 11.68 Adding tempdb data files.
Repeat this step on all SQL Servers participating in the AlwaysOn Availability Group.
To determine whether the files were created successfully, browse to the path entered for
the additional tempdb data files to validate they were created. Figure 11.69 shows
successful creation of our tempdb data files.
Figure 11.69 Verifying successful creation of tempdb data files.
At this point, we rebooted each SQL Server virtual machine to ensure the settings we
just configured are applied. We reboot them individually, making sure the first server
was up and all services started before initiating the second reboot.
Note
If the SQL Service is not restarted, AlwaysOn Availability Groups will not
work and some of the configuration settings are only applied at boot time,
which is why we are waiting until now for a reboot.
Note
Always back up your SQL Server database before it is joined to an
AlwaysOn Availability Group.
To back up the test database that was just created, return to Microsoft SQL Server
Management Studio, locate the newly created database (test01db), right-click this
database, and select Tasks, Backup... to open the Back Up Database interface. Verify
the information and click OK to begin the backup. This should only take a second or
two; then click OK on the successfully backed-up message to close out the Back Up
Database window. Figure 11.72 shows a successful backup of our test database,
test01db.
Figure 11.72 Backing up the test database.
This completes configuration of the test database. In the next section, we are going to
create the Availability Group, add this database to the Availability Group, and verify
we have the Availability Group functioning properly.
Note
For more information on SQL Server 2012 AlwaysOn availability modes
and failover types, review this article: http://technet.microsoft.com/en-
us/library/ff877884.aspx.
On the Listener tab, shown in Figure 11.76, select the radio button next to Create an
availability group listener. Fill in a Listener DNS Name (sql2012agl01), Port
(1433), and ensure Static IP is selected for Network Mode. Under Network Mode,
click Add... and enter the IP information for the listener. Once this is complete, click
OK to close the Add IP Address dialog box. Once the listener has been configured
correctly, click Next to continue with the wizard.
Note
ActiveStates ActivePerl also works; just be sure to read and understand
the EULA and ensure you are in compliance.
After Strawberry Perl (or ActivePerl) has been installed, download and extract the Dell
DVD Store binaries. The Dell DVD Store binaries are available from
https://github.com/dvdstore. Download the two following files from the directory:
ds21.tar.gz
ds21_sqlserver.tar.gz
For reasons of space, we are downloading these files to the VMDK file we added.
Depending on the size of the custom test, there may or may not be enough room on the c:\
drive for this to complete. After downloading the files, extract ds21 to the appropriate
location (R:\) and extract ds21_sqlserver inside the \ds2\ folder (R:\ds2\). It is very
important that this structure is maintained. See Figure 11.83 for more information.
With Strawberry Perl installed and Dell DVD Store files downloaded and extracted, it
is time to create our custom install file. To do this, browse to \ds2\ directory and
double-click the Install_DVDStore.pl file. Table 11.5 contains the questions and
answers (for our configuration) for this wizard. After each entry, press Enter to move
on to the next question until the wizard finishes. For this configuration on the equipment
we used, the build time of this script took approximately 20 minutes, and you can watch
the progress by viewing the \ds2\data_files\cust, \ds2\data_files\orders, and
\ds2\data_files\prod directories for file creation. The wizard will automatically close
once it is finished.
Note
Dont worry about making a mistake because the output of this wizard is a
text file that can be manually edited later. If DNS is not rock solid, use IP
addresses instead of hostnames.
Note
When entering the path for the database, make sure to enter the trailing
backslash (\).
Figure 11.84 shows us walking through the Dell DVD Store Wizard and the wizard
beginning the build of the custom files.
Figure 11.84 Creating the Custom Dell DVD Store install.
Next, we are going to create a custom configuration file for the workload driver. To do
this, navigate to \ds2\ and double-click the CreateConfigFile.pl file. Once this opens,
we will be asked a series of questions, which will then generate our custom
configuration file. The questions and answers are detailed in the Table 11.6. The wizard
will automatically complete once finished. A file named DriverConfig.txt will be
created in \ds2 containing the configuration data entered. If a mistake was made or you
are troubleshooting the installation, you can edit this file manually.
Table 11.6 Dell DVD Store Custom Wizard for Workload Driver
Note
We are using the Availability Group listener for the target hostname.
Figure 11.85 shows us walking through the Workload Driver Configuration Wizard.
Figure 11.85 Configuring the custom workload driver.
A known issue that we must address is the way the .csv and .txt files are created and
their failure to import into SQL Server 2012 (SQL Server 2008 too). The reason for this
is when the configuration files are created, an extra carriage return is present in the file
and this causes the import into SQL to fail. By running the files through a conversion
utility to remove these returns, we fix the issue. This is what the next section addresses.
To fix this issue, we need to run .txt and .csv files through a Unix-to-DOS conversion
utility. The utility we used was a free-to-use tool located at
http://www.efgh.com/software/unix2dos.htm. As of the writing of this chapter, the
author has placed no restrictions on its use. Download and run unix2dos.exe by dragging
and dropping .txt and .csv files onto it. This must be done starting at the ds2 folder level
and all files in this hierarchy. Missing a file could result in the SQL load failing.
Tip
Copy the executable into each directory and leave it there; that way, you know
which directories you have completed. In addition, sort files by Type, highlight
all the .csv and .txt files, and then drag and drop them onto the executable. Use the
Date Modified column to verify all files have been converted.
Note
Not all .txt files need to be run through the conversion utility (for example,
Read Me documentation), but we just have to be on the safe side.
Once all the .csv and .txt files have been updated, copy the entire ds2 directory and its
subfolders to one of the SQL Servers. For this lab, we copied ds2 to the root of the R:\
sql2012a virtual machine (backup drive). Keep in mind the free space on the drive you
are copying the contents to; if there is not enough free space, find another location. If
you are following along with us in terms of size of VMDKs and the size of Dell DVD
Store test, you will not see an issue. If you have followed our VMDK size but then went
with a larger Dell DVD Store test database size, this is where you will want to pay
attention to available drive space.
Next, we will need to install the SQL Server Management Tools on the Windows 8.1
virtual machine. To do this, mount the SQL Server 2012 installation CD and select
Installation and then New SQL Server stand-alone installation or add feature to an
existing installation. This brings up Setup Support Rules, and the wizard will perform
a preflight check, as shown in Figure 11.86. If any issues are identified, remediate them
and rerun the preflight check. When everything passes, click OK to continue.
Note
This file can be modified at any time prior to database creation, or if you
delete the d2 database, this script can be used to rebuild the database.
Open PowerShell on the SQL Server virtual machine and navigate to the
\ds2\sqlserverds2 directory and issue the following command (see Figure 11.97):
Click here to view code image
Note
The transaction logs must be backed up prior to adding the DS2 database to
the AlwaysOn Availability Group.
Now we are going to add the DS2 database to our AlwaysOn Availability Group. To do
this, open the Microsoft SQL Server Management Studio and expand AlwaysOn High
Availability, Availability Groups. Right-click the previously created Availability
Group (SQL2012AG01) and select Add Database.... On the Introduction page, shown
in Figure 11.110, click Next. On the Select Databases page, select DS2 and click
Next.
Figure 11.110 Adding DS2 to the Availability Group.
On the Select Initial Data Synchronization page, shown in Figure 11.111, leave Full
selected and ensure the path to the shared location created earlier is entered for the
location (\\LOADGEN\aobackup). Click Next to continue.
Figure 11.111 Central backup of DS2.
On the Connect to Existing Secondary Replicas page, click Connect... and connect to
the secondary instance. Click Next to continue. Figure 11.112 shows how we
configured SQL2012A to connect with SQL2012B.
Figure 11.112 Connecting to the secondary.
On the Validation page, shown in Figure 11.113, ensure all validation checks return a
successful result. If any issues are identified, remediate them and rerun the validation.
Figure 11.113 Validation check.
On the Results page, shown in Figure 11.114, review the settings are correct and click
Finish. When the wizard finishes, click Close.
Figure 11.114 Successful addition of DS2 to the AlwaysOn Availability Group.
To view the status of the AlwaysOn Availability Group, we can view the AlwaysOn
Availability Group dashboard. Figure 11.115 is the AlwaysOn Availability Group
dashboard showing the configuration is running.
Figure 11.115 AlwaysOn Availability Group dashboard with the DS2 database.
This completes the installation and configuration of the Dell DVD Store database into a
SQL Server 2012 AlwaysOn Availability Group. In the next section, we will execute
the load test.
Figure 11.117 Kicking off the Dell DVD Store load test.
Taking a quick peek at vCenter Infrastructure Navigator, shown in Figure 11.118, we
can see that the relationships have automatically updated to represent the AlwayOn
Availability Group configuration as well as the initiation of the Dell DVD Store test.
We can see that the SQL Servers have established a relationship between themselves
and that the LoadGen virtual machine has established a connection to the SQL Listener
currently resident on SQL 2012 a.
Figure 11.118 Updated screenshot from VIN.
When the run is complete, Dell DVD Store will present the final results and end the test,
as shown in Figure 11.119.
Note
If you see User name newuserXXXXXXXX already exists, as shown in
the first line in Figure 11.119, where X is an integer, this represents a user
signing in and that user name is already taken. This is based on the
pct_newcustomers setting in the DriverConfig.txt file. Changing this
to 0 eliminates this message from appearing.
Now that our run is complete, lets review the data presented in Figure 11.119. The first
value we see is et. The et value specifies the amount of time the test has been
executing. The value will specify the amount of time either since the test began (warm-
up time, which by default is 1 minute) or since warm-up time ended. You will see this
updated approximately every 10 seconds during the run. For our test, if we review the
Final results, we see et = 7317.8. This value is presented in seconds, so that means
our test ran for 121.963 minutes after the stats were reset.
Note
Stats are reset after the warm-up time has been achieved. If your system
requires longer ramp-up time, this is a configurable value,
warmup_time=X, and is located in the DriverConfig.txt file.
The next value we come across is n_overall. This value represents the total number
of orders processed after the stats were reset after the warm-up period. For the
preceding test, we have a value of n_overall=461871, so we know that a total of
461,871 orders were processed during this test period.
Moving on, the next value is opm. The opm value indicates the orders per minute. For
the preceding test, we have a value of 3786, meaning we handled 3,786 orders per
minute. This value will change, as you will see in the test, and is a rolling update of the
last minute.
The next value is rt_tot_lastn_max. This value represents the experience of the
last 100 users, in milliseconds, for their ordering experience. For our test, we have a
value of rt_tot_lastn_max=268, which means the last 100 users experienced
268 milliseconds of delay across the steps of their purchasing experience.
The next value we see is rt_tot_avg. This value represents the total response time
experienced by a user during their ordering cycle. This includes logging in, creating a
new user account, browsing inventory, and purchasing product. For our test, we had a
value of rt_tot_avg=15, which means the user experienced an average of 15-
millisecond response time.
The n_login_overall value represents the total number of logins. For our run, the
result returned was n_login_overall=369343, meaning we had 369,343 logins
for the entire test. Unlike the previous results, which provide a snapshot of performance
at that current moment, this value represent the total number of logins for the duration of
the test.
The next cumulative value we are presented with is the n_newcust_overall value.
This value represents how many new customers registered during the test period. For
our test, the value we achieved was n_newcust_overall=92528, meaning we
had 92,528 new customers.
Next, we have the n_browse_overall value presented. This value represents the
total number of browses experienced during a run. For our run, the value returned was
n_browse_overall=1385324, meaning we had 1,385,324 browses.
The next value in our results chain is n_purchase_overall. This value represents,
as you might guess, the total number of purchases during a given run. For our run, the
value returned was n_purchase_overall=461871, meaning we has 461,871
purchases go through our system.
How about login experience? The next value, rt_login_avg_msec, provides us
with the average login time in milliseconds for a user. For our run, we received a
rt_login_avg_msec=4, meaning our users login time for the system, on average,
was 4 milliseconds.
What about new user experience? The rt_newcust_avg_msec metric tells us how
long it takes for a new user to register themselves with our service. For our test, the
value we received was rt_newcust_avg_msec=2, meaning a new user
registration took 2 milliseconds.
How about the browse time? Metric rt_browse_avg_msec represents browse
time. For our run, the value returned was rt_browse_avg_msec=0.
The average purchase time is represented in the rt_purchase_avg_msec value.
For our run, we received a result of rt_purchase_avg_msec=8, meaning it took
an average of 8 milliseconds for a purchase to complete.
What happens if a customer is trying to order something but there is not enough of that
product in stock and the order needs to roll back? This is represented as a total number
experienced during the entire run in the n_rollbacks_overall value. For our run,
we received a value of n_rollbacks_overall=9361, meaning we had 9,361
orders rolled back due to a lack of product.
The value rollback_rate represents the percentage of rollbacks and is derived by
the following formula, as described in ds2driver_doc.txt:
Click here to view code image
n_rollback_overall / n_overall * 100%
Figure 11.120 Graphical results of OPM and logins from the Dell DVD Store test.
Now that we have the results from our initial test, it is time to determine which
variables we are going to manipulate and determine the impact of these settings base on
our test. A suggestion here is to also rerun the test with all the defaults in place, but test
vSphere and SQL Serverrelated functionality and the impact these have on
performance. For example, test vMotion, vSphere HA, and your affinity rules work
(these were previously configured). Also test shutting down the active SQL Server,
disconnecting the network from the virtual machines, and so on.
Once you are ready to get started with additional testing, you will need to reset the Dell
DVD Store database. The first step in this process is to remove the DS2 database from
the AlwaysOn Availability Group. The reason for removal is the script we use to reset
the database will fail if you attempt to run it while DS2 is part of an AlwaysOn
Availability Group. Once the DS2 database is removed, open the
sqlserverds2_cleanup_20GB.sql file located in the \ds2\sqlserverds2\build directory.
We will run the script from the SQL Server on which we built the database via the
Microsoft SQL Server Management Studio, as shown in Figure 11.121. Once the script
is loaded, click the Execute button to begin.
Summary
In this chapter, we walked through a complete, end-to-end configuration of a SQL
Server performance test on vSphere 5.5 with SQL 2012 as our database engine and
Windows Server 2012 as our guest operating system. This chapter builds on all the
previous chapters in this book; however, it does not include all the possible variations
and tweaks. We set up a base installation from which you are able to manipulate
various levers within vSphere, Windows, and SQL to find the optimal configuration for
your environment.
We discussed the importance of baselining these tests. It is important to baseline not
only the initial test but also all subsequent tests to understand the impact of the
configuration change made and to determine if the change provides enough value to be
rolled into production.
In addition to changing the levels of vSphere, Windows, or SQL, use this as an
opportunity to validate (or in some cases, demonstrate to others in the organization)
features within vSphere. For example, many DBAs are not familiar with vMotion,
Storage vMotion, HA, and many other vSphere technologies. Although these terms are
part of the everyday vocabulary of a vSphere administrator, they are not part of a DBA
or managements vernacular. Use this environment to demonstrate these features and
how they work under load.
Finally, we want to thank you, the reader, for your interest in our book. We are all very
passionate about virtualization of high I/O workloads such as SQL Server on the
vSphere platform and appreciate the opportunity to share what we have learned over the
years with you. We hope we have provided you value and you are able to use the
knowledge in this book in your professional life. Best of luck to you on your
virtualization journey.
Michael Corey, Jeff Szastak, and Michael Webster
Appendix A. Additional Resources
With this book, we have attempted to create the most comprehensive guide to
virtualizing your most demanding databases. The key to success in virtualizing your
databases is knowledge. This appendix is loaded with additional resources available to
you.
Figure A.1 Navigating to key SQL Server documentation from the home page.
Figure A.2 Navigating to key SQL Server documentation, continued.
Starting with Figure A.1, you can see that step 1 instructs you to go to the VMware home
page. The trick then is to find the link Virtualizing Enterprise Applications (shown in
step 2) at the bottom of the home page and click it. This will take you to the web page
shown in step 3. VMware considers the SQL Server database a business-critical
application, just like it does Microsoft Exchange, Oracle, and SAP. Therefore, if you
were to perform a web search, you should use the terms SQL Server Business Critical
Application to locate the page holding the white papers.
At the bottom of the web page shown in Figure A.2, in step 4 you will see a section
named Microsoft SQL Server. In this section, click the link Learn More About SQL
Server Virtualization (this is indicated by the arrow in Figure A.2). Clicking this link
will take you to step 5. This section of the website is dedicated to virtualizing a SQL
Server database on vSphere.
Tip
The URL to some useful white papers on how to virtualize SQL Server is
http://www.vmware.com/business-critical-apps/sql-virtualization.
At the bottom of this page, you will see a section titled Related Resources. You have
finally arrived at the mother lode of additional white papers. VMware has done an
excellent job on many of these white papers, and it is well worth your time to read them.
Here are a few of my favorites from the VMware site:
DBA Guide to Databases on VMwareWhite Paper
http://www.vmware.com/files/pdf/solutions/DBA_Guide_to_Databases_on_VMware-
WP.pdf
SQL Server on VMwareAvailability and Recovery Options
http://www.vmware.com/files/pdf/solutions/SQL_Server_on_VMware-
Availability_and_Recovery_Options.pdf
SQL Server on VMwareBest Practices Guide
http://www.vmware.com/files/pdf/solutions/SQL_Server_on_VMware-
Best_Practices_Guide.pdf
Setup for Failover Clustering and Microsoft Cluster Service
http://pubs.vmware.com/vsphere-55/topic/com.vmware.ICbase/PDF/vsphere-
esxi-vcenter-server-551-setup-mscs.pdf
vSphere Storage
http://pubs.vmware.com/vsphere-55/topic/com.vmware.ICbase/PDF/vsphere-
esxi-vcenter-server-551-storage-guide.pdf
vSphere Resource Management
http://pubs.vmware.com/vsphere-55/topic/com.vmware.ICbase/PDF/vsphere-
esxi-vcenter-server-551-resource-management-guide.pdf
vSphere Monitoring & Performance
http://pubs.vmware.com/vsphere-55/topic/com.vmware.ICbase/PDF/vsphere-
esxi-vcenter-server-551-monitoring-performance-guide.pdf
Now that SQL Server 2014 is out, we are sure some excellent additions will be made to
this list soon. Therefore, we recommend you take the time to check back to this web
page from time to time and look for an updated version of our book.
User Groups
Industry user groups are one of the most important resources you have available to you
in support of technology. The best part is, no matter where you are in the world, odds
are there is a technology group near you. Here are a few technology user groups focused
on SQL Server and virtualization that you should take the time to learn about.
VMware Community
Another great source of information on database virtualization is the VMware
Community site. This resource can be reached at https://communities.vmware.com.
Membership in the VMware Community is free. Figure A.3 shows two sections of the
VMware Community home page to give you a sense of whats available on the site.
Well over 100 forums are availableor if you like, you can start a new one.
Facebook Groups
As of October 2013, over 500 million people use Facebook and over 70 different
languages are supported. Within the Facebook site are a number of Facebook Groups
focused on VMware. If you are not familiar with Facebook Groups, heres a description
from the Facebook site:
Facebook Groups are the place for small group communication and for people to
share their common interests and express their opinion. Groups allow people to
come together around a common cause, issue or activity to organize, express
objectives, discuss issues, post photos, and share related content.
With over 500 million people on Facebook, its no wonder a number of groups have
emerged focused on VMware. One group I would like to point out is the VMware
vExpert group. This used to be a closed group, but on April 17, 2014 it was opened up
to the public. To access the VMware vExpert Facebook Group, go to
https://www.facebook.com/groups/57751806694/.
Note that you must first be logged in to your Facebook account. This is just one of many
Facebook Groups devoted to VMware.
Blogs
Twenty years ago, if you wanted to get timely high-quality technical information, your
options were limited:
Big industry tradeshows (such as VMworld and the PASS Summit)
The latest books on the topic
The vendors newest class on the topic
Today, high-quality information is coming out near real time in blogs. A word of
caution, though: Information over time can become outdated, and authors of blogs are
not always good about going back and deleting or updating their information as it
becomes outdated. Therefore, always take the time to use a little common sense when
obtaining information from the Internet. You should look at the credentials of the blogs
author and when the last time the information was updated.
An example of an excellent blog that contains useful information on virtualization is
Long White Virtual Clouds: All Things VMware, Cloud, and Virtualizing Business
Critical Applications, located at http://longwhiteclouds.com/. The author of this blog is
Michael Webster, one of the authors of this book. A sample of Michael Websters blog
is shown in Figure A.4.
vLaunchPad
A useful site to know about is vLaunchPad: Your Gateway to the VMware Universe,
which is located at http://thevpad.com/. This website asks people each year to vote for
their favorite blogs on virtualization. Michaels Long White Virtual Cloud blog came in
#13 out of over 300 blogs. Looking at the many blogs on the vLaunchPad site can be an
excellent source of information on virtualization.
For example, here are the top five sites:
1. Yellow-Bricks, by Duncan Epping (http://www.yellow-bricks.com/)
2. virtuallyGhetto, by William Lam (http://www.virtuallyghetto.com/)
3. Frank Denneman Blog (http://frankdenneman.nl/)
4. Cormac Hogan (http://cormachogan.com/)
5. Scott Lowe Blog (http://blog.scottlowe.org/)
Numbers
10Gb Ethernet NICs, 269
A
ABRTS/s counter, 326
ACID (Atomicity, Consistency, Isolation, and Durability), 302-303
ActivePerl, 407
ActiveState ActivePerl, 407
adapter count, 95
adapters
CAN (Converged Network Adapter), 276
iSCSI, 276
LSI Logic SAS, 95
physical network adapters, 267-269
PVSCSI, 95
virtual network adapters, 100
choosing, 250-251
traffic types, 101-102
tuning, 252-254
addresses (IP), 341-342
Admission Control, 88
affinity rules, 358
AGs (Availability Groups), 306-308
alignment of partitions, 128-129
AlwaysOn Availability Groups
configuring, 387-391
creating, 399-405
AlwaysOn Failover Cluster Instance, 125
anti-affinity rules, 358
Application Dependency Planner, 110
AQLEN, 168
arrays, 98-99
atomicity, 302-303
ATS (Atomic Test Set), 99
Auto Grow, 114-115
availability, 135
ACID (Atomicity, Consistency, Isolation, and Durability), 302-303
business continuity, 291
determining availability requirements, 287-288
disaster recovery, 291-294
high availability, 14-16
providing a menu of options, 288-289
RPOs (recovery point objectives), 290
RTOs (recovery time objectives), 290
sample high availability chart, 308-309
SLAs (service-level agreements), 290
SQL Server AlwaysOn Failover Cluster Instance (FCI), 304-306
SQL Server Availability Groups (AGs), 306-308
vSphere high availability
DRS (Distributed Resource Scheduler), 297
hypervisor availability features, 294-296
Storage DRS, 297
Storage vMotion, 297
vCenter SRM (Site Recovery Manager), 301
vCHS (vCloud Hybrid Service), 302
vDP (vSphere Data Protection), 300
vMotion, 296-297
vSphere App HA, 299-300
vSphere HA, 298-299
vSphere Replication, 300-301
X-vMotion, 298
Availability Groups (AGs), 306-308
Available Mbytes metrics, 321
Average Latch Wait Time(ms) metric, 324
Average Wait Time(ms) metric, 324
B
background noise, lack of, 334
backing up networks, 103
ballooning, 230-232
bandwidth, vMotion traffic, 276
baselines
baseline performance reports, 332-333
benchmarks, 315-316
developing, 317-318
industry-standard benchmarks, 316
validating performance with, 318
vendor benchmarks, 316-317
common performance traps
blended peaks of multiple systems, 335
failure to consider SCU (Single Compute Unit) performance, 335
invalid assumptions, 334
lack of background noise, 334
shared core infrastructure between production and non-production, 333-334
vMotion slot sizes of monster database virtual machines, 336-337
comparing
different processor generations, 330-331
different processor types, 328-330
customer deployments, 71
database workload, 48-50
explained, 311-314
metrics
ESXTOP counters, 325-327
SQL Server baseline infrastructure metrics, 321-322
SQL Server Perfmon counters, 323-324
SQL Server Profiler counters, 324-325
non-production workload influences on performance, 331-332
reasons for, 319-320
validating performance with, 318
vSphere infrastructure, 46-48
when to record, 320
Batch Requests/sec metric, 324
Batch/ETL (Extract Transform Load) workloads, 64
benchmarks, 315-316
developing
benchmark model based on recorded production performance, 318
benchmark model based on system nonfunctional requirements, 317
industry-standard benchmarks, 316
validating performance with, 318
vendor benchmarks, 316-317
BIOS settings, 12-13
blended peaks of multiple systems, 335
blocks, Pointer Block Eviction Process, 163-164
blogs, 444
Thomas LaRocks blog, 445
vLaunchPad, 444-445
breaking down large pages, 238-239
buffer
Buffer Cache, 49
Buffer Cache Hit Ratio, 50, 323
Buffer Manager, 323
Buffer Pool, 129-130, 219-220
built-in in-memory, 246-247
business case for virtualization, 9
BIOS settings, 12-13
DBA (database administrator) advantages, 10-11
hardware refresh, 20-22
high availability, 14-16
large databases, 22-23
performance, 16-17
provisioning/DBaaS, 17-20
database tiering, 19-20
shared environments, 20
reduced expenses, 9-10
SLAs (service level agreements), 11-12
business continuity, 291
business transparency, 73
Bytes Total/sec metric, 322
C
cache
Buffer Cache, 49
Buffer Cache Hit Ratio, 50, 323
CACHEUSED counter, 326
Fusion-io ioTurbine, 201-203
vRFC (vSphere Flash Read Cache), 199-201
CACHEUSED counter, 326
CAN (Converged Network Adapter), 276
capacity, one-to-one relationships and unused capacity, 38-40
Center of Excellence (CoE), 61-63
charge back, 73
check it before you wreck it rule, 36
choosing virtual network adapters, 250-251
cloud, vCHS (vCloud Hybrid Service), 302
Cluster Validation Wizard, 363-364
clusters
failover cluster instance storage layout, 157
vSphere 5.5 failover clustering environments, 185-186
Windows Failover Clustering
configuring, 359-368
quorum mode, 369-374
validating, 368
WSFC (Windows Server Failover Clustering), 304
CMDS/s counter, 326
CoE (Center of Excellence), 61-63
Column Storage, 134-135
commands, SP_Configure, 246
communication, 58-59
communication responsiveness, 253
mutual understanding, 59-60
responsibility domains, 60-61
comparing performance baselines, 328
different processor generations, 330-331
different processor types, 328-330
Complete page (SQL Server installation), 387
compression, 133
compromise, virtualization without, 108-109
computer names, requirements for performance testing, 341-342
configuration
AlwaysOn Availability Groups, 387-391
Hot-Add Memory and Hot-Add CPU, 356-358
jumbo frames, 259-262, 393-394
max/min memory, 392
Max Server Memory, 236-237
performance test labs, 342
performance tests, 339-340
affinity and anti-affinity rules, 358
AlwaysOn Availability Groups configuration, 387-391
AlwaysOn Availability Groups creation, 399-405
computer name/IP address requirements, 341-342
Dell DVD Store installation and configuration, 406-430
Dell DVD Store load test, 430-436
Hot-Add Memory and Hot-Add CPU, 356-358
jumbo frame configuration, 393-394
max/min memory configuration, 392
memory reservations, 355
multiple tempdb files, 394-395
network connection validation, 359
performance test lab setup, 342-345
software requirements, 341
SQL Server 2012 installation, 374-387
test database creation, 396-398
VMDK file configuration, 345-354
Windows Failover Clustering configuration, 359-368
Windows Failover Clustering quorum mode, 369-374
Windows Failover Clustering validation, 368
trace flags, 215
Configure Cluster Quorum Wizard, 370-374
consistency, 302-303
consolidations, 53, 68
continuity (business), 291
controllers, virtual storage, 138-143
Converged Network Adapter (CAN), 276
Cores per Socket, 83
counters
ESXTOP, 325-327
Perfmon, 50, 323-324
Profiler, 324-325
CPUs, 74-76
CPU Scheduler, 86
hot-add CPUs, 3-4
CPU Scheduler, 86
Create Cluster Wizard, 366
CrossSubnetDelay, 255
CrossSubnetThreshold, 255
%CSTP counter, 326
customer deployment baselines, 71
D
DaaS (Database as a Service), 73
and database virtualization, 17-20
Darwin, Charles, 1-3
database administrators. See DBAs
Database as a Service. See DaaS
database availability, 287
ACID (Atomicity, Consistency, Isolation, and Durability), 302-303
business continuity, 291
determining availability requirements, 287-288
disaster recovery, 291-294
providing a menu of options, 288-289
RPOs (recovery point objectives), 290
RTOs (recovery time objectives), 290
sample high availability chart, 308-309
SLAs (service-level agreements), 290
SQL Server AlwaysOn Availability Groups (AGs), 306-308
SQL Server AlwaysOn Failover Cluster Instance (FCI), 304-306
vSphere high availability
DRS (Distributed Resource Scheduler), 297
hypervisor availability features, 294-296
Storage DRS, 297
Storage vMotion, 297
vCenter SRM (Site Recovery Manager), 301
vCHS (vCloud Hybrid Service), 302
vDP (vSphere Data Protection), 300
vMotion, 296-297
vSphere App HA, 299-300
vSphere HA, 298-299
vSphere Replication, 300-301
X-vMotion, 298
database availability design, 135
database buffer pool, 219-220
database consolidations, 53
Database Engine Configuration page (SQL Server installation), 382
database files
file system layout, 110-122
data files, 123-126
log files, 123-126
NTFS file system allocation unit size, 126-127
OS, application binaries, and page file, 122
partition alignment, 128-129
Temp DB files, 123-126
Instant File Initialization (IFI), 120-122
number of, 110-113
size of, 114-116
data files, 116
Temp DB files, 116
transaction log file sizing, 117-120
database indexes, 222-225
database installation guidelines, 32-36
database instances, number of, 244-245
database metrics, 324
database pages
explained, 219-220
large pages
breaking down into default page size, 238-239
explained, 237-238
locking pages in memory, 239-241
paging, 220-221
swapping, 220-221
TPS (transparent page sharing), 228-229
database statistics, updating, 130-132
with maintenance plan, 131-132
with trace flag 2371, 131
database storage. See storage
database tiering, 19-20
database virtualization. See virtualization
database workload, baselining, 48-50
Data Compression, 133
data files
file system layout, 123-126
sizing, 116
data protection (vDP), 300
data stores
number of, 165-169
virtual disks per data store, 170-173
DAVG/cmd counter, 326
DAVG/cmd metric, 48
DBA Guide to Databases on VMware (white paper), 439
DBAs (database administrators)
business case for virtualization, 10-11
BIOS settings, 12-13
hardware refresh, 20-22
high availability, 14-16
large databases, 22-23
performance, 16-17
provisioning/DBaaS, 17-20
SLAs (service level agreements), 11-12
SLAs (service level agreements), 11-12
Decision Support System. See DSS workload
default queue depth (QLogic HBA), 166
Dell DVD Store, 327
installing and configuring, 406-430
load test, 430-436
Dell DVD Store Custom Install Wizard, 408
Denneman, Frank, 444
deployment, 54, 63
design
deployment, 54, 63
networks. See network design
server-side flash acceleration, 198-199
Fusion-io ioTurbine, 201-203
PernixData FVP, 204-206
vSphere Flash Read Cache (vFRC), 199-201
SQL Server database and guest OS storage, 109
Buffer Pool, 129-130
Column Storage, 134-135
database availability, 135
database statistics, updating, 130-132
Data Compression, 133
file system layout, 110, 122-129
Instant File Initialization (IFI), 120-122
number of database files, 110-113
size of database files, 114-120
Storage Spaces, 136
Volume Managers, 136
SQL Server on hyperconverged infrastructure, 207-213
SQL Server virtual machine storage, 136
expanding, 158-159
Jumbo VMDKs, 159-164
layout, 152-157
virtual disk devices, 143-152
virtual storage controllers, 138-143
VM hardware version, 137
storage design principles
database as extension of storage, 106
KISS principle (Keep It Standardized and Simple), 109
performance and underlying storage devices, 107
sizing for performance before capacity, 107-108
virtualization without compromise, 108-109
vSphere storage, 164
multipathing, 184-185
number of data stores and data store queues, 165-169
number of virtual disks per data store, 170-173
RAID (Redundant Array of Independent Disks), 187-197
storage DRS, 177-183
Storage IO Control (SIOC), 173-177
storage policies, 177-183
vSphere 5.5 failover clustering environments, 185-186
determining availability requirements, 287-288
developing benchmarks
benchmark model based on recorded production performance, 318
benchmark model based on system nonfunctional requirements, 317
disaster recovery, 291-294
discovery, 53
Disk Management utility, 352
Disk Space Requirements page (SQL Server installation), 382
disks
disk layout, 95
Disk Management utility, 352
enterprise flash disks (EFDs), 195-197
RAID (Redundant Array of Independent Disks)
economics of RAID performance, 194-197
IO penalties, 189-194
randomness of IO pattern, 187-188
read/write bias, 188
virtual disk devices, 143
IO blender effect, 151-152
Raw Device Map (RDM), 149-151
Thick Eager Zero disks, 147-148
Thin versus Thick Lazy Zero disks, 144-146
Distributed Resource Scheduler (DRS), 297
distributed switches, 263
distributed virtual switches, 100
documentation
online resources, 437-440
reading, 43-44
dozing, 13
DQLEN, 168
DRaaS, 293-294
drivers (PVSCSI), 31
%DRPPX counter, 327
%DRPTX counter, 327
DRS (Distributed Resource Scheduler), 297
DSS (Decision Support System) workload, 64
durability, 302-303
dynamic threshold for automatic statistics update, 131
E
E1000 adapters, 250-252
E1000E adapters, 250-252
economics of RAID performance, 194-197
education, 60
EFDs (enterprise flash disks), 195-197
Effects of Min and Max Server Memory (article), 235
enabling. See configuration
encapsulation, 28
enterprise flash disks (EFDs), 195-197
Epping, Duncan, 444
Error Reporting page (SQL Server installation), 385
ESXi host swap file location, 78
ESXTOP counters, 325-327
ETL (Extract Transform Load), 64
expanding SQL virtual machine storage layout, 158-159
expenses, reducing, 9-10
F
Facebook groups, 443
Failover Cluster Instance (FCI), 98, 304-306
Failover Cluster Manager, 362
Failover Clustering
configuring, 359-368
Failover Cluster Instance (FCI), 98, 304-306
Failover Cluster Manager, 362
failover clustering environments, 185-186
network settings, 254-256
network teaming, 270-273
quorum mode, 369-374
storage layout, 157
validating, 368
FCI (Failover Cluster Instance), 98, 304-306
Feature Selection page (SQL Server installation), 379
file system layout, 110-122
data files, 123-126
log files, 123-126
NTFS file system allocation unit size, 126-127
OS, application binaries, and page file, 122
partition alignment, 128-129
Temp DB files, 123-126
files
database files
Instant File Initialization (IFI), 120-122
number of, 110-113
size of, 114-120
file system layout, 110-122
data files, 123-126
log files, 123-126
NTFS file system allocation unit size, 126-127
OS, application binaries, and page file, 122
partition alignment, 128-129
Temp DB files, 123-126
multiple tempdb files, 394-395
VLFs (Virtual Log Files), 118-120
VMDK files, configuring
inside guest operating system, 352-354
on virtual machines, 345-351
vswap, memory reservations and, 233-234
vswp files, 88
flash, server-side flash acceleration, 198-199
Fusion-io ioTurbine, 201-203
PernixData FVP, 204-206
vSphere Flash Read Cache (vFRC), 199-201
Flash Virtualization Platform (FVP), 204-206
frames
jumbo frames, 256-259
configuring, 259-262, 393-394
testing, 262-264
pause frames, 268
Free System Page Table Entries metric, 321
full virtualization, importance of, 36-38
Fusion-io ioTurbine, 201-203
Fusion-io ioTurbine Profiler, 203
FVP (Flash Virtualization Platform), 204-206
G
Gage, John, 281
GAVG/cmd for NFS Datastores counter, 326
General Statistics, 324
GPT (GUID Partition Table), 136
Gray, Jim, 302-303
groups
AlwaysOn Availability Groups
configuring, 387-391
creating, 399-405
database groups, 69
user groups
Facebook groups, 443
PASS (Professional Association of SQL Server), 441-442
VMUG (VMware Users Group), 440
VMWare Community, 442-443
guest OS storage, 27, 109
Buffer Pool, 129-130
Column Storage, 134-135
database availability, 135
database statistics, updating, 130-132
with maintenance plan, 131-132
with trace flag 2371, 131
Data Compression, 133
file system layout, 110, 122
data files, 123-126
log files, 123-126
NTFS file system allocation unit size, 126-127
OS, application binaries, and page file, 122
partition alignment, 128-129
Temp DB files, 123-126
Instant File Initialization (IFI), 120-122
number of database files, 110-112
size of database files, 114-116
data file sizing, 116
Temp DB file sizing, 116
transaction log file sizing, 117-120
Storage Spaces, 136
VMDK file configuration in, 352-354
Volume Managers, 136
GUID Partition Table (GPT), 136
H
HA (high availability)
ACID (Atomicity, Consistency, Isolation, and Durability), 302-303
DRS (Distributed Resource Scheduler), 297
hypervisor availability features, 294-296
sample high availability chart, 308-309
SQL Server AlwaysOn Availability Groups (AGs), 306-308
SQL Server AlwaysOn Failover Cluster Instance (FCI), 304-306
Storage DRS, 297
Storage vMotion, 297
vCenter SRM (Site Recovery Manager), 301
vCHS (vCloud Hybrid Service), 302
vDP (vSphere Data Protection), 300
vMotion, 296-297
vSphere App HA, 299-300
vSphere HA, 298-299
vSphere Replication, 300-301
X-vMotion, 298
hardware. See physical hardware
hardware independence, 28
hardware refresh and database virtualization, 20-22
Heap size, 160-162
heartbeat vNICs, 256
help. See resources
high availability. See HA (high availability)
high-level virtualization implementation plan, 50-51
phase 1: requirements gathering, 51-52
phase 2: discovery, 53
phase 2.1: database consolidations, 53
phase 3: infrastructure adjustments, 53
phase 4: validation and testing, 54
phase 5: migration and deployment, 54
phase 6: monitoring and management, 54
Hirt, Allan, 445
Hogan, Cormac, 445
host-local swap, 78
host memory, 225-226
host <servername> CPU% value, 434
Hot-Add CPU, 3-4, 356-358
Hot-Add Memory, 4-5, 356-358
HTT (Hyper-Threading Technology), 85-87
hyperconverged infrastructure, 207-213, 280
Hyper-Threading Technology (HTT), 85-87
Hyperic, 343-345
hypervisor, 25
availability features, 294-296
compared to OS, 26-27
explained, 25-27
importance of full virtualization, 36-38
one-to-many relationships, 40
one-to-one relationships and unused capacity, 38-40
paravirtualization, 29
PVSCSI (paravirtual SCSI driver), 31
Type-1 hypervisors, 30
Type-2 hypervisors, 31
virtualized database installation guidelines, 32-36
VMs (virtual machines), 28
VMware ESXi versions, 40-41
VMXNET3, 32
I
IFI (Instant File Initialization), 120-122
ILM (information life cycle management), 207
implementation plans
database workload baselines, 48-50
high-level plan, 50-51
items to consider, 44-45
phase 1: requirements gathering, 51-52
phase 2: discovery, 53
phase 2.1: database consolidations, 53
phase 3: infrastructure adjustments, 53
phase 4: validation and testing, 54
phase 5: migration and deployment, 54
phase 6: monitoring and management, 54
RPOs (recovery point objectives), 45-46
RTOs (recovery time objectives), 45-46
SLAs (service-level agreements), 45-46
vSphere infrastructure baselines, 46-48
Independent Persistent (SQL FCI), 157
indexes (database), 222-225
industry-standard benchmarks, 316
information life cycle management (ILM), 207
Infrastructure Navigator, 343-344
initialization, Instant File Initialization (IFI), 120-122
installation. See also configuration
Dell DVD Store, 406-430
SQL Server 2012, 374-377, 384-387
Complete page, 387
Database Engine Configuration page, 382
Disk Space Requirements page, 382
Feature Selection page, 379
Installation Configuration Rules page, 385
Installation Rules page, 380
Instance Configuration page, 380
License Terms page, 377
preflight check, 375
Product Key page, 375
Ready to Install page, 385
Server Configuration page, 382
Setup Role page, 379
virtualized database installation guidelines, 32-36
Installation Configuration Rules page (SQL Server installation), 385
Installation Rules page (SQL Server installation), 380
Instance Configuration page (SQL Server installation), 380
Instant File Initialization (IFI), 120-122
invalid assumptions, 334
IO blender effect, 151-152
IOBlazer, 327
IOMeter, 142-143, 327
ioTurbine (Fusion-io), 201-203
ioTurbine Profiler (Fusion-io), 203
IP addresses, requirements for performance testing, 342
iSCSI
adapters, 276
port binding, 281
isolation, 28, 302-303
J
jumbo frames, 256-259
configuring, 259-262, 393-394
testing, 262-264
Jumbo VMDKs, 159
Pointer Block Eviction Process, 163-164
VMFS Heap size considerations, 160-162
K
KAVG/cmd counter, 326
Keep It Standardized and Simple (KISS), 109
KISS principle (Keep It Standardized and Simple), 109
Klee, David, 445
L
LACP, 273
Lam, William, 444
large databases and database virtualization, 22-23
large pages, 79
breaking down into default page size, 238-239
explained, 237-238
locking pages in memory, 239-241
LaRock, Thomas, 445
latches, 324
layout
file system layout, 110
data files, 123-126
log files, 123-126
NTFS file system allocation unit size, 126-127
OS, application binaries, and page file, 122
partition alignment, 128-129
Temp DB files, 123-126
virtual machine storage layout, 152-157
Leaf-Spine network architecture, 273
licenses
License Terms page (SQL Server installation), 377
VMware vCloud Suite licenses, 285
load test (Dell DVD Store), 430-436
locking pages in memory, 92, 239-241
locks, 324
log files
file system layout, 123-126
sizing, 117-120
Log Flush Wait Time, 324
Log Flush Waits/sec, 324
LogicalDisk(*): Avg Disk Sec/Read, 322
LogicalDisk(*): Avg. Disk Sec/Write, 322
LogicalDisk Disk Bytes/sec, 321
Logins/sec, 324
Logout/sec, 324
Lowe, Scott, 445
LSI Logic SAS, 95, 137-142
LUN queue depth, 167
M
maintenance plans, updating database statistics with, 131-132
Maintenance Plan Wizard, 420-422
management, 54
Max Server Memory, 234-236
MaxAddressableSpaceTB, 163-164
maximum memory, configuring, 392
maximum storage capacity limits, 111
MbRX/s counter, 327
MbTx/s counter, 327
MCTLSZ (MB) counter, 47-48, 326
memory
Buffer Pool, 129-130
cache
Buffer Cache, 49
Buffer Cache Hit Ratio, 50, 323
CACHEUSED counter, 326
Fusion-io ioTurbine, 201-203
vRFC (vSphere Flash Read Cache), 199-201
host memory, 225-226
hot-add memory, 4-5
large pages
breaking down into default page size, 238-239
explained, 237-238
locking pages in memory, 239-241
memory ballooning, 230-232
Memory Grants Pending, 324
Memory Manager, 324
memory overcommitment, 87
memory reservation, 355
explained, 232-233
mixed workload environment with memory reservations, 226-228
VMware HA strict admission control, 233
vswap file, 233-234
memory trends and the stack, 218
database buffer pool, 219-220
database indexes, 222-225
database pages, 219-221
paging, 220-221
swapping, 220-221
min/max memory, configuring, 392
mixed workload environment with memory reservations, 226-228
NUMA (Non-uniform Memory Access)
explained, 241-243
vNUMA, 243
overview, 76, 217-218
RAM, 87
shared-resource world, 246
SQL Server 2014 in-memory, 246-247
SQL Server Max Server Memory, 234-236
SQL Server Min Server Memory, 235
TPS (transparent page sharing), 228
VMs (virtual machines), 225-226
number of, 244-245
sizing, 244
vRFC (vSphere Flash Read Cache), 199-201
xVelocity memory, 134
Memory Grants Pending, 324
Memory Manager, 324
metrics
baseline metrics
ESXTOP counters, 325-327
SQL Server baseline infrastructure metrics, 321-322
SQL Server Perfmon counters, 323-324
SQL Server Profiler counters, 324-325
Buffer Cache Hit Ratio, 50
Cache Hit Ratio, 50
DAVG/cmd, 48
MCTLSZ, 47-48
%MLMTD, 47-48
%RDY, 47
READs/s, 47-48
storage-specific metrics, 94
Microsoft Assessment and Planning Toolkit, 110
Microsoft Clustering on VMware vSphere: Guidelines for Supported Configurations,
345
Microsoft System Center, 110
migration, 54
Min Server Memory, 235
minimum memory, configuring, 392
mixed workload environment with memory reservations, 226-228
MLAG (Multi-Link Aggregation Group), 273
%MLMTD metric, 47-48
models (benchmark workload)
based on recorded production performance, 318
based on system nonfunctional requirements, 317
monitoring, 54
Multi-Link Aggregation Group (MLAG), 273
Multi-NIC vMotion, 276-278
multipathing of storage paths, 184-185, 280
multiple tempdb files, creating, 394-395
N
National Institute of Standards and Technology (NIST), 291
n_browse_overall value, 433
NDFS (Nutanix Distributed File System), 207
network connections, validating, 359
network design, 264
Multi-NIC vMotion, 276-278
NIOC (Network IO Control), 101, 274-276
physical network adapters, 267-269
storage, 279-280
teaming and failover, 270-273
virtual switches, 265-267
Network IO Control (NIOC), 101, 274-276
network paths, verifying, 262
network security, 103, 281-284
network teaming, 270-273
network virtualization, 281-284
New Availability Group Wizard, 399-405
NIOC (Network IO Control), 101, 274-276
NIST (National Institute of Standards and Technology), 291
N%L counter, 326
n_newcust_overall value, 433
non-production workload influences on performance, 331-332
non-shared disks, 345
non-uniform memory architecture. See NUMA
n_overall value, 432
n_purchase_from_start value, 434
n_purchase_overall value, 433
n_rollbacks_from_start value, 434
n_rollbacks_overall value, 434
NTFS allocation unit size, 126-127
NUMA (non-uniform memory architecture), 79-85
explained, 241-243
NUMA Scheduler, 81
vNUMA, 82, 243
Wide NUMA, 81
NUMA Scheduler, 81
Nutanix Bible (Poitras), 210
Nutanix Distributed File System (NDFS), 207
Nutanix Virtual Computing Platform, 207-213
O
OLAP (Online Analytical Processing), 64
OLTP (Online Transaction Processing), 64
one-to-many relationships, 40
one-to-one relationships and unused capacity, 38-40
Online Analytical Processing (OLAP), 64
Online Transaction Processing (OLTP), 64
operating systems. See OSs
opm value, 432
OSs (operating systems)
application binaries, and page file, 122
compared to hypervisor, 26-27
guest operating systems, 27
P
PAGEIOLATCH, 124
PAGELATCH_XX, 112
Page Life Expectancy metric, 323
pages
explained, 219-220
large pages
breaking down into default page size, 238-239
explained, 237-238
locking pages in memory, 239-241
paging, 78, 220-221
swapping, 78, 220-221
TPS (transparent page sharing), 228-229
Pages/Sec metrics, 321
paging, 78, 220-221
Paging File(_Total): %Usage metric, 321
parallelism of storage design, 106
paravirtualization, 29
Paravirtualized SCSI (PVSCI), 31, 95, 137-141
partition alignment, 128-129
partitioning, 28
patches, 68
pause frames, 268
penalties (RAID IO), 189-194
Perfmon counters, 50, 323-324
performance baselines, 311-315
baseline performance reports, 332-333
benchmarks, 315
developing, 317-318
industry-standard benchmarks, 316
validating performance with, 318
vendor benchmarks, 316-317
comparing, 319-327
different processor generations, 330-331
different processor types, 328-330
common performance traps
blended peaks of multiple systems, 335
failure to consider SCU (Single Compute Unit) performance, 335
invalid assumptions, 334
lack of background noise, 334
shared core infrastructure between production and non-production, 333-334
vMotion slot sizes of monster database virtual machines, 336-337
and database virtualization, 16-17
metrics
ESXTOP counters, 325-327
SQL Server baseline infrastructure metrics, 321-322
SQL Server Perfmon counters, 323-324
SQL Server Profiler counters, 324-325
non-production workload influences on performance, 331-332
server-side flash acceleration, 198-199
Fusion-io ioTurbine, 201-203
PernixData FVP, 204-206
vSphere Flash Read Cache (vFRC), 199-201
storage
KISS principle (Keep It Standardized and Simple), 109
performance and underlying storage devices, 107
sizing for performance before capacity, 107-108
virtualization without compromise, 108-109
validating performance with, 318
vSphere storage design, 164
multipathing, 184-185
number of data stores and data store queues, 165-169
number of virtual disks per data store, 170-173
RAID (Redundant Array of Independent Disks), 187-197
storage DRS, 177-183
Storage IO Control (SIOC), 173-177
storage policies, 177-183
vSphere 5.5 failover clustering environments, 185-186
when to record, 320
performance tests, 339
affinity and anti-affinity rules, 358
AlwaysOn Availability Groups configuration, 387-391
AlwaysOn Availability Groups creation, 399-405
Dell DVD Store installation and configuration, 406-430
Dell DVD Store load test, 430-436
Hot-Add Memory and Hot-Add CPU, 356-358
jumbo frame configuration, 393-394
lab configuration, 342-345
max/min memory configuration, 392
memory reservations, 355
multiple tempdb files, 394-395
network connection validation, 359
reasons for performance testing, 339-340
requirements
computer names and IP addresses, 341-342
resources, 342
software, 341
SQL Server 2012 installation, 374-377, 384-387
Complete page, 387
Database Engine Configuration page, 382
Disk Space Requirements page, 382
Error Reporting page, 385
Feature Selection page, 379
Installation Configuration Rules page, 385
Installation Rules page, 380
Instance Configuration page, 380
License Terms page, 377
preflight check, 375
Product Key page, 375
Ready to Install page, 385
Server Configuration page, 382
Setup Role page, 379
test database creation, 396-398
VMDK file configuration
inside guest operating system, 352-354
on virtual machines, 345-351
Windows Failover Clustering configuration, 359-368
Windows Failover Clustering quorum mode, 369-374
Windows Failover Clustering validation, 368
performance traps
blended peaks of multiple systems, 335
failure to consider SCU (Single Compute Unit) performance, 335
invalid assumptions, 334
lack of background noise, 334
shared core infrastructure between production and non-production, 333-334
vMotion slot sizes of monster database virtual machines, 336-337
PernixData FVP, 204-206
PernixData FVP Datasheet, 205
phases of virtualization implementation
database consolidations, 53
discovery, 53
infrastructure adjustments, 53
migration and deployment, 54
monitoring and management, 54
requirements gathering, 51-52
validation and testing, 54
physical hardware, 73
adapter count, 95
CPUs, 74-76
data stores, 99
disk layout, 95
hardware compatibility, 62
HTT (Hyper-Threading Technology), 85-87
large pages, 79
LSI Logic SAS adapters, 95
memory, 76
memory overcommitment, 87
NUMA (non-uniform memory architecture), 79-85
PVSCSI adapters, 95
reservations, 87-89
SQL Server Lock Pages in Memory, 92
SQL Server Min Server Memory/ Max Server Memory, 90-91
storage, 93
storage-specific metrics, 94
swapping files, 78
Thin Provisioning, 98-99
virtualization overhead, 76-77
VMDKs, 99
file size, 100
provisioning types, 96-98
versus RDM, 96
Physical Mode RDMs (pRDM), 159
physical network adapters, 267-269
PKTRX/s counter, 327
PKTTX/s counter, 327
plans. See implementation plans
Pointer Block Eviction Process, 163-164
Poitras, Steven, 210
policies (storage), 177-183
pool, database buffer, 219-220
port groups, 265
power management policies, 253
pRDM (Physical Mode RDMs), 159
processors baseline comparisons
between different processor generations, 330
between different processor types, 328-330
Processor(_Total):Privileged Time metric, 321
Processor(_Total)[metric] Processor Time metric, 321
Product Key page (SQL Server installation), 375
Product Updates page (SQL Server installation), 377
Professional Association of SQL Server (PASS)
PASSVirtualization Virtual Chapter, 441
PASS SQLSaturday, 441-442
Profiler counters, 324-325
protocols, 279-280
provisioning
and database virtualization, 17-20
VMDK, 96-98
PVSCSI (Paravirtualized SCSI), 31, 95, 137-141
Q
QFULL SCSI sense code, 166
QLogic HBA, 166
QoS, 102-103
Query Plan Optimizer, 130
queues, data store, 165-169
quorum mode (Failover Clustering), 369-374
R
RAID (Redundant Array of Independent Disks), 187
economics of RAID performance, 194-197
IO penalties, 189-194
randomness of IO pattern, 187-188
read/write bias, 188
RAM, 87
randomness of IO pattern, 187-188
Raw Device Map (RDM), 149-151
RDM (Raw Device Map), 149-151
versus VMDK, 96
%RDY counter, 47, 325
reading documentation, 43-44
READs/s counter, 47-48, 326
read/write bias, 188
Ready to Install page (SQL Server installation), 385
recovery
disaster recovery, 291-294
recovery point objectives (RPOs), 45-46, 290
recovery time objectives (RTOs), 45-46, 290
vCenter SRM (Site Recovery Manager), 301
recovery point objectives (RPOs), 45-46, 290
recovery time objectives (RTOs), 45-46, 290
reducing expenses, 9-10
Redundant Array of Independent Disks. See RAID
relationships
one-to-many relationships, 40
one-to-one relationships, 38-40
reorganizing SQL workloads, 68-69
replication (vSphere), 300-301
reports (performance), 332-333
requirements gathering, 51-52
reservation (memory), 87-89, 355
explained, 232-233
mixed workload environment with memory reservations, 226-228
VMware HA strict admission control, 233
vswap file, 233-234
RESETS/s counter, 327
resource pools, Network IO Control, 275
resources
blogs
Thomas LaRockTs blog, 445
vLaunchPad, 444-445
documentation and white papers, 437-440
Twitter, 445-446
user groups, 440
Facebook groups, 443
PASS (Professional Association of SQL Server), 441-442
VMUG (VMware Users Group), 440
VMWare Community, 442-443
responsibility domains, 60-61
rollback_rate value, 434
Route Based on Physical NIC Load, 271
RPOs (recovery point objectives), 45-46, 290
rt_browse_avg_msec value, 433
rt_login_avg_msec value, 433
rt_newcust_avg_msec value, 433
RTOs (recovery time objectives), 45-46, 290
rt_purchase_avg_msec value, 433
rt_tot_avg, 433
rt_tot_avg value, 433
rt_tot_lastn_max value, 432
rules, affinity/anti-affinity, 358
S
SameSubnetDelay, 255
SameSubnetThreshold, 255
sample high availability chart, 308-309
SAP
benchmark examples between different processor generations, 330-331
benchmark examples between different processor types, 328-330
Data Compression with, 133
SCU (Single Compute Unit), 335
security, 281-284
Server Configuration page (SQL Server installation), 382
server-side flash acceleration, 198-199
Fusion-io ioTurbine, 201-203
PernixData FVP, 204-206
vSphere Flash Read Cache (vFRC), 199-201
service-level agreements (SLAs), 11-12, 45-46, 100, 290
settings (BIOS), 12-13
Setup for Failover Clustering and Microsoft Cluster Service (white paper), 439
Setup Role page (SQL Server installation), 379
shared core infrastructure between production and non-production, 333-334
shared disks, 345
shared environments, 20, 35
shared-resource world, 246
SIOC (Storage IO Control), 157, 173-177
Site Recovery Manager (SRM), 16, 301
sizing
Heap size, 160-162
database files, 114-116
data file sizing, 116
Temp DB file sizing, 116
transaction log files, 117-120
databases, 22-23
performance before capacity, 107-108
VMs (virtual machines), 244
SLAs (service-level agreements), 11-12, 45-46, 100, 290
software requirements for performance testing, 341
SP_Configure command, 246
SQL AlwaysOn Failover Cluster Instances, 157
SQL Compilations/sec metric, 324
SQL Re-Compilations/sec metric, 324
SQL Server 2012 installation, 374-387
Complete page, 387
Database Engine Configuration page, 382
Disk Space Requirements page, 382
Feature Selection page, 379
Installation Configuration Rules page, 385
Installation Rules page, 380
Instance Configuration page, 380
License Terms page, 377
preflight check, 375
Product Key page, 375
Ready to Install page, 385
Server Configuration page, 382
Setup Role page, 379
SQL Server 2014 & The Data Platform (data sheet), 247
SQL Server Best Practices (white paper), 210
SQL Server Max Server Memory, 90-91
SQL Server Min Server Memory, 90-91
SQL Server on VMwareAvailability and Recovery Options (white paper), 439
SQL Server on VMwareBest Practices Guide (white paper), 439
SQL Server SysPrep, 71
SQL Statistics, 324
SQL workloads, 64-67
Batch/ETL workloads, 64
DSS workloads, 64
OLAP workloads, 64
OLTP workloads, 64
reorganization, 68-69
tiered database offering, 70-73
SQLIOsim, 327
SQLSaturday, 441-442
SRM (Site Recovery Manager), 16, 301
stack, memory trends and, 79, 218
database buffer pool, 219-220
database indexes, 222-225
database pages, 219-220
paging, 220-221
swapping, 220-221
storage
design principles
database as extension of storage, 106
KISS principle (Keep It Standardized and Simple), 109
performance and underlying storage devices, 107
sizing for performance before capacity, 107-108
virtualization without compromise, 108-109
overview, 93, 105-106
server-side flash acceleration, 198-199
Fusion-io ioTurbine, 201-203
PernixData FVP, 204-206
vSphere Flash Read Cache (vFRC), 199-201
SQL Server database and guest OS storage, 109
Buffer Pool, 129-130
Column Storage, 134-135
database availability, 135
database statistics, updating, 130-132
Data Compression, 133
file system layout, 110, 122-129
Instant File Initialization (IFI), 120-122
number of database files, 110-113
size of database files, 114-120
Storage Spaces, 136
Volume Managers, 136
SQL Server on hyperconverged infrastructure, 207-213
SQL Server virtual machine storage, 136
expanding, 158-159
Jumbo VMDKs, 159-164
layout, 152-157
virtual disk devices, 143-152
virtual storage controllers, 138-143
VM hardware version, 137
Storage Acceleration, 96
storage arrays, 98
Storage DRS, 177-183, 297
Storage IO Control (SIOC), 157, 173-176
storage networks, 279-280
storage policies, 177-183
storage protocols, 279-280
Storage Spaces, 136
storage-specific metrics, 94
vSphere storage design, 164
multipathing, 184-185
number of data stores and data store queues, 165-169
number of virtual disks per data store, 170-173
RAID (Redundant Array of Independent Disks), 187-197
storage DRS, 177-183
Storage IO Control (SIOC), 173-177
storage policies, 177-183
vSphere 5.5 failover clustering environments, 185-186
Storage Acceleration, 96
Storage DRS, 177-183, 297
Storage IO Control (SIOC), 157, 173-176
Storage Spaces, 136
Storage vMotion, 297
Strawberry Perl, 406
swap files, 78
Swapin counter, 326
Swapout counter, 326
swapping, 78, 220-221
switches
virtual switches, 265-267
distributed virtual switch, 100
port groups, 265
teaming methods, 270
vSphere distributed switches, 263
vSS, 265
%SWPWT counter, 326
%SYS counter, 325
Szastak, Jeff, 311
T
Target Server Memory (KB) metric, 324
TDE (Transparent Data Encryption), 122
teams
Center of Excellence (CoE), 61-63
communication, 58
mutual understanding, 59-60
responsibility domains, 60-61
Temp DB files
file system layout, 123-126
multiple tempdb files, creating, 394-395
sizing, 116
test databases, creating, 396-398
testing, 54
baseline. See performance baselines
benchmarks, 315
developing, 317-318
industry-standard benchmarks, 316
vendor benchmarks, 316-317
jumbo frames, 262-264
performance tests, 339
affinity and anti-affinity rules, 358
AlwaysOn Availability Groups configuration, 387-391
AlwaysOn Availability Groups creation, 399-405
Dell DVD Store installation and configuration, 406-430
Dell DVD Store load test, 430-436
Hot-Add Memory and Hot-Add CPU, 356-358
jumbo frame configuration, 393-394
lab configuration, 342-345
max/min memory configuration, 392
memory reservations, 355
multiple tempdb files, 394-395
network connection validation, 359
reasons for performance testing, 339-340
requirements, 341-342
SQL Server 2012 installation, 374-387
test database creation, 396-398
VMDK file configuration, 345-354
Windows Failover Clustering configuration, 359-368
Windows Failover Clustering quorum mode, 369-374
Windows Failover Clustering validation, 368
Thick Eager Zero disks, 147-148
Thick Lazy Zero disks, 144-146
Thick Provisioned Eager Zeroed, 97
Thick Provisioned Lazy Zeroed, 97
Thick Provisioned LUNs, 98
Thin disks, 144-146
Think Provisioned LUNs, 98
Thin Provisioned VMDK, 97-99
Thin Provisioning, 98-99
tiered database offering, 70-73
tiers (database), 19-20
TLB (translation lookaside buffer), 79
TPC (Transaction Processing Performance Council), 316
TPS (transparent page sharing), 79, 228-229
trace flags
enabling, 215
list of, 215
trace flag 2371, 131
traffic types, 101-102
transaction log files, sizing, 117-120
Transaction Processing Performance Council (TPC), 316
translation lookaside buffer (TLB), 79
Transactions/sec metric, 324
Transparent Data Encryption (TDE), 122
transparent page sharing (TPS), 79, 228-229
troubleshooting common performance traps
blended peaks of multiple systems, 335
failure to consider SCU (Single Compute Unit) performance, 335
invalid assumptions, 334
lack of background noise, 334
shared core infrastructure between production and non-production, 333-334
vMotion slot sizes of monster database virtual machines, 336-337
tuning virtual network adapters, 252-254
Twitter, 445-446
Type-1 hypervisors, 30
Type-2 hypervisors, 31
U
Understanding Memory Resource Management in VMware vSphere 5.0 (study),
229-230
Understanding VMware vSphere 5.1 Storage DRS (white paper), 182
unused capacity and one-to-one relationships, 38-40
updating database statistics, 130-132
with maintenance plan, 131-132
with trace flag 2371, 131
%USED counter, 325
User Connections metric, 324
user groups, 440
Facebook groups, 443
PASS (Professional Association of SQL Server)
PASSVirtualization Virtual Chapter, 441
PASS SQLSaturday, 441-442
VMUG (VMware Users Group), 440
VMWare Community, 442-443
V
VAAI (vStorage APIs for Array Integration), 96
Validate a Configuration Wizard, 364
validation, 54
cluster network configuration, 368
network connections, 359
performance with baselines/benchmarks, 318
vCenter Hyperic, 343-345
vCenter Infrastructure Navigator, 343-344
vCenter Operations Manager, 87
vCenter SRM (Site Recovery Manager), 301
vCHS (vCloud Hybrid Service), 302
vCloud Hybrid Service (vCHS), 302
vCPUs (virtual CPUs), 4
vDP (vSphere Data Protection), 266, 300
Multi-NIC vMotion, 277
vDS (vSphere distributed switch), 262
vendor benchmarks, 316-317
verifying network paths, 262
Virtual Computing Platform (Nutanix), 207-213
virtual CPUs (vCPUs), 4
virtual disks, number per data store, 170-173
Virtual Log Files (VLFs), 118-120
virtuallyGhetto, 444
virtual machine storage, 136
expanding, 158-159
Jumbo VMDKs, 159
Pointer Block Eviction Process, 163-164
VMFS Heap size considerations, 160-162
layout, 152-157
number of VMs, 244-245
sizing, 244
virtual disk devices, 143
IO blender effect, 151-152
Raw Device Map (RDM), 149-151
Thick Eager Zero disks, 147-148
Thin versus Thick Lazy Zero disks, 144-146
virtual storage controllers, 138-143
VM hardware version, 137
VM memory, 225-226
Virtual Mode RDMs (vRDM), 159
virtual network adapters, 100
choosing, 250-251
traffic types, 101-102
turning, 252-254
virtual server access ports, 281
virtual switches, 265-267
distributed virtual switch, 100
port groups, 265
teaming methods, 270
virtualization, 93
advantages of, 3
hot-add CPUs, 3-4
hot-add memory, 4-5
business case for, 9
BIOS settings, 12-13
DBA (database administrator) advantages, 10-11
hardware refresh, 20-22
high availability, 14-16
large databases, 22-23
performance, 16-17
provisioning/DBaaS, 17-20
reduced expenses, 9-10
SLAs (service level agreements), 11-12
and compromise, 108-109
documentation, 43-44
explained, 1-2
hypervisor, 25
compared to OS, 26-27
explained, 25-27
importance of full virtualization, 36-38
one-to-many relationships, 40
one-to-one relationships and unused capacity, 38-40
paravirtualization, 29
PVSCSI (paravirtual SCSI driver), 31
Type-1 hypervisors, 30
Type-2 hypervisors, 31
virtualized database installation guidelines, 32-36
VMs (virtual machines), 28
VMware ESXi versions, 40-41
VMXNET3, 32
implementation plan
database workload baselines, 48-50
high-level plan, 50-51
items to consider, 44-45
phase 1: requirements gathering, 51-52
phase 2: discovery, 53
phase 2.1: database consolidations, 53
phase 3: infrastructure adjustments, 53
phase 4: validation and testing, 54
phase 5: migration and deployment, 54
phase 6: monitoring and management, 54
RPOs (recovery point objectives), 45-46
RTOs (recovery time objectives), 45-46
SLAs (service-level agreements), 45-46
vSphere infrastructure baselines, 46-48
importance of full virtualization, 36-38
overhead, 76-77
performance baselines
baseline performance reports, 332-333
benchmarks, 315-318
common performance traps, 333-337
comparing, 328-331
explained, 311-315
metrics, 321-327
non-production workload influences on performance, 331-332
reasons for, 319-320
validating performance with, 318
when to record, 320
power company example, 6
world before database virtualization, 5-6
Virtualization Overview (white paper), 21
virtualized database installation guidelines, 32-36
virtualized security zones, 283
vLaunchPad, 444-445
VLFs (Virtual Log Files), 118-120
VMDKs, 99
files, configuring
file size, 100
inside guest operating system, 352-354
on virtual machines, 345-351
Jumbo VMDKs, 159-160
Pointer Block Eviction Process, 163-164
VMFS Heap size considerations, 160-162
provisioning types, 96-98
versus RDM, 96
virtual machine storage layout, 152-157
VMFS heap size considerations, 160-162
vMotion, 296-297
slot sizes of monster database virtual machines, 336-337
traffic, 276
VMs (virtual machines). See virtual machine storage
VMUG (VMware Users Group), 440
%VMWait counter, 326
VMware App Director, 70
VMware Capacity Planner, 110
VMWare Community, 442-443
VMware ESXi versions, 40-41
VMware HA strict admission control, 233
VMware NSX, 283
VMware PowerCLI, 253
VMware Site Recovery Manager (SRM), 16
VMware vCloud Suite licenses, 285
VMware vSphere on Nutanix Best Practices (white paper), 210
VMXNET 3, 100
VMXNET3, 32, 251, 257
vNIC, 256
vNUMA, 82, 243
Volume Managers, 136
vRAM, 87
vRDM (Virtual Mode RDMs), 159
vRFC (vSphere Flash Read Cache), 199-201
vSphere, 98-99
baselining, 46-48
ESXTOP counters, 325-327
failover clustering environments, 185-186
high availability
DRS (Distributed Resource Scheduler), 297
hypervisor availability features, 294-296
Storage DRS, 297
Storage vMotion, 297
vCenter SRM (Site Recovery Manager), 301
vCHS (vCloud Hybrid Service), 302
vDP (vSphere Data Protection), 300
vMotion, 296-297
vSphere App HA, 299-300
vSphere HA, 298-299
vSphere Replication, 300-301
X-vMotion, 298
storage design, 164
multipathing, 184-185
number of data stores and data store queues, 165-169
number of virtual disks per data store, 170-173
RAID (Redundant Array of Independent Disks), 187-197
storage DRS, 177-183
Storage IO Control (SIOC), 173-177
storage policies, 177-183
vSphere 5.5 failover clustering environments, 185-186
vDS (vSphere distributed switch), 262-263
vFRC (vSphere Flash Read Cache), 199-201
vNUMA, 243
vSphere App HA, 299-300
vSphere HA, 298-299
vSphere Hot Plug Memory, 91
vSphere Replication, 300-301
vSphere Web Client, 260
vSS (vSphere standard switch), 265, 271
vSphere Monitoring & Performance (white paper), 440
vSphere Resource Management (white paper), 440
vSphere Storage (white paper), 440
vSphere Web Client, 260
vSS (vSphere standard switch), 265, 271
vStorage APIs for Array Integration (VAAI), 96
vswap file, memory reservations and, 233-234
W
Webster, Michael, 106, 170
white papers
DBA Guide to Databases on VMware, 439
Setup for Failover Clustering and Microsoft Cluster Service, 439
SQL Server Best Practices, 210
SQL Server on VMwareAvailability and Recovery Options, 439
SQL Server on VMwareBest Practices Guide, 439
Understanding VMware vSphere 5.1 Storage DRS, 182
VMware vSphere on Nutanix Best Practices, 210
vSphere Monitoring & Performance, 440
vSphere Resource Management, 440
vSphere Storage, 440
Wide NUMA, 81
Windows Failover Cluster Heartbeat settings, 255
Windows Failover Clustering
configuring, 359-368
network settings, 254-256
quorum mode, 369-374
validating, 368
Windows Server Failover Clustering (WSFC), 304
Windows vNIC properties page, 261
wizards
Cluster Validation Wizard, 363-364
Configure Cluster Quorum Wizard, 370-374
Create Cluster Wizard, 366
Dell DVD Store Custom Install Wizard, 408
Maintenance Plan Wizard, 420-422
New Availability Group Wizard, 399-405
Validate a Configuration Wizard, 364
Workload Driver Configuration Wizard, 409
Workload Driver Configuration Wizard, 409
workloads
based on recorded production performance, 318
based on system nonfunctional requirements, 317
baselining, 48-50
mixed workload environment with memory reservations, 226-228
SQL workloads, 64-67
reorganization, 68-69
tiered database offering, 70-73
worlds (VMs), 86
Writes/s counter, 326
WSFC (Windows Server Failover Clustering), 304
X-Y-Z
xVelocity memory, 134
X-vMotion, 298
Yellow-Bricks, 444