SQL Server Replication
SQL Server Replication
By : Brad McGehee
Apr 21, 2005
Replication allows you to "scale out" your SQL Server by distributing the processing
load over multiple servers. For example, you might want to consider moving your
reporting function from the main (transaction server) to another server (reporting
server). This task is relatively easily accomplished by implementing transactional
replication.
Replication offers many opportunities to scale out your current SQL Server, helping
to boost the overall performance of your SQL Server-based applications. [6.5, 7.0,
2000, 2005] Updated 8-5-2005
*****
*****
• Add additional RAM (besides what you need for using SQL Server without
using replication).
*****
Don't publish more data than you need, whether you are talking about tables,
rows, or columns in a table. You can use vertical and/or horizontal filtering to
prevent specific rows or columns of a table from being published.
If your tables include columns with IMAGE, NTEXT, or TEXT columns, you should
seriously consider filtering out these columns (vertical filtering) because these data
types can greatly slow down replication [6.5, 7.0, 2000, 2005] Updated 8-3-2003
*****
*****
When creating the distribution database and its log, don't take the defaults for
the location of the database. Instead, manually specify that the log and database be
placed on appropriate disk arrays.
*****
When creating a publication, SQL Server gives you the ability to specify if
one or more of the subscribers will be Microsoft Jet databases. If you need
this feature, then turn it on. But if you will not have any Microsoft Jet subscribers,
then don't select this option. What happens when you select this is option is that the
Snapshot Agent will use BCP character format, not the faster BCP native format,
when running. This also means that any SQL Server subscribers will have to go
through extra work translating the BCP character format when it receives the
publication, which causes a slight performance hit. [7.0, 2000] Updated 9-17-2004
*****
*****
If you are filtering the data on the publisher before it gets to any of the
subscribers, be sure that the column or columns you are filtering on have an
appropriate index. If not, then SQL Server will perform table scan to filter the data,
putting an increased load on the publisher and reducing performance. [6.5, 7.0,
2000, 2005] Updated 9-17-2004
If the replication load is very heavy between the publisher and the distributor, or
between the distributor and the subscriber(s), and each of these is on their own
dedicated SQL Server, and each are connected by a fast LAN connection, and the
latency among the servers is longer that you prefer, consider this option. Instead of
using public network connections between each of the servers, use a dedicated
private network instead. This can be accomplished by adding an additional NIC to
each server, and connecting them using a private hub, switch, or cross-over cable.
This technique will boost performance because data replication no longer has to
compete with all of the data already traveling on your public network. To ensure the
highest performance, use 100Mbs or 1Gbs NICs and matching connection hardware
and cables. [6.5, 7.0, 2000, 2005] Updated 9-17-2004
*****
When you first click on the Replication Monitor group in Enterprise Manager, after
having installed Replication on your server, you see a message asking you if you
want to enable polling for Replication Monitor.
What this means is that the replication agent status information, found in the
Replication Monitor group in Enterprise Manager, will automatically be refreshed
periodically if you answer Yes, or it won't be automatically refreshed if you answer
No.
If you answer Yes, then you will see another screen that allows you to specify how
long the refresh interval will be.
As you might imagine, if you answer Yes, SQL Server will have to incur some
additional overhead to automatically refresh the status information. For the most
part, the default 10 seconds refresh interval won't produce a significant amount of
overhead, and it is a good place to start. If you enter a refresh interval of less than
10 seconds, such as 1 or 2 seconds, you will probably notice the overhead, and you
will probably be unhappy with this selection. If you are currently having performance
problems, you may want to increase the default 10 seconds to a larger number, such
as 30 seconds or so.
Once you set the refresh interval, you can always change it by right-clicking on
Replication Monitor in Enterprise Manager and selecting "Refresh Rate and Settings."
[7.0, 2000] Updated 9-17-2004
*****
If the subscriber data is being indexed heavily for the decision support needs of
the organization, keep in mind that those same indexes can slow down how long it
takes data to be moved from the distributor to the subscriber. The more indexes
there are, the slower the replication process. This is because any new data moved to
the subscriber must be indexed. In some cases, it might be beneficial to create an
indexed view on the publisher, and then publish it as a table to subscribers. [2000,
2005] Added 10-9-2001
*****
The distribution agent, the log reader agent, the merge agent, and the
snapshot agent all log their activity as they work. The amount of data they log
depends on their verbosity setting. The more verbose the setting, the more overhead
there is. The less verbose the setting, the less overhead there is. According to
Microsoft, setting all of these settings to their lowest setting can boost replication
performance from 10 to 15 percent. In most cases, there are two verbosity settings
that need to be made:
Distribution Agent
distrib -HistoryVerboseLevel 1
distrib -OutputVerboseLevel 0
logread -HistoryVerboseLevel 1
logread -OutputVerboseLevel 0
Merge Agent
replmerg -HistoryVerboseLevel 1
replmerg -OutputVerboseLevel 0
Snapshot Agent
snapshot -HistoryVerboseLevel 1
snapshot -OutputVerboseLevel 0
The default value for the HistoryVerboseLevel is 2 (not 1), and the default value for
the OutputVerboseLevel is 2 (not 0).
Of course, if you are troubleshooting replication, you will not want to reduce these
values. But if you are in production, and all is going well, you should consider
changing the relevant default values in order to boost performance. [7.0,
2000] Updated 3-20-2006
If you need replication agents to run frequently, say every minute or so, it is
more efficient to set them to run continuously than it is from them to run frequently.
This is because these agents require overhead to start and stop as they are used,
which increases overhead. So if you have set the replication agents to run often,
consider setting them to run continuously instead. On the other hand, if you only
need the agents to run occasionally, once an hour or so, or longer, then this
configuration is still more efficient than running them continuously. You may have to
perform tests to see which option best meets your specific replication needs. [7.0,
2000, 2005] Updated 3-20-2006
*****
To optimize the performance of the distribution database, manually size it to
what you believe will be the biggest size it will ever get. This can reduce overhead as
SQL Server will not have to increase its size automatically as needed. In addition, be
sure not to turn on the "Auto Shrink" and the "Auto Close" database options, as they
will incur unnecessary overhead to the database. [7.0, 2000, 2005] Updated 3-20-
2006
*****
If your SQL Server 2000 replication configuration will cause a large number
of rows to be affected, consider using stored procedure replication. Significant
overhead can be reduced if the execution of stored procedures is replicated instead
of the data changes caused by the execution of one or more stored procedures. This
is generally accomplished in snapshot or transactional replication when one or
multiple stored procedures are specified as articles. [2000, 2005] Added 10-9-2001
*****
To help reduce overhead and the size of the distribution database, consider
reducing the amount of time that log history and transaction retention is held. To do
this, right-click on the "Replication" folder, then select "Configure Publishing,
Subscribers, and Distributors." Next, click on the "Properties" button for the
distribution database to make your change. [2000] Added 10-9-2001
*****
Avoid creating triggers on tables that contain subscribed data. They can
significantly add to overhead and slow replication performance, including snapshot
replication, transactional replication, and merge replication configurations. [7.0,
2000, 2005] Updated 3-20-2006
*****
• Dist: Delivered Cmds/sec: Tracks the number of commands per second sent
to subscribers from the distributor.
• Dist: Delivered Trans/sec: Tracks the number of transactions per second sent
to subscribers from the distributor.
Replication is a set of technologies for copying and distributing data and database objects from one
database to another and then synchronizing between databases to maintain consistency. Using
replication, you can distribute data to different locations and to remote or mobile users over local
and wide area networks, dial-up connections, wireless connections, and the Internet.
Transactional replication is typically used in server-to-server scenarios that require high throughput,
including: improving scalability and availability; data warehousing and reporting; integrating data
from multiple sites; integrating heterogeneous data; and offloading batch processing. Merge
replication is primarily designed for mobile applications or distributed server applications that have
possible data conflicts. Common scenarios include: exchanging data with mobile users; consumer
point of sale (POS) applications; and integration of data from multiple sites. Snapshot replication is
used to provide the initial data set for transactional and merge replication; it can also be used when
complete refreshes of data are appropriate. With these three types of replication, SQL Server
provides a powerful and flexible system for synchronizing data across your enterprise.
In addition to replication, in SQL Server 2008, you can sychronize databases by using Microsoft
Sync Framework and Sync Services for ADO.NET. Sync Services for ADO.NET provides an intuitive
and flexible API that you can use to build applications that target offline and collaboration scenarios.
For an overview of Sync Services for ADO.NET, seeMicrosoft Sync Framework. For complete
documentation, see this MSDN Web site.
Browse Content by Life Cycle
Product Evaluation
Getting Started
Planning and Architecture
Development
Deployment
Operations
Security and Protection
Troubleshooting
Technical Reference
Browse Content by Role
Administrator
Developer
Architect
SQL Constraints
Constraints are used to limit the type of data that can go into a table.
Constraints can be specified when a table is created (with the CREATE TABLE statement) or after
the table is created (with the ALTER TABLE statement).
• NOT NULL
• UNIQUE
• PRIMARY KEY
• FOREIGN KEY
• CHECK
• DEFAULT