402 Chapter9•ManagingReplication Components Two agents are used in Snapshot Replication, snapshot and distribution. Snapshot Agent This agent is a shared agent for all replication types. For any kind of replication it needs to have initial schema and data scripts. The snapshot agent’s job is to generate data scripts for the objects you are replicating. These scripts are written to a folder in the file system called a snapshot folder. Apart from writing scripts to snapshot folder, the snapshot agent will write commands to the distribution databases. Distribution Agent This agent will read from a snapshot folder and distribution database to propagate to the subscriber end. In snapshot replication, the subscription database is not read only, which means that you have the option of modifying the data at the subscriber database. However, your changes will be reverted back when the publisher snapshot agent runs. Transactional Transactional replication is the most used replication type out of the three available replication types. The most valuable feature of the transactional replication is the ability to replicate incremental changes rather than applying the all-data set. Transactional replication uses transactional log to generate transactions for replication. However, you don’t need your database recovery model to be either Full or Bulk-Logged. Transactional replication works with every database model. You need to have a primary key for all the tables that you are going to replicate. Transactional replication is much more scalable than snapshot replication, mainly due to the fact that transactional replication takes less time than snapshot replication. Here are the scenarios where you can use transactional replication: Replicate huge volumes of data. As transactional replication propagates incremental data to the subscribers, it can handle large volumes of data. For real-time application. For real-time applications you need to replicate data with minimum latency. Because transactional replication uses transaction log, latency between the time changes are made at the publisher and the changes arrive at the subscriber. Replicate data between non-SQL Server databases. Transactional replication allows you to replicate data between Oracle and SQL Server. Oracle ManagingReplication•Chapter9 403 Server instance can be a publisher or subscriber. However, Oracle publishing is available only in the Enterprise and Developer versions of SQL Server 2008. Components Transactional replication uses three components; snapshot agent, log reader agent, and distribution agent. The snapshot agent was described in a previous section. The log reader agent’s task is to monitor the changes in the transaction log and propagate this data to the subscriber. When the log reader captures data from the transaction log, it will update the dbo.MSrepl_transactions table in the distribution table. Then it will generate commands, which need to be run at subscriber at the dbo.MSRepl_Commands table in the distribution database. These commands are stored in binary format; in case you need to read them you can use the sp_browsereplcmds stored procedure in the distribution databases. The distribution agent’s task is to propagate distribution commands to all subscribers. After delivering all the commands to all subscribers, this agent makes sure that those commands are removed from the distribution database. There are two additional replication mechanisms that come under transactional replication, with updatable subscriptions and peer-to-peer-replication. Updatable Subscription With SQL Server 2005, a new replication type was introduced called Transactional publication with updatable subscriptions. This is another transactional replication type that has an option of having updatable subscribers. This feature is possible with two options in the transactional replication (when this replication is configured, a new column of uniqueidentifier type and a trigger is added to the table): Allow immediate updating subscriptions. Changes occurring at subscriber are written to the publisher using MS Distributed Transaction Coordinator (MS DTC). Therefore, you need to make sure that MS DTC service is started. Allow queued updating subscriptions. Changes occurring on the subscriber are replicated to the publisher using the queue reader agent. Transactional replication with queued updating is much better when there are less numbers of subscribers and changes at the subscriber are infrequent. 404 Chapter9•ManagingReplication Peer-to-Peer Replication Peer-to-peer replication is an option with transactional replication. All nodes in a peer-to-peer replication topology subscribe and publish from and to all other nodes. A transaction originating at one node will be replicated to all other nodes, but not replicated back to originator. This replication model is intended for use in applications to have multiple databases or database servers participating in a scale-out solution. Refer to Figure 9.1. Figure9.1 Layout of Peer-to-Peer Replication Clients may connect to one of many databases, which typically have the same data. One of the databases or database servers can be removed from the bank of servers participating in the scale-out solution and the load will be distributed among the remaining servers. New&Noteworthy… Peer-to-PeerReplicationConfigurationandFeatures In SQL Server 2008 Peer-to-Peer Replication dialog is introduced to enhance the easiness of configuring Peer-to-Peer replication. In the previous version of SQL Server, you need to type the server name and database and you Continued Managing Replication • Chapter 9 405 Ex a m Wa r n i n g There can be questions about selecting correct data types when there are updates in both publisher and subscriber. Most users may have used replication in SQL Server 2000 and since there is no possibility in subscriber to update, most users will think that the only relevant replication type is Merge Replication. However, you need to remember that in Transactional Replication there is an option for updatable subscribers. don’t have a chance to visualize the Peer-to-Peer replication configuration. Also, with SQL Server 2008, you have the privilege of dropping the connection between two servers. Configuration of Peer-to-Peer replication will be discussed later in the chapter. In earlier versions of SQL Server, you can add a node to a topology and connect the new node to one existing node. To connect the new node to more than one existing node, you must suspend all activity in the topology and then make sure that all pending changes are delivered to all nodes. In SQL Server 2008, you can connect the new node to any number of existing nodes without stopping it. This is made possible in the Configure Peer-to-Peer Topology Wizard or by specifying a value of init from lsn for the @sync_type parameter of sp_addsubscription. Peer-to-Peer Replication in SQL Server 2008 has the ability to detect conflicts during synchronization. This option is enabled by default and this enables the Distribution Agent to detect conflicts and to stop processing changes at the affected node. However, Peer-to-Peer Replication is available only with SQL Server Enterprise and Developer Editions, which is a drawback of Peer-to-Peer Replication. Also, after configuring Peer-to-Peer Replication you cannot disable it. Another major disadvantage in Peer-To-Peer Replication is that you cannot have filtering for it. Merge Merge Replication, like other replication technologies, will start from the initial snapshot. Afterward, changes at both publisher and subscriber(s) are tracked with triggers. Merge Replication does not propagate intermediate data; instead it will propagate net changes of data. For example, if a row changes three times at a Subscriber before it synchronizes with a Publisher, the row will change only once at the Publisher to reflect the net data change. This will enhance the performance of the Merge Replication. 406 Chapter9•ManagingReplication Here are the scenarios where you can use Merge Replication: Working offline If you have a system where you want to download a data set while you are working offline and then connect to the publisher to synchronize data, Merge Replication is the most suitable method. When the subscriber requires a different partition of data When there are other SQL Server versions With Merge replication you have the option of replicating data between SQL Server 2008, 2005, and 2000 versions. HeadoftheClass… DifferencebetweenMerge andPeer-to-PeerReplication Most users are confused with the differences and usage of Merge and Peer-to-Peer Replication because both replication types allow users to update/insert data at any Subscriber or at Publisher. Apart from that, both features support conflict resolution. However, Peer-to-Peer Replication does not support filtering, which is supported by Merge Replication. When you are considering Peer-to-Peer Replication, do not configure with more than 10 nodes, because it will have degrade performance. After you configure merge replication, new objects will be created in the dbo schema. Insert, update, and delete triggers are added to published tables to track changes. The triggers are named in the form MSmerge_ins_<GUID>, MSmerge_upd_<GUID>, and MSmerge_del_<GUID>. The GUID value is taken from the entry for the article in the system table sysmergearticles. Stored procedures are created to handle inserts, updates, and deletes to published tables, and to perform a number of other replication-related operations. Views are created to manage inserts, updates, deletes, and filtering. . servers. New&Noteworthy… Peer-to-PeerReplicationConfigurationandFeatures In SQL Server 2008 Peer-to-Peer Replication dialog is introduced to enhance the easiness of configuring Peer-to-Peer replication. In the previous version of SQL Server, you need to type the server. data. One of the databases or database servers can be removed from the bank of servers participating in the scale-out solution and the load will be distributed among the remaining servers. New&Noteworthy… Peer-to-PeerReplicationConfigurationandFeatures In. and then connect to the publisher to synchronize data, Merge Replication is the most suitable method. When the subscriber requires a different partition of data When there are other SQL Server