What Cannot Be Published - Microsoft SQL Server Black Book

A few things cannot be published with Microsoft SQL Server. With version 6.5 the list has shrunk a bit, but first let’s talk about version 6.0. You cannot publish your system catalog or the model, master, Msdb, or Tempdb. You also cannot publish any data from a table that does not have a defined primary key. In addition, three data types cannot be published: text, image, and time stamp. In most scenarios, these restrictions should cause only a small inconvenience and can be overcome through some creative programming.

Version 6.5 allows you to replicate text and image data between your servers. However, your text and image statements must be a logged operation. I do not

recommend replicating binary large object blocks (blobs). Because of their size, the Scheduled Table Refresh method is preferable. Scheduled Table Refresh is another option for moving data to remote servers. This method involves a block move of the entire table on a schedule between servers. It is time consuming and is usually incorporated into systems that make nightly updates during off-peak hours.

Keep in mind that you can replicate data between version 6.0 and version 6.5 databases as long as the publisher and the distribution server are the same version.

ODBC

You can replicate to ODBC databases only with Microsoft SQL Server version 6.5. To do this, you must first create a system data source through the ODBC Administrator, then create the subscriptions using the push or the publishing server controlled method. You will have no administrative control from the subscribers of replicated data through an ODBC connection. All administration must happen from Microsoft SQL Server.

Be careful of quoted identifiers with ODBC subscriptions. Some query tools do not like tables created with quoted names. Check your query tool to make sure it supports quoted identifiers before you end up chasing your tail and find you cannot view replicated data on your subscriber.

Previous Table of Contents Next

Brief Full Advanced

Search Search Tips

To access the contents, click the chapter and section titles. Microsoft SQL Server Black Book

(Publisher: The Coriolis Group)

Author(s): Patrick Dalton ISBN: 1576101495 Publication Date: 07/01/97

Search this book:

Previous Table of Contents Next

Central Publisher

The Central Publisher is the most basic scenario to implement. In Figure 4.3, notice that the publisher and distribution processes are located on the same server. This is adequate for enterprises that can put up a production server with a lot of horsepower and are not tasking the production machine heavily. For most of my consultations that concern replication, I recommend a separate distribution server—if not from the beginning, then as soon as budgets permit—to help balance the load across many servers.

FIGURE 4.3 The Central Publisher scenario.

Servers A, B, and C in the illustration are all subscribing to the data that

resides on the publisher. Each subscriber can be connected over different types of LAN/WAN connections at various speeds without slowing down the entire replication process. This is where Microsoft SQL Server shines in data

distribution. If the connection between Server A and the publisher goes down due to a hardware failure, the other servers can continue to receive up-to-date information without waiting for the communication failure to be resolved. When your link between Server A and the publisher is reestablished, the distribution process will make sure all modifications are applied to Server A until it reaches the same state of synchronization as Servers B and C.

http://www.itknowledge.com/reference/standard/1576101495/ch04/113-118.html (1 of 4) [1/27/2000 6:15:53 PM]

Go!

Keyword

---

Some of the possible real-life scenarios this model supports are executive management systems, report generation, ad hoc query databases, and

distribution of price lists from a main office. One of the best features of this scenario is that, unlike a two-phased commit, it will continue to work during a communication failure.

Central Publisher With Remote Distribution

I frequently recommend this model for companies that want to test replication to see if they would benefit from it. This scenario places a modular, balanced load on your system, which will make it easier to manage and configure. This configuration of servers is the natural progression from the Central Publisher. Although similar to the Central Publisher model, the distribution process is off-loaded to another server, as shown in Figure 4.4. This distribution server can be far smaller than your production server. You can configure it with 32MB of RAM, and it will still keep up with replication tasks quite nicely.

Figure 4.4 The Central Publisher With Remote Distribution scenario.

One important point to note is that many different publishers can use a single distribution server. In addition, many distribution servers can be used if your distribution load gets too high. This flexibility allows you to place as many subscribers online as you wish without an putting an extra load on your production server.

Publishing Subscriber

The next step up the ladder of complexity is the Publishing Subscriber scenario. This configuration can be used to push data across a slow link to a remote system, then re-publish and distribute the data to many subscribers without using all your available bandwidth for replication. See Figure 4.5 for an example.

Figure 4.5 The Publishing Subscriber scenario.

This scenario has all the benefits of other replication scenarios, yet it can support a slow communication link much better. Say, for instance, your

company wants to open an office and several stores across the country from its

current location. You can establish a 56K link between your two offices and replicate data across that link, then distribute the price lists or ad hoc query data to the stores as they open throughout the remote state. This will reduce network traffic and communication costs.

Central Subscriber

Now let me confuse you for a minute with a rather complex model. The Central Subscriber scenario is used commonly when a company has autonomous business divisions that must maintain local contact lists or inventory and report this information to a central office. This scenario is achieved on Microsoft SQL Server through the use of a location-specific key in the main office table that stores the “rolled-up” data from all the divisions. These regional keys separate the data and segment it in a single table so that effective queries can be written to analyze amounts or trends in one location. This scenario, illustrated in Figure 4.6, requires manual synchronization at the beginning of the process. The key is not to get into a replication loop at the main office. A replication loop occurs if you make changes to the data at the main office and replicate the data back out to the satellite offices. The insertion of replicated data at the satellite offices would trigger the replication process back up to the main office and would continue to create more replication tasks until your server was brought to its knees.

Figure 4.6 The Central Subscriber scenario.

Modify the data only at the source. If the main office needs to make a change to a piece of data, it should do so through remote procedure calls to the satellite offices, then allow the data to be replicated back up to the central office. This is a bit more difficult to program, but it allows you to modify your data.

Figure 4.7 shows how this data segmentation can be accomplished.

Administrators at each of the remote servers can modify their data all they want without affecting each other’s systems. The changes are replicated to the central subscriber and placed in a table that contains a primary key consisting of a region code and a part ID. The primary key in this example is a

combination of region and ID so that there will be no redundant data in the table. You could then write some queries or stored procedures that run periodically to check inventory levels and alert someone if certain thresholds are exceeded or if a business rule is violated.

Figure 4.7 Spreadsheet example of a replicated table.

Previous Table of Contents Next

Brief Full Advanced

Search Search Tips

To access the contents, click the chapter and section titles. Microsoft SQL Server Black Book

(Publisher: The Coriolis Group)

Author(s): Patrick Dalton ISBN: 1576101495 Publication Date: 07/01/97

Search this book:

Previous Table of Contents Next

Multiple Publishers Of A Single Table

This scenario is the one I see most often discussed among Internet news groups. For some reason, everyone wants to publish the same thing over and over again. The only way to share data between servers is to provide a primary key that is location-specific, set up replication tasks that horizontally partition the data by region, and publish only the data they control.

See Figure 4.8 for the server configuration and how it differs from the other models we’ve discussed. This scenario is best used to show data across a decentralized company, multiple warehouse inventory systems, airline

reservation systems, or regional order-processing systems. The only drawback is the amount of synchronization needed, since all sites contain both local and remote data.

Figure 4.8 The Multiple Publishers Of A Single Table scenario.

Publishing a table multiple times can be accomplished with some of the same techniques just covered in the Central Subscriber scenario—with one small twist. In Figure 4.9, notice that only the data each server has control of is published. Each of the two other servers would subscribe to the other two servers’ published data to make their lists complete. With this configuration,

http://www.itknowledge.com/reference/standard/1576101495/ch04/118-122.html (1 of 4) [1/27/2000 6:15:54 PM]

Go!

Keyword

---

you still should not modify the data of remote systems locally. Use remote stored procedures to make the changes at the source of the data and allow the changes to be replicated normally.

Figure 4.9 Single Table Publication spreadsheet.

Events And Processes

Now that we have covered the terminology and basic models, let’s go behind the scenes. As mentioned, replication is transaction log–based. The major components of replication are the published databases themselves, the

administration of the transaction log for any replicated database, the log reader task, and the synchronization process. A distribution database and the actual distribution process are also needed.

Log Reader Process

When a database is marked for replication, the transaction log is flagged so the log reader process knows to replicate these changes. Once you have completed a transaction in a replicated database, the actual transaction log entries are not removed until the log reader process moves them to the distribution database. This causes your transaction log to fill a bit faster than a nonreplicated

database with the same number of transactions. Once the transactions are moved to the distribution database, the entries in the publishing database transaction log can be flushed through the normal checkpoint process in Microsoft SQL Server.

The log reader process does not run continuously; it is an interval-based

process that is added to your server’s scheduled task list. Keep in mind that the SQL Executive service must be running for replication to work properly.

Synchronization Process

Synchronization can be done manually or through the Enterprise Manager. To perform manual synchronization, you would create an exact copy of the schema on the subscriber database and load the table or tables with exactly the same data as the publisher. If you select manual synchronization, Microsoft SQL Server assumes that both publisher and subscriber are identical. If you miss any records or some structural element, the replication process will generate errors.

If you choose automatic synchronization, Microsoft SQL Server will generate scripts that will create the structures for you on your subscribers and a BCP task to load the existing data into the new structure once it has been

successfully created. Upon completion of the BCP into the new structure, the distribution database is notified that the new subscriber is ready for any new transactions to be applied. Remember that transactions are applied on the

subscriber in the same order as they were applied on the publisher to maintain the integrity of the data.

Replication Distribution Process

As mentioned, replication distribution is best described as the traffic-cop process in replication. The distribution process keeps a list of the transactions that are to be replicated to subscribers, which subscribers have received the transactions, and whether the transactions were applied to each table

successfully. No transactions are purged from the distribution lists until all subscribers have successfully received and applied their changes.

In a multiple-server environment that merges two or three of the scenarios described in this chapter, the distribution of replicated data can get fairly difficult to manage. Take the time to test your configuration and make sure you have published your data only as much as needed to meet the requirements of replication.

Publish your data only one time. If your process requires that a particular piece of data be used in many different locations, publish it as an article one time and subscribe to it many times. Do not create a new publication of the same data each time it is used. If you fall into this trap, you will bring your system to its knees unnecessarily.

Do not forget to have a good backup strategy for the distribution server in your replication scenario. The transaction log and database should be backed up regularly and frequently. Should this server go down, your subscribers will not receive any transactions until you can bring it back online.

Communication Failures

Failures to communicate can cause problems in any environment. I have not found a system yet that does not require solid communications in order to function well—although Microsoft SQL Server comes close with regard to replication. If your link between publisher and subscriber fails for some

reason, Microsoft SQL Server will basically cache the transactions that need to be applied to each of the subscribers until they can be contacted and

successfully applied. Even the link between publisher and distribution server can be lost for a short period of time, and once reestablished, the publisher will still have the transactions in the transaction log for distribution. Using a

distribution server with processes whose only function is to keep servers synchronized is a very fault-tolerant method of distributing data.

Should your link between publisher and distribution server be lost, be careful of the transaction logs in any database marked for replication. Keep in mind that transactions in logs marked for replication are not purged until they are successfully written to the distribution server. If the transactions have nowhere to go, they will not be purged by the checkpoint process in Microsoft SQL Server.

Note: As a rule, always back up your transaction log more frequently than

normal if it is part of a database participating in replication. Do not allow the

log to get more than 75 percent full before making a backup.

Previous Table of Contents Next

Brief Full Advanced

Search Search Tips

To access the contents, click the chapter and section titles. Microsoft SQL Server Black Book

(Publisher: The Coriolis Group)

Author(s): Patrick Dalton ISBN: 1576101495 Publication Date: 07/01/97

Search this book:

Previous Table of Contents Next

In document Microsoft SQL Server Black Book (Page 102-112)