Repair a broken Replication Sql Server 2005 vs Sql Server 2008 – PART 2

NOTE OF CAUTION: In the end we decided to turn off the continue on conflict option, because too many conflicts were ignored and the tables got too far out of sync… So the continue on conflict is not such a great option afterall…

Part 1: http://bloggingabout.net/blogs/dries/archive/2009/04/18/repair-a-broken-replication-sql-server-2005-vs-sql-server-2008.aspx

So here’s the update: The option to to continue on conflict works fine!

 

I tested it on two virtual machines that have p2p replication with a test table. Stopped Replication by stopping the SQL agents and then inserted identical rows at all servers. Nothing happened, at least no error.

So I did it again with records with the same id, but different values. That also worked, but the first processed command was replaced by the second value. So both nodes were in sync again.

I decided to go live with it and inserted the missing records from one node in one of the others. No problems at all! 

A note of caution though: Not all conflicts are handled.

On Friday I marked a table for replication that did not exist on the other nodes. This did result in a replication error! Apparently the option doesn’t work for this situation. Well, no worries. I created the table on the other nodes and then the magic option did the rest: lots of conflicts, but no errors! Don’t forget to remove the table from the articles of the publication though…

 

NOTE OF CAUTION:

In the end we decided to turn off the continue on conflict option, because too many conflicts were ignored and the tables got too far out of sync… So the continue on conflict is not such a great option afterall…

 

Repair a broken Replication Sql Server 2005 vs Sql Server 2008

BEFORE YOU READ THIS:  In the end we decided to turn off the continue on conflict option, because too many conflicts were ignored and the tables got too far out of sync… So the continue on conflict is not such a great option afterall…

This is actually a personal note that I wrote down in june 2007.

That day we broke the peer to peer replication across our three servers. But I fixed it without having to restore databases and rebuild the replication.

 

The bad news: Today we broke replication again. This time the cause was a unique index on 2 columns that was being filled from 2 nodes simultaniously. That is BAD.

The good news: I fixed it.

Last time the replication broke I noticed a list of stored procedures that are responsible for theC(R)UD operations of replication. There are insert, update and delete procedures for all tables in replication and for every subscribing node. So for 1 table there are 2 insert, 2 update and 2 delete procedures.

To fix replication you can tweak these stored procedures to return success even in case of a failure. But be careful! If the databases get even more out of sync that’s even worse!

If a delete or update fails you can skip the error message that the row does not exist on the subscribing node.

However if an insert fails, you will have to write your own check to see if the insert will fail and than skip that insert.

 

Last friday the support team had a problem with peer-to-peer replication after installing SP 3 for Sql Server 2005. Even the update for SP3 did not resolve the issue. The issue was that every 30 seconds replication claimed that the process cannot allocate memory, although it was still replicating.

So with replication in a fault state and a wish to upgrade to Sql Server 2008 in the near future the support team decided to upgrade now and hope that that would fix the issue. Well… needless to say… it didn’t.

Non of the support people know how to set up replication, so I ended up doing it. The first time it all seemed to work, tracers worked fine and arrived after about 3 seconds. But the moment I started some of the processes replication failed.  That was a set back, because now the databases were out of sync…  I decided to ignore the fact that the databases were out of sync and set up replication all over again. I could do this, because I was and still am sure that these missing records on two of the nodes aren’t going to give us problems.

With a situation like this I would have had to spend at least a day to get back in sync.

As far as this article – http://technet.microsoft.com/en-us/library/bb934199.aspx – on technet claims: Once I turn on the option to continu on conflict, I do not have to manually change all the stored procedures to continu on conflict. If that is true it will save me a lot of time! First thing on monday I will try this on a test environment before I will use this on the production server.

I will definitely let you know how this went!

The Visual Studio performance and coverage logging engine is already running on the computer

Something really annoying happens in Visual Studio when I debug my tests. No tests run, because of a test run error:

Code coverage collection error: The Visual Studio performance and coverage logging engine is already running on the computer. Therefore, the test run cannot continue. Close the pending performance or code coverage session and then rerun the tests.

I open up the Task Manager and there is VSPerfMon.exe which is causing the problem. End the process and everything is fine again. But why won’t it shut down?

Well the reason for that  is that I stop debugging my tests while they are still executing. Let them run and fail and there is no problem.