5 Replies Latest reply: May 2, 2008 8:12 AM by Fredrik Johansson RSS

Data Gravitation

Fredrik Johansson Newbie

Hi.
We have been experiencing an issue in our system when enabling buddy replication. The issue manifests itself in way that replication seems to be completely missing. We can turn the issue on and off by enabling/disabling buddy replication so I have focused on isolating the problem in a stand-alone test.

In my test scenario I am now able to create something I think is a bug. I am using two caches with buddy replication enabled and force data gravitation between them. I then fail the secondary cache and examine if the data was recoverable on the primary cache. This works like a charm. However, when I do this a second time around, i.e. start up a new cache again, the same scenario fails.

The second time, the objects gravitated from the primary cache are not removed as they are the first time around and when we inspect the cache for recovered data after failing the second secondary cache, we get the wrong data. The primary cache has the correct data under its _buddy_backup node but since it prefers its own data, it will read the 'wrong' version.

This is pretty complex to explain, so I have posted the complete test code here: http://www.cubeia.com/misc/buddyrep/

I've tried to add comments to the code to be as explanatory as possible. The test was written for 2.1.0.GA.

I do have an understanding of data affinity and what the concept implies, however, I believe that the test do not break or abuse data affinity but rather tests fail-over scenarios when using buddy replication. Finally, why should the behavior change because the cache had a member in the past?

I hope you can run the test and try it out. Just ask away if the code not making any sense to you =)