8 Replies Latest reply: Jan 28, 2012 5:41 PM by Manik Surtani RSS

Write Skew issue (versioning)

Pedro Ruivo Newbie

Hi,

 

I think I have spotted a problem with the write skew check implementation based on versioning.

 

I've made this test to confirm:

 

I have a global counter that is incremented concurrently by two different nodes, running ISPN with Repeatable Read with write skew enabled. I expected that each successfully transaction will commit a different value.

 

In detail, each node do the following:

 

beginTx

Integer count = cache.get("counter");

count = count + 1;

cache.put("counter", count)

commitTx

 

To avoid errors, I've run this test on two ISPN versions: 5.1.0.CR4 and 5.0.1.Final. In 5.0.1.Final, it works as expected. However, on 5.1.0.CR4 I have a lot of repeated values. After a first check at the code, I've the impression that the problem may be due to that the version numbers of the keys for which the write skew check should be run is not sent with the prepare command.

 

The ISPN config file can be found here: http://pastebin.com/UCxGXw3K

 

Cheers,

Pedro Ruivo

  • 1. Re: Write Skew issue (versioning)
    Manik Surtani Master

    Hi Pedro. 

     

    I don't understand how this could have worked in 5.0.x since write skew checks in a cluster was not supported until 5.1. 

     

    Are you testing local mode?

     

    Cheers

    Manik

  • 2. Re: Write Skew issue (versioning)
    Mircea Markus Master

    One way or the other there shouldn't get duplicate counter values, right?

  • 3. Re: Write Skew issue (versioning)
    Pedro Ruivo Newbie

    Hi,

     

    I'm testing in replicated mode (full replication).

     

    In 5.0.x it works because of the locking scheme. In more detail, two cases can happen (list of events);

     

    1) write skew is detected:

     

    localTx reads "counter" and gets the value x

    remote prepare (remoteTx) is received

    remoteTx acquires lock on "counter"

    localTx tries to acquire lock on "counter"

    remoteTx updates "counter" to x+1

    remoteTx releases the lock

    localTx acquires the lock

    localTx detects that "counter"'s value is x+1 and aborts (see [1])

     

    2) deadlock/timeout acquiring the locks

     

    localTx reads "counter" and gets the value x

    localTx acquires the lock on "counter"

    remote prepare (remoteTx) is received

    remoteTx tries to acquire lock on "counter"

     

    deadlock is detected (or a timeout is triggered)

     

    For 5.1.x, I was expecting behavior like this:

     

    localTx reads "counter" and gets the value x (version y)

    remote prepare (remoteTx) is received and updates the "counter" to x+1 (version y+1)

    localTx sends the prepare command and the coordinator performs the write skew check

     

    The coordinator detects that the read version (y) is different from the actual version (y+1) and aborts the transaction

     

    This is my "definition" of write skew.

     

    Cheers,

    Pedro

     

     

    [1] in RepetableReadEntry

    if (actualValue != null && actualValue != value) {

      log.unableToCopyEntryForUpdate(getKey());

      throw new CacheException("Detected write skew");

    }

  • 4. Re: Write Skew issue (versioning)
    Manik Surtani Master

    No, in 5.0.x you may still get dupes.

  • 5. Re: Write Skew issue (versioning)
    Manik Surtani Master

    BTW is this unit test in a form that can be added to the Infinispan codebase?  If you could fork the project and create a pull request with a commit containing the test that would be great.

  • 6. Re: Write Skew issue (versioning)
    Pedro Ruivo Newbie

    No. The code was implemented in a modified version of radargun... However, I can try to implement it as a unit test this weekend if you are interested

     

    How hard is to implement a unit test?

  • 7. Re: Write Skew issue (versioning)
    Pedro Ruivo Newbie

    I have made a pull request with the test case. It's my first time that I create a test case and a pull request. If anything is wrong, please let me know.

     

    Cheers,

    Pedro

  • 8. Re: Write Skew issue (versioning)
    Manik Surtani Master

    Thanks for the test case.  I've incorporated this into Infinispan's test suite.  The bug is documented here and fixed here.