failure situation using cluster-wide lock
yelin666 Nov 14, 2010 6:31 PMI have a simple test for cluster-wide lock, and the code is attached. Basically, I created a timer, and schedule it to run a task every 5 seconds. When the task runs, it locks a key, gets the value, increases the value and puts it back. I tested with two instances running on my local node. Sometimes it works fine, one instance updates the value at a time. However, occasionally, after I killed the first instance, the second one failed to commit an update, while the thread was disconnected with the transaction (at STATUS_NO_TRANSACTION, so a rollback wouldn't work either), so the lock can not be released and no one could gain the lock anymore. I attached the ouputs from my 2 instances as well. For this standalone test, I was using JBossTX transaction manager.
When I ran a similar test inside a ServiceMix container, it's easier to get into this situation after running two instances for a while, even without killing 1 instance. Inside ServiceMix, the Geromino transaction manager is used. If it's helpful, the exception at commit time inside ServiceMix is as following:
com.ge.energy.ssi.core.datagrid.lock.DataGridLockExecutionException: Failed to commit locked transaction
at com.ge.energy.ssi.core.datagrid.infinispan.InfinispanDataGrid.lockAndWork(InfinispanDataGrid.java:252)
at com.ge.energy.ssi.core.datagrid.infinispan.InfinispanDataGrid.lockAndWork(InfinispanDataGrid.java:96)
at com.ge.energy.ssi.core.datagrid.example.counter.LockedCounter$CounterTask.run(LockedCounter.java:151)
at java.util.TimerThread.mainLoop(Timer.java:512)
at java.util.TimerThread.run(Timer.java:462)
Caused by: javax.transaction.RollbackException: Error during two phase commit
at org.apache.geronimo.transaction.manager.TransactionImpl.commitResource(TransactionImpl.java:742)
at org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:304)
at org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:250)
at com.ge.energy.ssi.core.datagrid.infinispan.InfinispanDataGrid.lockAndWork(InfinispanDataGrid.java:239)
... 4 more
Caused by: javax.transaction.xa.XAException
at org.infinispan.transaction.xa.TransactionXaAdapter.prepare(TransactionXaAdapter.java:90)
at org.infinispan.transaction.xa.TransactionXaAdapter.commit(TransactionXaAdapter.java:96)
at org.apache.geronimo.transaction.manager.TransactionImpl.commitResource(TransactionImpl.java:688)
... 7 more
Please suggest what went wrong here.
-
instance2.out.zip 2.1 KB
-
instance1.out.zip 456 bytes