JBossCacheDistributedLockManagerDesign

Distributed Lock Manager

 

Background

 

JBossCache locking is not distributed, but consists of a series of local locks.  When locks are obtained on a cache to write some state, the lock is not cluster-wide, but local to the instance.  When a transaction commits, the cache then attempts to lock all remote instances accordingly to apply the changes.

 

The purpose behind this approach is scalability and performance, since acquiring locks on all instances in a cluster and holding them for the entire duration of a transaction leads to poor concurrency, as well as poor scalability.

 

On the other hand, using local locks allows for the possibility of concurrent writes, causing transactions to fail at commit time if remote locks cannot be obtained.  Also, these failures are only detected at commit time and not earlier, when the write takes place.

 

With recent developments such as MVCC and partitioning, the drawbacks are mitigated since (1) reads are lock-free with MVCC and (2) state is owned on a limited subset of the cluster thanks to partitioning, reducing the impact of distributed locks on scalability.

 

Another recent refactoring, the creation of a LockManager in JBossCache 2.2.0, makes it easy to implement alternate lock managers.

 

Cooperative Distributed Lock Manager

 

This approach uses a LockCommand (that implements ReplicableCommand) and is broadcast to the entire cluster (or just the partition or buddy group).  This is performed by the CooperativeDistributedLockManager synchronously, and waits for all responses.  The LockCommand's perform() method then obtains a local lock.  Once all responses have been received, the CooperativeDistributedLockManager returns from it's lock method.

 

Similarly, unlock methods on CooperativeDistributedLockManager broadcast an UnlockCommand.

 

One drawback to this approach is that on rollback, UnlockCommands still need to be broadcast.