Versioning Design Document

Discussion April 2011

I met with Max last week to discuss the need for versioning. Here're some notes I took:

  • First of all, Max was worried that in the current 2LC set up, concurrent sessions in a node could end in reordered operations being sent to another node in the cluster. For example, node A doing update and then delete and node B receiving delete and then update. In the current default configuration, this is not possible since entity/collection caches are configured with synchronous invalidation.
  • The only possibility for re-ordering happening is if asynchronous mode was used, and in particular asynchronous marshalling. The potential reordering problem is explained in the asynchronous options wiki.
  • However, if asynchronous modes were possible, the performance of the 2LC would increase vastly but clearly consistency would need to be achieved in some other way, because in asynchronous communications, entity/collection updates could get lost and 2LC instances could end up in inconsistent state.
  • So, in such scenarios, instead of relying on cluster wide synchronization to keep caches in consistent manner, Max suggested relying on the database version information to keep caches in consistent state. As a first step, this could be tested with asynchronous entiy/collection caching to see if it would work. Max indicated that to support this some kind of tombstones might be needed to make sure updates don't override deletes.
  • From a Hibernate perspective, any type can be versioned, and the support for it comes via VersionType. This comes along with a Comparator that allows two instances of type T to be compared. So, Infinispan should be able to delegate version comparison to a Comparator instance. In the 2LC use case, Infinispan does not need to generate any new versions because data is provided (it's a cache remember!).
  • An evolution of this would be for Infinispan 2LC to other concurrency strategies such as: NONSTRICT_READ_WRITE and READ_WRITE. Currently, Infinispan only supports READ_ONLY and TRANSACTIONAL. This could result in big performance improvements on the 2LC.
  • Something that needs to be investigated is whether merges are being dealt with in 2LC case. I don't think they're, so this should be implemented to make sure on merge, the 2LC is cleared.

 

So, action items here include:

  • Implement MergeView to deal with merges.
  • Implement version handling in either Infinispan or 2LC code to enable asynchronous consistency levels for entity/collection.
  • Once versioning in place, implement other cache concurrency strategies such as: NONSTRICT_READ_WRITE and READ_WRITE.