I wrote a simple test program based on https://docs.jboss.org/author/display/ISPN/Using+Infinispan+as+a+JCache+provider#UsingInfinispanasaJCacheprovider-ClusteringJCacheinstances.
I changed the type of the cache to Invalidation cache, however it is not clear to me how it should work with the JCache API. I cannot have a use case where the same data is available from both of the cache instances. Let's see some examples I tried:
- I inserted a key-value pair into cache1
- I checked the key in cache1 and it was there (good behaviour)
- I checked the key in cache2 and it was not there (good behaviour)
- I inserted the same key-pair into cache2
- I checked the key in cache1 and it was not there (???)
In JCache API there are several functions to manipulate data: put, putIfAbsent, replace, replaceIfAbsent, remove, ...
I have the feeling that in case of invalidation cache if I use the replace or remove functions the key should be deleted from other caches. In case I use the put function the keys should not be deleted from other caches.
Could you please let me know what the right behaviour should be based on the javax.cache.Cache interface in case of invalidation cache?
The way Infinispan is working is correct. Invalidation means that when a node receives a put/replace/putIfAbsent/remove calls, it will send a message to the other nodes to remove the entry. So, when the entry is stored in cache2, it's removed from cache1. It doesn't verify if the value is the same or not.
JSR-107 does not specify how caches should behave in a cluster. JSR-107 only focuses on local caches. The behaviour of Infinispan caches, even under JCache API, for invalidated, distributed and replicated caches is specific to Infinispan.
Thanks for the reply!
As an example I saw how Alfresco CRM worked with EHCache in a cluster. There were the following use-cases:
Data was requested from the cache. If it was not available there, it was read from the database. The programmer called the put command so the key-value pair was available in the cache of the node. The cache did not send any message over the network as the cache engine trusted in the programmer enough that in case of put it is simply an entry that should be inserted. Probably it is available in the cache of other nodes, but who cares?
Data was updated by a node in the database. The node than called to remove the key from the cache (even if it was not in the cache of the node) and that sent the message to clear the key from every node.
Nothing to do concerning to the cache as invalidation cache is used for read-intensive situations
Same as update. Database modification (delete the entry by the node) and than the message for other nodes to delete the key.
Now the problem:
In case a "put" request deletes the same key on other nodes, invalidation cache will not be usable. It is used for read-intensive situations. If there are one billion records in the database and multiple nodes need the same amount of data at the same time, they will not be able to use the cache at all as they will start deleting the entry from each-others cache. Even more, this will cause a big performance drawback.
Second question is: Could an invalidation cache perform better in any situation than a distributed cache? I guess it can perform better if it is cheaper to ask the database (or some source of the data) about the data instead of other nodes that are in the cluster. Based on my history, I always thought that the best way to do read-intensive caching is the way how Alfresco-EHCache does. However, if Infinispan has a very good distributed cache implementation that never performs worse than that logic, invalidation cache has just no cause to exist anymore.
If you want to do a cache put without sending a message over the network to other nodes, you can call the following regardless of whether the cache is configured with invalidation, distribution or replication:
After that, you can call cache.put/cache.remove as normal and it will send messages to other nodes in the cluster.
We also have a put operation called putForExternalRead that works very well in this caching use cases. We use extensively in the Hibernate second level cache Infinispan integration code.
In fact, if you want to get tips of how to get the most out of Infinispan for caching, the Hibernate 2LC code base uses a lot of these tricks to improve performance and avoid being wasteful.