I am currently investigating options for our project. I have spent some time playing with infinispan, and I don't seem to be able to achieve desired effect. So I would like to ask, if it's actually possible and how to do it:
I would like to have a distributed cache consisting of several node.
I need to to be able to influence, how many elements can be stored on each node. Some of the nodes will behave as 'clients' - they are our applications, that put and retrieve data from cache. Other nodes will serve only the purpose of cache.
I can't use HotRod for clients to connect to the cache, as I need support for transactions. And I don't want the 'client' nodes to store the same amount of data, as the other nodes - they have quite heavy memory load from the application's activity.
The ideal configuration for my purposes would be if the 'client' nodes served only as L1 cache.
Does this make any sense and is it possible with infinispan?
The easiest thing is probably to have client nodes access Infinispan in embedded mode (in-VM), and then adjust eviction settings per node, depending on how many nodes each can contain.
You probably wanna use distribution as cluster mode.
The key thing about eviction is to figure out where to store data once evicted (if you want to provide backup for it). This could be persisted in a shared JDBC database, a file based cache store...etc.
Another possibility is to have a near cache style set up. This is not natively supported in Hot Rod, but there's a demo I built using JMS that would allow client Infinispan embedded access, and have Hot Rod servers behind. See my preso in:
And the code for it can be found in our distribution, under near cache client and server demo code.
I'll use a simple example to show my problem here.
If I have a node C (client) configured as follows:
<namedCache name="myCache"> <eviction strategy="LIRS" maxEntries="50" /> <expiration lifespan="-1" maxIdle="-1" reaperEnabled="false" wakeUpInterval="-1" /> <clustering mode="dist"> <hash numVirtualNodes="100" numOwners="1" /> <async/> </clustering> </namedCache>
and another node S (server) confiruged as follows:
<namedCache name="myCache"> <eviction strategy="LIRS" maxEntries="10000" /> <expiration lifespan="-1" maxIdle="-1" reaperEnabled="false" wakeUpInterval="-1" /> <clustering mode="dist"> <hash numVirtualNodes="100" numOwners="1" /> <async/> </clustering> </namedCache>
When I put 1000 elements in the node C, eventually there will be about 550 elements in the cluster due to even hash distribution. So, even though the cluster's capabilities are much larger, nearly half of the elements will get evicted.
I am looking for a configuration, where most of the elements will stay on the node S.
Right, the only way I can currently think of is if the cache keys are generated using the key affinity service.
This enables you to generate a key that's gonna be stored in a particular node. So, you could code against this service in such way that node S takes N times more keys than C.
If you have particular keys that need to use (iow, you cannot reuse the generated keys), you'd need to map your natural keys to the generated ones. A bit complex, but that's the only way I can think of getting that to work right now.
Yes, but this will work only as long as I can limit number of 'C' type nodes.
Is there any other way to influence hash distribution over cluster?
I am going to try playing with machineId in transport configuration and numOwners.
What if I set numOwners=2, and I set the same machineId for all my 'C' type nodes, and different machineIds for my 'S' type nodes? Does this guarantee, that each element I put in the cluster has at least one replica stored in 'S' nodes?