6 Replies Latest reply on Mar 3, 2010 3:52 AM by mircea.markus

Local file persistence of partitioned data with full preload

teroheinonen Feb 26, 2010 7:37 AM

I guess this is read-the-source-code question, but let's see if somebody already has the answer:

Assuming setup:

- cluster of multiple nodes, in ASYNC mode, numOwners >=2

- all data is planned to be kept in memory at all times

- local write-behind file storage on each node (shared = false, preload = true, passivation = false, fetchPersistentState = true)

- required full HA (in case of any single node failure, or multiple node failures in case numOwners >2)

Scenario:

- cluster of three nodes A,B,C

- numOwners = 2

- assuming Nodes A and B store key K1

- in case of node A failure, when the keys get rehashed and the related state transfer takes place

Q1: assuming the rehashing and state transfer takes place fully automatically, right?

- now nodes B and C store key K1

Q2: right?

- nodes B and C are updated (by any node) and the CacheStore updates the local files in nodes B and C

- node A is brought back on-line and joins the cluster

Q3: what is the logic with respect to value for K1 between the permanent store in A, and cache state in B and C? Which information prevails?

- rehashing takes place for nodes A and B to store K1 again

Q4: right?

Q5: is the K1 automatically removed from node C permanent storage (along with removing it from node C memory cache)?

------------

Q6: What would be the appropriate pattern to start a cluster most efficiently with the above setup: local distributed storage (file storage an example only, implementation could be other, but local to each node), need to preload all data (from all nodes) in distributed cache memory, get notice when all data is loaded (to signal application of system availability)

Thanks again.

Tero

1. Re: Local file persistence of partitioned data with full preload

mircea.markus Mar 2, 2010 6:39 AM (in response to teroheinonen)

Q1: yes
Q2: yes
Q3: If tou have fetchPersistentSate="true" then the cache store of A will be cleaned and all the data will be fetched from B and C.
Q4: true
Q5: that's what I would expect, perhs manik can confirm
Re:Q6, you should use preload="true", fetchPesistentState="true" and an @ViewChanged lister. The listener will be called whenever a new member joins, with information(event) that contain the size of the cluster. Once the cluster has the desired state, it means that the whole cluster was started and you can inform your application about it.
HTH,
Mircea
Actions
2. Re: Local file persistence of partitioned data with full preload

teroheinonen Mar 2, 2010 8:38 AM (in response to mircea.markus)

Hi, a couple of more clarification questions. When is @ViewChanged actually called, in case of a new node?
A) after JGroups view has changed, or
B) after the new node is preloaded and state transferred?

Can I rely that on @ViewChanged, all state transfers resulting from the rehashing have been performed?

How are the overlapping rehashing/state transfers managed. E.g. one node JVM crashes -> rehashing/state transfers, the crashed node comes back to alive (e.g. through automatic restart) before the rehashing/state transfer is completed. What is the policy to manage this situation?

Thanks for your support.
Actions
3. Re: Local file persistence of partitioned data with full preload

mircea.markus Mar 3, 2010 3:27 AM (in response to teroheinonen)
Hi, a couple of more clarification questions. When is @ViewChanged
actually called, in case of a new node?
A) after JGroups view has
changed, o
B) after the new node is preloaded and state
transferred?
Good point. ViewChanged will signal jgroups join (A) and not "Node joined and cluster size is now n". I think this might be a bit more complex than I thought. I think a NodeJoined event would make perfect sense for this scenario. But as we don't have one, here is a workaround that I think will do the job: register an CacheStarted listener, that will be called once the local cache is fully started. N.B. this does not mean that cluster is formed, it might be that other caches are still joininig. Now, as the cache is started you can do cache.put(localAddress, "started"). Then, in a loop, do:

while (cacheManager.getMembers().size() < expectedClusterSize) { Thread.sleep(1000); } boolean clusterFormed = false; for (Address addr:cacheManager.getMembers()) { if (cache.get(addr) == null) { clusterFormed = false; } } } //now inform you app that cluster was formed
Looking back, this looks a lot as a workaround, but it should do the job.
How are the overlapping
rehashing/state transfers managed. E.g. one node JVM crashes ->
rehashing/state transfers, the crashed node comes back to alive (e.g.
through automatic restart) before the rehashing/state transfer is
completed. What is the policy to manage this situation?
This is something that needs to be looked into in more details. I'll add a JIRA for it;
Actions
4. Re: Local file persistence of partitioned data with full preload

mircea.markus Mar 3, 2010 3:28 AM (in response to mircea.markus)

For more details on how infinispan handles state transfer in an non blocking manner, here is our design document.
Actions
5. Re: Local file persistence of partitioned data with full preload

mircea.markus Mar 3, 2010 3:47 AM (in response to mircea.markus)

for the firts issue I've created SPN-360. Feel free to review and comment on it, any feedback much appreciated.
Actions
6. Re: Local file persistence of partitioned data with full preload

mircea.markus Mar 3, 2010 3:52 AM (in response to mircea.markus)

Here is the JIRA for investigating and documenting multiple nodes state transfers.
Actions

Go to original post