6 Replies Latest reply on Mar 3, 2010 3:52 AM by mircea.markus

    Local file persistence of partitioned data with full preload

      I guess this is read-the-source-code question, but let's see if somebody already has the answer:

       

      Assuming setup:

      - cluster of multiple nodes, in ASYNC mode, numOwners >=2

      - all data is planned to be kept in memory at all times

      - local write-behind file storage on each node (shared = false, preload = true, passivation = false, fetchPersistentState = true)

      - required full HA (in case of any single node failure, or multiple node failures in case numOwners >2)

       

      Scenario:

       

      - cluster of three nodes A,B,C

      - numOwners = 2

      - assuming Nodes A and B store key K1

      - in case of node A failure, when the keys get rehashed and the related state transfer takes place

      Q1: assuming the rehashing and state transfer takes place fully automatically, right?

      - now nodes B and C store key K1

      Q2: right?

      - nodes B and C are updated (by any node) and the CacheStore updates the local files in nodes B and C

      - node A is brought back on-line and joins the cluster

      Q3: what is the logic with respect to value for K1 between the permanent store in A, and cache state in B and C? Which information prevails?

      - rehashing takes place for nodes A and B to store K1 again

      Q4: right?

      Q5: is the K1 automatically removed from node C permanent storage (along with removing it from node C memory cache)?

       

      ------------

       

      Q6: What would be the appropriate pattern to start a cluster most efficiently with the above setup: local distributed storage (file storage an example only, implementation could be other, but local to each node), need to preload all data (from all nodes) in distributed cache memory, get notice when all data is loaded (to signal application of system availability)

       

      Thanks again.

       

      Tero

        • 1. Re: Local file persistence of partitioned data with full preload
          mircea.markus

          Q1: yes

          Q2: yes

          Q3: If tou have fetchPersistentSate="true" then the cache store of A will be cleaned and all the data will be fetched from B and C.

          Q4: true

          Q5: that's what I would expect, perhs manik can confirm

          Re:Q6, you should use preload="true", fetchPesistentState="true" and an  @ViewChanged lister. The listener will be called whenever a new member joins, with information(event) that contain the size of the cluster. Once the cluster has the desired state, it means that the whole cluster was started and you can inform your application about it.

          HTH,

          Mircea

          • 2. Re: Local file persistence of partitioned data with full preload

            Hi, a couple of more clarification questions. When is @ViewChanged actually called, in case of a new node?

            A) after JGroups view has changed, or

            B) after the new node is preloaded and state transferred?

             

            Can I rely that on @ViewChanged, all state transfers resulting from the rehashing have been performed?

             

            How are the overlapping rehashing/state transfers managed. E.g. one node JVM crashes -> rehashing/state transfers, the crashed node comes back to alive (e.g. through automatic restart) before the rehashing/state transfer is completed. What is the policy to manage this situation?

             

            Thanks for your support.

            • 3. Re: Local file persistence of partitioned data with full preload
              mircea.markus

              Hi, a couple of more clarification questions. When is @ViewChanged
              actually called, in case of a new node?

              A) after JGroups view has
              changed, o

              B) after the new node is preloaded and state
              transferred?

              Good point. ViewChanged will signal jgroups join (A) and not "Node joined and cluster size is now n". I think this might be a bit more complex than I thought. I think a NodeJoined event would make perfect sense for this scenario. But as we don't have one, here is a workaround that I think will do the job: register an CacheStarted listener, that will be called once the local cache is fully started. N.B. this does not mean that cluster is formed, it might be that other caches are still joininig. Now, as the cache is started you can do cache.put(localAddress, "started"). Then, in a loop, do:


              while (cacheManager.getMembers().size() < expectedClusterSize) {
                Thread.sleep(1000);
              
              }
              boolean clusterFormed = false;
              for (Address addr:cacheManager.getMembers()) {
                    if (cache.get(addr) == null) {
                       clusterFormed = false;
              
                    }
              
                 }
              }
              //now inform you app that cluster was formed
              

              Looking back, this looks a lot as a workaround, but it should do the job.

              How are the overlapping
              rehashing/state transfers managed. E.g. one node JVM crashes ->
              rehashing/state transfers, the crashed node comes back to alive (e.g.
              through automatic restart) before the rehashing/state transfer is
              completed. What is the policy to manage this situation?

              This is something that needs to be looked into in more details. I'll add a JIRA for it;

              • 4. Re: Local file persistence of partitioned data with full preload
                mircea.markus
                For more details on how infinispan handles state transfer in an non blocking manner, here is our design document.
                • 5. Re: Local file persistence of partitioned data with full preload
                  mircea.markus
                  for the firts issue I've created SPN-360. Feel free to review and comment on it, any feedback much appreciated.
                  • 6. Re: Local file persistence of partitioned data with full preload
                    mircea.markus
                    Here is the JIRA for investigating and documenting multiple nodes state transfers.