1 2 3 4 Previous Next 51 Replies Latest reply on May 23, 2006 11:23 AM by manik Go to original post
      • 45. Re: Buddy Replication in JBoss Cache (JBCACHE-61)
        brian.stansberry

         

        "manik.surtani@jboss.com" wrote:
        "bstansberry@jboss.com" wrote:


        a) Cache discovers its buddy has failed; therefore it needs to walk through the tree taking ownership of all data owned by the old owner.
        b) Cache has no idea its buddy has failed, but receives a get() call targetting a node it doesn't own. E.g. load balancer fails over a webapp request before the JGroups suspicion process is complete. Here the cache should just take ownership of the single node.


        Do we even consider a) anymore? Just let the backup data be as backups - ownerless - until a get() call comes into the cluster for this data. Then let the cache that deals with this get() become th eowner of this single node. I.e., let ownership shift on a node-by-node basis. (And let data that never gets requested for again gradually get evicted)

        There is little point in a single cache ("primary buddy") taking ownership of all the failed cache's data if the load balancer will be dstributing requests for the failed cache across the cluster anyway.


        (Our last chat moved away from the shared tree approach and back to _buddy_backup_, but I'm replying to document this part of our conversation).

        If the backup doesn't take ownership of the data, the number of caches on which the data is stored is reduced by 1 (the old data owner). So, there is a reduction in the reliability guarantee.

        That's the downside. The upside is there is no sudden mass movement of data around the cluster when a server goes down. The data just gets gravitated as it gets requested.

        • 46. Re: Buddy Replication in JBoss Cache (JBCACHE-61)
          brian.stansberry

           

          "manik.surtani@jboss.com" wrote:
          "bstansberry@jboss.com" wrote:

          Will these be a hard, or a suggestion?


          Perhaps this could be configurable as well. For heterogenous replication groups to make any sense, this would have to be hard - but to achieve the same levels of backup security, it would need to be flexible to utilise servers from other repl groups. A tradeoff we could pass on to the user I suppose.


          You must be related to the owners of my previous employer. "Make it a config option" is what they always said as well :-)

          All kidding aside, what you say makes sense.

          • 47. Re: Buddy Replication in JBoss Cache (JBCACHE-61)
            brian.stansberry

            (Notes from a chat Manik and I had yesterday; all good ideas are his, all bad ideas and poor explanation are mine)

            We were discussing state transfer issues if we went with a "shared tree" approach rather than the original design of using a special _buddy_backup_ region. Previous posts discuss the process a data owner goes through in preparing state to transfer to a new buddy. Some issues there, but not too bad. But now consider things from the new buddy's perspective, particularly one who has an existing tree but has been named as the buddy of another cache following a topology change. This new buddy would have to take the received state and integrate it into his own existing tree. The received state may not be a nicely grouped set of nodes that can easily be tacked onto a few branches of the existing tree -- the nodes may be scattered around.

            Integrating this state is a solvable problem, but its a pain and likely to be ugly.

            The reason we moved away from the _buddy_backup_ region approach was the need to ensure a get() sent out from a ClusteredCacheLoader would be able to find the data if the data owner had died, but the primary buddy had not recognized that yet and taken ownership of the data.

            A solution to the get() problem is to have caches, when they receive a get() with a special "checkBuddyTree" option, is to have the cache first check their main tree, then all their buddy backup trees. Return the first data they find.

            This is kind of ugly (checking multiple trees), but actually less ugly than all the state integration issues we've discussed above. So, we're inclined to move back to the original _buddy_backup_ region approach.

            Thoughts?

            • 48. Re: Buddy Replication in JBoss Cache (JBCACHE-61)
              manik

              After this has gone through many iterations here and on IM, etc. I've updated the designs on the wiki.

              Please do have a look and comment accordingly - I would really like your feedback on this as this will eventually become the documentation for buddy replication.

              Cheers,
              Manik

              • 49. Re: Buddy Replication in JBoss Cache (JBCACHE-61)
                manik
                • 50. Re: Buddy Replication in JBoss Cache (JBCACHE-61)
                  galder.zamarreno

                  I have arrived to this discussions late and I was very pleased to read
                  through the wiki. It very well explained and the pictures do help in
                  visualising the buddy replication design/implementation.

                  In the Implementation Overview section, I would probably change the
                  order of the sections from:

                  Configuring buddy replication
                  Gravitation of data
                  Finding your buddy
                  Backing up data

                  To:

                  Configuring buddy replication
                  Finding your buddy
                  Backing up data
                  Gravitation of data

                  I guess having the order sections in a chonological way would make more
                  sense?

                  In the Diagram 1: Operational cluster section, there's two spelling
                  mistakes: gravidates and loking. Manik, you should be ashamed! ;-)

                  Finally, in the section of buddyPoolName, regarding the example of having
                  3 power sources, just to confirm that I understood correctly, each of the
                  power sources would have a different buddyPoolName and any other
                  nodes would have one of these buddyPoolNames as set by the user.

                  • 51. Re: Buddy Replication in JBoss Cache (JBCACHE-61)
                    manik

                     


                    In the Diagram 1: Operational cluster section, there's two spelling
                    mistakes: gravidates and loking. Manik, you should be ashamed! ;-)


                    Picky bugger. Fixed. :-)


                    Finally, in the section of buddyPoolName, regarding the example of having
                    3 power sources, just to confirm that I understood correctly, each of the
                    power sources would have a different buddyPoolName and any other
                    nodes would have one of these buddyPoolNames as set by the user.


                    Nope. Each buddy pool will have one member on each power supply. So that if one power supply dies, all buddy pools still have living members.


                    1 2 3 4 Previous Next