8 Replies Latest reply on Jun 13, 2011 9:21 AM by kcbabo

    Plans for clustering and load balancing support

    mvecera

      Hello,

       

      I wonder what are basic plans for $subject in SwitchYard?

       

      Let's consider the following example:

      I have two services (ServiceA and ServiceB) both deployed on several SwitchYard instances in a cluster.

      ServiceA calls ServiceB.

      How could I configure ServiceA to call ServiceB on any node in the cluster depending on their load?

       

      Thanks,

      Martin

        • 1. Re: Plans for clustering and load balancing support
          kcbabo

          Hey Martin,

           

          Clustering will be achieved through a bus provider which can be distributed (e.g. HornetQ) and a federated runtime registry which associates services with the instances in which they run.  From an end-user configuration standpoint, both of these details should be transparent.  If you setup a cluster of SwitchYard instances, you should not have to write code which selects a specific instance of a service.  If an instance is unavailable, the addressing handler will not send messages to it.  In theory, we could enhance the addressing logic to factor in load metrics and other data, but it's not something we've tackled yet.  Do you have a specific use case in mind?

           

          cheers,

          keith

          • 2. Re: Plans for clustering and load balancing support
            ldimaggio

            So, in a 12-node cluster, if Service A calls Service B - and the addressing handler determines that Service B should be called on node #12, as nodes 1-11 are busier (CPU, network load, disk IO, or other defined load metrics) than node 12?

             

            I'm guessing that the use case would be used to maximize the *total* throughput across the whole cluster - and maybe to be able to monitor the entire cluster by watching any one node?

            • 3. Re: Plans for clustering and load balancing support
              kcbabo

              Personally, I think the determination of which node is best suited for a request is a really challenging problem.  Anything beyond the most basic metrics (e.g. size of work queue) can turn out to be a mess.  For example, how do we weight any given metric against the others.   Network speed, CPU utilization, disk I/O, memory utilization, and response time are the typical bounds and even with these it could be a dog's breakfast.   The way I was hoping to approach this is more of a multi-step deal:

               

              1) Allow for clustering in the first place. :-)

              2) Collect appropriate work metrics for service instances and make them available.

              3) Create an extensible/pluggable mechanism for determining which node a message is routed to.

              4) Fine tune default load balancing policy to take into account load metrics.

               

              Does that seem reasonable?

              • 4. Re: Plans for clustering and load balancing support
                ldimaggio

                Sounds sane to me.   ;-)

                 

                Seriously - step 3 is an important idea - as people will always have their own (heavily customized) approaches to handling this.

                • 5. Re: Plans for clustering and load balancing support
                  mvecera

                  Hello Keith,

                   

                  thanks for your reply. So the clustering abilities will rely on a bus provider? What about service with WS interface? Of course I could ask UDDI to give me an endpoint from the cluster, but it cannot decide based on any metrics (not even message queue size)... Or am I missing anything here?

                   

                  Thanks,

                  Martin

                  • 6. Re: Plans for clustering and load balancing support
                    kcbabo

                    Sorry, I thought you were referring to the clustering available directly in the ESB.  External load balancing options are always available (e.g. mod-cluster for SOAP/HTTP) and we need to make sure we integrate well.  Also, we will have several bus providers that allow for various delivery policies (remotable, transactional, by-reference, etc.), so it's best not to think about the bus provider as a specific technology like HornetQ.  Sorry for giving that impression earlier.

                     

                    We will not be using UDDI as a runtime registry as is the case with JBoss ESB.  Our runtime registry contains the actual runtime state of services in the cluster and will be able to store (or correlate with) service metadata and metrics.

                    • 7. Re: Plans for clustering and load balancing support
                      mvecera

                      Actually, I wanted to ask both questions. Thanks for your clarification.

                       

                      What registry do you plan to use? What about WS endpoints? What about 3rd party registries integration?

                      • 8. Re: Plans for clustering and load balancing support
                        kcbabo

                        The registry question boils down to two basic use cases:

                         

                        1) A runtime registry which reflects the current state of the services available in the ESB.

                        2) A publication registry which allows the address of a service to be advertised and queried by external consumers.

                         

                        These are two distinct use cases with two different registries.  The runtime registry is internal only - users will not interact with it directly at all.  The publication registry will most likely be a blend of UDDI and a service repository.  The user will interact with this registry and will choose which services are published in it.  The runtime state of the service is not a material detail to the publication registry.

                         

                        I see no reason why we should not continue to support publication of services to third-party registries for the second use case.