6 Replies Latest reply on Dec 6, 2011 2:30 PM by alexc099

    Failover with static clusters?

    alexc099

      Something else that I had working in a broadcast discovery environment that I can't seem to get working now in a static cluster environment is HA failover.

       

      My hornetq-configuration.xml defines the cluster like so. The netty connector is the primary and netty-backup is the backup. (I've also tried it with just netty-backup.) When I bring them both up I see the backup announcement and when I kill the primary the backup takes over.

       

         <!-- Clustering configuration -->

         <cluster-connections>

            <cluster-connection name="my-cluster">

               <address>jms</address>

               <connector-ref>netty</connector-ref>

               <retry-interval>500</retry-interval>

               <use-duplicate-detection>true</use-duplicate-detection>

               <max-hops>1</max-hops>

               <static-connectors>

                  <connector-ref>netty</connector-ref>

                  <connector-ref>netty-backup</connector-ref>

               </static-connectors>

            </cluster-connection>

         </cluster-connections>

       

      I've defined my connection factory in hornetq-jms.xml like so:

       

         <connection-factory name="NettyConnectionFactory">

            <connectors>

               <connector-ref connector-name="netty"/>

            </connectors>

            <entries>

               <entry name="/ConnectionFactory"/>

            </entries>

            <ha>true</ha>

       

            <confirmation-window-size>1</confirmation-window-size>

            <client-failure-check-period>100</client-failure-check-period>

       

            <retry-interval>1000</retry-interval>

            <retry-interval-multiplier>1.0</retry-interval-multiplier>

            <reconnect-attempts>10</reconnect-attempts>

         </connection-factory>

       

      I've tried it both with and without confirmation-window-size (which the docs say are required but isn't set in the sample) but that doesn't seem to help. With reconenct-attempts set to 10 I'll eventually get

       

      org.hornetq.core.logging.impl.JULLogDelegate warn WARNING: Tried 10 times to connect. Now giving up on reconnecting it.

       

      But if I set it to -1 then my producer just hangs continuously attempting to reconnect to the dead server. Did I miss something in the config?