6 Replies Latest reply on Mar 9, 2011 9:11 AM by massios

    Clustering and failover question

    massios

      Dear all,

       

      We are working on an ESB 4.6 system clustered with 2 nodes. It is running on jboss 5.1 with oracle as an underlying database. We are facing the following problem relating to jms clustering.

      1. Initially both node1 and node2 are up and running with node1 as the master node.

      2. The ESB application uploads a message to the queue with name "queue_a" through node1.
      3. We stop node1. Now only node2 is running and it has taken over as the master node.

      4. We UNSUCCESSFULLY try to consume the message in "queue_a" through node2. No error message appears but message remains

          in the queue. Basically the client waits in the blocking read message method.

       

      If we perform a select on the messaging database we can find the message we posted on "queue_a" but "queue_a" appears 2 times in the post_office table, one time per each node_id.

       

      Is there a way to configure jboss messaging so that a node that is running can read the messages of a node that has failed or has been shut down?

       

      Thanks in advance,

       

      Nikos.

        • 1. Clustering and failover question
          gaohoward

          It seems the cluster wasn't properly set up.

           

          Pls read the user's manual. I guess the special "ClusterPullConnectionFactory" wasn't configured.

           

          Howard

          • 2. Re: Clustering and failover question
            massios

            Hello Howard,

             

            We had a look at the manual and at the following files

            messaging-service.xml

            connection-factories-service.xml

            but cannot find what is wrong.

             

            We have a ClusterPullConnectionFactory defined at the connection-factories-service.xml

             

            - <mbean code="org.jboss.jms.server.connectionfactory.ConnectionFactory" name="jboss.messaging.connectionfactory:service=ClusterPullConnectionFactory" xmbean-dd="xmdesc/ConnectionFactory-xmbean.xml">
            
              <depends optional-attribute-name="ServerPeer">jboss.messaging:service=ServerPeer</depends> 
            
              <depends optional-attribute-name="Connector">jboss.messaging:service=Connector,transport=bisocket</depends> 
            
              <depends>jboss.messaging:service=PostOffice</depends> 
            
              <attribute name="SupportsFailover">true</attribute> 
            
              <attribute name="SupportsLoadBalancing">true</attribute> 
              </mbean>
            
            

             

            and the line

             

            <attribute name="ClusterPullConnectionFactoryName">jboss.messaging.connectionfactory:service=ClusterPullConnectionFactory</attribute>
            

             

             

            in our messaging-service.xml

             

             

            I am also attaching the entire files.

             

             

            Nikos

            • 3. Clustering and failover question
              massios

              Hello again Howard,

               

               

              Ι was having a look at the code for

               

              org.jboss.messaging.core.impl.clusterconnection.MessageSucker

               

              and for

               

              org.jboss.messaging.core.impl.clusterconnection.ClusterConnectionManager

               

               

              Can you or someone confirm that:

               

              a) The ClusterPullConnectionFactory is οnly used when a client lookups the ClusteredConnectionFactory.

               

              b) That the ClusterPullConnectionFactory only works when one defines the queue as  Clustered=true  (we are doing this in our case)

               

              c) That the message suckers created by the ClusterConnectionManager only suck messages from the other nodes IF AND ONLY IF the other nodes are up? (they are not in our example).

               

               

              Our problem is that

               

              a) the messages are stored in the DB

               

              b) the node that wrote them is taken out

               

              c) The messages are readable only if the node that wrote them is up.

               

               

              Nikos

              • 4. Clustering and failover question
                gaohoward

                Hi Nikos,

                 

                The 'ClusterPullConnectionFactory' doesn't need its 'supportFailover' and 'SuppostsLoadBalancing' set to 'true', They need to be 'false' .

                • 5. Clustering and failover question
                  gaohoward

                  a) The ClusterPullConnectionFactory is οnly used when a client lookups the ClusteredConnectionFactory.

                   

                  --No it's used internally for pulling messages between nodes in a cluster.

                   

                  b) That the ClusterPullConnectionFactory only works when one defines the queue as  Clustered=true  (we are doing this in our case)

                   

                  --yes, and make sure you deploy the queue on each node.

                   

                  c) That the message suckers created by the ClusterConnectionManager only suck messages from the other nodes IF AND ONLY IF the other nodes are up? (they are not in our example).

                   

                  --Yes, if a node is down, its messages won't be sucked. However they may be 'failover'ed to other nodes.

                   

                  Our problem is that

                   

                  a) the messages are stored in the DB

                   

                  b) the node that wrote them is taken out

                   

                  c) The messages are readable only if the node that wrote them is up.

                   

                  I guess this is a 'failover' configuration issue. When a 'node' is taken out, there is always a node in the cluster that will 'merge' the dead node's messages over and deliver them. However if a node is shutdown normally, it may or may not be failed over, depending on a property called 'FailoverOnNodeLeave'.

                   

                  By the way, JBoss Messaging is now in maintenance mode, i.e. no more community releases since (only in EAP).

                   

                  I'd suggest you have a look at HornetQ, the successor to JBM and with richer functionalities and better performance.

                  • 6. Clustering and failover question
                    massios

                    Hello Howard,

                     

                    We tried setting 'FailoverOnNodeLeave' to true and our problems were solved.

                     

                    I just waited for a while marking it as answered because we wanted to see that it was indeed solved and because we had some extra problems. After setting 'FailoverOnNodeLeave' to true, we were still getting occasionaly problems after many restarts with the ClusterPullConnectionFactory. It was not working propely and was not pulling messages from servers that were up (again).

                     

                    The problem was solved after ignoring an eariler recommendation

                     

                    Yong Hao Gao wrote:

                     

                    The 'ClusterPullConnectionFactory' doesn't need its 'supportFailover' and 'SuppostsLoadBalancing' set to 'true', They need to be 'false' .

                     

                    and setting in the 'ClusterPullConnectionFactory' the 'supportFailover' and 'SuppostsLoadBalancing' to 'true'.

                     

                    But basically our problems are solved now.

                     

                    Thanks,

                     

                    Nikos