6 Replies Latest reply on Jan 16, 2012 4:21 AM by sri4jb4rel

    JBoss Clustering 5.1 (Discovery Interrupt Failure)

    sri4jb4rel

      I am planning to do a set up of JBoss Clustering using the community version 5.1.0. Followed the instructions mentioned in the link http://docs.jboss.org/jbossclustering/cluster_guide/5.1/html-single/index.html. But post deploying the 2 node cluster once in single machine and second try I did in two machines. In both the cases I am getting the below error continuously post the server are started. I am able to view the applications in stanalong httpURLs but due to the below error i m confused to understand if really my instances are part of the cluster

       

      Moreover i never find in the logs of the server that my custoem cluster has two nodes participated

       

      My question remains

      1) Has the cluster formed properly?

      2) How to check if an instance is part of a customised cluster and its member nodes? Is there any way in JMX console

      3) How to verify if Muticast is working properly or not?

       

      Error in both server log is

      ##########################################################################################################################

      2012-01-11 11:52:25,370 ERROR [org.jgroups.protocols.MPING] (Timer-2,10.185.254.80:7901) failed sending discovery request

      java.io.InterruptedIOException: operation interrupted

              at java.net.PlainDatagramSocketImpl.send(Native Method)

              at java.net.DatagramSocket.send(DatagramSocket.java:625)

              at org.jgroups.protocols.MPING.sendMcastDiscoveryRequest(MPING.java:341)

              at org.jgroups.protocols.PING.sendGetMembersRequest(PING.java:259)

              at org.jgroups.protocols.Discovery$PingSenderTask$1.run(Discovery.java:407)

              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417)

              at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:280)

              at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:135)

              at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:65)

              at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:146)

              at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:170)

              at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:651)

              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:676)

              at java.lang.Thread.run(Thread.java:637)

      ##########################################################################################################################

       

      When verified what is port and IP combination i found its not related to my cluster. Please check the GSM below its of Messaging

      ##########################################################################################################################

      ---------------------------------------------------------

      GMS: address is 10.185.254.80:7901 (cluster=MessagingPostOffice-DATA)

      ---------------------------------------------------------

      2012-01-11 11:16:08,061 INFO  [org.jboss.jms.server.connectionfactory.ConnectionFactory] (main) Connector bisocket://10.185.254.80:4557 has le

      asing enabled, lease period 10000 milliseconds

      2012-01-11 11:16:08,061 INFO  [org.jboss.jms.server.connectionfactory.ConnectionFactory] (main) org.jboss.jms.server.connectionfactory.Connect

      ionFactory@15d8350 started

      2012-01-11 11:16:08,072 INFO  [org.jboss.jms.server.connectionfactory.ConnectionFactory] (main) Connector bisocket://10.185.254.80:4557 has le

      asing enabled, lease period 10000 milliseconds

      ##########################################################################################################################

       

       

       

      Wrt to my Cluster 'DEVCLUST' I dont find differnt nodes participating. Please find its Related Logs below

      ##########################################################################################################################

      "2012-01-11 11:15:26,436 INFO  [org.jboss.mail.MailService] (main) Mail Service bound to java:/Mail

      2012-01-11 11:15:28,692 INFO  [org.jboss.jmx.adaptor.snmp.agent.SnmpAgentService] (main) SNMP agent going active

      2012-01-11 11:15:35,855 INFO  [org.jboss.ha.framework.interfaces.HAPartition.DEVCLUST] (main) Initializing partition DEVCLUST

      2012-01-11 11:15:35,966 INFO  [STDOUT] (JBoss System Threads(1)-3)

      ---------------------------------------------------------

      GMS: address is 10.185.254.80:58862 (cluster=DEVCLUST)

      ---------------------------------------------------------

      2012-01-11 11:15:36,335 INFO  [org.jboss.cache.jmx.PlatformMBeanServerRegistration] (main) JBossCache MBeans were successfully registered to t

      he platform mbean server.

      2012-01-11 11:15:36,502 INFO  [STDOUT] (main)

      ---------------------------------------------------------

      GMS: address is 10.185.254.80:58862 (cluster=DEVCLUST-HAPartitionCache)

      ---------------------------------------------------------

      2012-01-11 11:15:38,065 INFO  [org.jboss.ha.framework.interfaces.HAPartition.DEVCLUST] (JBoss System Threads(1)-3) Number of cluster members:

      1

      2012-01-11 11:15:38,065 INFO  [org.jboss.ha.framework.interfaces.HAPartition.DEVCLUST] (JBoss System Threads(1)-3) Other members: 0

      2012-01-11 11:15:38,513 INFO  [org.jboss.cache.RPCManagerImpl] (main) Received new cluster view: [10.185.254.80:58862|0] [10.185.254.80:58862]

      2012-01-11 11:15:38,521 INFO  [org.jboss.cache.RPCManagerImpl] (main) Cache local address is 10.185.254.80:58862

      2012-01-11 11:15:38,529 INFO  [org.jboss.cache.RPCManagerImpl] (main) state was retrieved successfully (in 2.02 seconds)

      2012-01-11 11:15:38,625 INFO  [org.jboss.cache.factories.ComponentRegistry] (main) JBoss Cache version: JBossCache 'Cascabel' 3.1.0.GA

      2012-01-11 11:15:38,529 INFO  [org.jboss.cache.RPCManagerImpl] (main) state was retrieved successfully (in 2.02 seconds)

      2012-01-11 11:15:38,625 INFO  [org.jboss.cache.factories.ComponentRegistry] (main) JBoss Cache version: JBossCache 'Cascabel' 3.1.0.GA

      2012-01-11 11:15:38,626 INFO  [org.jboss.ha.framework.interfaces.HAPartition.DEVCLUST] (main) Fetching serviceState (will wait for 30000 milli

      seconds):

      2012-01-11 11:15:38,631 INFO  [org.jboss.ha.framework.interfaces.HAPartition.DEVCLUST] (main) State could not be retrieved (we are the first m

      ember in group)

      2012-01-11 11:15:38,796 INFO  [org.jboss.ha.jndi.HANamingService] (main) Started HAJNDI bootstrap; jnpPort=1200, backlog=50, bindAddress=/10.1

      85.254.80

      2012-01-11 11:15:38,816 INFO  [org.jboss.ha.jndi.DetachedHANamingService$AutomaticDiscovery] (main) Listening on /10.185.254.80:1102, group=23

      9.255.100.101, HA-JNDI address=10.185.254.80:1200

      2012-01-11 11:15:42,220 INFO  [org.jboss.invocation.unified.server.UnifiedInvokerHA] (main) Service name is jboss:service=invoker,type=unified

      ha

      2012-01-11 11:15:45,498 WARN  [org.jboss.jms.server.jbosssx.JBossASSecurityMetadataStore] (main) WARNING! POTENTIAL SECURITY RISK. It has been

      detected that the MessageSucker component which sucks messages from one node to another has not had its password changed from the installatio

      n default. Please see the JBoss Messaging user guide for instructions on how to do this.

      2012-01-11 11:15:45,575 WARN  [org.jboss.annotation.factory.AnnotationCreator] (main) No ClassLoader provided, using TCCL: org.jboss.managed.a

      pi.annotation.ManagementComponent

      2012-01-11 11:15:46,041 WARN  [org.jboss.annotation.factory.AnnotationCreator] (main) No ClassLoader provided, using TCCL: org.jboss.managed.a

      pi.annotation.ManagementComponent

      2012-01-11 11:15:46,246 INFO  [com.arjuna.ats.jbossatx.jta.TransactionManagerService] (main) JBossTS Transaction Service (JTA version - tag:JB

      OSSTS_4_6_1_GA) - JBoss Inc.

      2012-01-11 11:15:46,247 INFO  [com.arjuna.ats.jbossatx.jta.TransactionManagerService] (main) Setting up property manager MBean and JMX layer

      2012-01-11 11:15:46,966 INFO  [com.arjuna.ats.jbossatx.jta.TransactionManagerService] (main) Initializing recovery manager

      2012-01-11 11:15:47,352 INFO  [com.arjuna.ats.jbossatx.jta.TransactionManagerService] (main) Recovery manager configured

      2012-01-11 11:15:47,353 INFO  [com.arjuna.ats.jbossatx.jta.TransactionManagerService] (main) Binding TransactionManager JNDI Reference

      2012-01-11 11:15:47,463 INFO  [com.arjuna.ats.jbossatx.jta.TransactionManagerService] (main) Starting transaction recovery manager"

      ##########################################################################################################################

       

       

      This is how i started in single instance mode

      ./run_a.sh -c MYCLUSTER -g DEVCLUST -u 239.255.100.100 -b 10.185.254.80 -Djboss.messaging.ServerPeerID=1 -

      Djboss.service.binding.set=ports-default &

      ./run_b.sh -c MYCLUSTER2 -g DEVCLUST -u 239.255.100.101 -b 10.185.254.80 -Djboss.messaging.ServerPeerID=2 -Djboss.service.binding.set=ports-01

      &

       

      In different machine set up i avoided the -Djboss.service.binding.set attibutes.

       

      Configuration I Changed

      1) ../server/<node1>/deploy/messaging/messaging-service.xml

           Set the ServerPeerID to what used in run.sh command line

       

      2) ../server/<node1>/deploy/jbossweb.sar/service.xml

           In the Engine XML element specified the jvmRoute="MYCLUSTER"

        • 1. Re: JBoss Clustering 5.1 (Discovery Interrupt Failure)
          wdfink

          I don't understand your start commands.

          what does MYCLUSTER and DEVCLUST mean?

          Also if you use different multicast addresses with the -u option the nodes will not see each other

          • 2. Re: JBoss Clustering 5.1 (Discovery Interrupt Failure)
            wdfink

            You must use the same multi cast address and partition name if you want to have a cluster of the nodes.

            The bind address must be different (on the same machine) or you have to use port-binding.

             

            1+2)

            If nodes join a cluster you will see messages like 'New members' + 'All members' on all nodes (If you start only one this node show the message '# of members : 1' and its own data

            You could check via JMX console, look for the server-cluster beans, you will find the same information like you see in the message.

             

            3)

            see wiki for a simple test https://community.jboss.org/wiki/TestingJBoss

            • 3. Re: JBoss Clustering 5.1 (Discovery Interrupt Failure)
              sri4jb4rel

              Thanks for response Fink. Your suggestion worked..in case if both the nodes are in same physical IP. When I tried adding another node in a different phyiscal to the same cluster with same unicast IP, the node3 is working as a separate node and is not a part of the cluster of other 2 nodes.

               

              Moreover the Discovery error is still coming in the MessagePostOffice-DATA cluster... Note This happens after few minutes from startup. Is there i am doing something wrong..

              2012-01-12 10:34:15,550 ERROR [org.jgroups.protocols.MPING] (Timer-3,10.185.254.80:7900) failed sending discovery request
              java.io.InterruptedIOException: operation interrupted
                      at java.net.PlainDatagramSocketImpl.send(Native Method)
                      at java.net.DatagramSocket.send(DatagramSocket.java:625)
                      at org.jgroups.protocols.MPING.sendMcastDiscoveryRequest(MPING.java:341)
                      at org.jgroups.protocols.PING.sendGetMembersRequest(PING.java:259)
                      at org.jgroups.protocols.Discovery$PingSenderTask$1.run(Discovery.java:407)
                      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417)
                      at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:280)
                      at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:135)
                      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:65)
                      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:146)
                      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:170)
                      at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:651)
                      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:676)
                      at java.lang.Thread.run(Thread.java:637)
              2012-01-12 10:35:17,660 ERROR [org.jgroups.protocols.MPING] (Timer-4,10.185.254.80:7900) failed sending discovery request
              java.io.InterruptedIOException: operation interrupted
                      at java.net.PlainDatagramSocketImpl.send(Native Method)
                      at java.net.DatagramSocket.send(DatagramSocket.java:625)
                      at org.jgroups.protocols.MPING.sendMcastDiscoveryRequest(MPING.java:341)
                      at org.jgroups.protocols.PING.sendGetMembersRequest(PING.java:259)
                      at org.jgroups.protocols.Discovery$PingSenderTask$1.run(Discovery.java:407)
                      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417)
                      at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:280)
                      at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:135)
                      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:65)
                      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:146)
                      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:170)
                      at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:651)
                      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:676)
                      at java.lang.Thread.run(Thread.java:637)
              2012-01-12 10:36:19,770 ERROR [org.jgroups.protocols.MPING] (Timer-4,10.185.254.80:7900) failed sending discovery request
              java.io.InterruptedIOException: operation interrupted
                      at java.net.PlainDatagramSocketImpl.send(Native Method)
                      at java.net.DatagramSocket.send(DatagramSocket.java:625)
                      at org.jgroups.protocols.MPING.sendMcastDiscoveryRequest(MPING.java:341)
                      at org.jgroups.protocols.PING.sendGetMembersRequest(PING.java:259)
                      at org.jgroups.protocols.Discovery$PingSenderTask$1.run(Discovery.java:407)
                      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417)
                      at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:280)
                      at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:135)
                      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:65)
                      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:146)
                      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:170)
                      at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:651)
                      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:676)
                      at java.lang.Thread.run(Thread.java:637)

               

              • 4. Re: JBoss Clustering 5.1 (Discovery Interrupt Failure)
                sri4jb4rel

                Hi Fink,Please ignore the clustering issue mentioned earlier across different physical machine. That was because they were in different subnet mask and MULTICAST was not working.

                 

                But the Timer is still continuously giving me the Discovery failure error as mentioned in earlier post any suggestion to avoid that error. Will it cause any issues.

                • 5. Re: JBoss Clustering 5.1 (Discovery Interrupt Failure)
                  wdfink

                  Do you have checked with the JGroups test?

                  I remember that I've seen such issue on a misconfigured network.

                  • 6. Re: JBoss Clustering 5.1 (Discovery Interrupt Failure)
                    sri4jb4rel

                    Hi Fink,

                    Thanks for your response. I have tested with the URL code u have mentioned. This discovery issue is not coming because of a cluster. If I run an application of JBoss as per "all node configuration" standalone also I have observed this issue being encountered.