8 Replies Latest reply on Jul 27, 2012 6:02 AM by rajendra.atmuri

    jboss5.1 cluster on Solaris :  MPING Exception

    zsobolsky

      Hi,

       

      I have a 1 jboss instance that I startup with "run.sh –c all –b $hostname"

      Nothing is deployed or modified in the configuration, just a jboss off-the-shelf

       

      HP server : Startup is ok

      10:23:50,533 INFO [ServerImpl] JBoss (Microcontainer) [5.1.0.GA (build: SVNTag=JBoss_5_1_0_GA date=200905221053)] Started in 3m:24s:907ms

       

      Solaris server :

      A few minutes after startup, I have following exception. This exception occurs every 1 or 2 minutes.

       

      10:30:50,014 INFO [ServerImpl] JBoss (Microcontainer) [5.1.0.GA (build: SVNTag=JBoss_5_1_0_GA date=200905221053)] Started in 2m:21s:346ms

       

      10:35:36,516 ERROR [MPING] failed sending discovery request

      java.io.InterruptedIOException: operation interrupted

      at java.net.PlainDatagramSocketImpl.send(Native Method)

      at java.net.DatagramSocket.send(DatagramSocket.java:612)

      at org.jgroups.protocols.MPING.sendMcastDiscoveryRequest(MPING.java:341)

      at org.jgroups.protocols.PING.sendGetMembersRequest(PING.java:259)

      at org.jgroups.protocols.Discovery$PingSenderTask$1.run(Discovery.java:407)

      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417)

      at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:280)

      at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:135)

      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:65)

      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:146)

      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:170)

      at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:651)

      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:676)

      at java.lang.Thread.run(Thread.java:595)

       

      10:41:33,519 ERROR [MPING] failed sending discovery request

      10:44:32,016 ERROR [MPING] failed sending discovery request

      10:45:31,515 ERROR [MPING] failed sending discovery request

      10:46:31,013 ERROR [MPING] failed sending discovery request

      10:47:30,512 ERROR [MPING] failed sending discovery request

      10:49:29,510 ERROR [MPING] failed sending discovery request

      10:51:28,508 ERROR [MPING] failed sending discovery request

      10:53:27,518 ERROR [MPING] failed sending discovery request

      10:54:27,014 ERROR [MPING] failed sending discovery request

      10:55:26,513 ERROR [MPING] failed sending discovery request

      10:56:26,012 ERROR [MPING] failed sending discovery request

      10:58:25,009 ERROR [MPING] failed sending discovery request

       

      If I did redo this test with the same conditions with jboss-4.2.3

      HP-UX is OK

      Solaris is OK !!!

       

      How can I get rid of the MPING exception in jboss5.1 for Solaris?

       

      thx

       

       

       

        • 1. Re: jboss5.1 cluster on Solaris :  MPING Exception
          brian.stansberry

          Interesting. There was an earlier post about a similar condition -- see my earlier comment at http://community.jboss.org/message/6747#6747

           

          What's somewhat different in your case is the problem is repeating. But it doesn't happen 100% of the time -- MERGE2 schedules a fixed delay task to send a discovery message, and your log messages aren't happening with fixed spacing. It is happening often though.

           

          JGroups should probably treat this InterruptedIOException differently from other IOExceptions, perhaps log it at WARN without a stack trace. I'll file a JIRA to do that, but first I want to explore more why this happens.

           

           

          1) You say 4.2.3 is OK. But nothing in 4.2.3 by default uses MPING. MPING comes into AS 5.x via one of the JGroups channels used by the messaging server. Have you changed the 4.2.3 configuration so it uses a JGroups channel with MPING? Clarifying this prevent a wrong path of looking for differences in JGroups between the version used in 4.2.3 and the one in 5.1.0.

           

          2) Have you adjusted the JGroups protocol stack configurations? If so what's the value of the MPING protocol's "timeout" attribute on any configuration that's using MPING.

           

          3) Is your server under load, particularly load that generates lots clustering traffic?  I'm wondering if the processing of scheduled tasks is being delayed.

           

          4) If yes to question 3) was the HP-UX machine that didn't have this problem also under equivalent load?

          • 2. Re: jboss5.1 cluster on Solaris :  MPING Exception
            zsobolsky

            Hi

             

            In respons to your questions:

             

            1) You say 4.2.3 is OK. But nothing in 4.2.3 by default uses MPING. MPING comes into AS 5.x via one of the JGroups channels used by the messaging server. Have you changed the 4.2.3 configuration so it uses a JGroups channel with MPING? Clarifying this prevent a wrong path of looking for differences in JGroups between the version used in 4.2.3 and the one in 5.1.0.

            Didn't know that 4.2.3 isn't using MPING, but my jboss4 uses of-the-shelf configuration. I didn't change a thing.

             

            2) Have you adjusted the JGroups protocol stack configurations? If so what's the value of the MPING protocol's "timeout" attribute on any configuration that's using MPING.

            No, my jboss5 uses of-the-shelf configuration. I didn't change a thing.

             

            3) Is your server under load, particularly load that generates lots clustering traffic?  I'm wondering if the processing of scheduled tasks is being delayed.

            No, jboss is the only app running on this server (it's a new servers).  In attach you find the stdo of probe.sh

             

             

            PS. Don't hesitate to ask more info.

            • 3. Re: jboss5.1 cluster on Solaris :  MPING Exception
              brian.stansberry

              Thanks for the responses. I've opened https://jira.jboss.org/jira/browse/JGRP-1161

               

              I don't see any probe.sh output attached. Could you attach it to the JIRA?

               

              Are you using clustered JBoss Messaging? If not we can explore workarounds to eliminate the usage of MPING. (Even if you're using clustered JBM we can do that; the workaround is just simpler if you're not.)

               

              Also, please confirm that you're not seeing any unusual (i.e. WARN or ERROR) logging from other than MPING.  The AS opens other channels that periodically send multicast messages; if this is the only place that is logging, that helps shift suspicion away from a general problem with sending multicast and more toward how MPING specifically does it.

              • 4. Re: jboss5.1 cluster on Solaris :  MPING Exception
                zsobolsky

                I don't see any probe.sh output attached. Could you attach it to the JIRA?

                 

                Ok, done.

                 

                Are you using clustered JBoss Messaging? If not we can explore workarounds to eliminate the usage of MPING. (Even if you're using clustered JBM we can do that; the workaround is just simpler if you're not.)

                 

                No, only SLSB.

                 

                Also, please confirm that you're not seeing any unusual (i.e. WARN or ERROR) logging from other than MPING.  The AS opens other channels that periodically send multicast messages; if this is the only place that is logging, that helps shift suspicion away from a general problem with sending multicast and more toward how MPING specifically does it.

                 

                 

                Message other than MPING ERROR : see attach server.log

                • 5. Re: jboss5.1 cluster on Solaris :  MPING Exception
                  brian.stansberry

                  Thanks, nothing in that server.log is relevant (which is what I expected). And that's helpful information as it tells me the 3 other JGroups channels that the AS "all" config starts by default aren't reporting problems. Those use a slightly different approach to send multicast discovery messages (PING protocol + UDP protocol instead of just MPING protocol).

                   

                  As to the workaround, since you aren't using clustered JBoss Messaging, you can make this problem go away by switching JBM to non-clustered operation.To switch JBoss Messaging to non-clustered operation, in server/all/deploy/messaging edit the <somename>-persistence-service.xml file and in the jboss.messaging:service=PostOffice mbean configuration:

                   

                  • turn off clustering:
                  <attribute name="Clustered">false</attribute>  
                  • delete or comment out the dependency on the JGroups channel factory
                  <depends optional-attribute-name="ChannelFactoryName">jboss.jgroups:service=ChannelFactory</depends>

                   

                  The "<somename>" depends on whether you've changed the standard JBM config to persist to some DB other than HSQL. The default file is hsqldb-persistence-service.xml.

                  • 6. Re: jboss5.1 cluster on Solaris :  MPING Exception
                    zsobolsky

                    The MPING Exception has disappeared from the logging.

                    Thanks for the quick response and workaround.
                    • 7. Re: jboss5.1 cluster on Solaris :  MPING Exception
                      bogdan.tindeche

                      Hi,

                       

                      I have the same problem described by this thread,

                      but I use clustered Jboss  Messaging.

                      Can you share the workaround for this case?

                       

                      Thanks

                      • 8. Re: jboss5.1 cluster on Solaris :  MPING Exception
                        rajendra.atmuri

                        Hi Zac,

                         

                        I have faced similar issue on JBoss 5.1.0 cluster on Solaris Zones. After viewing the discussion and going through https://issues.jboss.org/browse/JGRP-1161 , I have downloaded the latest jgroups.jar (2.6.15) and applied on both Jboss nodes. Restarted the servers, but still the problem is not yet resolved for me.

                         

                        On the same Solaris Zone machines, I have setup JBoss4.2.3 cluster and didnt face any error as mentioned in the post.

                         

                        Can you please help me out.

                         

                        Thanks