5 Replies Latest reply: Mar 1, 2011 10:22 AM by Wolf-Dieter Fink RSS

JBoss instance halts in cluster

Muhammad Irfan Masood Newbie

Hi,

 

Can some one let me know what is the procedure for starting and stopping JBoss instances in a cluster?. Last week I faced this issue, the two nodes in cluster were working fine for a long time, we applied a patch having change in a .class file. We restarted master node after applying patch but it stopped receiving live traffic, then we stopped child node to apply patch; as child stopped; the master node started receiving traffic, later on when we started child node, then child node went into halted state i.e. Then we stopped both nodes and started both at the same time, but still one node was receiving traffic but other did not.

 

Please help me out on what is the valid procedure to start nodes in a JBoss cluster. In case we need to apply a patch having changes in binaries, how we stop the instances and start again.

 

Please note that we are not using Form Deployment, we have deployed our application in exploded .ear form in the JBOSS_HOME/server/all/deploy directory. The JBoss 3.2.5 is running on Windows 2008 Server and JBoss is installed as Windows service using Java Wrapper. The JDK is 1.4.2.

 

Any help will be highly appreciated.

 

Best Regards!

 

Irfan

  • 1. JBoss instance halts in cluster
    Wolf-Dieter Fink Master

    Do you check Multicast?

    see http://community.jboss.org/wiki/JGroups

    Special http://community.jboss.org/wiki/TestingJBoss

     

    Do the cluster find together?

    Which instance is configured within the client for lookup?

    Did the behaviour change when you change the client config of JNP?

  • 2. JBoss instance halts in cluster
    Muhammad Irfan Masood Newbie

    Thanks for your answer.

     

    Please note that after making few tries to restart both instances at the same time, by luck, it worked and both instances start receiving traffic. So there seems to be some specific procedure on how to start / restart instances.

     

    I mean its behaviour is un predictable, some times it start correclty and both instances receive traffic and some times one of the instances halts.

     

    please help me out. thanks

     

    Irfan.

  • 3. JBoss instance halts in cluster
    Muhammad Irfan Masood Newbie

    Below are logs when it was woribng fine: (Instance 2 logs)

     

    2011-02-15 00:00:01,501 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to sdes-app01:60528 (additional data: 16 bytes) (own address=SDES-APP02:55268 (additional data: 16 bytes))

    2011-02-15 00:00:01,501 DEBUG [org.jgroups.protocols.UDP] sending message to sdes-app01:60528 (additional data: 16 bytes) (src=SDES-APP02:55268 (additional data: 16 bytes)), headers are {FD=[FD: heartbeat], UDP=[UDP:group_addr=DefaultPartition]}

    2011-02-15 00:00:01,501 DEBUG [org.jgroups.protocols.UDP] received (ucast) 166 bytes from /10.3.21.111:60528

    2011-02-15 00:00:01,501 DEBUG [org.jgroups.protocols.UDP] message is [dst: SDES-APP02:55268 (additional data: 16 bytes), src: sdes-app01:60528 (additional data: 16 bytes) (2 headers), size = 0 bytes], headers are {FD=[FD: heartbeat ack], UDP=[UDP:group_addr=DefaultPartition]}

    2011-02-15 00:00:01,501 DEBUG [org.jgroups.protocols.FD] received ack from sdes-app01:60528 (additional data: 16 bytes)

    2011-02-15 00:00:01,564 DEBUG [org.jgroups.protocols.UDP] received (ucast) 133 bytes from /10.3.21.111:60528

    2011-02-15 00:00:01,564 DEBUG [org.jgroups.protocols.UDP] message is [dst: SDES-APP02:55268 (additional data: 16 bytes), src: sdes-app01:60528 (additional data: 16 bytes) (2 headers), size = 0 bytes], headers are {FD=[FD: heartbeat], UDP=[UDP:group_addr=DefaultPartition]}

    2011-02-15 00:00:01,564 DEBUG [org.jgroups.protocols.UDP] sending message to sdes-app01:60528 (additional data: 16 bytes) (src=SDES-APP02:55268 (additional data: 16 bytes)), headers are {FD=[FD: heartbeat ack], UDP=[UDP:group_addr=DefaultPartition]}

    2011-02-15 00:00:04,013 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to sdes-app01:60528 (additional data: 16 bytes) (own address=SDES-APP02:55268 (additional data: 16 bytes))

    2011-02-15 00:00:04,013 DEBUG [org.jgroups.protocols.UDP] sending message to sdes-app01:60528 (additional data: 16 bytes) (src=SDES-APP02:55268 (additional data: 16 bytes)), headers are {FD=[FD: heartbeat], UDP=[UDP:group_addr=DefaultPartition]}

    2011-02-15 00:00:04,013 DEBUG [org.jgroups.protocols.UDP] received (ucast) 166 bytes from /10.3.21.111:60528

    2011-02-15 00:00:04,013 DEBUG [org.jgroups.protocols.UDP] message is [dst: SDES-APP02:55268 (additional data: 16 bytes), src: sdes-app01:60528 (additional data: 16 bytes) (2 headers), size = 0 bytes], headers are {FD=[FD: heartbeat ack], UDP=[UDP:group_addr=DefaultPartition]}

    2011-02-15 00:00:04,013 DEBUG [org.jgroups.protocols.FD] received ack from sdes-app01:60528 (additional data: 16 bytes)

    2011-02-15 00:00:04,060 DEBUG [org.jgroups.protocols.UDP] received (ucast) 133 bytes from /10.3.21.111:60528

     

    After when it did not worked in cluster (instance 1)

     

    2011-02-22 23:58:44,956 DEBUG [org.jgroups.protocols.MERGE2] initial_mbrs=[[own_addr=SDES-APP01:63710 (additional data: 16 bytes), coord_addr=SDES-APP01:63710 (additional data: 16 bytes)]]

    2011-02-22 23:58:59,969 DEBUG [org.jgroups.protocols.PING] FIND_INITIAL_MBRS

    2011-02-22 23:58:59,969 DEBUG [org.jgroups.protocols.PING] waiting for initial members: time_to_wait=2000, got 0 rsps

    2011-02-22 23:58:59,969 DEBUG [org.jgroups.protocols.UDP] sending message to 228.1.2.3:45566 (src=SDES-APP01:63710 (additional data: 16 bytes)), headers are {PING=[PING: type=GET_MBRS_REQ, arg=null], UDP=[UDP:group_addr=DefaultPartition]}

    2011-02-22 23:58:59,969 DEBUG [org.jgroups.protocols.UDP] received (mcast) 120 bytes from /10.3.21.111:63710 (size=120 bytes)

    2011-02-22 23:58:59,969 DEBUG [org.jgroups.protocols.UDP] message is [dst: 228.1.2.3:45566, src: SDES-APP01:63710 (additional data: 16 bytes) (2 headers), size = 0 bytes], headers are {PING=[PING: type=GET_MBRS_REQ, arg=null], UDP=[UDP:group_addr=DefaultPartition]}

    2011-02-22 23:58:59,969 DEBUG [org.jgroups.protocols.PING] received GET_MBRS_REQ from SDES-APP01:63710 (additional data: 16 bytes), returning [PING: type=GET_MBRS_RSP, arg=[own_addr=SDES-APP01:63710 (additional data: 16 bytes), coord_addr=SDES-APP01:63710 (additional data: 16 bytes)]]

    2011-02-22 23:58:59,985 DEBUG [org.jgroups.protocols.UDP] sending message to SDES-APP01:63710 (additional data: 16 bytes) (src=SDES-APP01:63710 (additional data: 16 bytes)), headers are {PING=[PING: type=GET_MBRS_RSP, arg=[own_addr=SDES-APP01:63710 (additional data: 16 bytes), coord_addr=SDES-APP01:63710 (additional data: 16 bytes)]], UDP=[UDP:group_addr=DefaultPartition]}

    2011-02-22 23:58:59,985 DEBUG [org.jgroups.protocols.UDP] received (ucast) 314 bytes from /10.3.21.111:63710

    2011-02-22 23:58:59,985 DEBUG [org.jgroups.protocols.UDP] message is [dst: SDES-APP01:63710 (additional data: 16 bytes), src: SDES-APP01:63710 (additional data: 16 bytes) (2 headers), size = 0 bytes], headers are {PING=[PING: type=GET_MBRS_RSP, arg=[own_addr=SDES-APP01:63710 (additional data: 16 bytes), coord_addr=SDES-APP01:63710 (additional data: 16 bytes)]], UDP=[UDP:group_addr=DefaultPartition]}

    2011-02-22 23:58:59,985 DEBUG [org.jgroups.protocols.PING] received FIND_INITAL_MBRS_RSP, rsp=[own_addr=SDES-APP01:63710 (additional data: 16 bytes), coord_addr=SDES-APP01:63710 (additional data: 16 bytes)]

    2011-02-22 23:58:59,985 DEBUG [org.jgroups.protocols.PING] waiting for initial members: time_to_wait=1984, got 1 rsps

    2011-02-22 23:59:01,997 DEBUG [org.jgroups.protocols.PING] initial mbrs are [[own_addr=SDES-APP01:63710 (additional data: 16 bytes), coord_addr=SDES-APP01:63710 (additional data: 16 bytes)]]

    2011-02-22 23:59:01,997 DEBUG [org.jgroups.protocols.MERGE2] initial_mbrs=[[own_addr=SDES-APP01:63710 (additional data: 16 bytes), coord_addr=SDES-APP01:63710 (additional data: 16 bytes)]]

    2011-02-22 23:59:21,922 DEBUG [org.jgroups.protocols.PING] FIND_INITIAL_MBRS

     

    Any idea some thing wrong in the logs.

     

    Best Regards!

     

    Irfan

  • 4. JBoss instance halts in cluster
    Muhammad Irfan Masood Newbie

    By using the sepcial test given below:

     

    Special http://community.jboss.org/wiki/TestingJBoss

     

    The nodes in cluster are not discovering each other, on both nodes the view list only one node i.e. itself.

     

    Any idea what could be wrong, where to see for possible problems.

  • 5. JBoss instance halts in cluster
    Wolf-Dieter Fink Master

    Do you finish this thread, because of the started thread : http://http://community.jboss.org/thread/163240