0 Replies Latest reply on Nov 23, 2010 2:05 PM by manohar.parelly

    Jboss node hangs frequently

    manohar.parelly

      Dear All,

       

      We have a problem in our production environment. One of our 4 jboss nodes is getting hanged frequently.

       

      We have 4 jboss nodes clustered in 2 physical servers. They are running perfectly till 4 months back for almost 2 years.

      After starting up 4 nodes one after the other they are running fine till for some period of time (sometimes 10 days, sometimes 2 days).
      All of a sudden one of the jboss node is hanging and not receiving any request while other nodes are working fine. If we identify the hanging node immediately and restart the node everything works fine. Otherwise it is impacting other 3 nodes slowly and making them also to hang.

       

      In normal case no.of threads count is not exceeding 200 in any node. But when it hangs, it is showing more than 500 threads in Thread dump.
      Thread dump I have attached. Please check.

       

      We found below exception in our server.log file just before jboss node goes down. Please check below error:
      Please let me know if you need any more information. Please give your suggestions to resolve the issue.


      java.lang.RuntimeException: JBossCacheService: exception occurred in cache get after retry ...
      at org.jboss.web.tomcat.service.session.JBossCacheWrapper.get(JBossCacheWrapper.java:94)
      at org.jboss.web.tomcat.service.session.JBossCacheService.loadSession(JBossCacheService.java:251)
      at org.jboss.web.tomcat.service.session.JBossCacheManager.loadSession(JBossCacheManager.java:1010)
      at org.jboss.web.tomcat.service.session.JBossCacheManager.findSession(JBossCacheManager.java:782)
      at org.apache.catalina.connector.Request.doGetSession(Request.java:2283)
      at org.apache.catalina.connector.Request.getSession(Request.java:2075)
      at org.jboss.web.tomcat.security.SecurityAssociationValve.invoke(SecurityAssociationValve.java:99)
      at org.jboss.web.tomcat.service.session.ClusteredSessionValve.invoke(ClusteredSessionValve.java:87)
      at org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:84)
      at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
      at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
      at org.jboss.web.tomcat.service.jca.CachedConnectionValve.invoke(CachedConnectionValve.java:157)
      at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
      at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:262)
      at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
      at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
      at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:446)
      at java.lang.Thread.run(Thread.java:595)
      Caused by: org.jboss.cache.lock.TimeoutException: failure acquiring lock: fqn=/JSESSION/localhost/atheebweb/MhPmOOs88X4+gQY0qxzIHA**, caller=GlobalTransaction:<172.16.64.21:7811>:678309, lock=write owner=Thread[IncomingMessageHandler (channel=Tomcat-AtheebWeb),5,JGroups threads] (activeReaders=0, activeWriter=Thread[IncomingMessageHandler (channel=Tomcat-AtheebWeb),5,JGroups threads], waitingReaders=0, waitingWriters=0, waitingUpgrader=0)
      at org.jboss.cache.Node.acquire(Node.java:500)
      at org.jboss.cache.interceptors.PessimisticLockInterceptor.acquireNodeLock(PessimisticLockInterceptor.java:381)
      at org.jboss.cache.interceptors.PessimisticLockInterceptor.lock(PessimisticLockInterceptor.java:309)
      at org.jboss.cache.interceptors.PessimisticLockInterceptor.invoke(PessimisticLockInterceptor.java:183)
      at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:68)
      at org.jboss.cache.interceptors.UnlockInterceptor.invoke(UnlockInterceptor.java:32)
      at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:68)
      at org.jboss.cache.interceptors.ReplicationInterceptor.invoke(ReplicationInterceptor.java:39)
      at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:68)
      at org.jboss.cache.interceptors.TxInterceptor.handleNonTxMethod(TxInterceptor.java:365)
      at org.jboss.cache.interceptors.TxInterceptor.invoke(TxInterceptor.java:160)
      at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:68)
      at org.jboss.cache.interceptors.CacheMgmtInterceptor.invoke(CacheMgmtInterceptor.java:138)
      at org.jboss.cache.TreeCache.invokeMethod(TreeCache.java:5877)
      at org.jboss.cache.TreeCache.get(TreeCache.java:3641)
      at org.jboss.cache.TreeCache.get(TreeCache.java:3622)
      at org.jboss.cache.TreeCache.get(TreeCache.java:3418)
      at sun.reflect.GeneratedMethodAccessor143.invoke(Unknown Source)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:585)
      at org.jboss.mx.interceptor.ReflectedDispatcher.invoke(ReflectedDispatcher.java:155)
      at org.jboss.mx.server.Invocation.dispatch(Invocation.java:94)
      at org.jboss.mx.server.Invocation.invoke(Invocation.java:86)
      at org.jboss.mx.server.AbstractMBeanInvoker.invoke(AbstractMBeanInvoker.java:264)
      at org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:659)
      at org.jboss.mx.util.MBeanProxyExt.invoke(MBeanProxyExt.java:210)
      at $Proxy443.get(Unknown Source)
      at org.jboss.web.tomcat.service.session.JBossCacheWrapper.get(JBossCacheWrapper.java:78)
      ... 17 more
      Caused by: org.jboss.cache.lock.TimeoutException: read lock for /JSESSION/localhost/atheebweb/MhPmOOs88X4+gQY0qxzIHA** could not be acquired by GlobalTransaction:<172.16.64.21:7811>:678309 after 15000 ms. Locks: Read lock owners: []
      Write lock owner: Thread[IncomingMessageHandler (channel=Tomcat-AtheebWeb),5,JGroups threads]
      , lock info: write owner=Thread[IncomingMessageHandler (channel=Tomcat-AtheebWeb),5,JGroups threads] (activeReaders=0, activeWriter=Thread[IncomingMessageHandler (channel=Tomcat-AtheebWeb),5,JGroups threads], waitingReaders=0, waitingWriters=0, waitingUpgrader=0)
      at org.jboss.cache.lock.IdentityLock.acquireReadLock(IdentityLock.java:262)
      at org.jboss.cache.Node.acquireReadLock(Node.java:512)
      at org.jboss.cache.Node.acquire(Node.java:474)