2 Replies Latest reply on Jul 21, 2015 10:34 AM by saranya.guna

    Infinispan cache UDP

    saranya.guna

      Hi All,

       

      Faced the below issue in production enviroment.

       

      When the application tries to acquire lock for the cache , we got replication timeout exception.

       

      We have four nodes in clustered setup( node1, node2, node3 and node 4) .Node 3 went down and again came up within two minutes. When the node went down, the below message was sent only to the coordinator  node i.e) node 1.

       

      16:56:24,584 DEBUG [org.jgroups.protocols.pbcast.NAKACK2] (Incoming-6,shared=udp) removed node3:server/test from xmit_table (not member anymore)

       

      The above message was not received in node 2 and node 4. Got the below error in coordinator node.

       

      16:56:29,584 WARN  [org.jgroups.protocols.pbcast.GMS] (ViewHandler,test,node1:server/test) node1:server/test: failed to collect all ACKs (expected=3) for view [node1:server/test|4] after 5000ms, missing ACKs from [node2:server/test, node4:server/test]

       

       

      So, when the node comes up the below message was received only by the coordinator.

       

      16:57:57,642 DEBUG [org.jgroups.protocols.pbcast.GMS] (Incoming-12,shared=udp) node1:server/test installing view [node1:server/test|5] [node1:server/test,node2:server/test, node4:server/test, node3:server/test]

      16:57:57,642 DEBUG [org.jgroups.protocols.FD_SOCK] (Incoming-12,shared=udp) VIEW_CHANGE received: [node1:server/test, node2:server/test, node4:server/test, node3:server/test]

       

      While checking the logs in node2, i could see FD has detected that the node 3 went down through heart beat message

       

      16:56:59,764 DEBUG [org.jgroups.protocols.FD] (Timer-3,shared=udp) node2:server/test: received no heartbeat from node3:server/test for 5 times (30000 milliseconds), suspecting it

       

      Below are the cache configurations:

       

      <cache-container name="test" aliases="test" default-cache="test">

                          <transport lock-timeout="60000"/>

                          <replicated-cache name="test" mode="SYNC" start="EAGER" batching="true">

                              <transaction mode="NON_XA" locking="PESSIMISTIC"/>

                              <locking isolation="READ_COMMITTED" striping="false" acquire-timeout="600000"/>

                          </replicated-cache>

      </cache-container>

       

      <subsystem xmlns="urn:jboss:domain:jgroups:1.1" default-stack="udp">

                      <stack name="udp">

                          <transport type="UDP" socket-binding="jgroups-udp"/>

                          <protocol type="PING"/>

                          <protocol type="MERGE3"/>

                          <protocol type="FD_SOCK" socket-binding="jgroups-udp-fd"/>

                          <protocol type="FD"/>

                          <protocol type="VERIFY_SUSPECT"/>

                          <protocol type="pbcast.NAKACK"/>

                          <protocol type="UNICAST2"/>

                          <protocol type="pbcast.STABLE"/>

                          <protocol type="pbcast.GMS"/>

                          <protocol type="UFC"/>

                          <protocol type="MFC"/>

                          <protocol type="FRAG2"/>

                          <protocol type="RSVP"/>

                      </stack>

      </subsytem>

        • 1. Re: Infinispan cache UDP
          wdfink

          It seems that you might have an issue with UDP (network) or the configuration.

           

          You might check with the JGoups tools whether it works correct. See https://developer.jboss.org/wiki/TestingJBoss

          This should be not related to infinispan, beside that note that the infinispan subsystem is not recommended for caches which are accessed directly by an application.

          You should add and use the Infinispan modules to the server to access application caches. See the Infinispan documentation for this.

          • 2. Re: Infinispan cache UDP
            saranya.guna

            DEBUG [org.jgroups.protocols.pbcast.GMS] (Incoming-12,shared=udp) node1:server/test installing view [node1:server/test|5] [node1:server/test,node2:server/test, node4:server/test, node3:server/test]

             

            Only when install view was present in all nodes , it is working correct?