0 Replies Latest reply on Aug 16, 2010 7:08 PM by snacker

    Delayed Replication

    snacker

      We occasionally see that replication between nodes may take around 500ms at times.

       

      Usually it is "immediate", but occasionally it takes 70ms and we've seen it take up to 500 ms.

       

      The behavior we are expecting is when an EJB request to server 111 is completed, that all of the cache data that was changed is replicated to server 222 before the client EJB request returns.

       

      We have a monitoring application which constantly checks the jboss instances.

       

      So here is the process:

       

      step
      Server 111
      Server 222
      1monitor invokes ejb to set a NEW cache value for key ABC123.
      2monitor invokes ejb to read the value for key ABC123 and verifies it is the same value it set.
      3monitor invokes ejb to read the value for key ABC123 and verifies it is the same value that was sent to server 111.
      4If the value is not the same, try for a max of 5 seconds or until it gets the correct value.
      5Monitor invokes ejb to remove key ABC123.
      6Monitor invokes ejb to make sure key ABC123 has been removed.
      7monitor invokes ejb to make sure key ABC123 has been removed from this server as well.
      8if the value still exists, then try for a max of 5 seconds or until it is found that key ABC123 has been removed.

       

      Most of the time this works correctly.

      However we see delays at steps 3-4 and 7-8 where the value has not been replicated until some time afterwards.

       

      Are there any settings which would guarantee that the cache data has been replicated to the other nodes (barring an error) before the client transaction is completed and control returned to the client?

       

      I have attached an example of our cache configuration.

      We have many caches... they all use different cluster names and port numbers.