2 Replies Latest reply on Feb 4, 2013 6:54 AM by james-muso

    Intermittent cache invalidation failures

    james-muso

      Hi, I'm geting regular but intermittent invaliation failures from my distributed JBoss Cache / Hibernate setup and I'm hoping someone can give me a few pointers on how to track down this problem.

       

      I can easily recreate the problem in our live environment, by updating a cached entity on one node, then checking the same entity on another node. Approx 50% of the time, the cache hasn't invalidated on the 2nd node, so the older version of the entity is shown.

       

      I'm using a TCP multicast (because this is on EC2 so I can't use UDP) and there are currently 8 nodes in the cache.

       

      Debug logs aren't showing any messages that would indicate a problem and the logs are identical when the invalidation occurs and when it fails. Can anyone suggest suitable log setting that might show some more useful logs (switch all to trace level outputted far too much info to make sense of)?

       

      Thanks in advance for any help or suggestions!

       

       

       

      JBoss Cache v3.1.0

      JGroups v2.6.7

      Hibernate v3.3.2

       

       

      The debug logs I get on the node that is updating the entity are a follows:

       

      23 Jan 2013 17:04:38,536 [http-bio-8443-exec-8] DEBUG InvalidationInterceptor:244 - Is a CRUD method

      23 Jan 2013 17:04:38,550 [http-bio-8443-exec-8] DEBUG InvalidationInterceptor:244 - Is a CRUD method

      23 Jan 2013 17:04:38,550 [http-bio-8443-exec-8] DEBUG InvalidationInterceptor:381 - Cache [XX.XX.XX.XX:6800] replicating InvalidateCommand{fqn=/mbCache/configData/ENTITY/com.package.EntityName#4881}

       

       

       

      My cache settings are:

       

          <!-- A config appropriate for entity/collection caching that

               uses pessimistic locking -->

          <cache-config name="pessimistic-entity">

       

              <!-- Node locking scheme -->

              <attribute name="NodeLockingScheme">PESSIMISTIC</attribute>

       

              <!--

                  READ_COMMITTED is as strong as necessary for most

                  2nd Level Cache use cases.

              -->

              <attribute name="IsolationLevel">READ_COMMITTED</attribute>

       

              <!-- Mode of communication with peer caches.

             

                   INVALIDATION_SYNC is highly recommended as the mode for use

                   with entity and collection caches.

              -->

              <attribute name="CacheMode">INVALIDATION_SYNC</attribute>

       

              <!-- Name of cluster. Needs to be the same for all members, in order

                   to find each other -->

              <attribute name="ClusterName">pessimistic-entity</attribute>

             

              <!-- Use a UDP (multicast) based stack. A udp-sync stack might be

                   slightly better (no JGroups FC) but we stick with udp to

                   help ensure this cache and others like timestamps-cache

                   that require FC can use the same underlying JGroups resources. -->

              <attribute name="MultiplexerStack">tcp</attribute>

       

              <!-- Whether or not to fetch state on joining a cluster. -->

              <attribute name="FetchInMemoryState">false</attribute>

       

              <!--

                The max amount of time (in milliseconds) we wait until the

                state (ie. the contents of the cache) are retrieved from

                existing members at startup. Ignored if FetchInMemoryState=false.

              -->

              <attribute name="StateRetrievalTimeout">20000</attribute>

       

              <!--

                  Number of milliseconds to wait until all responses for a

                  synchronous call have been received.

              -->

              <attribute name="SyncReplTimeout">20000</attribute>

       

              <!-- Max number of milliseconds to wait for a lock acquisition -->

              <attribute name="LockAcquisitionTimeout">15000</attribute>

       

             <!--

                Indicate whether to use marshalling or not. Set this to true if you

                are running under a scoped class loader, e.g., inside an application

                server.

             -->

             <attribute name="UseRegionBasedMarshalling">true</attribute>

             <!-- Must match the value of "useRegionBasedMarshalling" -->

             <attribute name="InactiveOnStartup">true</attribute>

        • 1. Re: Intermittent cache invalidation failures
          hebergentilin

          Idk if this will help u, but try uncomment the tag:

           

          <Valve className="org.apache.catalina.valves.RequestDumperValve" />

           

          in the server.xml file.

          There are more valve tag that can help you. Try uncomment some else.

          • 2. Re: Intermittent cache invalidation failures
            james-muso

            Thanks - I'll give that a try.

             

            After adding some trace level logs, I discovered that the hibernate query cache was sending invalidate messages for all hibernate entites (not just my cached config data objects) and so this was causing a huge amount of invalidate messages to be sent. Deactivating this seems to have solved the problem - so it seems the problem was due to very high frequency of invalidate messages.

             

            It seems odd that there aren't any error message being logged, so perhaps these errors are occuring at a higher level, in which case your suggestion of logging tomcat errors would have helped!

             

            Cheers