14 Replies Latest reply on Mar 6, 2007 2:27 AM by msteiner

    Endless loop in JBoss Cache 1.4.1.GA

      My application works perfect with 1.4.0.SP1 but not with 1.4.1.GA. I have got node " / a / b"
      Everything works fine then I remove (TreeCacheMBean.remove(Fqn root);) then I once again put " / a / b" (TreeCacheMBean.put(Fqn fqn, String key, Object object);) and here problem starts. I have debugged a litle and found problematic place: org.jboss.cache.interceptors.PessimisticLockInterceptor line 170

      (...)
      if (createIfNotExists)
       {
       do
       {
       lock(fqn, ctx.getGlobalTransaction(), lock_type, recursive, zeroLockTimeout ? 0 : lock_timeout, createIfNotExists, storeLockedNode);
       }
       while(!cache.exists(fqn)); // keep trying until we have the lock (fixes concurrent remove())
       // terminates successfully, or with (Timeout)Exception
       }
      
      (...)
      

      cache.exists(fqn) always returns false beause node "a" in fqn " / a / b" has been marked for removal (data={__JBOSS_MARKED_FOR_REMOVAL=null})
      and org.jboss.cache.Node.getOrCreateChild(..) (line 322) does nothing with this node because it only creates new node if children().get(child_name)==null but this node is not null ! it exists and has __JBOSS_MARKED_FOR_REMOVAL flag.

      Why node 'a' is not removed but only marked __JBOSS_MARKED_FOR_REMOVAL?
      It is connected with JBCACHE-871 ?

      My TransactionManager configuration:

      <attribute name="TransactionManagerLookupClass">
       org.jboss.cache.JBossTransactionManagerLookup
       </attribute>
      
      <attribute name="IsolationLevel">NONE</attribute>






        • 1. Re: Endless loop in JBoss Cache 1.4.1.GA

          Upgrade to 1.4.1.SP1 doesnt help. Does anyone has idea where could be problem ? I can't belive that production version has so serious bug.

          • 2. Re: Endless loop in JBoss Cache 1.4.1.GA
            manik

            Thanks for spotting this. Yes, this is related to JBCACHE-871, because since fixing that, nodes aren't removed immediately but are marked as removed instead and are only cleaned up during tx commit.

            The bug is the result of a parent being removed in the same tx as a child is being implicitly created.

            http://jira.jboss.com/jira/browse/JBCACHE-974

            • 3. Re: Endless loop in JBoss Cache 1.4.1.GA

              Thanks for reply. Are you sure that this happen only when deleteting and adding is in the same tx? My application listenens on the socket, server gets first request that deletes root node (whith stateless ejb) then after a while I make another socket request which put new node to cache and this makes endless loop.

              • 4. Re: Endless loop in JBoss Cache 1.4.1.GA
                manik

                This shld only be within the same tx. The whole concept of marking nodes for removal does not exist if no txs are used.

                If things happen in different txs, e.g., remove in tx 1, and put in tx 2, the looping in tx 2 is *supposed* to happen until tx 1 commits and finishes the remove operation and actually removes nodes.

                If your app causes tx 1 to remove, suspends the tx, and then starts tx 2 with a put, then you have a deadlock because tx 2 will never come out of the loop until tx 1 finishes.

                • 5. Re: Endless loop in JBoss Cache 1.4.1.GA

                  This how I am getting the Loop. I delete root node:

                  15:24:25,709 DEBUG [TxInterceptor] local transaction exists - registering global tx if not present for Thread[pool-1-thread-3,5,jboss]
                  15:24:25,715 DEBUG [TxInterceptor] local transaction exists - registering global tx if not present for Thread[pool-1-thread-3,5,jboss]
                  15:24:25,716 DEBUG [TxInterceptor] Transaction TransactionImpl:XidImpl[FormatId=257, GlobalId=p_1.dom.pl/21, BranchQual=, localId=21] is already registered.
                  15:24:25,717 DEBUG [TxInterceptor] Running commit phase. One phase? true
                  15:24:25,717 DEBUG [TxInterceptor] Finished local commit/rollback method for GlobalTransaction:<192.168.24.106:34109>:2
                  15:24:25,717 DEBUG [TxInterceptor] Finished commit phase
                  15:24:25,740 DEBUG [BaseEvictionAlgorithm] processRemoveNodes(): Can't find node associated with fqn: /Could have been evicted earlier. Will just continue.
                  


                  Look at: Running commit phase

                  Now I dont have any nodes in cache (as CacheMgmt mbean says). I put node in the cache:

                  15:26:22,976 DEBUG [TxInterceptor] local transaction exists - registering global tx if not present for Thread[pool-1-thread-5,5,jboss]
                  15:26:22,994 DEBUG [TxInterceptor] local transaction exists - registering global tx if not present for Thread[pool-1-thread-5,5,jboss]
                  15:26:22,994 DEBUG [TxInterceptor] Transaction TransactionImpl:XidImpl[FormatId=257, GlobalId=p_1.dom.pl/24, BranchQual=, localId=24] is already registered.
                  



                  Uuups , loop started...

                  After a while:

                  15:31:23,069 WARN [TransactionImpl] Transaction TransactionImpl:XidImpl[FormatId=257, GlobalId=programista_1.gdynia.4drivers.pl/24, BranchQual=, localId=24] timed out. status=STATUS_ACTIVE
                  


                  Secend transaction timeouted.

                  • 6. Re: Endless loop in JBoss Cache 1.4.1.GA
                    manik

                    Do both the txs start on the same cache instance? Or are they n different instances in the cluster?

                    • 7. Re: Endless loop in JBoss Cache 1.4.1.GA

                      This is the same cache instance. I have edited hostname only in part of log - sorry.

                      • 8. Re: Endless loop in JBoss Cache 1.4.1.GA
                        manik

                        Hi - sorry, have been out of the loop in this for a while - do you still see this problem?

                        • 9. Re: Endless loop in JBoss Cache 1.4.1.GA

                          Yes, I have tried 1.4.1.SP2 - problem still exists.

                          • 10. Re: Endless loop in JBoss Cache 1.4.1.GA
                            manik

                            Have you got a unit test that recreates this? One that works on the cache directly, rather than the MBean?

                            • 11. Re: Endless loop in JBoss Cache 1.4.1.GA

                              Add this method to org.jboss.cache.transaction.TransactionTest from cache 1.4.1.SP2 src dist. It causes endless loop

                              public void testEndlessLoop() {
                               try {
                               Fqn root = new Fqn();
                               Fqn fqn = new Fqn(root, 1L);
                               //put first time
                               tx.begin();
                               this.cache.put(fqn, "k", "v");
                               tx.commit();
                              
                               //get works fine
                               tx.begin();
                               assertEquals("v", this.cache.get(fqn, "k"));
                               tx.commit();
                              
                               //remove all
                               tx.begin();
                               this.cache.remove(root);
                               tx.commit();
                              
                               //get returns null - ok
                               //put - endless loop
                               tx.begin();
                               assertNull(this.cache.get(fqn, "k"));
                               this.cache.put(fqn, "k", "v");
                               tx.commit();
                              
                               } catch (Throwable t) {
                               t.printStackTrace();
                               fail(t.toString());
                               }
                               }


                              • 12. Re: Endless loop in JBoss Cache 1.4.1.GA
                                manik

                                Thanks for this. Interestingly, this only happens when you try and remove the root node. Try changing it such that:

                                 Fqn root = Fqn.fromString("/my/SubRoot");
                                


                                and it works fine.

                                • 13. Re: Endless loop in JBoss Cache 1.4.1.GA
                                  manik

                                  See JBCACHE-999 - your bug got the lucky 999!

                                  • 14. Re: Endless loop in JBoss Cache 1.4.1.GA

                                     

                                    See JBCACHE-999 - your bug got the lucky 999!


                                    :-)


                                    Thanks for quick response.