9 Replies Latest reply on Aug 20, 2012 9:04 AM by galder.zamarreno

    "Unable to acquire lock" under load

    ninethirty

      I think there's something I just don't understand here.  I'm testing Infinispan within a batch processing framework.  I'm not positive that heavy load is a factor here, but when running a job that takes a lot of CPU and a fair amount of memory churn, I'll consistently see "unable to acquire lock" exceptions in the log (full stack trace below).   I don't understand this because I'm running with just a two-node cluster, and each node is working on its own subset of keys.  So, there are never two nodes that are accessing the same key at the same time.  For that matter, there aren't even two threads within the same JVM  accessing the same key.  Why would this lock be held by another thread?

       

      My configuration is DIST_SYNC, without L1 enabled, and num_owners = 2 (so all keys should, in practice, be on both machines).  I've had to increase a couple of other timeouts in config so far:

       

      config.setRehashRpcTimeout(30000);

      config.setStateRetrievalTimeout(60, TimeUnit.SECONDS);

      config.setSyncReplTimeout(60, TimeUnit.SECONDS);

      gc.setDistributedSyncTimeout(60000);

       

      everything else is default.  I first saw this on 4.2.0, but have since upgraded to 4.2.1.CR3 with no luck.

       

       

       

      The exception is:

       

      org.infinispan.util.concurrent.TimeoutException: Unable to acquire lock after [10 seconds] on key [SYS_P68762] for requestor [Thread[OOB-395,Infinispan-Cluster,kdr-devb-978,5,Thread Pools]]! Lock held by [(another thread)]

                at org.infinispan.container.EntryFactoryImpl.acquireLock(EntryFactoryImpl.java:228)

                at org.infinispan.interceptors.CacheLoaderInterceptor.loadIfNeeded(CacheLoaderInterceptor.java:132)

                at org.infinispan.interceptors.CacheLoaderInterceptor.visitPutKeyValueCommand(CacheLoaderInterceptor.java:79)

                at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:80)

                at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)

                at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:132)

                at org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:58)

                at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:80)

                at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)

                at org.infinispan.interceptors.TxInterceptor.enlistWriteAndInvokeNext(TxInterceptor.java:184)

                at org.infinispan.interceptors.TxInterceptor.visitPutKeyValueCommand(TxInterceptor.java:132)

                at org.infinispan.interceptors.DistTxInterceptor.visitPutKeyValueCommand(DistTxInterceptor.java:79)

                at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:80)

                at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)

                at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:95)

                at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:62)

                at org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:58)

                at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:80)

                at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:273)

                at org.infinispan.commands.remote.BaseRpcInvokingCommand.processVisitableCommand(BaseRpcInvokingCommand.java:49)

                at org.infinispan.commands.remote.SingleRpcCommand.perform(SingleRpcCommand.java:44)

                at org.infinispan.remoting.InboundInvocationHandlerImpl.handleInternal(InboundInvocationHandlerImpl.java:142)

                at org.infinispan.remoting.InboundInvocationHandlerImpl.handleWithWaitForBlocks(InboundInvocationHandlerImpl.java:156)

                at org.infinispan.remoting.InboundInvocationHandlerImpl.handleWithRetry(InboundInvocationHandlerImpl.java:266)

                at org.infinispan.remoting.InboundInvocationHandlerImpl.handle(InboundInvocationHandlerImpl.java:129)

                at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.executeCommand(CommandAwareRpcDispatcher.java:159)

                at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.handle(CommandAwareRpcDispatcher.java:144)

                at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:578)

                at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:489)

                at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:365)

                at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:771)

                at org.jgroups.JChannel.up(JChannel.java:1465)

                at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:954)

                at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.up(STREAMING_STATE_TRANSFER.java:265)

                at org.jgroups.protocols.FRAG2.unfragment(FRAG2.java:311)

                at org.jgroups.protocols.FRAG2.up(FRAG2.java:171)

                at org.jgroups.protocols.FlowControl.up(FlowControl.java:419)

                at org.jgroups.protocols.FlowControl.up(FlowControl.java:419)

                at org.jgroups.protocols.pbcast.GMS.up(GMS.java:888)

                at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:234)

                at org.jgroups.protocols.UNICAST.up(UNICAST.java:310)

                at org.jgroups.protocols.pbcast.NAKACK.handleMessage(NAKACK.java:806)

                at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:671)

                at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:132)

                at org.jgroups.protocols.FD.up(FD.java:266)

                at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:269)

                at org.jgroups.protocols.MERGE2.up(MERGE2.java:210)

                at org.jgroups.protocols.Discovery.up(Discovery.java:292)

                at org.jgroups.protocols.TP.passMessageUp(TP.java:1093)

                at org.jgroups.protocols.TP.access$100(TP.java:56)

                at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1633)

                at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1615)

                at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)

                at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)

                at java.lang.Thread.run(Thread.java:619)

        • 1. "Unable to acquire lock" under load
          brenuart

          Try with <locking useLockStriping="false"/>

          • 2. "Unable to acquire lock" under load
            ninethirty

            Thanks, I'll try that.  Any idea why not using lock striping would affect this?  Assuming that lock striping in Infinispan is the same as in other concurrent collections, I'd expect that to mean that it has a set of locks that each refer to a subset of keys, so that concurrent updates can potentially go forward without acquiring the same global lock.

            In concept, that's certainly a good thing -- and I'd expect a performance hit if it was off.  Is there an algorithmic reason that lock striping might cause this error, or do you want me to try it in order to debug the problem? 

            • 3. "Unable to acquire lock" under load
              brenuart

              With lock striping enabled, attempt to lock a single key may result in multiple keys being locked... (a single lock covers multiple keys).

              So, even if two processes attempt to lock the same key, they may race for the same lock.

              • 4. "Unable to acquire lock" under load
                ninethirty

                Oh, I see.  I didn't read the docs:

                 

                The alternative is to disable lock striping - which would mean a new lock is created per entry.  This approachmay give you greater concurrent throughput, but it will be at the cost of additional memory usage, garbage collection churn, etc.

                 

                I assumed that disabling lock striping meant one global lock per cache, not one per entry.  That makes sense.   Thanks!

                • 5. "Unable to acquire lock" under load
                  ninethirty

                  Sure enough, that worked.  Thanks!

                  • 6. "Unable to acquire lock" under load
                    brenuart

                    He he... have been caught by the same problem several weeks ago

                    • 7. "Unable to acquire lock" under load
                      galder.zamarreno

                      Guys, I've created a poll in http://community.jboss.org/polls/1073?pollSuccess=true in order to discuss whether to carry on enabling lock striping by default or not. Please vote

                      • 8. Re: "Unable to acquire lock" under load
                        kowshikns

                        I still get this error witht  the following configuration

                         

                         

                        private

                         

                        DefaultCacheManager createCacheManagerProgramatically() {

                        GlobalConfigurationBuilder gcb =

                        new

                        GlobalConfigurationBuilder();

                        gcb.classLoader(

                        this

                        .getClass().getClassLoader());

                        GlobalConfiguration gc = gcb.transport().defaultTransport().clusterName(

                        "GLC").globalJmxStatistics().enabled(true

                        ).build();

                         

                        // Working:

                         

                        return new DefaultCacheManager(gc, new ConfigurationBuilder().clustering().cacheMode(CacheMode.DIST_SYNC).l1().enabled(false).invalidationThreshold(-1).lifespan(120000).hash().numOwners(3).rehashRpcTimeout(120000).jmxStatistics().enabled(true).transaction().locking().useLockStriping(false

                        ).build());

                         

                        }

                         

                         

                        I am trying to update the the cache concurrently..

                         

                        Using Infinispan 5.1.2 final in jboos 7.1.1 final

                        • 9. Re: "Unable to acquire lock" under load
                          galder.zamarreno

                          This is an old thread. If you have any issues, open a new one with the test case as well as the configuration.