1 Reply Latest reply: Oct 27, 2011 10:28 AM by slorg1 RSS

Issues with Locking when running transactional replicated cache

slorg1 Newbie

Hi every one,

 

I am working on deploying my current web application at Amazon EC2.

 

First, I have successfully got Infinispan and JGroups to replicated my cache on my local machine with 2 tomcats.

Now, a little of context, I am using infinispan as Hibernate's L2 cache in my application. It is being used in conjunction with Atomikos for the transaction management side.

 

All that said, I am trying to set up project in a production like environment at Amazon EC2 over SSH tunnels.

So far so good, I got JGroups in, what I believe to be, a happy place. The TCP PING seems to work, both instances hand shake and all.

How ever I run into issue when:

 

1. starting a transaction on one side, loading objects in the cache which should be replication

2. modifying them

3. commit the transaction

4. the 2nd tomcat receives the objects modified to replicate the new states.

 

Then I get the following :

 

 

2011-10-25 10:32:57,090 [Incoming-12,infinispan-cluster,tomcat-test1-1593] ERROR org.infinispan.interceptors.InvocationContextInterceptor - ISPN000136: Execution error

org.infinispan.util.concurrent.TimeoutException: Unable to acquire lock after [0 milliseconds] on key [com.XXXX..DBObjectl#6694706] for requestor [Thread[Incoming-12,infinispan-cluster,tomcat-test1-1593,5,Thread Pools]]! Lock held by [GlobalTransaction:<tomcat-test2-28268>:467:remote]

          at org.infinispan.container.EntryFactoryImpl.acquireLock(EntryFactoryImpl.java:240)

          at org.infinispan.container.EntryFactoryImpl.wrapEntryForWriting(EntryFactoryImpl.java:157)

          at org.infinispan.container.EntryFactoryImpl.wrapEntryForWriting(EntryFactoryImpl.java:109)

          at org.infinispan.interceptors.LockingInterceptor.visitPutKeyValueCommand(LockingInterceptor.java:293)

          at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:77)

          at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:119)

          at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:133)

          at org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:60)

          at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:77)

          at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:119)

          at org.infinispan.interceptors.TxInterceptor.enlistWriteAndInvokeNext(TxInterceptor.java:214)

          at org.infinispan.interceptors.TxInterceptor.visitPutKeyValueCommand(TxInterceptor.java:162)

          at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:77)

          at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:119)

          at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:104)

          at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:64)

          at org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:60)

          at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:77)

          at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:274)

          at org.infinispan.commands.remote.BaseRpcInvokingCommand.processVisitableCommand(BaseRpcInvokingCommand.java:71)

          at org.infinispan.commands.remote.SingleRpcCommand.perform(SingleRpcCommand.java:62)

          at org.infinispan.remoting.InboundInvocationHandlerImpl.handleInternal(InboundInvocationHandlerImpl.java:173)

          at org.infinispan.remoting.InboundInvocationHandlerImpl.handleWithWaitForBlocks(InboundInvocationHandlerImpl.java:187)

          at org.infinispan.remoting.InboundInvocationHandlerImpl.handleWithRetry(InboundInvocationHandlerImpl.java:297)

          at org.infinispan.remoting.InboundInvocationHandlerImpl.handle(InboundInvocationHandlerImpl.java:159)

          at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.executeCommand(CommandAwareRpcDispatcher.java:165)

          at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.handle(CommandAwareRpcDispatcher.java:144)

          at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:577)

          at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:488)

          at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:364)

          at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:770)

          at org.jgroups.JChannel.up(JChannel.java:1484)

          at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1074)

          at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.up(STREAMING_STATE_TRANSFER.java:263)

          at org.jgroups.protocols.FRAG2.up(FRAG2.java:189)

          at org.jgroups.protocols.FlowControl.up(FlowControl.java:418)

          at org.jgroups.protocols.pbcast.GMS.up(GMS.java:908)

          at org.jgroups.protocols.VIEW_SYNC.up(VIEW_SYNC.java:170)

          at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:246)

          at org.jgroups.protocols.UNICAST.up(UNICAST.java:309)

          at org.jgroups.protocols.pbcast.NAKACK.handleMessage(NAKACK.java:838)

          at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:667)

          at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:133)

          at org.jgroups.protocols.FD.up(FD.java:275)

          at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:275)

          at org.jgroups.protocols.MERGE2.up(MERGE2.java:209)

          at org.jgroups.protocols.Discovery.up(Discovery.java:293)

          at org.jgroups.protocols.TP.passMessageUp(TP.java:1109)

          at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1665)

          at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1642)

          at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)

          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

          at java.lang.Thread.run(Thread.java:619)

 

 

I am not very sure about what is wrong with my set up. I have reached a point where I am not sure what to try to fix it any more.

 

Here is my JGroups set up:

 

<config xmlns="urn:org:jgroups"
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xsi:schemaLocation="urn:org:jgroups http://www.jgroups.org/schema/JGroups-2.12.xsd">
   <TCP_NIO
      bind_addr="localhost"
      bind_port="7800"
      port_range="5"
      loopback="true"
      recv_buf_size="20000000"
      send_buf_size="640000"
      discard_incompatible_packets="true"
      max_bundle_size="64000"
      max_bundle_timeout="30"
      enable_bundling="true"
      use_send_queues="true"
      sock_conn_timeout="300"

      thread_pool.enabled="true"
      thread_pool.min_threads="1"
      thread_pool.max_threads="25"
      thread_pool.keep_alive_time="5000"
      thread_pool.queue_enabled="false"
      thread_pool.queue_max_size="100"
      thread_pool.rejection_policy="Run"

      oob_thread_pool.enabled="true"
      oob_thread_pool.min_threads="1"
      oob_thread_pool.max_threads="8"
      oob_thread_pool.keep_alive_time="5000"
      oob_thread_pool.queue_enabled="false"
      oob_thread_pool.queue_max_size="100"
      oob_thread_pool.rejection_policy="Run" level="fatal"/>


   <TCPPING timeout="1000"
      initial_hosts="localhost[7800]"
      port_range="5"
      num_initial_members="2" level="fatal" />
   <MERGE2 max_interval="100000"
      min_interval="20000"/>
   <FD_SOCK bind_addr="localhost" start_port="7700" port_range="1"/>
   <FD timeout="500" max_tries="2"/>
   <VERIFY_SUSPECT timeout="200"
        bind_addr="localhost" />
   <pbcast.NAKACK
      use_mcast_xmit="false" gc_lag="0"
      retransmit_timeout="300,600,1200,2400,4800"
      discard_delivered_msgs="true"/>
   <UNICAST timeout="300,600,1200" />
   <pbcast.STABLE stability_delay="1000" desired_avg_gossip="500"
      max_bytes="400000"/>
   <VIEW_SYNC avg_send_interval="60000"/>
   <pbcast.GMS print_local_addr="true" join_timeout="3000"
      view_bundling="true"/>
   <UFC max_credits="2000000"
      min_threshold="0.10"/>
   <FRAG2 frag_size="60000"  />
   <pbcast.STREAMING_STATE_TRANSFER bind_addr="localhost" bind_port="7900"/>
</config>

 

 

 

And some of my infinispan config:

 

  <!--  Define the default cache settings. -->
   <default>
      <locking
         isolationLevel="REPEATABLE_READ"
         lockAcquisitionTimeout="20000"
         writeSkewCheck="false"
         concurrencyLevel="500"
         useLockStriping="false"
      />


      <transaction transactionManagerLookupClass="com.XXXX.InfinispanCacheBeanFactory$TransactionManagerLookupImpl"
            useSynchronization="true"
            syncRollbackPhase="false"
            syncCommitPhase="false"
            useEagerLocking="false"
            eagerLockSingleNode="false"
            cacheStopTimeout="30000" />
      <jmxStatistics enabled="false" />


      <clustering mode="replication">
         <stateRetrieval
            timeout="20000"
            fetchInMemoryState="false"
            alwaysProvideInMemoryState="false"
         />


         <sync replTimeout="20000"/>
      </clustering>
      <expiration maxIdle="1800000" />
   </default>

 

 

 

 

Again, this works locally:

- works when both tomcats are on 1 machine

- works when both tomcats are on 2 machines if I set the IP to not localhost (obviously).

 

Thus, I am not quite sure what is going wrong.

 

I am running infinispan 5.0.0.CR8 with JGroups 2.12.1.1.Final

 

When I have run the debugger through the spot where it crashes, I saw that the lock time out selected using:

 

 
protected long getLockAcquisitionTimeout(InvocationContext ctx) {
      return ctx.hasFlag(Flag.ZERO_LOCK_ACQUISITION_TIMEOUT) ?
            0 : configuration.getLockAcquisitionTimeout();
   }

 

 

 

 

Always yields true for

ctx.hasFlag(Flag.ZERO_LOCK_ACQUISITION_TIMEOUT)

 

So my assumption here is that infinispan is expecting the lock to be acquirable right away.

 

Thank you in advance.