StateTransferException with infinispan 5.0.0FINAL
shohou Sep 6, 2011 8:39 AMHi!
I'm getting StateTransferException with 5+ nodes in cluster on 3 different computers:
This one is on starting instance:
2011-09-02 10:11:20,822 - INFO - GlobalComponentRegistry - ISPN000128: Infinispan version: Infinispan 'Pagoa' 5.0.0.FINAL 2011-09-02 10:11:21,083 - INFO - RpcManagerImpl - ISPN000074: Trying to fetch state from SERVER-1-43204 2011-09-02 10:11:23,093 - INFO - InboundInvocationHandlerImpl - ISPN000067: Will try and wait for cache named api-reports-cache to start 2011-09-02 10:11:53,122 - INFO - InboundInvocationHandlerImpl - ISPN000068: Cache named api-reports-cache does not exist on this cache manager! 2011-09-02 10:11:53,241 - INFO - InboundInvocationHandlerImpl - ISPN000067: Will try and wait for cache named api-reports-cache to start 2011-09-02 10:12:23,265 - INFO - InboundInvocationHandlerImpl - ISPN000068: Cache named api-reports-cache does not exist on this cache manager! 2011-09-02 10:12:23,282 - INFO - InboundInvocationHandlerImpl - ISPN000067: Will try and wait for cache named api-reports-cache to start 2011-09-02 10:12:53,306 - INFO - InboundInvocationHandlerImpl - ISPN000068: Cache named api-reports-cache does not exist on this cache manager! 2011-09-02 10:12:53,324 - INFO - InboundInvocationHandlerImpl - ISPN000067: Will try and wait for cache named api-reports-cache to start 2011-09-02 10:13:53,392 - INFO - InboundInvocationHandlerImpl - ISPN000068: Cache named api-reports-cache does not exist on this cache manager! 2011-09-02 10:13:53,406 - INFO - InboundInvocationHandlerImpl - ISPN000067: Will try and wait for cache named api-reports-cache to start 2011-09-02 10:14:23,432 - INFO - InboundInvocationHandlerImpl - ISPN000068: Cache named api-reports-cache does not exist on this cache manager! 2011-09-02 10:14:23,448 - INFO - InboundInvocationHandlerImpl - ISPN000067: Will try and wait for cache named api-reports-cache to start 2011-09-02 10:14:53,474 - INFO - InboundInvocationHandlerImpl - ISPN000068: Cache named api-reports-cache does not exist on this cache manager! 2011-09-02 10:14:53,486 - INFO - InboundInvocationHandlerImpl - ISPN000067: Will try and wait for cache named api-reports-cache to start 2011-09-02 10:15:21,554 - ERROR - JGroupsTransport - ISPN000096: Caught while requesting or applying state org.infinispan.statetransfer.StateTransferException: java.io.EOFException: The stream ended unexpectedly. Please check whether the source of the stream encountered any issues generating the stream. at org.infinispan.statetransfer.StateTransferManagerImpl.applyState(StateTransferManagerImpl.java:292) at org.infinispan.remoting.InboundInvocationHandlerImpl.applyState(InboundInvocationHandlerImpl.java:235) at org.infinispan.remoting.transport.jgroups.JGroupsTransport.setState(JGroupsTransport.java:607) at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.handleUpEvent(MessageDispatcher.java:711) at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:771) at org.jgroups.JChannel.up(JChannel.java:1441) at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1074) at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.connectToStateProvider(STREAMING_STATE_TRANSFER.java:523) at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.handleStateRsp(STREAMING_STATE_TRANSFER.java:462) at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.up(STREAMING_STATE_TRANSFER.java:223) at org.jgroups.protocols.FRAG2.up(FRAG2.java:189) at org.jgroups.protocols.FlowControl.up(FlowControl.java:418) at org.jgroups.protocols.FlowControl.up(FlowControl.java:400) at org.jgroups.protocols.pbcast.GMS.up(GMS.java:908) at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:246) at org.jgroups.protocols.UNICAST.handleDataReceived(UNICAST.java:613) at org.jgroups.protocols.UNICAST.up(UNICAST.java:294) at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:703) at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:133) at org.jgroups.protocols.FD.up(FD.java:275) at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:275) at org.jgroups.protocols.MERGE2.up(MERGE2.java:209) at org.jgroups.protocols.Discovery.up(Discovery.java:293) at org.jgroups.protocols.TP.passMessageUp(TP.java:1109) at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1665) at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1647) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.EOFException: The stream ended unexpectedly. Please check whether the source of the stream encountered any issues generating the stream. at org.infinispan.marshall.VersionAwareMarshaller.objectFromObjectStream(VersionAwareMarshaller.java:193) at org.infinispan.statetransfer.StateTransferManagerImpl.processCommitLog(StateTransferManagerImpl.java:215) at org.infinispan.statetransfer.StateTransferManagerImpl.applyTransactionLog(StateTransferManagerImpl.java:237) at org.infinispan.statetransfer.StateTransferManagerImpl.applyState(StateTransferManagerImpl.java:281) ... 28 more Caused by: java.io.EOFException: Read past end of file at org.jboss.marshalling.SimpleDataInput.eofOnRead(SimpleDataInput.java:126) at org.jboss.marshalling.SimpleDataInput.readUnsignedByteDirect(SimpleDataInput.java:263) at org.jboss.marshalling.SimpleDataInput.readUnsignedByte(SimpleDataInput.java:224) at org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:209) at org.jboss.marshalling.AbstractObjectInput.readObject(AbstractObjectInput.java:37) at org.infinispan.marshall.jboss.GenericJBossMarshaller.objectFromObjectStream(GenericJBossMarshaller.java:194) at org.infinispan.marshall.VersionAwareMarshaller.objectFromObjectStream(VersionAwareMarshaller.java:191) ... 31 more
And this are on the instance that provides state:
2011-09-06 12:07:25,725 -- ERROR [::] - RpcManagerImpl - ISPN000073: Unexpected error while replicating java.util.concurrent.TimeoutException: Timed out waiting for a cluster-wide sync to be released. (timeout = 240 seconds) at org.infinispan.remoting.transport.jgroups.JGroupsDistSync.blockUntilReleased(JGroupsDistSync.java:78) at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:436) at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:132) at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:156) at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:265) at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:252) at org.infinispan.remoting.rpc.RpcManagerImpl.broadcastRpcCommand(RpcManagerImpl.java:235) at org.infinispan.remoting.rpc.RpcManagerImpl.broadcastRpcCommand(RpcManagerImpl.java:228) at org.infinispan.interceptors.ReplicationInterceptor.handleCrudMethod(ReplicationInterceptor.java:116) at org.infinispan.interceptors.ReplicationInterceptor.visitPutKeyValueCommand(ReplicationInterceptor.java:79) ...
org.infinispan.CacheException: java.util.concurrent.TimeoutException: Timed out waiting for a cluster-wide sync to be released. (timeout = 240 seconds) at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:145) at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:156) at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:265) at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:252) at org.infinispan.remoting.rpc.RpcManagerImpl.broadcastRpcCommand(RpcManagerImpl.java:235) at org.infinispan.remoting.rpc.RpcManagerImpl.broadcastRpcCommand(RpcManagerImpl.java:228) at org.infinispan.interceptors.ReplicationInterceptor.handleCrudMethod(ReplicationInterceptor.java:116) at org.infinispan.interceptors.ReplicationInterceptor.visitPutKeyValueCommand(ReplicationInterceptor.java:79) at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:77) at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:119)
2011-09-06 12:18:16,295 -- DEBUG - JGroupsDistSync - Failed to acquire exclusive processing lock. Read lock holders are [Thread[RetryQueueProcessor-sessions-cache@DFS1-27866,10,Thread Pools], Thread[RetryQueueProcessor-api-reports-cache@DFS1-27866,10,Thread Pools], Thread[Timer-2,5,main]] 2011-09-06 12:18:16,297 -- ERROR - JGroupsTransport - ISPN000095: Caught while responding to state transfer request org.infinispan.statetransfer.StateTransferException: java.util.concurrent.TimeoutException: STREAMING_STATE_TRANSFER-sender-10,Infinispan-Cluster,DFS1-27866 could not obtain exclusive processing lock after 240 seconds. Locks in question are java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock@19977e8[Read locks = 3] and java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock@8f243d[Unlocked] at org.infinispan.statetransfer.StateTransferManagerImpl.generateState(StateTransferManagerImpl.java:162) at org.infinispan.remoting.InboundInvocationHandlerImpl.generateState(InboundInvocationHandlerImpl.java:253) at org.infinispan.remoting.transport.jgroups.JGroupsTransport.getState(JGroupsTransport.java:590) at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.handleUpEvent(MessageDispatcher.java:690) at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:771) at org.jgroups.JChannel.up(JChannel.java:1484) at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1074) at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER$StateProviderHandler.process(STREAMING_STATE_TRANSFER.java:651) at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER$StateProviderThreadSpawner$1.run(STREAMING_STATE_TRANSFER.java:580) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:619) Caused by: java.util.concurrent.TimeoutException: STREAMING_STATE_TRANSFER-sender-10,Infinispan-Cluster,DFS1-27866 could not obtain exclusive processing lock after 240 seconds. Locks in question are java.util.concurrent.locks.ReentrantR eadWriteLock$ReadLock@19977e8[Read locks = 3] and java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock@8f243d[Unlocked] at org.infinispan.remoting.transport.jgroups.JGroupsDistSync.acquireProcessingLock(JGroupsDistSync.java:104) at org.infinispan.statetransfer.StateTransferManagerImpl.generateTransactionLog(StateTransferManagerImpl.java:189) at org.infinispan.statetransfer.StateTransferManagerImpl.generateState(StateTransferManagerImpl.java:152) ... 11 more 2011-09-06 12:18:16,303 -- DEBUG - STREAMING_STATE_TRANSFER - State writer is closing the socket
There was no load at all, all caches had maybe 5 objects in them
Version 4.2.1.FINAL works without any problems with at least 10 nodes in cluster