I upgraded to Infinispan 5.1.3.FINAL, and got the following error message in the log intermittently. Basically, T00696119-60477 is the current node reporting the error, and T00696119-53825 is a new node joining the cluster. It looks like before the new node joining the view, the current node received messages from the new node and rejected it. Is there a way to avoid this error?
|20:37:05,154 | ERROR | OOB-15,null||| UNICAST||| 157 - com.ge.energy.ssi.core.datagrid.core - 2.0.0.rc1 | T00696119-60477: sender window for T00696119-53825 not found|
|20:37:05,247 | INFO | Incoming-8,null | JGroupsTransport||| - - | ISPN000094: Received new cluster view: [T00696119-60477|57] [T00696119-60477, T00696119-53825]|
I don't suppose you can reproduce this, can you ?
Without more information, here's why this could happen: let's call the members A and B.
- A sends a few unicasts to B
- B receives them, but the first unicast from A is dropped and will later get retransmitted
- B drops the unicast and asks A for its first unicast
- A receives the request and tries to furnish the first unicast, but meanwhile the table was dropped. This can happen on a view change (in which B was not a member)
- A logs the error message (I changed this to a WARN, as it shouldn't affect the system)
- When A sends another unicast to B, it will re-establish the connection; this time both A and B will have the same connection-id
This *should* not effect the correctness of the system. Would be nice if you can reproduce it though, to make sure I'm right on this...
I couldn't use a simple example to reproduce this issue. However, I did reproduce it frequently with our production system and a testing program. This happens to Infinispan 5.1.3. When I reverted to 5.1.0, it disappeared.
I am not sure if what you described is my case, but it may be, as it happened around the time of a view change. And I don't think it effect the correctness of the system either. In which version of JGroups you changed it to a WARN? As a WARN makes me feel more comfortable with it, and easier to convince our QA team.