2 Replies Latest reply: Apr 25, 2012 1:12 PM by Lin Ye RSS

Error message from JGroups

Lin Ye Novice

I upgraded to Infinispan 5.1.3.FINAL, and got the following error message in the log intermittently. Basically, T00696119-60477 is the current node reporting the error, and T00696119-53825 is a new node joining the cluster. It looks like before the new node joining the view, the current node received messages from the new node and rejected it. Is there a way to avoid this error?

20:37:05,154 | ERROR | OOB-15,null  | UNICAST                      | 157 - com.ge.energy.ssi.core.datagrid.core - 2.0.0.rc1 | T00696119-60477: sender window for T00696119-53825 not found
20:37:05,247 | INFO  | Incoming-8,null  | JGroupsTransport             |  -  -  | ISPN000094: Received new cluster view: [T00696119-60477|57] [T00696119-60477, T00696119-53825]
  • 1. Re: Error message from JGroups
    Bela Ban Master

    I don't suppose you can reproduce this, can you ?

     

    Without more information, here's why this could happen: let's call the members A and B.

    - A sends a few unicasts to B

    - B receives them, but the first unicast from A is dropped and will later get retransmitted

    - B drops the unicast and asks A for its first unicast

    - A receives the request and tries to furnish the first unicast, but meanwhile the table was dropped. This can happen on a view change (in which B was not a member)

    - A logs the error message (I changed this to a WARN, as it shouldn't affect the system)

    - When A sends another unicast to B, it will re-establish the connection; this time both A and B will have the same connection-id

     

    This *should* not effect the correctness of the system. Would be nice if you can reproduce it though, to make sure I'm right on this...

  • 2. Re: Error message from JGroups
    Lin Ye Novice

    I couldn't use a simple example to reproduce this issue. However, I did reproduce it frequently with our production system and a testing program. This happens to Infinispan 5.1.3. When I reverted to 5.1.0, it disappeared.

     

    I am not sure if what you described is my case, but it may be, as it happened around the time of a view change. And I don't think it effect the correctness of the system either. In which version of JGroups you changed it to a WARN? As a WARN makes me feel more comfortable with it, and easier to convince our QA team.