9 Replies Latest reply on Oct 7, 2009 6:39 AM by prange

    ProducerWindow and reconnection

    prange

      Hi,

      i have a client that automatically reconnects when a feilurelistener detectes a failure on a session. I am doing this manually because the client has to reconnect even if the server has been restarted, and the session cannot reconnect to the orginal session on the server.

      I am also using the SendAcknowledgementHandler to make sure all my messages reach the server, in case any ar lost during a connection failure.

      To make sure that evert message is acked as soon as it arrives i have set the producerWindowSize to 0 and i have set all blocking to false to enable hight throughput.

      When i restart the server my failurehandler gets called, but the thread that sent the message hangs on the send methof of the ClientSession. If i set the window to -1, i can reconnect, but the messages are not being acked. Am i doing something wrong?

        • 1. Re: ProducerWindow and reconnection
          timfox

          What version are you using?

          Setting producer window size to zero, is going to make things go really slow!

          I don't understand why you are doing that. Can you explain your reasoning more?

          • 2. Re: ProducerWindow and reconnection
            prange

            I am using BETA5.

            I want to ack every message every time so that i know that only the unsent messages are beeing resent after a reconnection. If the ack is being batched up, some unacked messages may have arrived at the server before the shutdown, and when the server comes back up, some messages that already have been sent are being resent.

            Looking at the code i suspect (but am not sure) that this can in fact happen regardless of window size: If the server goes down just after the buffer is full, the client blocks waiting for an ack (forever ?).

            • 3. Re: ProducerWindow and reconnection
              prange

              Another way of explaining why i am doing this is that server uptime is not expected to be high, so we have to be sure every message is sent once even if the server goes down, and the client is not able to reconnect automatically.

              • 4. Re: ProducerWindow and reconnection
                timfox

                You're misunderstanding how blocking on sends and send acknowledgements work.

                IIRC this is all discussed in the following chapter in the user manual:

                http://hornetq.sourceforge.net/docs/hornetq-2.0.0.BETA5/user-manual/en/html/send-guarantees.html

                But I will summarise it here:

                If you want to know if a message you sent has reached the server or not, there are two ways to do this:

                1) Set block-on-persistent-send (or block-on-non-persistent-send) to true.

                If you set this to true then, by the time your call to send() has returned successfully you know that the message has definitely reached the server and been persisted (if appropriate).

                However if you send() call throws an exception, because the connection or server has failed then you don't actually know if that exception occurred after your message was persisted or before it was persisted.

                In this case, you can enable duplicate detection (see chapter in user manual) and resend the message. If it has been persisted the server will just ignore it when it receives it again.

                Blocking on send is the standard way most messaging systems work. Problem is it is slow since it requires an entire network round trip on every message sent

                2) Using asynchronous send acknowledgements.

                In this method we can get the same guarantees as 1) but without having to do a network round trip on every send().

                The idea here is as you send messages you store them in a list on the sender, then you send them with block-on-persistent-send set to *false*. This means you can send many messages in quick succession without waiting for a network round trip on each one ==> much faster!

                Then, some time later the server calls your send acnknowledgement handler saying "message 1234" has been received and persisted on the server. At which point you can remove your message from your list and ack it to the source you obtained it from.

                You don't (and you shouldn't) change producer window size to do this.

                Is this any clearer now?

                • 5. Re: ProducerWindow and reconnection
                  timfox

                  BTW you will also need duplicate detection turned on for 2)

                  • 6. Re: ProducerWindow and reconnection
                    prange

                    I think i got it, i am just horrible at explaining what i mean.

                    But what i didn't think of was this:


                    However if you send() call throws an exception, because the connection or server has failed then you don't actually know if that exception occurred after your message was persisted or before it was persisted.


                    Luckily i just found the part of duplicates detection, and that will solve all my problems.


                    But i am still curious: When the window is full, the client blocks while waiting for an ack for the last message. But if the server goes down just after the last send before the buffer became full, the ack will never come. I could not find any timeout in the code waiting for the condition. Will the client wait forever? Is there a tiny window where the client might hang forever?

                    • 7. Re: ProducerWindow and reconnection
                      timfox

                       

                      "prange" wrote:
                      I think i got it, i am just horrible at explaining what i mean.

                      But what i didn't think of was this:

                      However if you send() call throws an exception, because the connection or server has failed then you don't actually know if that exception occurred after your message was persisted or before it was persisted.


                      Luckily i just found the part of duplicates detection, and that will solve all my problems.


                      That depends if you want the best performance or not.

                      If you're not too bothered about perf then just send blocking and use duplicate detection. This is what most messaging systems do (actually not all even have duplicate detection).

                      If you want the best performance *and* 100% guaranteed acknowledgement that messages reached the server then use send acknowledgements. This is a feature unavailable in other messaging systems, and allows HornetQ performance to far surpass them.


                      But i am still curious: When the window is full, the client blocks while waiting for an ack for the last message. But if the server goes down just after the last send before the buffer became full, the ack will never come. I could not find any timeout in the code waiting for the condition. Will the client wait forever? Is there a tiny window where the client might hang forever?


                      That should never happen - if you can create a test case demonstrating it happening, you could create a JIRA and we'll will fix it.

                      • 8. Re: ProducerWindow and reconnection
                        prange

                        The combination of (persistent) duplicate detection and acks are fantastic. I aree with your bold statement:


                        This is a feature unavailable in other messaging systems, and allows HornetQ performance to far surpass them

                        Not only outperform on performance, but also on developing speed!

                        • 9. Resolved: Re: ProducerWindow and reconnection
                          prange

                          Good job, thanks