1 2 3 4 5 Previous Next 74 Replies Latest reply on May 17, 2006 1:02 AM by mskonda Go to original post
      • 60. Re: XARecovery: Messaging integration with JBoss Transaction

        yup

        • 61. Re: XARecovery: Messaging integration with JBoss Transaction
          marklittle

          "timfox" wrote : I feel like we're going around in circles :)
          |

          You're tell me ;)

          When I get out of this J1 session, I'll see if I can write the whole thing up again in a single entry. All I can say now is that this has NOTHING to do with XAResourceRecovery ;-)

          anonymous wrote :
          | It's my understanding that we cannot guarantee that the tm has a serialized XAResource since the tm node might have failed after the transaction on the server was prepared and logged but before it was logged on the TM node, so relying on serialized XAResources is not an option.
          |

          That's true, but you should implement an entire solution which works in other failure cases too.

          anonymous wrote :
          | Therefore we need to provide an XAresource somehow that allows the TM to call recover() so it can get a list of prepared txs on the remote node so it can call commit() / rollback() as appropriate.
          |

          No, you need to implement an XARecoveryModule that is run by the recovery manager on the machine where your XAResource state resides. This recovery module will scan your datastore and recreate the XAResource state (somehow - that'll be up to you) and then figure out whether to commit or roll it back (by calling the transaction it was registered with).

          anonymous wrote :
          | It was my understanding that XAResourceRecovery is a way of providing that XAResource to the TM.

          It is, but only if the TM knows it needs one. In this case it doesn't because there is no TM log entry. What you need to tie into is the recovery subsystem, which is separate from the TM.

          • 62. Re: XARecovery: Messaging integration with JBoss Transaction
            timfox

            What is so different about our XAResources that we can't use the standard XARecoveryManager?

            • 63. Re: XARecovery: Messaging integration with JBoss Transaction
              timfox

               

              "mark.little@jboss.com" wrote:


              When I get out of this J1 session, I'll see if I can write the whole thing up again in a single entry.


              :) That would be great. I have to say that right now I am utterly confused...

              • 64. Re: XARecovery: Messaging integration with JBoss Transaction
                marklittle

                 

                "timfox" wrote:
                mskonda - right, so what you have implemented is basically what I was expecting.

                At this point, I don't really understand what is wrong with the way you have done it.


                Well if it's based on XAResourceRecovery and uses it in the way you think it works, that'll be the reason it doesn't work ;-)

                • 65. Re: XARecovery: Messaging integration with JBoss Transaction
                  marklittle

                  Here is the scenario for all possible recovery cases (irrespective of what the XAResource implementation looks like). This won't talk about JBossTS specifics. I want to get the description of recovery out first, then I'll post a separate entry on how it works in JBossTS (probably later in the day, as I've got J1 stuff to do.)

                  For the avoidance of doubt, keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" are to be interpreted as described in RFC2119 [S. Bradner, "Key words for use in RFCs to Indicate Requirement Levels," RFC 2119, Harvard University, March 1997 ].

                  When a participant is enrolled with the transaction, it MAY get a reference to the transaction coordinator in order that it MAY then call back to the coordinator later to determine the transaction outcome (e.g., if it has not received a termination notification in some time period.) If the participant does not receive a reference to the coordinator then in some recovery situations it has no choice but to either way for the coordinator to eventually inform it of the outcome (which may never happen if the coordinator does not have a reference to the participant), or it can be flagged to some administrator to deal with manually. This would require the sys admin to be able to figure out which transaction the participant was enrolled with in the first place and then to see what happened. The sys admin knows that if the transaction is not running then it MUST have rolled back (we use presumed abort). If the transaction is running, then the sys admin SHOULD wait and do nothing.

                  If a participant fails before it gets prepare, then there is no recovery for it or the coordinator to do. Presumed abort semantics mean that the coordinator will roll back the transaction and the participant can unilaterally do likewise (if there is any representation of the participant after the failure is fixed).

                  If the participant received a prepare message then it MUST have recorded its after-commit state durably. Along with this state MAY be a reference to the coordinator, as mentioned above.

                  Likewise, if all of the prepare phase completes successfully, the transaction coordinator will write a log entry containing references to the participants.

                  Now if we have a failure, it could happen such that the participant and transaction are fully prepared (scenario A), or only the participant is prepared (scenario B).

                  In scenario A, we MAY have both top-down and bottom-up recovery. If the participant has a reference to the coordinator, it can call up and ask about the status. If the coordinator has not yet recovered, then the bottom-up recovery needs to back off and try again. Obviously this back off could keep happening, so some manual intervention may be necessary in the edge case (e.g., recovery still hasn't happened after 10 years!) In the top-down recovery case, the coordinator goes through its intensions list (the log) and informs each participant that they should commit (if there's a log entry, then the outcome MUST be to commit, with presumed abort). If any participant doesn't answer, then the entry for them remains in the log and recovery will try again periodically. Participants who do respond are pruned from the log.

                  In scenario B, we can only have bottom-up recovery because the coordinator hasn't written anything about the participant in the log (there is no log). In this case, it's all down to the participant and the amount of information it recorded during prepare. If it didn't save the coordinator reference, then there is nothing that can be done automatically: the participant (and any recovery system) cannot determine the transaction outcome because without a coordinator reference, you can't differentiate this case from rollback or commit - either is just as likely. Hence manual intervention is necessary.

                  If the participant did record the coordinator reference, then it SHOULD use it to ascertain the transaction outcome. It'll either be told that the transaction is committing or that it has rolled back. It can then act accordingly.

                  Hopefully we all agree that this summary is good enough to analyse the problem against?

                  • 66. Re: XARecovery: Messaging integration with JBoss Transaction
                    marklittle

                    BTW, scenario B without a coordinator reference is pretty much what XA gives you in general. It's not presumed abort. Many XA participants in this case will unilaterally abort after a set period of time, which can result in heuristic outcomes. We should try to avoid that here.

                    • 67. Re: XARecovery: Messaging integration with JBoss Transaction
                      marklittle

                      Oh, and a transaction id (XID) is not the same thing as a coordinator reference ;-)

                      • 68. Re: XARecovery: Messaging integration with JBoss Transaction
                        timfox

                         

                        "mark.little@jboss.com" wrote:

                        Hopefully we all agree that this summary is good enough to analyse the problem against?


                        Looks good :)

                        • 69. Re: XARecovery: Messaging integration with JBoss Transaction
                          timfox

                          Since we don't record references to the co-ordinator, then for Scenario B I was assuming that we just mark transactions as in doubt after a certain amount of time, then the sysadmin can decide what to do with them.

                          For Scenario A, JBoss TS will have the XAResource serialized in it's logs (if we make it serializable) so it can just deserialize it and call commit/rollback as appropriate?

                          • 70. Re: XARecovery: Messaging integration with JBoss Transaction

                            Can I interject and attempt to refocus this? ;-)

                            I think 7! pages is enough dicussion, :-)

                            What is required is a roadmap that produces
                            a set of tests that shows this stuff working.

                            As an aside, we also need to cope with
                            scenarios where JBossTS is NOT the transaction manager.
                            Things like XAResourceRecovery or serializable XAResources while
                            useful as building blocks are not the full solution since neither
                            is part of the spec.
                            The fundamental issue being that there is no JTA spec equivalent of
                            XAResource.open(...)

                            I'd break the issue down to a number of problems/issues:

                            1) How does JBoss Messaging's XAResource integrate into a
                            transaction manager's recovery mechanism?
                            2) What features and information should JBoss Messaging
                            provide to cater for recovery heuristics or the "bottom up" recovery?
                            3) Do we have tests that shows all this works?

                            Possible answers:

                            1) Implement both a serializable XAResource and a JBoss Message
                            specific XAResourceFactory. If the underlying TM does not handle
                            serializable XAResources then the XAResourceFactory can
                            be used to build TM specifc integrations like XAResourceRecovery

                            2 Information about inflight transactions needs to be exposed
                            at the management layer, including configurations for placing
                            transactions "in doubt" and options for the admin to take decisions.

                            3) Somebody needs to write the basic tests.

                            Issues:

                            1a) The XAResourceFactory/Recovery is really a stategy thing.
                            For example mskonda assumes that naming is configured correctly
                            and there is a predefined name.
                            In the case where the machine is talking to two different JMS clusters
                            or different jndi bindings, this solution is not going to work.

                            1b) If you are going to provide a serializable XAResource, you can
                            include the connection information in the serialized form.
                            However, this also has the possiblity that this information
                            may become orphaned from the server/cluster.
                            e.g. Transaction prepared against cluster nodes A, B
                            but sometime later at recovery the cluster is made up of C, D, E
                            cf. http://wiki.jboss.org/wiki/Wiki.jsp?page=RetryInterceptor

                            2) The "bottom up" recovery needing a reference to the co-ordinator
                            and retrieving the state is another "out of spec" feature.
                            The JTA spec only provides a XID to the XAResource.
                            The tx context might not be the same transaction.
                            The tx context might not be serializable/referenceable.

                            3) The tests would be better written against a basic "mock" TM,
                            JBoss Messaging needs to work against a TM with a minimal feature set.
                            Testing against JBossTS is an issue for additional integration tests.

                            • 71. Re: XARecovery: Messaging integration with JBoss Transaction
                              marklittle

                               

                              "adrian@jboss.org" wrote:
                              Can I interject and attempt to refocus this? ;-)

                              <lots of text deleted> ;-)



                              All of this is fine and good and correct: there is no standard way to approach this. At no time over the past 40 years (even predating XA) has there been a standard way in which to drive recovery (even ignoring a distributed situation). However, I hope the discussion as a whole has been useful and maybe someone should take the time to capture the gist of the topic (including <lots of text deleted>) in a wiki for messaging - it will be useful from a conceptual perspective when designing and implementing the solution.

                              • 72. Re: XARecovery: Messaging integration with JBoss Transaction
                                ovidiu.feodorov

                                If somebody really, really wants to write the wiki, then I'll yield, otherwise, I'll give it a try :)

                                • 73. Re: XARecovery: Messaging integration with JBoss Transaction
                                  marklittle

                                  I've written far too much about transactions and recovery over the past 20+ years. I'll step back from this one ;-)

                                  • 74. Re: XARecovery: Messaging integration with JBoss Transaction

                                    Think I can try :)

                                    1 2 3 4 5 Previous Next