7 Replies Latest reply on Jun 3, 2014 11:20 AM by tomjenkinson

    Problem during recovery aborted transaction - recovery changes status to commited

    tomasz.lewandowski

      I have a simple MDB (JBoss 7.1 with jbossts-4.16.5) that consumes a message from HornetQ (2.2.14) and inserts it into database (PostgreSQL 9.1.10) all in global transaction. Most "crash" situations are handled ok, but there is a problem when database crashes just after preparing transaction but before sending response to TM. TM decides to abort the transaction. First it rollback message reception, then it tries to rollback database branch but database is still unavailable so it writes log file with status ABORTED. So far it's looks fine.

      Problem starts during recovery. AtomicActionRecoveryModule checks status of this transaction in ObjectStore using com.arjuna.ats.arjuna.recovery.ActionStatusService.getObjectStoreStatus(Uid, String) and get ActionStatus.COMMITTED (sic!), because this method returns ActionStatus.COMMITTED when log file exists.

      I think that ActionStatusService.getObjectStoreStatu should check action status in file contents and not just check if file exists. Can you explain this situation, please?

        • 1. Re: Problem during recovery aborted transaction - recovery changes status to commited
          tomjenkinson

          Hi Tomasz,

           

          Its a nuance of the implementation. The only way it would be an issue is if we were to then go and call commit on the database XAResource. This would be impossible though as we are write-behind logger, i.e. as the database was not fully prepared, our save_state method will not record information about it (it should still be in the pending list if you have it in the debugger).

           

          There are two statuses that are in play here the transaction status (ABORTED - the transaction has rolled back) and the object store file system status (COMMITTED - there is a file).

           

          Hope that helps,

          Tom

          • 2. Re: Problem during recovery aborted transaction - recovery changes status to commited
            tomasz.lewandowski

            Thanks for explanations.

            You are right that recovery doesn't call commit on XAResource, but it also doesn't call rollback on it, as I would expect. It doesn't call rollback because recovery always calls phase2Commit if only object store file system status is COMMITTED (which is true in this case).

            In my case there is a problem in committing the transaction for the first time because database prepared transaction and went down just after it. As one of prepare failed, TM executes phase2Abort, rollback's HornetQ, and tries to rollback postgres. Postgres is down so JDBC driver returns XAException.XAER_RMERR, which TM treats as HEURISTIC_HAZARD and inserts into heuristicList. I assume that during recovery it should repeat pahes2Abort, but it doesn't.

            My question is - is it:

            1) JDBC driver bug (because it should return something different then XAException.XAER_RMERR in this case)?

            2) TM bug, as it should replay phase2Abort instead of phase2Commit during recovery?

            3) a feature and I should live with it ?

            • 3. Re: Problem during recovery aborted transaction - recovery changes status to commited
              tomjenkinson

              Hi Tomasz,

               

              If its stored as a heuristic it is required that an administrator should go in and manually look at the logs and resolve it themselves. That said, I can't see the bit of code that treats XAException.XAER_RMERR as HEURISTIC_HAZARD. I am assuming you have it in the debugger, can you point out in this file where it treats converts to a HEURISTIC_HAZARD: narayana/ArjunaJTA/jta/classes/com/arjuna/ats/internal/jta/resources/arjunacore/XAResourceRecord.java at master · jbosst… If its not in that file it could be in BasicAction.


              4.16.5 is really old so I can't be certain this isn't something we have fixed sorry. Can you try to replicate it in WildFly 8.1.0?


              Thanks,

              Tom

              • 4. Re: Problem during recovery aborted transaction - recovery changes status to commited
                tomasz.lewandowski

                In com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord.topLevelAbort()

                there is a line "_theXAResource.rollback(_tranID);" that calls rollback and a catch for XAException which maps (in switch block) XAException.XAER_RMERR and XAException.XA_HEURHAZ to TwoPhaseOutcome.HEURISTIC_HAZARD. The _prepared atribute is set to true as it is set just before prepare.

                Nothing changed in newer releases for this case.

                • 5. Re: Problem during recovery aborted transaction - recovery changes status to commited
                  tomjenkinson

                  _prepared is set before we call prepare as it may/may not be prepared in the resource manager.

                   

                  You were onto something when you said the driver was incorrect to return RMERR in this scenario:

                   

                  This is from the XA spec:

                  [XAER_RMERR]

                  An error occurred in rolling back the transaction branch. The resource manager is

                  free to forget about the branch when returning this error so long as all accessing

                  threads of control have been notified of the branch’s state.

                  [XAER_RMFAIL]

                  An error occurred that makes the resource manager unavailable.

                   

                  We are treating it as a heuristic as we just don't know what the resource manager has done in this circumstance. I don't think there is much we can do. The only valid response for the resource manager is XAER_RMFAIL. You can probably patch the driver using byteman if you wanted to try that?

                   

                  Thanks,

                  Tom

                  • 6. Re: Problem during recovery aborted transaction - recovery changes status to commited
                    tomasz.lewandowski

                    Great thanks,

                    After patching JDBC driver (as you suggested) this situation is handled fine - TM doesn't create log file (as pendingList, preparedList and heuristicList are all empty), database holds prepared transaction untill recovery kicks in and do bottom up recovery.

                    • 7. Re: Problem during recovery aborted transaction - recovery changes status to commited
                      tomjenkinson

                      Great news - if you file an issue with PostGres please do include the link on here so those interested may follow it.

                       

                      Thanks!

                      Tom