1 2 3 Previous Next 80 Replies Latest reply on Jan 27, 2010 4:51 PM by marklittle

    Jboss transaction recovery issue

      Hi, all:

           My scenario:

      gif_1.gif

      Above is my scenario. As we all know, in the two-phase commit process:

      The first step, prepare the XAResources one by one.

      The second, commit the XAResources in order.

      In commit step, I did these:

      There are two XAResources in the transaction, after committing of the oracleResource_1 completed, I stop the database server at Machine_3 before start committing the oracleResource_2. Then there will be an exception, and this transaction need recovery.

      Because of that the oracleResource_1 is committed, so the data is persisted to the file system, it is not possible tp rollback. In other works, for keep the transaction's ACID the oracleResource_2 must be committed too in recovery process. However in the XARecoveryModule the operation is rollback.

      Is this a bug or I understand wrong?

        • 1. Re: It seems that it is a bug

          Above is my scenario. As we all know, in the two-phase commit process:

          The first step, prepare the XAResources one by one.

          The second, commit the XAResources in order.

          In commit step, I did these:

          There are two XAResources in the transaction, after committing of the oracleResource_1 completed, I stop the database server at Machine_3 before start committing the oracleResource_2. Then there will be an exception, and this transaction need recovery.

          Because of that the oracleResource_1 is committed, so the data is persisted to the file system, it is not possible tp rollback. In other works, for keep the transaction's ACID the oracleResource_2 must be committed too in recovery process. However in the XARecoveryModule the operation is rollback.

          Is this a bug or I understand wrong?

          • 2. Re: Jboss transaction recovery issue
            marklittle

            " I stop the database server at Machine_2 before start committing the oracleResource_2. Then there will be an exception, and this transaction need recovery."

             

            Since in your diagram oracleResource_1 is on Machine_2 and you've already said that oracleResource_1 was committed, why is there going to be an exception?

            • 3. Re: Jboss transaction recovery issue
              Sorry, it should be "stop the database server at Machine_3 before start committing the oracleResource_2"...
              • 4. Re: Jboss transaction recovery issue
                marklittle
                OK, so just to make sure there are no further confusions, can you please restate your problem and make sure it is accurate this time? Thanks.
                • 5. Re: Jboss transaction recovery issue

                  Thanks for care my question.

                  In my transactional application, there are two database servers on different machine, and application on my own machine terminates the transaction, so there are 3 machines in all. There are two XAResources in the transaction, these two XAResources are come from two different database servers, which all are ORACLE.

                  First of all, I add one breakpoint in commit() method of class OracleResource, which wrap the class oracle.jdbc.xa.client.OracleXAResource (see in attached src.zip).

                   

                    public void commit() throws NotPrepared, HeuristicHazard, HeuristicMixed,
                     HeuristicRollback {
                    try {
                     System.err.println("Start to commit XA Transaction XID["+xid+"]");
                     xaRes.commit(xid, false);//add breakpoint this line
                     System.err.println("Commit XA Transaction Sucessfully XID["+xid+"]");
                    } catch (XAException e) {
                     System.err.println("Commit XA Transaction Failure XID["+xid+"],Msg["+e.getMessage()+"]");
                     e.printStackTrace();
                    }
                   }

                   

                  Start the program, in two-phases commit protocal,

                  No.1, XAResources participated into the transaction will prepare in order, for example, OracleResource_1 from oracle_1 (machine_2) prepared firstly, OracleResource_2 from oracle_2 (machine_3) prepared secondly. The prepare method of class OracleResource will be invoked.

                  No.2, OracleResource_1 and OracleResource_2 commit in order, the commit method of OracleResource_1 and OracleResource_2 will be invoked. This will hit the breakpoint, because that there are 2 XAResources, so the breakpoint will be hit twice. At the first time, let it step over, but at the second time, I stop the database server oracle_2, then the programe wil throw an excetion, and the transaction is not compeleted.

                   

                  After restart the database server oracle_2, the recovery service will recover the uncompleted transaction. XAResource's recover method will return the uncompleted transaction id. Because OracleResource_1 of this transaction has been commited in No.2, the result is persistent into the database, this can not be compensated, so the transaction returned from oracle_2 must be commited too, if not do this the result of the transction is not Consistency (one property of ACID).

                   

                  However, the recovery service don't do commit but rollback now, so I think this is a problem or a bug.

                   

                  Are we clear now? I am a Chinese, my english is not very good, please understand.

                  • 6. Re: Jboss transaction recovery issue
                    scarceller

                    Shaohua,

                     

                    I also have tested almost the same exact setup. In my case the DBs are DB2v9.5  and Oracle10g in XA mode. I have correctly configured both XAResourceRecovery to use the AppServerJDBCXARecovery.class I know the TM can build the recovery connection(s) just fine because it can rollback in-doubts just fine.

                     

                    But I also have same exact issue in the scenario you describe:

                    - t1.perpare

                    - t2.perpare

                    - t1.commit

                    - network failure to the t2 database at this exact time.

                     

                    - I then power down the app server. (to be 100% sure I have NO active transactions during recovery)

                    - reconnect t2 database

                    - check to see the state of the transactions (ORDERS) in both databases and t1 has one extra order commited while t2 has the matching order in an in-doubt state as should be.

                     

                    Then I simply restart the AppServer with full arjuna debug turned on and can clearly see the TM contact both DBs, t1 says no in-doubts found. then t2 says 1 in-doubt found. The logs indicate it will recover/fix the issue but it doesn't and the in-doubt remains in the t2 DB as well as in the TM logs.

                     

                    Then, I tested the following just for fun:

                    - leave the in-doubt in t2

                    - stop the app server

                    - delete the tran logs, not recommended! just did this as a test. What this does is leave the TM logs empty and the in-doubt in the t2 DB. The XA spec is clear here: in this case during recovery phase2 the TM connects to t2 sees the in-doubt then checks it's logs and sees no match (nothing in the log) and does a rollback as the spec dictates. I then check t2 myself and it has resolved the in-doubt by a rollback. However, this was just a test and in our case results in the wrong action because the matching ORDER was already commited to t1 and not to t2.

                     

                    I reported this issue via this forum some months back. The answers I recieved did not help.

                    In the end I was evaluating JBoss for XA recovery using very detailed set of test cases and once we saw this large failure we decided not to use JBoss.

                     

                    I'm glad I'm not the only one who has hit this issue.

                    Either we have something not configured correctly in the TM or JBoss has a bug.

                    As I said it's only in the case when the only right choice during recovery is a commit (not a rollback). I have NEVER seen the Arjuna TM commit an in-doubt left behind in either DB (I've swapped DB2 and Oracle).

                    Also of intrest is that the detailed arjuna TRACE seems to indicate it thinks it should do a rollback which is clearly wrong. So I'm glad it results in not doing anything because if it really did do a rollback on t2 then the ORDER would be sitting in t1 (shipping system) but not in t2 (billing system).

                     

                    I then decided to test another app server (diffrent vendor) against the same EXACT DBs and all worked well. I did this to verify that I was not having a DB issue.

                     

                    Now that I described my failure, does this sound exactly like yours?

                    I have full arjuna TRACE output logs.

                     

                    The other tool I saw was a Java Swing GUI ArjunaToolsFrameWork in the appserverdir\docs\examples\transactions\jbossts-tools.sar and the readme seems to indicate this tool is for looking at the TM Logs. I tried to run this tool but can't figure out howto load it? I think this tool could uncover some more details by simply seeing what's in the TM logs and what state it thinks the failed tran is in.

                    Would like to get that GUI working, does anyone know where the docs might be to start that tool? or howto load it?

                     

                    Hope this helps you.

                    • 7. Re: Jboss transaction recovery issue
                      jhalliday
                      Please send me the trace, along with the test case if it's easy to do so, and exact details of the versions (AS, Oracle, drivers) you are using. The existing crash rec test cases we have pass except for mysql (known limitation in their xa impl) and MSSQL (broken drivers) so I'm a bit baffled to see this on other dbs, but if I have enough information to recreate it I should be able to sort it out.
                      • 8. Re: Jboss transaction recovery issue
                        scarceller

                        see this old thread http://community.jboss.org/thread/145372

                        last 2 posts.

                        It has the trace details from a run several months ago.

                         

                        That trace was after a server cold re-start so NO active trans where in the TM at the time of recovery.

                         

                        The test case is simple cause a failure before the commit to the second DB. In this case the work has commited to DB1 and the only choice during recovery is to commit it to DB2 and it simply never works.

                         

                        Could you think of any reason like some wrong setting in the JDBC def or the TM config that would cause this?

                         

                        EDIT:

                        The DBs are

                        DB2 V8.1 with java drivers from that exact DB

                        Oracle 10g with drivers from that exact DB

                         

                        Also these same exact drivers works on other AppServer just fine.

                        • 9. Re: Jboss transaction recovery issue
                          scarceller

                          Any tips on starting the Java Swing GUI to inspect the tran logs?

                          I think this tool could really be useful.

                           

                          Thanks.

                          • 10. Re: Jboss transaction recovery issue
                            jhalliday

                            Ahh yes, I remember now the trace has the dodgy com.arjuna.ats.internal.jta.recovery.info.rollingback statements. Try AS trunk which has the new JTA with clearer trace logging, or run under a debugger to see what it is actually doing. Based on the behavior described I'd hazard a guess at JBTM-602 in which case dropping in the TS 4.6.1.CP03 jars files may help with the behavior although not with the dodgy log output.

                            • 11. Re: Jboss transaction recovery issue
                              scarceller

                              I see latest is JBossTS 4.9.0.GA , does this have the fix JBTM-602?

                               

                              I can't find TS 4.6.1.CP03, I'm guessing the latest 4.9.0.GA should but I don't want to be wrong.

                               

                              I'll run the test again if I can figure out what JTA to download.

                               

                              Thanks.

                              • 12. Re: Jboss transaction recovery issue
                                jhalliday

                                yes, but 4.9 integrates with AS 6, not AS 5.1.

                                 

                                http://repository.jboss.com/maven2-brew/jboss/jbossts/

                                • 13. Re: Jboss transaction recovery issue
                                  mmusgrov

                                  scarceller wrote:

                                   

                                  Any tips on starting the Java Swing GUI to inspect the tran logs?

                                  I think this tool could really be useful.

                                   

                                  Thanks.

                                   

                                  There should be a file called INSTALL in the distribution that describes how to install the product. The last section says how to run the embedded tools (basically the tools are packaged as a sar file - just drop the sar into the app server deploy directory and then navigate to the EmbeddedTools service in the JMX console). The JBossTS core manual shows how to use it but it is quite straight forward - it shows the Object Store hierarchy as a tree view from which you can drill down to see details of individual transactions.

                                   

                                  Note that this tool is provided as is. We are working on providing similar functionality as a JMX MBean so the Swing version of the tool will eventually be removed.

                                  • 14. Re: Jboss transaction recovery issue
                                    scarceller

                                    Thanks for the link.

                                     

                                    I downloaded the jbossjts-4.6.1.GA_CP03.jar

                                     

                                    Is it as simple as replacing the existing jbossjts.jar in the jboss-as\client and jboss-as\common\lib with this new version?

                                     

                                    Is this all that I need to try?

                                     

                                    Thanks.

                                    1 2 3 Previous Next