1 2 3 Previous Next 35 Replies Latest reply: Oct 20, 2010 5:24 PM by henk de boer Go to original post RSS
  • 16. Re: when to use xa-datasource
    arjan tijms Novice

    testrot wrote:

     

    http://docs.redhat.com/docs/en-US/JBoss_Enterprise_Application_Platform/5/html-single/Administration_And_Configuration_Guide/index.html#id2666889

     

    Table 9.2

     

    com.arjuna.ats.arjuna.coordinator.commitOnePhase

    Thanks a lot! Now why didn't this show up in Google It does contain a lot of the keywords I searched for.

     

    Anyway, the default value thus seems to be true (YES)"

     

    com.arjuna.ats.arjuna.coordinator.commitOnePhase
    YES
    Determines whether the transaction manager automatically applies  the one-phase commit optimization to the transaction completion  protocol, when only a single resource is registered with the  transaction. Enabled by default to prevent writing transaction logs  needlessly.

     

    I think the previous question then remains interesting. If the one-phase commit optimization is enabled by default, is there still any extra overhead when using the xa-datasource as the only participant in a transaction?

  • 17. Re: when to use xa-datasource
    Jonathan Halliday Master
    If the one-phase commit optimization is enabled by default, is there still any extra overhead when using the xa-datasource as the only participant in a transaction?

     

    no, the execution path is pretty much identical.

  • 18. Re: when to use xa-datasource
    arjan tijms Novice

    Jonathan Halliday wrote:

     

    no, the execution path is pretty much identical.

     

    Ok, thanks for the confirmation.

     

    One difference that I noted though. With an xa-datasource, I get the following log lines in my console every 7:29 minutes:

     

    14:28:49,780 INFO  [STDOUT] DriverManager.getConnection("jdbc:postgresql://my.testdomain:5432/dev_db?loginTimeout=0&socketTimeout=0&prepareThreshold=5&unknownLength=2147483647&tcpkeepalive=false")
    14:28:49,781 INFO  [STDOUT]     trying driver[className=sun.jdbc.odbc.JdbcOdbcDriver,sun.jdbc.odbc.JdbcOdbcDriver@1f6dd397]
    14:28:49,781 INFO  [STDOUT] *Driver.connect (jdbc:postgresql://my.testdomain:5432/dev_db?loginTimeout=0&socketTimeout=0&prepareThreshold=5&unknownLength=2147483647&tcpkeepalive=false)
    14:28:49,781 INFO  [STDOUT]     trying driver[className=org.postgresql.Driver,org.postgresql.Driver@53c12192]
    14:28:49,929 INFO  [STDOUT] getConnection returning driver[className=org.postgresql.Driver,org.postgresql.Driver@53c12192]
    

     

    This repeats for exactly the amount of connections I have configured as min-pool-size in the datasource configuration. If I change xa-datasource in my -ds.xml file to local-tx-datasource, and leave the min/max pool settings exactly the same, this doesn't happen.

  • 19. Re: when to use xa-datasource
    henk de boer Master

    Jonathan Halliday wrote:

     

    no, the execution path is pretty much identical.

     

    Which thus basically means an application can pretty much use an xa-datasource as its default, can't it?

     

    Most tutorials and even the JBoss in action book mentioned in this thread make it sound like you should always use local-tx-datasource as the default and use xa-datasource only if you really need it. A lot then go into some detail that you probably don't need XA and that's is very heavyweight etc. I think a lot of developers have become scared of XA because of that.

  • 20. Re: when to use xa-datasource
    Jonathan Halliday Master
    Which thus basically means an application can pretty much use an xa-datasource as its default, can't it?

     

    Only if the underlying database actually supports it.

    Most tutorials and even the JBoss in action book mentioned in this  thread make it sound like you should always use local-tx-datasource as  the default and use xa-datasource only if you really need it.

    If you're considering the problem as one of local-tx-datasource vs. xa-datasource configuration then you're approaching it the wrong way. The real issue is an application design one: distributed data consistency based on XA or based on custom application logic. It breaks down into

     

    a) apps with only one datasource, in which case there is nothing to keep consistent with and hence no appreciable difference between XA and non-XA at runtime, since XA will auto optimise to 1PC. You'll pay a slight performance overhead for the extra round trips to the db, but it's unlikely to be significant unless your transactions are very small e.g. a single sql statement. In most ORM i.e. JPA based tx you'll already be making multiple round trips to the db for SQL execution and two more won't make an appreciable difference. Unfortunately many microbenchmarks don't use the actual business logic transactions in comparisons, making the XA overhead appear larger than it typically is in practice. That said, there is no advantage to be gained from that slight overhead, except is so far as it may make life easier for whoever has to manage the app server config.

     

    b) apps with two datasources where ease of development trumps runtime performance, in which case XA is the way to go. This is most of them.

     

    c) apps with two datasources where the runtime overhead of a 2PC is intolerable and the pain of developing and testing custom data consistency management code is thus inescapable. This is a small minority of cases.

     

    XA is heavyweight only at runtime, not at design or test time. And at runtime it's heavy only if it is needed - it will largely optimise out in 1PC or read only cases.

     

    Many developers rule out XA based on hearsay without actually trying it or fully costing the custom alternatives. The price of dev and test time being what it is, it's often more economic to throw additional hardware at the problem and use XA rather than rolling your own tx protocol or rearchitecting the app to not require one at all. The only cases where that won't work are ones where scaling and latency requirements makes it impossible. With the exception of some financial services apps I've yet to come across a real world use case where a proper cost benefit analysis has ruled against XA. Frankly when it comes right down to the bottom line, most apps just are not as large or performance critical as their developer's egos may imagine.

     

    The rules are simple:

     

    If you app only uses a single datasource, use local-tx

    If your app uses two datasources, use xa-datasource

    If your xa-datasource app is too slow in performance testing and the transaction overhead is identified as the bottleneck, tune the app server. If it's still too slow, roll your own tx management solution.

     

    It's the last step you should really be afraid of and try hard to avoid. Getting data consistency protocols right is hard. Leave it to the specialists if you possibly can - a robust transaction management solution is many person years of dev and test effort. XA is thirty seconds of xml wiring, although you do still need to test your crash recovery configuration.

  • 21. Re: when to use xa-datasource
    henk de boer Master

    Jonathan Halliday wrote:

     

    Which thus basically means an application can pretty much use an xa-datasource as its default, can't it?

     

    Only if the underlying database actually supports it.

     

    Of course, but are there any real world databases today that don't support this? But I hear you and it's a valid point.

     

    You'll pay a slight performance overhead for the extra round trips to the db, but it's unlikely to be significant unless your transactions are very small e.g. a single sql statement.

     

    With this you mean the extra BEGIN; and COMMIT; sql statements, right?

     

    It breaks down into

     

    a) apps with only one datasource [...]

    b) apps with two datasources where ease of development trumps runtime performance [...]

    c) apps with two datasources where the runtime overhead of a 2PC is intolerable[...]

     

    Maybe there's a d:

     

    d) apps with two datasources where ease of development trumps runtime performance and where one of those datasource is used in the overwhelming majority of the cases.

     

    I've seen many applications that have a so-called "main database" where most of the activity takes place, but where occasionally some operations are done that involve a second datasource. In that case, it may still be an optimization (albeit as you indicated a small one) to declare your primary datasource as local-tx?

  • 22. Re: when to use xa-datasource
    Jonathan Halliday Master
    With this you mean the extra BEGIN; and COMMIT; sql statements, right?

    More or less. The actual implementation is an XA start (for begin) and an XA end. Depending on the driver impl and db's wire protocol it is possible neither of those is actually sql - they may bypass the parse step and definitly will bypass the query plan step that regular sql has to go through. The commit round trip is not extra - it exists in the non-XA case too, as ORM solutions use the driver with autocommit off.

    I've seen many applications that have a so-called "main database" where most of the activity takes place, but where occasionally some operations are done that involve a second datasource. In that case, it may still be an optimization (albeit as you indicated a small one) to declare your primary datasource as local-tx?

    As long as you can tolerate the reduced guarantees that come with LRCO. If you can't and you still want the performance for the general case then you actually deploy three datasources, the primary db having two: a local for most use and an XA for the special cases.

  • 23. Re: when to use xa-datasource
    henk de boer Master

    As long as you can tolerate the reduced guarantees that come with LRCO.

     

    Okay, I didn't knew that. The JBoss AS manual didn't mention anything about reduced guarantees, although it did mention interposition support needed to be disabled for a distributed environment.

     

    If you can't and you still want the performance for the general case then you actually deploy three datasources, the primary db having two: a local for most use and an XA for the special cases.

     

    I was actually thinking about that, but I'm not sure how well this plays out in combination with JPA and EJB. I would also have to define a second persistence unit for the XA datasource, and I'm not sure whether I maybe also have to duplicate all session beans that are used in both the normal and the special cases. The problem here is that session beans reference a PU explicitly and I'm not sure whether it's possible to use the same bean with different PUs somehow.

  • 24. Re: when to use xa-datasource
    testrot Newbie
    Jonathan Halliday wrote:
    [...]
    If you app only uses a single datasource, use local-tx

    If your app uses two datasources, use xa-datasource

    [...]

    There is one thing to emphazise. It's all about datasources (as Jonathan wrote) and not databases. In our scenario we use only one database and nevertheless have to use xa-datasources. Why? Because we use JBoss Messaging and manipulate JPA Entities within Message Driven Beans (consuming messages). Furthermore we also produce messages in transactional Session Bean methods. So although we use only one database (JBoss Messaging is configured to use the same database as JPA) we still have to use xa-datasources to get transactional integrity for the whole logic (consume/produce message and manipulate Entities via JPA). At least this is what I think must be done. Am I wrong?

     

    Henk wrote:

    Which thus basically means an application can pretty much use an xa-datasource as its default, can't it?

    You wouldn't configure a xa datasource unless you really have to, because for many DBMS there are extra steps required by the DBMS. MS SQL for example requires some extra configuration steps many database admins don't like. It's not able to support XA transactions for out-of-the-box/standard installations. Furthermore XA behavior of MS SQL is, let's say improveable ;-). If a running transaction is prepared and not yet commited and the sqlserver process is stopped (regular via Windows service administration) the still running MS DTC (automatically used for XA transactions in  MS SQL Server) can't commit the transaction and the database eventually will be marked "suspect" which means it can't be used before the transaction is manually corrected via DTC.

    Another thing to mention is, that the required jboss configuration isn't trivial as soon as xa transactions are involved:

    - transaction recovery has to be configured

    - JTA or JTS which implementation do I need?

    Jonathan, I don't want to offend you. I think everyone involved in implementing JBoss transactions does good work, but as someone who has to get the whole thing working, I would really appreciate a comprehensive, (easy) to read and complete documentation which explains everything I need to know to get working xa transactions with all major databases (oracle, mssql, db2,...). So far I have to read a lot of different JBoss documentation, forum threads and wiki pages to gather the information I need. And  I am still not sure if the whole configuration is correct and works like I expect.

  • 25. Re: when to use xa-datasource
    Jonathan Halliday Master
    Another thing to mention is, that the required jboss configuration isn't trivial as soon as xa transactions are involved:

    - transaction recovery has to be configured

    - JTA or JTS which implementation do I need?

    Recovery config is automatic in newer versions. The JTA vs. JTS thing is about tx context propagation, which is orthogonal to XA. The 'what version to choose' wiki page states where you need one or the other. We've just finished planning for the next six months and documentation work will be receiving a lot of focus. The existing doc set was developed for JBossTS standalone, not embedded in AS and that's a gap we need to plug.

  • 26. Re: when to use xa-datasource
    testrot Newbie
    Recovery config is automatic in newer versions.

    Good thing to hear, this will make things easier! I guess this applies to releases after EAP 5.1.0.

    The JTA vs. JTS thing is about tx context propagation, which is  orthogonal to XA. The 'what version to choose' wiki page states where  you need one or the other.

    I know this wiki page. The following statement confuses me:

    The JTS is required only where a transaction needs to span multiple   server JVMs, such as in an application server cluster using   transactional EJBs.

    I wondered if JTS is also required in homogenous clusters. I would guess it isn't because a transaction (EJB Call, message processing) doesn't leave the initial cluster node and stays in a single JVM (see my comment on the page).

    We've just finished planning for the next six months and documentation  work will be receiving a lot of focus. The existing doc set was  developed for JBossTS standalone, not embedded in AS and that's a gap we  need to plug.

    Again this is good news for me. Indeed the standalone doc is a little confusing when using JBoss transactions in the AS.

  • 27. Re: when to use xa-datasource
    arjan tijms Novice

    testrot wrote:

    Another thing to mention is, that the required jboss configuration isn't trivial as soon as xa transactions are involved:

    - transaction recovery has to be configured

    - JTA or JTS which implementation do I need?

     

    I'm not really sure I follow all of this.

     

    I have for instance a Stateless session bean in JBoss AS 5.1 that is injected with:

     

    1. An entity manager for a persistence unit with an XA datasource for a PG DBMS
    2. A direct reference to a datasource that is also configured as an XA datasource (to another DBMS than in 1. but also PG)
    3. A JMS connection from JBoss AS' JmsXA connection factory
    4. A reference to an EJB from a remote JBoss AS 5.1 instance (no clustering, just a remote instance on another server, I inject this via a federated JNDI or I just look up the remote bean programmatically via JNDI)

     

    I don't explicitely configure any transaction recovery and don't choose between JTA or JTS. When I call a method on the session bean, the container by default automatically starts a JTA transaction. When I access all the above mentioned 4 resources within a single method in the session bean, it seems that the correct thing happens.

     

    Am I missing something?

  • 28. Re: when to use xa-datasource
    Jonathan Halliday Master

    > I guess this applies to releases after EAP 5.1.0.

     

    EAP 5.1 has auto recovery config for datasources. It's not auto wired for messaging, but then again that manual config entry should be present by default already anyhow.

     

    > I wondered if JTS is also required in homogenous clusters

     

    If you can guarantee the tx scope won't leave the jvm then you don't need it. The problem is all the abstraction gets in your way and makes that guarantee hard. As a general rule the 'cluster' will keep the tx local, but what if e.g. you have a pool of ejb servers and a centralized messaging server, or a pool of web containers and a separate pool of ejb container machines? Or a sticky load balancer that may failover to a different node in some cases? The term 'clustering' can apply to a number of protocols and tiers and it's important to think though the transaction flow that will apply to your specific architecture. In general the JTS applies to EJB<->EJB calls only. You may need e.g. WS-AT/WS-BA instead if you want to span a tx on web services calls.

     

     

    > Am I missing something?

     

    yup - crash recovery testing. You didn't test it, so you don't know if it will work. Hint: it won't - crash rec wiring is not automatic in that version.

  • 29. Re: when to use xa-datasource
    testrot Newbie
    EAP 5.1 has auto recovery config for datasources. It's not auto wired for messaging, but then again that manual config entry should be present by default already anyhow.

    So this

     

    <property name="com.arjuna.ats.jta.recovery.XAResourceRecoveryJDBC.MY_DS" value="com.arjuna.ats.internal.jbossatx.jta.AppServerJDBCXARecovery;jndiname=MY_DS,username=test,password=test"/>

     

    in jbossts.properties is obsolete in EAP 5.1.0 ?

     

    And this

     

    <property name="com.arjuna.ats.jta.recovery.XAResourceRecovery.JBMESSAGING1" value="org.jboss.jms.server.recovery.MessagingXAResourceRecovery;java:/DefaultJMSProvider"/>

     

    is still required?

     

    If you can guarantee the tx scope won't leave the jvm then you don't need it. The problem is all the abstraction gets in your way and makes that guarantee hard. As a general rule the 'cluster' will keep the tx local, but what if e.g. you have a pool of ejb servers and a centralized messaging server, or a pool of web containers and a separate pool of ejb container machines? Or a sticky load balancer that may failover to a different node in some cases? The term 'clustering' can apply to a number of protocols and tiers and it's important to think though the transaction flow that will apply to your specific architecture. In general the JTS applies to EJB<->EJB calls only. You may need e.g. WS-AT/WS-BA instead if you want to span a tx on web services calls.

    I think I got your point. It's more complicated than I expected. However I only have cluster nodes which are really identical (same JBoss configuration and same EAR). JBoss Messaging (Destinations and ConnectionFactory are clustered) and EJBs are identical on all nodes. No Webservices. Transactions are always started by EJB Calls or MDBs. So I can imagine only one situation a tx leaves the JVM: A Stateful Session Bean call interrupted by a node failure followed by a transparent failover to a second node. Would I get an error, that the tx couldn't be propagated to the failover node?