4 Replies Latest reply: Nov 18, 2011 6:54 AM by anuj bhatia RSS

Retry and Timeout settings

anuj bhatia Newbie

Hi,

 

I have a web service request that takes around 10 minutes to complete. This web service is being invoked from an INVOKE activity in a BPEL process running in RiftSaw.

 

I find that RiftSaw re-invokes the web service after every 5 minutes even though I've set mex.timeout=900000 and added the faultOnFailure extension element:

 

     <ext:failureHandling xmlns:ext="http://ode.apache.org/activityRecovery">

            <ext:faultOnFailure>true</ext:faultOnFailure>

      </ext:failureHandling>

 

After around 5 minutes I see a connection reset exception in the JBoss logs from org.apache.cxf.transport.http.HTTPConduit. But shouldn't the process fail at that point instead of retrying the request?

 

Thanks

Anuj

  • 1. Re: Retry and Timeout settings
    Marek Baluch Newbie

    Hi Anuj,

     

    I recommend you use an asynchrounous scenario for such web-service invocation. It's a far better solution then changing the timeout for such long running services.

     

    Just to be sure - you changed the mex.timout by adding a name.endpoint properties file into the jar right next to the process definition correct? Thanks

     

    Best regards

    Marek.

  • 2. Re: Retry and Timeout settings
    anuj bhatia Newbie

    Hi Marek,

     

    I agree an asynchronous approach would be better for this scenario and I plan to implement that in the long run. I was wondering if there was a quick fix or whether there was something wrong in my settings.

     

    Also, I've set the mex.tiomeout property in a file called service-config.endpoint that is placed into the jar right next to the process definition. I'm pretty sure it's taking effect.

     

    On further investigation I found a probable cause of the problem is that the JBoss transaction timeout is set to a value smaller than the web service invoke timeout. This is causing some unexpected behavior in RiftSaw. Here's what I observed:

     

    1. Assume that the web service returns a response in 7 minutes, the transaction timeout is set to 5 minutes, the mex.timeout is set to say 15 mins.

     

    2. The BPEL process invokes the web service.

     

    3. After 5 minutes there's a transaction rollback warning in the JBoss logs: [com.arjuna.ats.arjuna.coordinator.CheckedAction_2] - CheckedAction::check - atomic action a282a58:a20:4ec5f30d:3db aborting with 1 threads active!

     

    4. After 7 minutes (when the invoked web service returns) there's an error from RiftSaw (I assume because the corresponding transaction has been aborted):

     

    org.hibernate.LazyInitializationException: could not initialize proxy - no Session

        at org.hibernate.proxy.AbstractLazyInitializer.initialize(AbstractLazyInitializer.java:86)

        at org.hibernate.proxy.AbstractLazyInitializer.getImplementation(AbstractLazyInitializer.java:140)

        at org.hibernate.proxy.pojo.javassist.JavassistLazyInitializer.invoke(JavassistLazyInitializer.java:190)

        at org.apache.ode.dao.jpa.bpel.ProcessInstanceDAOImpl_$$_javassist_22.getInstanceId(ProcessInstanceDAOImpl_$$_javassist_22.java)

        at org.apache.ode.bpel.engine.PartnerRoleMessageExchangeImpl.continueAsync(PartnerRoleMessageExchangeImpl.java:136)

        at org.apache.ode.bpel.engine.PartnerRoleMessageExchangeImpl.reply(PartnerRoleMessageExchangeImpl.java:88)

        at org.jboss.soa.bpel.runtime.ws.WebServiceClient$TwoWayCallable$1.call(WebServiceClient.java:298)

        at org.apache.ode.scheduler.simple.SimpleScheduler.execTransaction(SimpleScheduler.java:294)

     

    5. After 15 minutes (at end of mex.timeout) the process is marked as failed, with the following messages in the JBoss logs:

     

    [org.apache.ode.bpel.runtime.INVOKE] (ODEServer-3) Failure during invoke: No response received for invoke (mexId=hqejbhcnphr6rgm1w5uw09), forcing it into a failed state.

     

    6. At this point intermittently I find that the web service is re-invoked, though I haven't found the exact scenario in which this happens I think it's probably because some how the ode_job table is left in an inconsistent state (see next point).

     

    7. At the end of this process it's not possible to shut down the JBoss server normally using the shutdown.sh command. The last message logged is:

     

    [org.jboss.soa.bpel.runtime.engine.service.BPELEngineService] (JBoss Shutdown Hook) Stopping JBoss BPEL Engine

     

    and it keeps waiting for the BPEL Engine to stop (I think there's some lock that's not released correctly). So I have to terminate the JBoss process using kill -9. At this point I think sometimes the ode_job table is left inconsistent and in the next test run I see the web service being re-invoked even though it shouldn't be because I've set faultOnFailure to true.

     

     

    I think there should be some check in RiftSaw to detect that the mex.timeout value is being set to a value greater than the JBoss transaction timeout and report it as an error. Also, there's definitelt seems to be a bug with some locks not being released properly that prevents a clean JBoss shutdown.

     

    I'm testing with JBoss 5.1.0 and RiftSaw 2.3.0.Final and JBoss WS is using CXF 3.4.0.

     

    Do you think it's worth logging a Jira for this or am I missing something?

     

    Thanks

    Anuj

  • 3. Re: Retry and Timeout settings
    Gary Brown Master

    Hi Anuj

     

    Yes if you could raise a jira outlining this scenario, and if possible a simple test case to demonstrate the problem.

     

    Thanks.

     

    Regards

    Gary