1 2 Previous Next 16 Replies Latest reply: Feb 20, 2012 10:37 AM by Erhard Siegl RSS

Error 503 for several seconds until session failover

Erhard Siegl Newbie

Hi,

 

I tried a simple demo applicaton and democlient (see attachments) that opens a session and increases a session-variable for each hit. The democlient hits the server once every second. I have an apace httpd with mod_cluster and two JBoss server. When I shut down the active JBoss, I get Error 503 for about 10 seconds, then the other server gets the requests an continues the session:

 

$ ./democlient.py http://devjava/demo7/

0 unknown

1 ef730190

2 ef730190

3 ef730190

4 ef730190

...

28 ef730190

29 ef730190

30 ef730190 <--- Now I shut down the active server

Failed to open "http://devjava/demo7/". Error code - 503.

Broken between 30 and 0 for 0 seconds.

Failed to open "http://devjava/demo7/". Error code - 503.

Broken between 0 and 0 for 1 seconds.

Failed to open "http://devjava/demo7/". Error code - 503.

Broken between 0 and 0 for 2 seconds.

Failed to open "http://devjava/demo7/". Error code - 503.

Broken between 0 and 0 for 3 seconds.

Failed to open "http://devjava/demo7/". Error code - 503.

Broken between 0 and 0 for 4 seconds.

Failed to open "http://devjava/demo7/". Error code - 503.

Broken between 0 and 0 for 5 seconds.

Failed to open "http://devjava/demo7/". Error code - 503.

Broken between 0 and 0 for 6 seconds.

Failed to open "http://devjava/demo7/". Error code - 503.

Broken between 0 and 0 for 7 seconds.

Failed to open "http://devjava/demo7/". Error code - 503.

Broken between 0 and 0 for 8 seconds.

Failed to open "http://devjava/demo7/". Error code - 503.

Broken between 0 and 0 for 9 seconds.

Broken between 0 and 31 for 10 seconds. <-- After about 10 seconds the second server continues with the session

32 dc1ca871

33 dc1ca871

34 dc1ca871

35 dc1ca871

...

 

Especially when I shut down JBoss gracefully I would have expected that a failover occurs without errors.

Do I expect too much?

Is it a configuration error or a bug?

Anybody got this running without errors with mod_cluster?

 

I tried this with JBoss 4.2.3, the latest JBoss 7 Snapshot (with AJP), mod_cluster 1.0.0, 1.1.3 and the latest 1.1.4-SNAPSHOT, with the same results. The Apache configuration is the one from the mod_cluster download.

The problem might be related to

http://community.jboss.org/message/625242

http://community.jboss.org/message/643133

 

Greetings

Erhard

  • 1. Re: Error 503 for several seconds until session failover
    Erhard Siegl Newbie

    It seems that the problem is in mod_proxy_cluster.c:

     

            if (domain == NULL) {
                /*
                 * We have a route provided that doesn't match the
                 * balancer name. See if the provider route is the
                 * member of the same balancer in which case return 503
                 */
                ap_log_error(APLOG_MARK, APLOG_ERR, 0, r->server,
                             "proxy: CLUSTER: (%s). All workers are in error state for route (%s)",
                             (*balancer)->name, route);
        ...
    

     

     

    I don't use domain-mode and domain is only set when ou->mess.Domain[0] != '\0'. Clustering should be independent from domain-mode. The following helps with this problem:

     

    Index: mod_proxy_cluster.c
    ===================================================================
    --- mod_proxy_cluster.c          (revision 663)
    +++ mod_proxy_cluster.c          (working copy)
    @@ -1858,9 +1858,7 @@
     #endif
         if (node_storage->find_node(&amp;ou, route) == APR_SUCCESS) {
             if (!strcmp(balancer, ou->mess.balancer)) {
    -            if (ou->mess.Domain[0] != '\0') {
    -                *domain = ou->mess.Domain;
    -            }
    +            *domain = ou->mess.Domain;
                 return APR_SUCCESS;
             }
         }
    

     

     

    Greetings

    Erhard

  • 2. Re: Error 503 for several seconds until session failover
    Jean-Frederic Clere Master

    Hm it seems you are using stickySessionForce = true, aren't you?

  • 3. Re: Error 503 for several seconds until session failover
    Erhard Siegl Newbie

    Yes, I use the defaults.

    ssl                                                                                  advertise=true

    advertise-socket=modcluster                                                          auto-enable-contexts=true

    balancer=mycluster                                                                   excluded-contexts=ROOT,admin-console,invoker,jbossws,jmx-console,juddi,web-console

    flush-packets=false                                                                  flush-wait=-1

    max-attemps=1                                                                        node-timeout=-1

    ping=10                                                                              proxy-list=/

    socket-timeout=20                                                                    sticky-session=1

    sticky-session-force=true                                                            sticky-session-remove=false

    stop-context-timeout=10                                                              ttl=60

    worker-timeout=-1

  • 4. Re: Error 503 for several seconds until session failover
    Jean-Frederic Clere Master

    try with stickySessionForce = false

  • 5. Re: Error 503 for several seconds until session failover
    Erhard Siegl Newbie

    No noticable difference. In the logfile:

    [Mon Jan 02 11:38:09 2012] [error] proxy: CLUSTER: (balancer://mycluster). All workers are in error state for route (2b727b8c-faf0-37d7-9cab-e2af94cd7bea)

     

     

    ls subsystem=modcluster/mod-cluster-config=configuration   

    ssl                                                                                  advertise=true

    advertise-socket=modcluster                                                          auto-enable-contexts=true

    balancer=mycluster                                                                   excluded-contexts=ROOT,admin-console,invoker,jbossws,jmx-console,juddi,web-console

    flush-packets=false                                                                  flush-wait=-1

    max-attemps=1                                                                        node-timeout=-1

    ping=10                                                                              proxy-list=/

    socket-timeout=20                                                                    sticky-session=1

    sticky-session-force=false                                                           sticky-session-remove=false

    stop-context-timeout=10                                                              ttl=60

    worker-timeout=-1

  • 6. Re: Error 503 for several seconds until session failover
    Jean-Frederic Clere Master

    Please try with the original mod_cluster code and HAVE_CLUSTER_EX_DEBUG 1 (mod_proxy_cluster/mod_proxy_cluster.c), I am not able to reproduce the problem.

  • 7. Re: Error 503 for several seconds until session failover
    Erhard Siegl Newbie

    Attached the error log with HAVE_CLUSTER_EX_DEBUG 1. The requests with the democlient look like this:

     

    ./democlient.py http://devjava/demo/

    0 unknown

    Switch to node cluster1

    1 cluster1

    2 cluster1

    3 cluster1

    4 cluster1

    5 cluster1

    6 cluster1

    7 cluster1

    8 cluster1

    9 cluster1

    10 cluster1

    11 cluster1

    12 cluster1

    Failed to open "http://devjava/demo/". Error code - 503.

    Broken between 12 and 0 for 0 seconds.

    Failed to open "http://devjava/demo/". Error code - 404.

    Broken between 0 and 0 for 1 seconds.

    Failed to open "http://devjava/demo/". Error code - 503.

    Broken between 0 and 0 for 2 seconds.

    Failed to open "http://devjava/demo/". Error code - 503.

    Broken between 0 and 0 for 3 seconds.

    Failed to open "http://devjava/demo/". Error code - 503.

    Broken between 0 and 0 for 4 seconds.

    Failed to open "http://devjava/demo/". Error code - 503.

    Broken between 0 and 0 for 5 seconds.

    Failed to open "http://devjava/demo/". Error code - 503.

    Broken between 0 and 0 for 6 seconds.

    Failed to open "http://devjava/demo/". Error code - 503.

    Broken between 0 and 0 for 7 seconds.

    Failed to open "http://devjava/demo/". Error code - 503.

    Broken between 0 and 0 for 8 seconds.

    Failed to open "http://devjava/demo/". Error code - 503.

    Broken between 0 and 0 for 9 seconds.

    Failed to open "http://devjava/demo/". Error code - 503.

    Broken between 0 and 0 for 10 seconds.

    Broken between 0 and 13 for 11 seconds.

    Switch to node cluster2

    14 cluster2

    15 cluster2

    16 cluster2

    17 cluster2

     

     

    These tests are done with JBoss4 because something strange happend. When I tried with JBoss 7, I suddenly couldn't reproduce the problem anymore. After some testing with JBoss 4 and JBoss 7, it looks like a clean restart of Apache and JBoss4 application leads to the error, a clean restart of Apache and JBoss7 is ok, but stopping all JBoss 4 instances and starting JBoss 7 instances without restart of Apache leads to the error. (It seems that starting the JBoss 4 after stopping the JBoss 7 without Apache restart doesn't lead to the error, but its too late right now to confirm this for sure.) In other words:

    Stop Apache and all JBoss instances

    start Apache

    start JBoss4-1

    start JBoss4-2

    start democlient

    stop JBoss4-1 -> Error 503

    stop democlient

    stop JBoss4-2

    start JBoss7-1

    start JBoss7-2

    start democlient

    stop JBoss7-1 -> Error 503

    stop democlient

    stop JBoss7-2

    restart Apache

    start JBoss7-1

    start JBoss7-2

    start democlient

    stop JBoss7-1 -> No Error!

     

    I colleage of mine reproduced the error today solely with JBoss 7, I investigate the details tomorrow. The error also occured consistently with jboss-as-7.1.0.CR1-SNAPSHOT, in the meantime I upgraded to  jboss-as-7.1.0.Final-SNAPSHOT (because of another bug). Maybe this fixed the problem with JBoss 7 (ou->mess.Domain[0] != '\0' ???) I also installed the new mod_cluster.jar in JBoss4, but it didn't help either.

     

    If JBoss 4 is not supposed to work with mod_cluster 1.1.4, it's not too much of a problem, since I have a workaround with my patch, otherwise I would be happy to help with more information if necessary.

     

    Erhard

     

    Nachricht geändert durch Erhard Siegl

  • 8. Re: Error 503 for several seconds until session failover
    Erhard Siegl Newbie

    Apearently yesterday it was too late to think straight. The reason that it suddenly worked was the restart of Apache after setting sticky-session-force=false and in JBoss 4 I still had sticky-session-force=true. So it seems that after changing sticky-session-force one has to stopp all servers and restart Apache.

    Since sticky-session-force=true is the default, what are the plans? Change the defaults, make it work or change the documentation?

    Do you still need a logfile with HAVE_CLUSTER_EX_DEBUG 1?

     

    Erhard

  • 9. Re: Error 503 for several seconds until session failover
    Jean-Frederic Clere Master

    The documentation says that sticky-session-force=true is the default so it is correct, if you think the default should be false open a JIRA.

    I don't need logfile with HAVE_CLUSTER_EX_DEBUG 1 if it works.

    Anyway the 404 you have is weird it may need some investigation. Did it occur with AS7?

  • 10. Re: Error 503 for several seconds until session failover
    Erhard Siegl Newbie

    The 404 occured with JBoss 4. (At the first glipse it looks like http://community.jboss.org/message/643850 but you said this came from my patch.)

     

    I think the defaults should work and I think getting a 503 is not ok. It took me a couple of days and your (much apreciated) help to get a simple demo running (still have to fix it for JBoss 4). I think mod_cluster and AS 7 are great, but I ran into about 5 problems (I still have open issues) since I started to play around with it. I want my customers to use mod_cluster, but it in order to recommend it, it has to work out of the box. Thats why I try to help with these issues.

    I think it should be fixed that there is a 503 with sticky-session-force=true. It seems to be the same problem as in https://issues.jboss.org/browse/MODCLUSTER-257, which is still unresolved.

     

    I hope this doesn't sount like a rant, its not meant to be negative.

     

    Erhard

  • 11. Re: Error 503 for several seconds until session failover
    Jean-Frederic Clere Master

    According to the latest trace you provide the 404 is trigger by a bug in the remove / update logic. I will create a JIRA for it.

     

    MODCLUSTER-257 is probably several small bugs and misconfiguration (read bad defaults too).

     

    503 with sticky-session-force=true won't be fixed for the moment (most of them are excepted).

  • 12. Re: Error 503 for several seconds until session failover
    Erhard Siegl Newbie

    I got AS4 and AS7 both working with stickySessionForce = false. Thank you. I didn't understand stickySessionForce properly, to make this default is questionable.

     

    The issue that I had to restart Apache in order to acivate the property: Is it a bug or intended behaviour?

  • 13. Re: Error 503 for several seconds until session failover
    Jean-Frederic Clere Master

    The need to restart Apache httpd is a bug: need a JIRA.

  • 14. Re: Error 503 for several seconds until session failover
    Radoslav Husar Master

    BTW here is a link to the required apache restart issue https://issues.jboss.org/browse/MODCLUSTER-273

1 2 Previous Next