-
1. Re: AS 7 domain robustness - process controller doesn't autorestart app server node after a fail
emuckenhuber Jun 2, 2012 5:25 AM (in response to maksymg)Yes, that is the intended behavior. The process-controller will only restart the host-controller if the process exits unexpected. This is important that potentially remote hosts stay manageable. We think that starting a server is an administrative task though and shouldn't be done silently. So a crashed server should rather be detected by a monitoring solution, which usually provides a more sophisticated set of tools to properly handle such an event.
-
2. Re: AS 7 domain robustness - process controller doesn't autorestart app server node after a fail
maksymg Jun 4, 2012 12:04 PM (in response to emuckenhuber)Emanuel,
Can you clarify whether Java termination due to getting out of memory qualifes as unexpected exit, and as result be restarted automatically?
Thanks,
Maksym
-
3. Re: AS 7 domain robustness - process controller doesn't autorestart app server node after a fail
wdfink Jun 5, 2012 3:41 AM (in response to maksymg)Some of my experience.
If the JVM is dead it will be mostly an accident (Yes I've seen false scripts and admins kill processes ) or a JVM bug.
This is a reason to force a restart imediately.
OOM or i.e. slow response might be a bug (memory leak) or a temporary overloading of your (cluster)system.
What might happen is that if you restart the JBoss instance it blow up your system, let me assume you have 3 nodes handling your throughput, one of it get in such OOM situation.
Your automatic will shutdown or kill the instance (maybe a shutdown is not possible if the GC is running crazy).
The load will be distributed to the other two nodes which will be overloaded as well and the system is complete down and it will heavy tocome back to work automaitcaly.
So the decision whether and how to restart in such situation is very difficult and might require a complex automatic or even an administrator.
-
4. Re: AS 7 domain robustness - process controller doesn't autorestart app server node after a fail
b.eckenfels Jun 28, 2012 7:44 PM (in response to wdfink)Yes the admin has to decide what to do. But if he decides that a died server is a signal, that it needs restart the PC (or HC) does need a "auto-restart" setting (with parameters for sleep time before restart, maximum restart count (and time the count is reset)) "Wait 10s and restart, do this maximum 3 times in 1h".