2 Replies Latest reply on Apr 2, 2007 5:57 PM by quentena

    possible bug in run.sh on Solaris

    quentena

      We've been using jboss 4.0.5.GA and we think we've found a bug with run.sh on Solaris (9 or 10).

      This link is the patch that seems to have introduced this problem:
      http://jira.jboss.com/jira/browse/JBAS-3748

      This link is where the patch was discussed:
      http://www.jboss.com/index.html?module=bb&op=viewtopic&t=92156

      To trigger the bug, you need to set LAUNCH_JBOSS_IN_BACKGROUND. Then start the jboss server normally. Once it's started stop the server. The JVM stops, however the run.sh script hangs around consuming 100% of a single CPUs resources.

      The problem seems to be this bit of script, plus the fact that the script shebang is #!/bin/sh

      while [ "$WAIT_STATUS" -ne 127 ]; do
       JBOSS_STATUS=$WAIT_STATUS
       wait $JBOSS_PID 2>/dev/null
       WAIT_STATUS=$?
      done
      


      On Solaris, #!/bin/sh is *real* bourne shell and the wait shell built-in for /bin/sh on Solaris returns 0 (not 127) if the PID (passed as an argument) doesn't exist. The man page for wait states that this is the correct behaviour.

      Anyway, wait returns 0 and the while loop continues for ever burning up CPU resources (until you kill it with one of the signals not being trapped).

      We've thought of two possible fixes for this:
      1) change the shebang to be #!/bin/bash. This works OK on Linux & Solaris (provided you've installed bash) but I can't speak for other OS's.

      2) test and alias wait to the program version if it exists.

      Something like this near the top of the script:
      if [ -x /bin/wait ]; then
       alias wait='/bin/wait'
      fi
      



      Has anyone else run across this problem? Can anyone think of a better solution?