10 Replies Latest reply on Nov 30, 2010 1:23 PM by peterj

    Full Garbage Collection every few seconds – GC frees very little memory (JBoss 4.2.3 with JDK 1.5.0 64-bit)

    kopin

      Setup

      8-core Xeon / 10 GB RAM

      RedHat EL 5.3 (64-bit)

      JDK 1.5.0._22 (64-bit)

      JBoss AS 4.2.3.GA with JBoss Messaging installed

      JAVA_OPTS=”-server -Xrs -Xms2048m -Xmx2048m -XX:MaxPermSize=512m -XX:NewRatio=1 -XX:ThreadStackSize=160 -XX:+UseParallelGC -Djava.awt.headless=true -Djava.library.path=$JBOSS_NATIVE_DIR -Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000”

       

      Description

      We are running an EE web application on JBoss with an Apache httpd as a reverse proxy (using mod_jk) on another machine.  Our stress tests have shown that the server can handle up to 400 concurrent requests, and we have therefore set the Tomcat AJP pool to that size.

       

      When usage is normal, we can see about 40 active threads out of 400 being used, and the server works fine.  Sometimes (we have not been able to discover a pattern – it may happen anytime from every couple of days to every couple of weeks) the number of active AJP threads reaches the maximum (400) and from that time onwards, server performance degrades significantly and does not get back to normal, even after the active AJP thread count has fallen to 10-20.  Eventually, performance reaches unacceptable levels and we have to restart JBoss.

       

      We have observed that when the problem occurs, the garbage collector behaves like this (and never recovers) :

      /usr/java/jdk_5_64/jdk1.5.0_22/bin/jstat -gcutil 21801 5000 10

        S0     S1     E      O      P     YGC     YGCT    FGC    FGCT     GCT

        0.00   0.00  46.39  99.81  99.97  43111  839.352 14271 38382.902 39222.254

        0.00   0.00  99.81  99.78  99.97  43111  839.352 14273 38385.821 39225.173

        0.00   0.00  99.98  99.78  99.99  43111  839.352 14274 38388.653 39228.005

        0.00   0.00  99.96  99.78  99.98  43111  839.352 14276 38394.327 39233.679

        0.00   0.00  99.80  99.75  99.97  43111  839.352 14277 38397.220 39236.572

        0.00   0.00  29.83  99.75  99.96  43111  839.352 14278 38402.896 39242.249

        0.00   0.00  99.09  99.75  99.98  43111  839.352 14280 38405.740 39245.093

        0.00   0.00  99.80  99.70  99.97  43111  839.352 14281 38408.664 39248.016

        0.00   0.00  99.41  99.70  99.77  43111  839.352 14283 38414.305 39253.657

        0.00   0.00  99.82  99.70  99.79  43111  839.352 14284 38417.120 39256.473

       

      When all is normal, its behaviour is much better :

      /usr/java/jdk_5_64/jdk1.5.0_22/bin/jstat -gcutil 28174  5000 10

        S0     S1     E      O      P     YGC     YGCT    FGC    FGCT     GCT

        0.00  96.31  17.02  50.89  99.06  28939  522.830  1777 3073.229 3596.059

      97.80   0.00  11.21  51.04  99.06  28942  522.874  1777 3073.229 3596.103

      90.52   0.00  87.36  51.19  99.06  28944  522.904  1777 3073.229 3596.133

        0.00  99.05  43.58  51.35  99.06  28947  522.951  1777 3073.229 3596.180

        0.00  97.12  95.89  51.47  99.06  28949  522.983  1777 3073.229 3596.211

        0.00  84.73  47.50  51.72  99.06  28953  523.045  1777 3073.229 3596.274

        0.00  98.39  68.94  51.87  99.06  28955  523.078  1777 3073.229 3596.307

        0.00  97.86  56.04  52.01  99.07  28957  523.111  1777 3073.229 3596.340

        0.00  99.08  63.76  52.15  99.07  28959  523.141  1777 3073.229 3596.370

        0.00  96.79   1.02  52.29  99.07  28961  523.171  1777 3073.229 3596.400

       

      The whole Java heap space is all in RAM, i.e. the OS swap file is not used at all.  We suspected a memory leak in the application, since the old generation space becomes almost full, but the JVM does not throw an OutOfMemory error; instead, it seems to free very little memory each time, resulting in a full GC.

       

      Any ideas?

       

      Thanks.

      Constantine

        • 1. Re: Full Garbage Collection every few seconds – GC frees very little memory (JBoss 4.2.3 with JDK 1.5.0 64-bit)
          peterj

          Don't set NewRatio to 1. This means that the young gen and old gen are the same sizes. The JVM will do a full collection if the old gen does not have enough to hold all of the object in the young gen (note I wrote 'all', and not 'alive', the determination of whether to do a full vs minor collection takesp lace before the JVM figures out how many objects will survive a minor collection).  I recommend setting it to 3 or 4.

           

          From the data you posted, the "bad" run seems to be tenuring objects immediately.

           

          If you suspect a memory leak, take a heap dump and anlayze it with VisualVM.

           

          You might want to gather and plot the GC stastics. See these white papers:

          Java Garbage Collection Statistical Analysis 101

          Java Garbage Collection Performance Analysis 201

          at http://www.cmg.org/cgi-bin/search.cgi?q=java+peter+johnson&x=30&y=10

           

          Also, dig through this forum, it contains many tuning suggestions (including more details about using VisualVM and spotting memory leaks)

          1 of 1 people found this helpful
          • 2. Re: Full Garbage Collection every few seconds – GC frees very little memory (JBoss 4.2.3 with JDK 1.5.0 64-bit)
            kopin

            Thanks very much for your prompt response.

             

            Either the JVM is tenuring objects immediately or the Eden-Survivor-Old cycle is so fast it does not get logged in the 5-second period jstat is collecting info.  Also, I did not know how the JVM decides whether to do a full or minor collection - that explains a lot.

             

            I have already taken heap dumps during "good" and "bad" runs, but have not been able to spot anything interesting.  The "good" run has about 650 threads total, while the "bad" about 1000, and the main difference was a lot more Object.wait() threads during the "bad" run.  Not much to work on, really.

             

            Have also used jmap to create a histogram, but most of the memory seemed (not surprisingly) to be allocated to primitive arrays and other JDK classes, so no useful information there either.  I was hoping to find a large chunk of the heap allocated to some of our application's objects, which would indicate a memory leak, but this was not the case.

             

            Anyway, I'll change NewRatio, read the papers you suggested and post back when I have some results.

             

            C.

            • 3. Re: Full Garbage Collection every few seconds – GC frees very little memory (JBoss 4.2.3 with JDK 1.5.0 64-bit)
              wdfink

              I start the JVM always with GC logfile, it will help often to see problems in production.

              There is a nice GUI to analyse it http://www.tagtraum.com/gcviewer.html

               

              If there is no application memory leak you might check whether a different strategy of GC help.

              For our application we switch to CMS with :

              -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing -XX:CMSIncrementalDutyCycleMin=0 -XX:CMSIncrementalDutyCycle=10 -XX:NewRatio=1 -XX:SurvivorRatio=20

               

              Also in this case we set the YoungGen=1400M and OldGen=600M to ensure that the old generation must never have an full-gc, all objects are dead before moved to ldGen.

              1 of 1 people found this helpful
              • 4. Re: Full Garbage Collection every few seconds – GC frees very little memory (JBoss 4.2.3 with JDK 1.5.0 64-bit)
                kopin

                Just a quick update :

                 

                I set NewRatio=3 and total GC time on the application server is now about 30% less than before.

                 

                The JVM performs more minor GCs (which was something to be expected, since the new generation now has less heap space), but the total minor GC time has not increased.  Also, it performs much fewer major GCs, and while the time for each full GC is a little longer than before (which was also expected, since the heap space allocated to the old generation has increased), the overall time spent in major GCs is less.

                 

                So it looks like this setting did do some good.  I still don't know whether the original problem (full GCs all the time) will reappear, but will post again when I have something of interest to report.

                 

                Thanks again.

                C.

                • 5. Re: Full Garbage Collection every few seconds – GC frees very little memory (JBoss 4.2.3 with JDK 1.5.0 64-bit)
                  wdfink

                  Unfortunately the GC will be a never ending story.

                  We have different settings for the same application in different scenarios or requirements.

                  The adjustment is a balance between troughput <=> full stop time.

                   

                  For our application I observe that the OldGeneration is after a start phase, more or less, static.

                  We adjust young generation experimental that only a small part will be alive if the area become full and no object will survive more than once.

                  With this behaviour the GC is stable a good bandwidth, after a threshold the application become slower and because of this object alive more often.

                  That you will see in the GC logfile and this indicate that the application is short before overloaded.

                  • 6. Re: Full Garbage Collection every few seconds – GC frees very little memory (JBoss 4.2.3 with JDK 1.5.0 64-bit)
                    peterj
                    The JVM performs more minor GCs (which was something to be expected,  since the new generation now has less heap space), but the total minor  GC time has not increased.  Also, it performs much fewer major GCs, and  while the time for each full GC is a little longer than before (which  was also expected, since the heap space allocated to the old generation  has increased), the overall time spent in major GCs is less.

                    For a more striking example of this phenomenon, see the example in my "101" paper. Actually, in that example I avoid full GCs altogether!

                     

                    You really should also total the amount of time spent in GC over the course of the run and determine the percentage of time spent in GC. That is one of the data points I have in my latest white paper that I am presenting at CMG in two weeks. I am also using R to analyze the data, beats Excel!

                    • 7. Re: Full Garbage Collection every few seconds – GC frees very little memory (JBoss 4.2.3 with JDK 1.5.0 64-bit)
                      jimyliu

                      Some possibilities:

                      - Maybe there is memory leak in your application. Since you are using 64 bit JVM, you can increase the heap size to 4G or even 6G to see if heap usage keeps increasing.

                      - If heap usage is stabilized at some point (ex: 3GB), then you use 4G heap size and avoid full GC.

                      - If heap usage keep increasing, then several memory heap dump and analyze if there is any leak

                      - Of course, you can also dump the memory and investigate if you can reduce the memory footprint in the application code. This takes more time than just increases the heap size.

                      • 8. Re: Full Garbage Collection every few seconds – GC frees very little memory (JBoss 4.2.3 with JDK 1.5.0 64-bit)
                        kopin

                        If I had a memory leak, wouldn't I always get an OutOfMemoryError eventually?  As I wrote on the first post, the JVM is never throwing one.

                         

                        C.

                        • 9. Re: Full Garbage Collection every few seconds – GC frees very little memory (JBoss 4.2.3 with JDK 1.5.0 64-bit)
                          kopin

                          Update:

                           

                          It seems that the problem persists even after setting NewRatio=3 with the throughput GC.

                           

                          We therefore tried a different approach altogether by using the CMS collector (in order to minimize stop times) with the following settings on one of the servers :

                           

                          JAVA_OPTS="-server -Xrs -Xms2048m -Xmx2048m -XX:MaxPermSize=512m -XX:NewSize=300m -XX:ThreadStackSize=160  -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:CMSInitiatingOccupancyFraction=60 -XX:+UseCMSCompactAtFullCollection -Djava.awt.headless=true -Djava.library.path=$JBOSS_NATIVE_DIR -Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000"

                           

                          I had to set NewSize explicitly, since NewRatio is ignored by the CMS collector (I read a post that said it is some sort of JVM bug).

                           

                          We noticed that during the time the server with the Throughput GC was having a "bad" run, the one with the CMS collector was healthy, although they were both under significant load.  The server with the CMS collector recovered fully and the GC was freeing a lot of memory, while the one with the throughput GC did not and had to be restarted.

                           

                          CMSInitiatingOccupancyFraction was set to 60, which is somewhat conservative; will probably need to experiment on that.

                          • 10. Re: Full Garbage Collection every few seconds – GC frees very little memory (JBoss 4.2.3 with JDK 1.5.0 64-bit)
                            peterj
                            If I had a memory leak, wouldn't I always get an OutOfMemoryError  eventually?  As I wrote on the first post, the JVM is never throwing  one.

                            That is true, but continual full GCs means that the app's working set size is a very large percentage of the heap and that you have sized the heap too small.  And when you use the CMS collector you need to size the heap even larger, if you don't then you run into the possibility of the CMS collector not being able to free up memory in a timely fashion, in which case a very expensive (time-wise) stop-the-world collection takes place. Right-sizing the heap usually takes a lot of GC monitoring and experimentation to get right, though the easiest way is to allocate a very large heap to begin with (like Jimy Liu suggested) and monitor that, then you can usually come up with a decent heap size.