1 2 Previous Next 17 Replies Latest reply on Mar 8, 2012 5:18 AM by arumilli

Cpu 100% and GC is running

arumilli Jan 5, 2012 7:07 AM

Hi,

We have observered the CPU is touch 100% usage while GC running in JVM for couple of secs and (daemon)threads are increasing from 200 to 500.

Following are the GC logs:

2012-01-05T09:55:04.131000: 9748.990: [GC [PSYoungGen: 1269344K->24844K(1274944K)] 1644968K->410570K(3931200K), 0.1539430 secs]
2012-01-05T09:55:06.901000: 9751.760: [GC [PSYoungGen: 1266572K->16787K(1275200K)] 1652298K->412469K(3931456K), 0.1403010 secs]
2012-01-05T09:55:09.826000: 9754.685: [GC [PSYoungGen: 1258515K->28069K(1272640K)] 1654197K->427708K(3928896K), 0.0863350 secs]
2012-01-05T09:55:12.573000: 9757.432: [GC [PSYoungGen: 1267749K->28986K(1268672K)] 1667388K->442158K(3924928K), 0.1925500 secs]
2012-01-05T09:55:15.415000: 9760.274: [GC [PSYoungGen: 1268666K->29813K(1269760K)] 1681838K->456553K(3926016K), 0.1983170 secs]
2012-01-05T09:55:18.400000: 9763.259: [GC [PSYoungGen: 1263605K->25547K(1271232K)] 1690345K->465817K(3927488K), 0.0741820 secs]
2012-01-05T09:55:21.030000: 9765.889: [GC [PSYoungGen: 1259339K->28609K(1270464K)] 1699609K->479890K(3926720K), 0.0903450 secs]
2012-01-05T09:55:23.691000: 9768.550: [GC [PSYoungGen: 1262145K->23413K(1271104K)] 1713426K->486897K(3927360K), 0.0705770 secs]
2012-01-05T09:55:26.348000: 9771.207: [GC [PSYoungGen: 1256949K->20455K(1273152K)] 1720433K->492161K(3929408K), 0.0597660 secs]
2012-01-05T09:55:28.934000: 9773.793: [GC [PSYoungGen: 1257383K->25634K(1272448K)] 1729089K->504501K(3928704K), 0.0641520 secs]
2012-01-05T09:55:31.628000: 9776.487: [GC [PSYoungGen: 1262562K->24623K(1274304K)] 1741429K->513884K(3930560K), 0.0649770 secs]
2012-01-05T09:55:34.698000: 9779.557: [GC [PSYoungGen: 1263791K->34357K(1273536K)] 1753052K->531695K(3929792K), 0.0710200 secs]
2012-01-05T09:55:37.427000: 9782.286: [GC [PSYoungGen: 1273525K->32264K(1261248K)] 1770863K->550162K(3917504K), 0.0734510 secs]
2012-01-05T09:55:40.007000: 9784.866: [GC [PSYoungGen: 1258376K->36728K(1262848K)] 1776274K->572739K(3919104K), 0.0913590 secs]
2012-01-05T09:55:42.520000: 9787.379: [GC [PSYoungGen: 1262840K->21809K(1265728K)] 1798851K->578910K(3921984K), 0.0735170 secs]
2012-01-05T09:55:44.996000: 9789.855: [GC [PSYoungGen: 1243889K->21991K(1265024K)] 1800990K->585886K(3921280K), 0.0672800 secs]
2012-01-05T09:55:47.506000: 9792.365: [GC [PSYoungGen: 1244071K->25800K(1268032K)] 1807966K->597725K(3924288K), 0.0715630 secs]
2012-01-05T09:55:50.101000: 9794.960: [GC [PSYoungGen: 1251720K->20716K(1266560K)] 1823645K->603125K(3922816K), 0.0660460 secs]
2012-01-05T09:55:52.527000: 9797.386: [GC [PSYoungGen: 1246636K->17876K(1268672K)] 1829045K->606765K(3924928K), 0.0609200 secs]
2012-01-05T09:55:54.990000: 9799.849: [GC [PSYoungGen: 1246420K->22689K(1268544K)] 1835309K->615465K(3924800K), 0.0649620 secs]
2012-01-05T09:55:57.464000: 9802.323: [GC [PSYoungGen: 1251233K->16658K(1269824K)] 1844009K->617066K(3926080K), 0.0643570 secs]
2012-01-05T09:55:59.967000: 9804.826: [GC [PSYoungGen: 1247570K->15072K(1269760K)] 1847978K->619016K(3926016K), 0.0599230 secs]
2012-01-05T09:56:02.499000: 9807.358: [GC [PSYoungGen: 1245984K->16908K(1270400K)] 1849928K->623478K(3926656K), 0.0640130 secs]
2012-01-05T09:56:05.053000: 9809.912: [GC [PSYoungGen: 1248460K->13682K(1269824K)] 1855030K->625222K(3926080K), 0.0592950 secs]
2012-01-05T09:56:07.780000: 9812.639: [GC [PSYoungGen: 1245234K->7300K(1269056K)] 1856774K->623595K(3925312K), 0.0493350 secs]
2012-01-05T09:56:10.469000: 9815.328: [GC [PSYoungGen: 1238148K->5620K(1236480K)] 1854443K->623920K(3892736K), 0.0413330 secs]
2012-01-05T09:56:13.238000: 9818.097: [GC [PSYoungGen: 1236468K->7042K(1269248K)] 1854768K->627137K(3925504K), 0.0379700 secs]
2012-01-05T09:56:15.948000: 9820.807: [GC [PSYoungGen: 1236802K->7586K(1269184K)] 1856897K->629529K(3925440K), 0.0296320 secs]
2012-01-05T09:56:18.617000: 9823.476: [GC [PSYoungGen: 1237346K->10997K(1271104K)] 1859289K->635764K(3927360K), 0.0349820 secs]
2012-01-05T09:56:21.153000: 9826.012: [GC [PSYoungGen: 1243189K->9572K(1269760K)] 1867956K->638488K(3926016K), 0.0305350 secs]
2012-01-05T09:56:54.537000: 9859.396: [GC [PSYoungGen: 1241764K->11066K(1273536K)] 1870680K->642061K(3929792K), 0.0356580 secs]
2012-01-05T09:58:26: 9950.859: [GC [PSYoungGen: 1248058K->16924K(1272128K)] 1879053K->649544K(3928384K), 0.0543600 secs]

and my JVM settings are

-Dprogram.name=run.sh
-Dprogram.name=run.sh
-Xms3872m
-Xmx3872m
-Dsun.rmi.dgc.client.gcInterval=3600000
-Dsun.rmi.dgc.server.gcInterval=3600000
-XX:NewSize=1278m
-XX:MaxNewSize=1278m
-Djava.net.preferIPv4Stack=true
-XX:MaxPermSize=256m
-Djavax.net.ssl.trustStore=/opt/properties/.keystore
-Djavax.net.ssl.trustStorePassword=changeit
-Dcom.sun.management.jmxremote.port=8888
-Djava.rmi.server.hostname=xx.xx.xx.xx
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote
-XX:+PrintGCDetails
-Xloggc:/opt/gc.log
-XX:+UseTLAB
-Xss256k

Regards,

Arumilli.

1. Re: Cpu 100% and GC is running

rhusar Jan 5, 2012 7:21 AM (in response to arumilli)

So whats the question? :-) GC could be a CPU intensive activity, so reaching 100% CPU usage is plausible.
Actions
2. Re: Cpu 100% and GC is running

arumilli Jan 5, 2012 7:21 AM (in response to arumilli)

adding screenshot
Actions
3. Re: Cpu 100% and GC is running

arumilli Jan 5, 2012 7:23 AM (in response to rhusar)

Sorry..I missed my question. Why the Daemon threads are increasing and how I can control this.

Arumilli
Actions
4. Re: Cpu 100% and GC is running

rhusar Jan 5, 2012 7:40 AM (in response to arumilli)

If you are using parrallel GC the number of garbage collector threads can be controlled with the ParallelGCThreads command line option -XX:ParallelGCThreads=<your desired number>. Or maybe you want to stop clients from invoking GC directly via .gc() call, then you can -XX:+DisableExplicitGC.
Actions
5. Re: Cpu 100% and GC is running

arumilli Jan 5, 2012 7:59 AM (in response to rhusar)

Thanks for your suggestion..I missed to add DisableExplicitGsC parameter. I will add this now and will post you the results.

Thanks for your help
Actions
6. Re: Cpu 100% and GC is running

rhusar Jan 5, 2012 8:35 AM (in response to arumilli)

GC tuning is very specific to your application and your workflow. Of course there are lots of general recommendations, but you will mostly have to experiment yourself.

Check the official pages for reference: http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html
Actions
7. Re: Cpu 100% and GC is running

wdfink Jan 5, 2012 10:10 AM (in response to arumilli)

I suppose that it will not help to disable ExplicitGC because JBoss do not use this normally.
And if your application do such your developer should know often it is a workaround for a high memory consumption.
Actions
8. Re: Cpu 100% and GC is running

wdfink Jan 5, 2012 10:22 AM (in response to arumilli)

Often I see a behaviour that after reaching a threshold in memory consumption the GC become more active, this force more parallel access (because of longer duration) and this will end in more GC to handle the small rest of memory

So what you might check
- is the overall memory use increased => you will have a leak
you might use jstat to monitor the JBoss JVM it will be not a big overhead and its possible in production
- you might have a traffic peak that end in such behaviour => check whether more memory or clustering helps
also jstat will help

Also you should activate the GC logging (see link gc-tuning from Rado above):
-verbose:gc -Xloggc:/tmp/jboss_gc.log -XX:+PrintGCDetails ...
The interesting log is 'Total Time where application stop' see Java GC tuning
In your case I suppose a that ~1% app time and 99% of such stop time, this mean the GC must work hard to provide memory for allocation
Actions
9. Re: Cpu 100% and GC is running

peterj Jan 5, 2012 11:22 AM (in response to arumilli)

Looking at the GC data you posted, here is what I see:

a) A minor collection is taking place about every 3 second

b) The minor collection is taking anywhere from 1/10 to 1/20 of a second. Even with 0.1 sec per collection, out of 3 seconds, that amounts to only about 3% of the wall-clock time is spent in GC, which is very good.

c) The minor collection cleans up about 1.2GB of objects, leaving about 20MB in the survivor space, and promoting about 5MB to the tenured generation. That also looks very good.

d) Based on (a) and (c), your app is generating 400MB of garbage per second. That is a lot of garbage. You might want to look at why that is an if there is anything you can do about it.

e) Your heap increased from 400MB to 650MB in about 3.5 minutes. That is an increase of about 70MB per minute. Based on the acitivity of this app, that doesn't look that bad either.

In all, this looks fairly good to me. It will be interesing, though, to see what happens when you hit a full GC. Cleaning up a 2.5GB tenured generation will be very painful (possibly very long pause time). There are two things you could do about that. First, reduce your overall heap size because based on the data you posted, I don't think that you really need a large heap. Though based on the amount of garbage being generated per second, you might really need a large heap. Second, switch to using the CMS collector. And when you do so, reset the tenuring threshold and survivor space sizes so that you can continue to clean up the 400MB of garbage per second with the minro collector rather than putting the burden for that on the CMS collector.
1 of 1 people found this helpful
Actions
10. Re: Cpu 100% and GC is running

arumilli Jan 6, 2012 7:02 AM (in response to peterj)

Thanks for your quick response..

I done the modification to JVM parameters and still threads getting incresing with GC.

-Dprogram.name=run.sh -Dprogram.name=run.sh -server -Xms3872m -Xmx3872m -Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000 -XX:NewSize=1278m -XX:MaxNewSize=1278m -Djava.net.preferIPv4Stack=true -XX:MaxPermSize=256m -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -Djavax.net.ssl.trustStore=/opt/properties/.keystore -Djavax.net.ssl.trustStorePassword=changeit -Dcom.sun.management.jmxremote.port=8888 -Djava.rmi.server.hostname=192.168.100.39 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote -XX:+PrintGCDetails -Xloggc:/opt/gc.log -XX:+UseTLAB -XX:+UseParNewGC -XX:ParallelGCThreads=4 -Xss256k -Djava.net.preferIPv4Stack=true

Do i need to add survivor ratio ,if so how I can decide the ratio value.
Actions
11. Re: Cpu 100% and GC is running

peterj Jan 6, 2012 11:12 AM (in response to arumilli)

Why are you assuming that the daemon thread increase has anything to do with GC? Based on the GC data you provided, GCs happen every 3 seconds and are very short. The thread graph from VisualVM shows something completely different - the thread count is steady for a long time until suddenly it grows and then once again is steady. I think that there is another explanation for the growth in threads. It is almost as if, for example, a new EJB was accessed and the app server instantiated a number of them, each on its own thread.

I suggest that you take a thread dump, one before the daemon thread increase and another after, and examine the thread dump looking threads labelled "daemon".

By the way, the GC threads are always running. Take a thread dump and you should see 4 or 5 of them (based on your settings). They are there even when GC is not running. I hope that you have at least 4 cores - you should never set the number of GC threads to a larger number than the number of cores.
Actions
12. Re: Cpu 100% and GC is running

peterj Jan 6, 2012 11:19 AM (in response to arumilli)

As for setting the survivor ratio, look at the GC data you posted. What is the ratio of the young generation size before GC and the largest size after GC? The after GC size is entirely within a survivor space. I would assume a max survivor space size of about 1.5 times the largest after GC size you see (1.5 x 36MB ~= 50MB), then 1278MB / 50MB ~= 25, which is what I would set the ratio to.

You need to also set the tenuruing threshold. These white papers might help:

Java Garbage Collection Statistical Analysis 101
Java Garbage Collection Performance Analysis 201
at http://www.cmg.org/cgi-bin/search.cgi?q=java+peter+johnson&x=30&y=10
Actions
13. Re: Cpu 100% and GC is running

rhusar Jan 6, 2012 12:15 PM (in response to peterj)

It is almost as if, for example, a new EJB was accessed and the app server instantiated a number of them, each on its own thread.
EJB invocations wouldnt really create daemon threads, would they...
Actions
14. Re: Cpu 100% and GC is running

peterj Jan 6, 2012 12:50 PM (in response to rhusar)

>>EJB invocations wouldnt really create daemon threads, would they...
I don't know (which is why I said "almost as if"), I'm not sure under which circumstances a thread gets marked as daemon and I'm too busy to try some experiments to find out. But the idea is probably correct - something new got started and that caused the threads to be created. Initially I thought opening a database connection, but I don't think that each connection gets its own thread. But it has to be something.
Actions

1 2 Previous Next

Go to original post