8 Replies Latest reply on Dec 10, 2010 9:20 AM by robertjlee

How to configure HornetQ to be faster at loading the journal?

robertjlee Nov 12, 2010 7:27 AM

Can we please ask some advice on how to get HornetQ to start up more quickly? Currently, it can take several hours for our server to start up.

We have a HornetQ sever with a single topic, with a typical message size of around 3-5K, and the following address settings:

<max-size-bytes>4294967296</max-size-bytes>
<page-size-bytes>1048576</page-size-bytes>

The Java MX setting is set to 10240m, which we thought would be plenty to cover the running of HornetQ with a 4096MiB topic.

When we start HornetQ, it very quickly slows down as it reads each file in the journal. The slowdown seems to be the checkDeleteSize() anonymous class in JournalImpl.load().

For most of the journal files, the free memory is not less than 20% of the maximum memory, so this method does nothing. But when it gets about 80% of the way through, the loading process slows right down. We have seen start-up times of several hours in total.

Some more observations:

- most of the files have a deleteCount of 20001, exactly one more than the threshold.

- the amount of memory saved by this loop does not make any significant difference to the heap usage shown in the JMX console (which also shows Eden space and Old Gen space as full)

- this method loops over all messages in memory for each journal file after a certain threshold. It is therefore order O(M*N), where M is the number of messages and N the number of journal files.

- this method is single-threaded, which means we are limited by the speed of each CPU core; if it used a multi-threaded approach (even just to find the records), it would give us a significant speed boost.

The main problem this causes is that we need to artificially increase the MX heap usage setting for Java in order to start HornetQ in a sensible time-frame. But there is no way to reduce this setting without restarting HornetQ, which means waiting for the journal directory to become small. (We think we need between 12 and 14Gb to start up avoiding this loop, for a single 4Gb address).

Would it be better to have fewer, larger journal files, or would this cause a performance problem during normal use?

Would it be better to have more, smaller journal files, to try and avoid having 20000 deletes in each file?

Is there any way we can "compact" the journal (ie remove deleted records) while HornetQ is stopped?

Many thanks for any help or advice.

1. Re: How to configure HornetQ to be faster at loading the journal?

clebert.suconic Nov 12, 2010 11:14 AM (in response to robertjlee)

There has been an update on trunk that will make it faster. Can you check on trunk?
Actions
2. Re: How to configure HornetQ to be faster at loading the journal?

robertjlee Nov 15, 2010 5:20 AM (in response to clebert.suconic)

I think we're currently running TRUNK version #9716; has the journal load code been improved since then?
Actions
3. Re: How to configure HornetQ to be faster at loading the journal?

clebert.suconic Nov 15, 2010 12:11 PM (in response to robertjlee)

No.. it hasn't...

Do you think you could share your data with me (I never look at the data itself... for me they are all random bytes). you can exchange emails about how to get access?

Or if you can't.. can you produce a testcase reproducing a similar data?
Actions
4. Re: How to configure HornetQ to be faster at loading the journal?

robertjlee Nov 16, 2010 11:31 AM (in response to clebert.suconic)

Sadly, we deleted our backup of the data journal after we got it restarted last time.

Hopefully we'll be able to reproduce a test case in the next few days.
Actions
5. Re: How to configure HornetQ to be faster at loading the journal?

clebert.suconic Nov 16, 2010 12:19 PM (in response to robertjlee)

you should preserve the whole data (if you can share the data).

A testcase will be the best option though.
Actions
6. Re: How to configure HornetQ to be faster at loading the journal?

robertjlee Dec 9, 2010 12:26 PM (in response to clebert.suconic)

We are working on a test case, and think we've reproduced something very similar to the issue that we saw on the live server.

It looks like part of the problem is the garbage collection method, when we run our unit tests on the default JVM settings, it works fine, but when we set it to the default from run.sh, the system slows right down during journal load (-XX:+UseParallelGC -XX:+AggressiveOpts -XX:+UseFastAccessorMethods -XX:ParallelGCThreads=6). We haven't seen checkDeleteSize() take progressively longer in the test yet though.

We will hopefully be in a position to create a JIRA issue tomorrow.
Actions
7. Re: How to configure HornetQ to be faster at loading the journal?

clebert.suconic Dec 9, 2010 5:08 PM (in response to robertjlee)

What probably happened was: There was a bug fix on trunk...

And currently there's an issue on loading data from previous versions.. There are probably two issues.

I would be interested on a testcase that creates / duplicates a probable issue within the same version on trunk.
Actions
8. Re: How to configure HornetQ to be faster at loading the journal?

robertjlee Dec 10, 2010 9:20 AM (in response to clebert.suconic)

The data was generated by TRUNK version #9716, and read back by the same version. We're still working on a unit test to replicate the problem; we think we know how to do it but the amount of data involved is significant enough that it's taking longer than we expected.

We did find one issue though: the default run.sh file contains settings for Xmx but not Xms, so the total memory will likely be smaller than the maximum memory at the point where the journal loads. JournalImpl line 1480 tests to see if memory is critical by comparing freeMemory with maxMemory, but freeMemory is the amount of total memory that is free, even if total memory is less than max memory. So the code could end up flushing deletes even if memory is not critical, but only a small amount of total memory is available.

Unless you are trying to reduce memory expansion during the journal load for some reason, but memory expansion characteristics are controlled by -XX:MinHeapFreeRatio and -XX:MaxHeapFreeRatio so it would seem strange to attempt such here?
Actions

Go to original post