InfiniSpan 5.1.0CR2 with Jgroups 3.0.1 performance| JBoss.org Content Archive (Read Only)

15. Re: InfiniSpan 5.1.0CR2 with Jgroups 3.0.1 performance

galder.zamarreno Mar 1, 2012 10:35 AM (in response to galder.zamarreno)

Something else for you to try: add useReplQueue="true" replQueueMaxElements="10000" attributes to <async> element in order to use a replication queue.

16. Re: InfiniSpan 5.1.0CR2 with Jgroups 3.0.1 performance

belaban Mar 1, 2012 10:57 AM (in response to galder.zamarreno)

OK, so I ran your test on my laptop, but changed the test slightly (removed unnecessary auto-boxing/unboxing). I also use a repl-queue, as described by Galder.

I got 24 TXs / millisecond.

I actually used the latest JGroups and Infinispan, but this didn't make a great diff (ca. 10%).

17. Re: InfiniSpan 5.1.0CR2 with Jgroups 3.0.1 performance

galder.zamarreno Mar 1, 2012 11:03 AM (in response to belaban)

Btw, 275 tx/ms for ehcache looks way too high. In fact, I was getting a similar number with Infinispan earlier.... when the passive and active node did not cluster

I don't see where in that ehcache configuration is clustering set up.

AFAIK, you either need some kind of JGroups magic that ehcache used to plug in, or terracota magic, which is the clustering voodoo they add these days.

18. Re: InfiniSpan 5.1.0CR2 with Jgroups 3.0.1 performance

belaban Mar 1, 2012 11:13 AM (in response to galder.zamarreno)

Yes, 275'000 TXs / sec seem too good to be true. The thing you don't do in your test is to wait for all of the data to arrive at the recipient ("passive"). So if an implementation simply buffers the puts and removes (the gets are local, so no clustering traffic is generated there), your test would finish (if the buffer is large enough to hold all of the puts and removes) without *any* replication happening at all !

I can't speak for ehcache, but in our system, the PUTs are REMOVEs are sent asynchronously (in this config), or even placed into a replication queue (if enabled), and sent whenever 2000 messages have been queued. It is very likely that the passive cache won't have received all updated when your perf test finishes...

You could modify your test to put a special moniker into the cache when all sender threads are done, and poll at the recipient until you see this moniker, and only *then* return and measure the time taken.

19. Re: InfiniSpan 5.1.0CR2 with Jgroups 3.0.1 performance

ltepper Mar 4, 2012 4:02 AM (in response to galder.zamarreno)

Hi Galder,

First, thank you for your answers.

Appologies for the typo, In the lower example, I meant

which solved me the RejectedExecutionException I had.

Regarding the ehcache version, it's ehcache-core-2.5.1.jar.

One more important thing - I've made the ehcache run synchronously by using

properties

=

"replicateAsynchronously=false"

as a property of cacheEventListenerFactory.

and the results were 2 transactions / millisecond.

This explains why in synchronous cache mode I got the memory problem -

In fact, further tests showed that when the synchronous cache mode test finished, millions of RMI objects were still running,

The meaning of this is that the 275 transaction / millisecond test was actually misleading and should be ignored.

Regarding use of TCP in infinispan with JGroup, I actually started with this configuration and got to similar results (a different test setup, but got to 14 transactions / millisecond).

Galder - with the TCP setup, can you please mention what were the rates you achived?

Also here is my network configuration:

=================================================

[root@smp128 classes]# ethtool eth0
Settings for eth0:
        Supported ports: [ TP ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Supports auto-negotiation: Yes
        Advertised link modes: 10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Advertised auto-negotiation: Yes
        Speed: 1000Mb/s
        Duplex: Full
        Port: Twisted Pair
        PHYAD: 1
        Transceiver: internal
        Auto-negotiation: on
        Supports Wake-on: g
        Wake-on: g
        Link detected: yes

=================================================

Is it possible that 1000 Mb/s is my bottleneck ?

Regards,

Liron Tepper

20. Re: InfiniSpan 5.1.0CR2 with Jgroups 3.0.1 performance

ltepper Mar 4, 2012 4:21 AM (in response to ltepper)

Hi Bela,

Thank you very much for your answers.

24 trans / sec sounds way better, and may be good enough for us.

Can you please send me that code changes you've made? I would like to try get to these rates.

Another question - are you sure that in an your asynchronous setup, the replication queue remains empty in the second that the test ends?

Regards,

Liron Tepper

21. Re: InfiniSpan 5.1.0CR2 with Jgroups 3.0.1 performance

belaban Mar 4, 2012 5:03 AM (in response to ltepper)

With sync replication I get 3 transactions / millisecond in my test. This will naturally always be much slower than async repl

22. Re: InfiniSpan 5.1.0CR2 with Jgroups 3.0.1 performance

belaban Mar 4, 2012 5:08 AM (in response to ltepper)

Send me your email address (to belaban at yahoo dot com), and I'll send you the changes. Note that I used Infinispan 5.2 snapshot and JGroups 3.1 snapshot, I can ship those 2 JARs as well if you want.

With this setup and <async asyncMarshalling="false" useReplQueue="true" replQueueMaxElement="1000"/>, I get (for 60 threads, each doing 100'000 requests) 37 transactions / millisecond !

No, I don't know that the replication queue is empty when the test ends, I actually suspect there are still elements in it, that's why I suggested the additional of an END marker, and taking the stop time only when the END marker is in the passive cache.

23. Re: InfiniSpan 5.1.0CR2 with Jgroups 3.0.1 performance

ltepper Mar 4, 2012 10:11 AM (in response to belaban)

Hi Bela,

I've used your code and the new configuration files and the new jars and got improvement to 17 transactions / millisecond.

I'm trying to understand what is limiting the performance on my test.

Regards,

Liron Tepper

24. Re: InfiniSpan 5.1.0CR2 with Jgroups 3.0.1 performance

belaban Mar 5, 2012 3:05 AM (in response to ltepper)

Hi Liron,

The config I sent you uses UDP. You might want to tune the TCP/IP stack a bit (e.g. increase the net.core.rmem_max buffer) to increase the perf, or try using TCP instead of UDP.

For me it's not so important whether you get 17 or 37 TXs/ms, but I wanted to look into the 250+ TX/ms number you got on ehcache, versus the low number on Infinispan; I was concerned about an order of magnitude perf diff between ehcache and Infinispan, but now this seems to be gone. Do you have any recent numbers on your tests with ehcache ?

As I mentioned before, async replication gives you the best numbers, but your test doesn't really measure the time to send and receive N modifications; it only measures the time to *send* N modifications, without waiting until they have been received on the passive node. I suggest modifying your test slightly: each sender thread should place an END marker key into the cache, and - when done sending - the test should block until *all* END markers are seen in the passive cache. IMO, measuring the time it takes to *replicate* N items is more meaningful than measuring the time it takes to *send* N items.

Cheers,

25. Re: InfiniSpan 5.1.0CR2 with Jgroups 3.0.1 performance

ltepper Mar 5, 2012 4:08 AM (in response to belaban)

Hi Bela,

With ehCache I got to only 2 TX / ms on synchronous mode, and on asynchronous mode it has a problem of memory consumption which means the test has failed.

I actually did what you've suggested with the END marker and this is how I got the 17 TX / ms. (sent one marker after "join" operation to all threads)

Regards,

Liron Tepper

26. Re: InfiniSpan 5.1.0CR2 with Jgroups 3.0.1 performance

belaban Mar 5, 2012 4:17 AM (in response to ltepper)

Liron Tepper wrote:

Hi Bela,

With ehCache I got to only 2 TX / ms on synchronous mode, and on asynchronous mode it has a problem of memory consumption which means the test has failed.
I actually did what you've suggested with the END marker and this is how I got the 17 TX / ms. (sent one marker after "join" operation to all threads)

OK, now you should run Infinispan in synchronous mode as well (add <sync/> and remove <async ... />), and see what numbers you're getting. I'd say this should be about the same as ehcache, maybe a bit higher. In real life, in your app you probably want to group the modifications made to the cache together into a real (JTA) transaction.

I see, with an END marker, you'll probably get less performance because you now include the time to apply all modifications, and that takes longer than just measuring the time to send the modifications. This would also be lower than 37 TX/ms on my machine. The diff shouldn't be framatic though, as my repl queue was only 1000, so I would have had to subtract the time it takes to deliver 1000 modifications (tops) to the application.

Re ehcache: you might run out of memory because they're buffering modifications (similar to our replQueue) and only flush the queue every now and then. You might be able to set this interval, thus preventing the OOME. I'm not an expert though, consult the ehcache forums for details on how to do this.

27. Re: InfiniSpan 5.1.0CR2 with Jgroups 3.0.1 performance

ltepper Mar 11, 2012 4:36 AM (in response to belaban)

Hi Bela,

Thanks for your answer.

I have another question for you.

Is it possible to use a setup in which every machine run two instances of infinispan server? (total of two machines)

If true, will the rate now be 17 tx / ms in each of the instances, or will it be 8.5 tx / sec (total rate of 17)?

This will double the performance for us if possible.

Regards,

Liron Tepper

28. Re: InfiniSpan 5.1.0CR2 with Jgroups 3.0.1 performance

belaban Mar 12, 2012 4:49 AM (in response to ltepper)

You can run as many instances as you want, as long as the instances form a cluster. So, for example, you could run 2 passive instances on one box, and 1 passive and 1 active on another box.

Running 1 passive / 1active instances, I get 35 TXs / ms.

Running 3 passive / 1 active instances, I get 28 TXs / ms.

Remmber, we're using replication, so

- every key is stored in every instances (also the active cache)

- every modification triggers a message send to all nodes in the cluster

The first item makes scalability a function of the average data size and the cluster size. For example, if you have an average data size of 100MB, and 10 nodes, then every node will have to store roughly 1GB of data. As you can see, this is not very scalable as either the cluster size increases, or the data size increases. However, reads are very fast, as they are always local (no network round trip).

29. Re: InfiniSpan 5.1.0CR2 with Jgroups 3.0.1 performance

ltepper Mar 12, 2012 5:43 AM (in response to belaban)

Hi Bela,

Thank you for your reply.

I might not have explained myself so well.

What I actually meant was forming 2 different clusters, each cluster holding different cache instances.

host A host B

===== =====

cluster 1

passive1 <------------> active1

cluster 2

passive2 <------------> active2

Note that passive1 and passive2 are different java processes.

I have just managed to run such a setup, by using seperate configuration files for each cluster. (with different ports etc.)

The results were very good - got actual rate of 16 tx / ms in each cluster !

Is that sounds possible? I wonder what is the bottleneck of a single cluster.

Regards,

Liron Tepper