9 Replies Latest reply on Sep 28, 2015 11:09 AM by mmr11408

    Significant performance degradation between 6.0 and 8.01

    mmr11408

       

      App: Two embedded Infinispan distributed, clustered cache programs running in tomcat 8, java 8. One loads and manages the cache, the other reads the cache (app server).

       

      I have been running with 6.0.0.Final on Windows for many months. Cache manager that loads several caches is started first, then app server is started in another instance of tomcat. Moving to 8.0.1 which was announced today I noticed that it takes 4 times longer for the app server to come up. There is no code change on my side between those two versions.

       

      Upgrading to 8.0.1 requires this new flag (-Djava.net.preferIPv4Stack=true) which was not specified in 6.0.0, otherwise, a “cannot set ip_ttl” related exception is thrown and the app dies.

       

      Below is the startup times using different versions:

       

      With -Djava.net.preferIPv4Stack=true

       

      1. 6.0 manager: Server startup in 39633 ms, Server startup in 40762 ms, Server startup in 38340 ms
      2. 6.0 server:  Server startup in 110445 ms, Server startup in 101195 ms, Server startup in 104344 ms

       

      Without -Djava.net.preferIPv4Stack=true

       

      1. 6.0 manager: Server startup in 37567 ms, Server startup in 35919 ms, Server startup in 36344 ms
      2. 6.0 server:  Server startup in 103732 ms, Server startup in 101272 ms, Server startup in 100887 ms

       

      Required -Djava.net.preferIPv4Stack=true

       

      1. 7.2.5 manager: Server startup in 53831 ms
      2. 7.2.5 server:  Server startup in 448290 ms

       

      Requires -Djava.net.preferIPv4Stack=true

       

      1. 8.0 manager: Server startup in 52939 ms
      2. 8.0 server:  Server startup in 430250 ms

       

      The new flag seems to negatively impact performance in the manager and server in 6.0.0 but not significantly. Syncing the cache between cluster members is significantly slowed down in version 7.x and 8.x. Is that expected or is it new news?

       

      Is it platform dependent (Windows vs. Linux) or the same performance is expected everywhere?

       

       

       

        • 1. Re: Significant performance degradation between 6.0 and 8.01
          dan.berindei

          We don't expect Windows to be slower than Linux, or 8.0.1 to be slower than 6.0.2 in general.

           

          Can you give more details as to what exactly happens during startup? Does the manager preload data from a database, and does the app server read that data explicitly or does it just receive it via state transfer? Can you post your Infinispan and JGroups configuration?

          • 2. Re: Significant performance degradation between 6.0 and 8.01
            mmr11408

             

            Thank you for the response. Normally, the manager preloads the cache from the DB during startup. In the test mode, it loads the cache with structured data using for loops, so it is not actually reading the DB. The app server receives the data via state transfer, it does not explicitly read the data.

             

            The JGroups configuration is the basic one shipped with Infinispan and is the same for both cache manager and app server. The Infinispan configuration of the manager tells it to load the full data into the cache whereas the app server only holds 20,000 records in the cache and uses a file store to hold the rest. The manager has an extra cache also which only the manager instances use. All are uploaded (jgroups.xml, jbossDataGridConfigurationManager.xml, jbossDataGridConfigurationServer.xml). Let me know if I can provide any additional information. Thanks in advance.

             

            • 3. Re: Significant performance degradation between 6.0 and 8.01
              mmr11408

              Looks like I did not link the files to this discussion properly, sorry. Please see:

               

              jbossDataGridConfigurationServer.xml

              jbossDataGridConfigurationManager.xml

              jgroups.xml

              • 4. Re: Significant performance degradation between 6.0 and 8.01
                wdfink

                Did you use the jgroups default configuration provided by 6 or 8 for both or dou you use the related one. JGroups might have several changes between versions.

                 

                Also are you able to see where the time is spend or is it spread all over the start phases?

                • 5. Re: Significant performance degradation between 6.0 and 8.01
                  mmr11408

                  Thanks for the reply.

                   

                  Used the same version that I was using for 6.0; was not aware of any changes that had to be made to that configuration file.

                   

                  In a day or two will attempt the upgrade again and collect the information and post it.

                  • 6. Re: Significant performance degradation between 6.0 and 8.01
                    dan.berindei

                    Updating the JGroups configuration is not mandatory, but we highly recommend keeping your configuration as close as possible to the default for the particular version of Infinispan you're using. Of course, we also like to hear when the defaults don't work for you for some reason...

                     

                    One thing sticking out is that we upgraded to UNICAST3 in the default configuration, which removed the need to RSVP state transfer-related commands. If you use UNICAST2 instead, it might take longer to re-transmit dropped messages, because we no longer set the RSVP flag on those commands.

                     

                    Regarding the Infinispan configuration, I see the state transfer chunkSize is set to 0 for most caches. That means "transfer everything in a single message" in 6.0, but is no longer supported since 7.0 because it too often leads to performance problems (and even OutOfMemoryErrors). We don't have a lot of hard data on this, but I'd say the ideal size for a state transfer "chunk" is around 1MB, so chunkSize should be 1000000/averageEntrySize. An order-of-magnitude variation shouldn't change the transfer speed too much, but setting the chunkSize to 1 when your entries are very small could slow things down a lot.

                     

                    If you try the update again, please enable DEBUG logging for org.infinispan and post the logs here, that might help track down where the slowdown is coming from.

                    • 7. Re: Significant performance degradation between 6.0 and 8.01
                      mmr11408

                       

                      Thanks for the response. The chunk-size indeed seems to make a significant difference in version 8. Now I don’t think there is a performance degradation between release 6 and 8.

                       

                      Using default-configs/default-jgroups-udp.xml from infinispan-embedded-query-8.0.1.Final.jar (if that is not the one I should be using please let me know) I collected the following data:

                       

                      Using jgroups.xml from Infinispan 6.0.0 and chunk-size=”1”

                       

                      Cache manager: Server startup in 44743 ms, Server startup in 54445 ms,

                      App server: Server startup in 379250 ms, Server startup in 337161 ms

                       

                       

                      Debug enabled, chunk-size of 1 and using default-configs/default-jgroups-udp.xml from infinispan-embedded-query-8.0.1.Final.jar:

                       

                      Cache manager: Server startup in 60644 ms

                      App server: Server startup in 500309 ms

                       

                       

                      Debug enabled, chunk-size of 1000000/averageEntrySize and using default-configs/default-jgroups-udp.xml from infinispan-embedded-query-8.0.1.Final.jar:

                       

                      Cache Manager: Server startup in 49811 ms, Server startup in 49313 ms

                      App Server: Server startup in 89160 ms , Server startup in 93951 ms , Server startup in 92919 ms, Server startup in 93811 ms, Server startup in 95832 ms

                       

                       

                      One issue that I have not been able to resolve is the errors I get in Eclipse in my 8.0 Infinispan configuration files. It must be something very simple but I have been starring at it for a while. In the file below, Eclipse displays this error on each line containing “locking”.

                       

                      vc-complex-type.2.4.d: Invalid content was found starting with element 'locking'. No child element is expected at this point.

                       

                      I suspect the issue is related to the state-transfer entry because if I move the lines around the error is always on the entry after “state-transfer”. If I take that XML and run a validator against it, I get the same error, but if I move “state-transfer” to the bottom of the “replicated-cache” section it becomes valid, but Eclipse complains earlier that a “state-transfer” is needed. Can anyone see what I am missing?

                       

                      <?xml version="1.0" encoding="UTF-8"?>
                      <infinispan xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                                  xsi:schemaLocation="urn:infinispan:config:8.0 http://www.infinispan.org/schemas/infinispan-config-8.0.xsd"
                                  xmlns="urn:infinispan:config:8.0">

                       

                         <jgroups transport="org.infinispan.remoting.transport.jgroups.JGroupsTransport">
                            <stack-file name="udp" path="jgroups.xml"/>
                            <!-- <stack-file name="tcp" path="jgroups-tcp.xml"/> -->
                         </jgroups>

                         <cache-container default-cache="default">

                            <replicated-cache name="MailboxCache" mode="SYNC" remote-timeout="90000" statistics="true" >
                                <state-transfer enabled="true" timeout="940000" chunk-size="18181" await-initial-transfer="true" />
                                <locking isolation="READ_COMMITTED" />
                                <partition-handling enabled="false" />
                                <transaction mode="NONE" />
                                <eviction strategy="LIRS" max-entries="20000" />
                                <expiration max-idle="-1" lifespan="-1" />
                                <store-as-binary keys="false" values="false" />
                                <persistence passivation="false" >
                                   <file-store path="Infinispan-SingleFileCacheStore" />
                                </persistence>
                                <versioning scheme="NONE" />
                                <indexing index="NONE" />
                            </replicated-cache>

                            <replicated-cache name="EdiCache" mode="SYNC" remote-timeout="90000" statistics="true">
                                <state-transfer enabled="true" timeout="940000" chunk-size="16666" await-initial-transfer="true"/>
                                <locking isolation="READ_COMMITTED" />
                                <partition-handling enabled="false" />
                                <transaction mode="NONE" />
                                <eviction strategy="LIRS" max-entries="20000" />
                                <expiration max-idle="-1" lifespan="-1" />
                                <store-as-binary keys="false" values="false" />
                                <persistence passivation="false" >
                                   <file-store path="Infinispan-SingleFileCacheStore" />
                                </persistence>
                                <versioning scheme="NONE" />
                                <indexing index="NONE" />
                            </replicated-cache>

                            <replicated-cache name="BuCache" mode="SYNC" remote-timeout="90000" statistics="true">
                                <state-transfer enabled="true" timeout="940000" chunk-size="25000" await-initial-transfer="true"/>
                                <locking isolation="READ_COMMITTED" />
                                <partition-handling enabled="false" />
                                <transaction mode="NONE" />
                                <eviction strategy="LIRS" max-entries="20000" />
                                <expiration max-idle="-1" lifespan="-1" />
                                <store-as-binary keys="false" values="false" />
                                <persistence passivation="false" >
                                   <file-store path="Infinispan-SingleFileCacheStore" />
                                </persistence>
                                <versioning scheme="NONE" />
                                <indexing index="NONE" />
                            </replicated-cache>

                            <replicated-cache name="ClusterCache" mode="SYNC" remote-timeout="90000" statistics="true">
                                <state-transfer enabled="true" timeout="940000" chunk-size="100000" await-initial-transfer="true"/>
                                <locking isolation="READ_COMMITTED" />
                                <partition-handling enabled="false" />
                                <transaction mode="NONE" />
                                <eviction strategy="NONE" max-entries="10" />
                                <expiration max-idle="-1" lifespan="-1" />
                                <store-as-binary keys="false" values="false" />
                                <persistence passivation="false" >
                                </persistence>
                                <versioning scheme="NONE" />
                                <indexing index="NONE" />
                            </replicated-cache>

                            <replicated-cache name="StatsCache" mode="SYNC" remote-timeout="20000" statistics="true">
                                <state-transfer enabled="true" timeout="240000" chunk-size="10000" await-initial-transfer="true"/>
                                <locking isolation="READ_COMMITTED" />
                                <partition-handling enabled="false" />
                                <transaction mode="NONE" />
                                <eviction strategy="LIRS" max-entries="1000" />
                                <expiration max-idle="-1" lifespan="-1" />
                                <store-as-binary keys="false" values="false" />
                                <persistence passivation="false" >
                                   <file-store path="Infinispan-SingleFileCacheStore" />
                                </persistence>
                                <versioning scheme="NONE" />
                                <indexing index="NONE" />
                            </replicated-cache>

                         </cache-container>

                      </infinispan>

                       

                      • 8. Re: Significant performance degradation between 6.0 and 8.01
                        dan.berindei

                        That's great news, Mehdi! I think the difference between 6.0 and 8.0 with chunkSize=1 may be because the newer JGroups is more aggressive about bundling messages, and bundling introduces latency when you have lots of small, synchronous messages.

                         

                        The configuration validation error is because we use xs:sequence in our XSDs, and that means the elements always have to be in the same order. xs:all would allow reordering, but it doesn't work with inheritance, and we use inheritance a lot in our XSDs.

                        • 9. Re: Significant performance degradation between 6.0 and 8.01
                          mmr11408

                          Thanks Dan. I was using urn:infinispan:config:8.0 which was throwing me off. By looking at http://infinispan.org/schemas/infinispan-config-8.0.xsd and rearranging the order the error went away.