8 Replies Latest reply on Jul 3, 2012 10:46 AM by clebert.suconic

    HornetQ startup fails during initialization with null pointer exception if large number of page files are present.

    labdhi

      I am using standalone HornetQ 2.2.5.Final version. paging is enabeled and schedule delivery feature is in use.

      journal size is 400 MB and page file size is 10 MB.  

      using plain JMS transactions instead of JTA. application is multithreaded with dozens of producers and more than 50 concurrent consumers around 20 Queues    

       

      Spring is used to produce and consume the messages.

      CachingConnectionFactory is used to wrap the HornetQQueueConnectionFactory and various JMS admin objects like connection, session and message producers are cached by Spring.

      DefaultMessageListenerContainers provided by Spring are in use to pull the message fom HornetQ. at the time of pulling the message JMS session in use is transacted and transaction is a local JMS transaction. jta is not in use in short thruout.

      ConsumerWindowSize is set to 0 on to ensure that only one message will get pulled at a time per consumer thread. as and when consumer thread is done with one message[either commit or rollback] it pulls another message from HornetQ.

      HornetQ is running as standalone process on a different node. producers and consumers are remotely connecting to it.

       

      incase of huge load HornetQ starts paging incoming messages to page files. 

       

       

      If we end up accumulating too many page files on the disk at that time another thing what i noticed is that HornetQ instance is not even coming up properly during restart of the process and failing during initialization itself. stack trace is given below for such kind of scenarios.

       

      [main] 06:50:31,490 SEVERE [org.hornetq.core.server.impl.HornetQServerImpl]  Failure in initialisation

      1. java.lang.NullPointerException

              at org.hornetq.core.paging.cursor.impl.PageSubscriptionImpl.getPageInfo(PageSubscriptionImpl.java:726)

              at org.hornetq.core.paging.cursor.impl.PageSubscriptionImpl.getPageInfo(PageSubscriptionImpl.java:712)

              at org.hornetq.core.paging.cursor.impl.PageSubscriptionImpl.processReload(PageSubscriptionImpl.java:655)

              at org.hornetq.core.paging.cursor.impl.PageCursorProviderImpl.processReload(PageCursorProviderImpl.java:241)

              at org.hornetq.core.paging.impl.PagingStoreImpl.processReload(PagingStoreImpl.java:355)

              at org.hornetq.core.paging.impl.PagingManagerImpl.processReload(PagingManagerImpl.java:250)

              at org.hornetq.core.persistence.impl.journal.JournalStorageManager.loadMessageJournal(JournalStorageManager.java:1229)

              at org.hornetq.core.server.impl.HornetQServerImpl.loadJournals(HornetQServerImpl.java:1619)

              at org.hornetq.core.server.impl.HornetQServerImpl.initialisePart2(HornetQServerImpl.java:1469)

              at org.hornetq.core.server.impl.HornetQServerImpl.access$100(HornetQServerImpl.java:132)

              at org.hornetq.core.server.impl.HornetQServerImpl$SharedStoreLiveActivation.run(HornetQServerImpl.java:356)

              at org.hornetq.core.server.impl.HornetQServerImpl.start(HornetQServerImpl.java:570)

              at org.hornetq.jms.server.impl.JMSServerManagerImpl.start(JMSServerManagerImpl.java:275)

              at com.qpass.service.jms.HornetQService.main(HornetQService.java:51)

       

       

      my  hornetq-configuration.xml is given below.

       

       

      <configuration xmlns="urn:hornetq"
                     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                     xsi:schemaLocation="urn:hornetq /schema/hornetq-configuration.xsd">

         <acceptors>
            <acceptor name="netty">
               <factory-class>org.hornetq.core.remoting.impl.netty.NettyAcceptorFactory</factory-class>
               <param key="host"  value="${hostname}"/>
               <param key="port"  value="${livePort}"/>
               <param key="use-nio"  value="true"/>
            </acceptor>
            <acceptor name="netty_localhsot">
               <factory-class>org.hornetq.core.remoting.impl.netty.NettyAcceptorFactory</factory-class>
               <param key="host"  value="localhost"/>
               <param key="port"  value="${livePort}"/>
               <param key="use-nio"  value="true"/>
            </acceptor>
         </acceptors> 

      <address-settings>
            <!--default for catch all-->
             <address-setting match="jms.queue.ConsumerUpdateRequestQueue">
               <dead-letter-address>jms.queue.consumerupdate.ExceptionQueue</dead-letter-address>
               <max-delivery-attempts>1</max-delivery-attempts>
               <max-size-bytes>${queueDataInMemory}</max-size-bytes>
            </address-setting>
             <address-setting match="jms.queue.ConsumerUpdateRetryQueue">
               <dead-letter-address>jms.queue.chargesandpayments.ExceptionQueue</dead-letter-address>
               <max-delivery-attempts>1</max-delivery-attempts>
               <max-size-bytes>${queueDataInMemory}</max-size-bytes>
            </address-setting>
             <address-setting match="jms.queue.EventDistributionQueue">
               <dead-letter-address>jms.queue.notification.ExceptionQueue</dead-letter-address>
               <max-delivery-attempts>1</max-delivery-attempts>
               <max-size-bytes>${EDQ_queueDataInMemory}</max-size-bytes>
            </address-setting>
            <address-setting match="jms.queue.consumerupdate.DeadUpdateQueue">
               <max-size-bytes>${queueDataInMemory}</max-size-bytes>
               <max-delivery-attempts>-1</max-delivery-attempts>
               <redelivery-delay>60000</redelivery-delay>
            </address-setting>
            <address-setting match="jms.queue.notification.DeadEventQueue">
               <max-size-bytes>${queueDataInMemory}</max-size-bytes>
               <max-delivery-attempts>-1</max-delivery-attempts>
               <redelivery-delay>60000</redelivery-delay>
            </address-setting>
            <address-setting match="jms.queue.mail.DeadLetterQueue">
               <max-size-bytes>${queueDataInMemory}</max-size-bytes>
               <max-delivery-attempts>-1</max-delivery-attempts>
               <redelivery-delay>60000</redelivery-delay>
            </address-setting>
            <address-setting match="jms.queue.chargesandpayments.DeadMessageQueue">
               <max-size-bytes>${queueDataInMemory}</max-size-bytes>
               <max-delivery-attempts>-1</max-delivery-attempts>
               <redelivery-delay>60000</redelivery-delay>
            </address-setting>
            <address-setting match="jms.queue.chargesandpayments.DeadMessageQueues">
               <max-size-bytes>${queueDataInMemory}</max-size-bytes>
               <max-delivery-attempts>-1</max-delivery-attempts>
               <redelivery-delay>60000</redelivery-delay>
            </address-setting>
            <address-setting match="jms.queue.osn.ExternalNotificationDeliveryDeadQueue">
               <max-size-bytes>${queueDataInMemory}</max-size-bytes>
               <max-delivery-attempts>-1</max-delivery-attempts>
               <redelivery-delay>60000</redelivery-delay>
            </address-setting>
            <address-setting match="jms.queue.extnotification.ExternalNotificationRedeliveryDeadQueue">
               <max-size-bytes>${queueDataInMemory}</max-size-bytes>
               <max-delivery-attempts>-1</max-delivery-attempts>
               <redelivery-delay>60000</redelivery-delay>
            </address-setting>
            <address-setting match="jms.queue.extnotification.ExternalNotificationRedeliveryQueue">
               <max-size-bytes>${queueDataInMemory}</max-size-bytes>
               <max-delivery-attempts>-1</max-delivery-attempts>
               <redelivery-delay>60000</redelivery-delay>
            </address-setting>
            <address-setting match="jms.queue.HeuristicHazardsQueue">
               <max-size-bytes>${queueDataInMemory}</max-size-bytes>
            </address-setting>
            <address-setting match="jms.queue.osn.ExternalNotificationDeliveryQueue">
               <dead-letter-address>jms.queue.extnotification.ExternalNotificationRedeliveryQueue</dead-letter-address>
               <max-delivery-attempts>1</max-delivery-attempts>
               <max-size-bytes>${ENDQ_queueDataInMemory}</max-size-bytes>
            </address-setting>
            <address-setting match="jms.queue.BillingRequestQueue">
               <max-size-bytes>${BRQ_queueDataInMemory}</max-size-bytes>
            </address-setting>
             <address-setting match="jms.queue.MailQueue">
               <dead-letter-address>jms.queue.mail.ExceptionQueue</dead-letter-address>
               <max-delivery-attempts>3</max-delivery-attempts>
               <max-size-bytes>${queueDataInMemory}</max-size-bytes>
            </address-setting>
             <address-setting match="jms.queue.PaymentAdaptorRequestQueue">
               <dead-letter-address>jms.queue.chargesandpayments.ExceptionQueue</dead-letter-address>
               <max-delivery-attempts>-1</max-delivery-attempts>
               <max-size-bytes>${queueDataInMemory}</max-size-bytes>
            </address-setting>
             <address-setting match="jms.queue.PaymentAdaptorResponseQueue">
               <dead-letter-address>jms.queue.chargesandpayments.DeadMessageQueues</dead-letter-address>
               <max-delivery-attempts>1</max-delivery-attempts>
               <max-size-bytes>${PARQ_queueDataInMemory}</max-size-bytes>
            </address-setting>
             <address-setting match="jms.queue.PaymentAdaptorRetryQueue">
               <dead-letter-address>jms.queue.chargesandpayments.DeadMessageQueues</dead-letter-address>
               <max-delivery-attempts>1</max-delivery-attempts>
               <max-size-bytes>${queueDataInMemory}</max-size-bytes>
            </address-setting>
             <address-setting match="jms.queue.notification.RedeliveryQueue">
               <dead-letter-address>jms.queue.notification.ExceptionQueue</dead-letter-address>
               <max-delivery-attempts>1</max-delivery-attempts>
               <max-size-bytes>${queueDataInMemory}</max-size-bytes>
            </address-setting>
             <address-setting match="jms.queue.consumerupdate.ExceptionQueue">
               <dead-letter-address>jms.queue.consumerupdate.DeadUpdateQueue</dead-letter-address>
               <max-delivery-attempts>1</max-delivery-attempts>
               <max-size-bytes>${queueDataInMemory}</max-size-bytes>
            </address-setting>
             <address-setting match="jms.queue.chargesandpayments.ExceptionQueue">
               <dead-letter-address>jms.queue.chargesandpayments.DeadMessageQueue</dead-letter-address>
               <max-delivery-attempts>1</max-delivery-attempts>
               <max-size-bytes>${queueDataInMemory}</max-size-bytes>
            </address-setting>
             <address-setting match="jms.queue.extnotification.ExceptionQueue">
               <dead-letter-address>jms.queue.extnotification.ExternalNotificationRedeliveryDeadQueue</dead-letter-address>
               <max-delivery-attempts>1</max-delivery-attempts>
               <max-size-bytes>${queueDataInMemory}</max-size-bytes>
            </address-setting>
             <address-setting match="jms.queue.mail.ExceptionQueue">
               <dead-letter-address>jms.queue.mail.DeadLetterQueue</dead-letter-address>
               <max-delivery-attempts>1</max-delivery-attempts>
               <max-size-bytes>${queueDataInMemory}</max-size-bytes>
            </address-setting>
             <address-setting match="jms.queue.notification.ExceptionQueue">
               <dead-letter-address>jms.queue.notification.DeadEventQueue</dead-letter-address>
               <max-delivery-attempts>1</max-delivery-attempts>
               <max-size-bytes>${queueDataInMemory}</max-size-bytes>
            </address-setting>
            ${customHornetQCoreQueueConfiguration}
         </address-settings>
         <jmx-domain>com.qpass</jmx-domain>
         <paging-directory>${data_dir}/paging</paging-directory>
         <bindings-directory>${data_dir}/bindings</bindings-directory>
         <journal-directory>${data_dir}/journal</journal-directory>
         <large-messages-directory>${data_dir}/large-messages</large-messages-directory>
         <journal-file-size>${journalSize}</journal-file-size>
         <journal-min-files>3</journal-min-files>
         <journal-type>NIO</journal-type>
         <id-cache-size>1</id-cache-size>
         <persist-id-cache>false</persist-id-cache>
         <security-enabled>false</security-enabled>
         <remoting-interceptors>
            <class-name>com.qpass.service.jms.ScheduledDeliveryInterceptor</class-name>
         </remoting-interceptors>
         <message-counter-enabled>true</message-counter-enabled>
         <persist-delivery-count-before-delivery>false</persist-delivery-count-before-delivery>
         <message-counter-sample-period>300000</message-counter-sample-period>
         <message-counter-max-day-history>2</message-counter-max-day-history>
         <message-expiry-scan-period>600000</message-expiry-scan-period>
         <message-expiry-thread-priority>1</message-expiry-thread-priority>
         <memory-measure-interval>180000</memory-measure-interval>
         ${customHornetQCoreConfiguration}
      </configuration>

        • 1. Re: HornetQ startup fails during initialization with null pointer exception if large number of page files are present.
          clebert.suconic

          I have fixed something on Branch_2_2_EAP that will be merged to the other branches next week.

           

          But I could only replicate it if deleting the page folder and not the journal. (or vice versa.. don't remember now).

          • 2. Re: HornetQ startup fails during initialization with null pointer exception if large number of page files are present.
            rahook

            We've just had a similar sort of issue, which left HornetQ (inside JBoss 6.1.0 Final) unable to start up cleanly, with pretty well the same exception (shown below). Since we had to get the service back up in a hurry, we copied aside all the journal and paging files, hosed out the data directories, and went off. The trouble is that we have 200,000+ messages in the paging file that we really would like to recover (yes, I know, we need a HA solution - our data volume is growing faster than we can develop the full solution). So not an entirely related issue: is there any way to recover the messages from the backed up paging and journal files?

             

             

             

            14:34:23,739 INFO  [HornetQServerImpl] live server is starting with configuration HornetQ Configuration (clustered=false,backup=false,sharedStore=true,journalDirectory=/mnt/data1/servers/jboss-.1.0.Final/server/default/data/hornetq/journal,bindingsDirectory=/mnt/data1/servers/jboss-.1.0.Final/server/default/data/hornetq/bindings,largeMessagesDirectory=/mnt/data1/servers/jboss-6.1.0.Final/server/default/data/hornetq/largemessages,pagingDirectory=/mnt/data1/servers/jboss-6.1.0.Final/server/default/data/hornetq/paging)

            14:34:23,741 INFO  [HornetQServerImpl] Waiting to obtain live lock

            14:34:23,831 INFO  [JournalStorageManager] Using NIO Journal

            14:34:23,856 WARNING [HornetQServerImpl] Security risk! It has been detected that the cluster admin user and password have not been change

            d from the installation default. Please see the HornetQ user guide, cluster chapter, for instructions on how to do this.

            14:34:24,348 INFO  [FileLockNodeManager] Waiting to obtain live lock

            14:34:24,349 INFO  [FileLockNodeManager] Live Server Obtained live lock

            14:34:26,893 SEVERE [HornetQServerImpl] Failure in initialisation: java.lang.NullPointerException

                      at org.hornetq.core.paging.cursor.impl.PageSubscriptionImpl.getPageInfo(PageSubscriptionImpl.java:726) [:6.1.0.Final]

                      at org.hornetq.core.paging.cursor.impl.PageSubscriptionImpl.getPageInfo(PageSubscriptionImpl.java:712) [:6.1.0.Final]

                      at org.hornetq.core.paging.cursor.impl.PageSubscriptionImpl.processReload(PageSubscriptionImpl.java:655) [:6.1.0.Final]

                      at org.hornetq.core.paging.cursor.impl.PageCursorProviderImpl.processReload(PageCursorProviderImpl.java:241) [:6.1.0.Final]

                      at org.hornetq.core.paging.impl.PagingStoreImpl.processReload(PagingStoreImpl.java:355) [:6.1.0.Final]

                      at org.hornetq.core.paging.impl.PagingManagerImpl.processReload(PagingManagerImpl.java:250) [:6.1.0.Final]

            • 3. Re: HornetQ startup fails during initialization with null pointer exception if large number of page files are present.
              ronnys

              Hi Clebert,

               

              we recently encountered the same NPE with HornetQ 2.2.5 on startup with some existing page files. After porting the changes made the HornetQ SVN repository r11150 (for PageSubscriptionImpl.java only), it became better, but still failed with another NPE for 2 transactions. The same applies to HornetQ 2.2.14, the stacktrace is

               

              * [main] 2-Jul 16:23:55,814 SEVERE [HornetQServerImpl]  Failure in initialisation

              java.lang.NullPointerException

                      at org.hornetq.core.paging.cursor.impl.PageSubscriptionImpl.installTXCallback(PageSubscriptionImpl.java:826)

                      at org.hornetq.core.paging.cursor.impl.PageSubscriptionImpl.reloadPreparedACK(PageSubscriptionImpl.java:561)

                      at org.hornetq.core.persistence.impl.journal.JournalStorageManager.loadPreparedTransactions(JournalStorageManager.java:1955)

                      at org.hornetq.core.persistence.impl.journal.JournalStorageManager.loadMessageJournal(JournalStorageManager.java:1273)

                      at org.hornetq.core.server.impl.HornetQServerImpl.loadJournals(HornetQServerImpl.java:1603)

                      at org.hornetq.core.server.impl.HornetQServerImpl.initialisePart2(HornetQServerImpl.java:1445)

                      at org.hornetq.core.server.impl.HornetQServerImpl.access$1200(HornetQServerImpl.java:138)

                      at org.hornetq.core.server.impl.HornetQServerImpl$SharedStoreLiveActivation.run(HornetQServerImpl.java:1919)

                      at org.hornetq.core.server.impl.HornetQServerImpl.start(HornetQServerImpl.java:366)

                      [...]

               

              I had to export the journal and remove the offending record + adapt the associated transaction manually to start the server. Would be nice if this could be fixed in the next HornetQ version.

               

              Thanks & Best regards,

              Ronny

              • 4. Re: HornetQ startup fails during initialization with null pointer exception if large number of page files are present.
                clebert.suconic

                Did you manually remove any pages?

                 

                 

                I wonder how it got into a situation with prepared TX and no Pages.

                 

                Do you still have the data before the occurence? any chance I could look into that to try to understand how it got into that?

                • 5. Re: HornetQ startup fails during initialization with null pointer exception if large number of page files are present.
                  clebert.suconic

                  if you manually removed pages this could happen (althought it shouldn't prevent the server from starting up.. which I can fix it).

                   

                   

                  I'm just trying to understand how the data got into this situation.. maybe there's something else I can do besides just avoid the startup. I will be waiting your input before I do any work on this.

                   

                   

                  If you can also open a JIRA for this please.

                  • 6. Re: HornetQ startup fails during initialization with null pointer exception if large number of page files are present.
                    clebert.suconic

                    hmmm... maybe it got into this situation with 2.2.5 and now you're trying to open it with 2.2.14?

                     

                    again.. I'm in wait mode

                    • 7. Re: HornetQ startup fails during initialization with null pointer exception if large number of page files are present.
                      ronnys

                      Hi Clebert,

                       

                      thanks for the quick reply.

                       

                      We havn't removed pages manually and HornetQ has been shut down properly. I still have the full data directory, but unfortunately, this is production data with customer stuff inside, so I cannot share it that easily. The data got written by 2.2.5, but neither 2.2.5 nor 2.2.14 were able to start up with it out of the box. Ticket is https://issues.jboss.org/browse/HORNETQ-964

                       

                      Best regards,

                      Ronny

                      • 8. Re: HornetQ startup fails during initialization with null pointer exception if large number of page files are present.
                        clebert.suconic

                        Ok, I will check and fix the possible NPE and let the server restart on that case. with a warning instead of a crash.

                         

                         

                        I think the data got on that state on 2.2.5. I have no indication that 2.2.14 would make it like that. (There were a few issues with proper shutdown on 2.2.5 that were fixed on the later one. With 2.2.5 would be better to stop load before shutdown or simply kill -9 if there's lots of load going through the server).