1 2 3 4 Previous Next 59 Replies Latest reply: Jun 20, 2011 1:58 AM by Jamshed Katta RSS

Possible Deadlock Sending to JMS Queue

David Hosier Newbie

I'm using HornetQ 2.2.2.Final on JBoss 6.0.Final, and I am trying to stress test my application. What I'm seeing is that at some point the code just stops sending messages and appears to be stuck in the HornetQ code waiting to acquire locks. I've just let the server sit here for about 4 hours now, and it's still just parked in the same place on this one thread in a WAITING state. The stack for this thread is as follows:

 

Thread: akka:event-driven:dispatcher:global-4 : priority:5, demon:true, threadId:101, threadState:WAITING
- waiting on <0x2f9c3beb> (a java.util.concurrent.Semaphore$NonfairSync)
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:969)
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1281)
java.util.concurrent.Semaphore.acquire(Semaphore.java:441)
org.hornetq.core.client.impl.ClientProducerCreditsImpl.acquireCredits(ClientProducerCreditsImpl.java:67)
org.hornetq.core.client.impl.ClientProducerImpl.doSend(ClientProducerImpl.java:305)
org.hornetq.core.client.impl.ClientProducerImpl.send(ClientProducerImpl.java:142)
org.hornetq.jms.client.HornetQMessageProducer.doSend(HornetQMessageProducer.java:451)
org.hornetq.jms.client.HornetQMessageProducer.send(HornetQMessageProducer.java:199)
org.hornetq.ra.HornetQRAMessageProducer.send(HornetQRAMessageProducer.java:221)
com.osi.notification.NotificationHelper.sendNotification(NotificationHelper.java:69)
com.osi.notification.NotificationHelper.sendNotificationWithTx(NotificationHelper.java:40)
com.osi.notification.actors.DestinationSenderActor.onReceive(DestinationSenderActor.java:30)
akka.actor.UntypedActor$$anonfun$receive$1.apply(UntypedActor.scala:70)
akka.actor.UntypedActor$$anonfun$receive$1.apply(UntypedActor.scala:69)
akka.actor.Actor$$anonfun$4.apply(Actor.scala:481)
akka.actor.Actor$$anonfun$4.apply(Actor.scala:464)
akka.actor.Actor$class.apply(Actor.scala:435)
akka.actor.UntypedActor.apply(UntypedActor.scala:63)
akka.actor.LocalActorRef.akka$actor$LocalActorRef$$dispatch(ActorRef.scala:1012)
akka.actor.LocalActorRef$$anonfun$invoke$1.apply$mcV$sp(ActorRef.scala:832)
akka.actor.LocalActorRef$$anonfun$invoke$1.apply(ActorRef.scala:828)
akka.actor.LocalActorRef$$anonfun$invoke$1.apply(ActorRef.scala:828)
akka.util.ReentrantGuard.withGuard(LockUtil.scala:19)
akka.actor.LocalActorRef.invoke(ActorRef.scala:827)
akka.dispatch.MessageInvocation.invoke(MessageHandling.scala:23)
akka.dispatch.ExecutableMailbox$class.processMailbox(ExecutorBasedEventDrivenDispatcher.scala:190)
akka.dispatch.ExecutorBasedEventDrivenDispatcher$$anon$1.processMailbox(ExecutorBasedEventDrivenDispatcher.scala:109)
akka.dispatch.ExecutableMailbox$class.run(ExecutorBasedEventDrivenDispatcher.scala:166)
akka.dispatch.ExecutorBasedEventDrivenDispatcher$$anon$1.run(ExecutorBasedEventDrivenDispatcher.scala:109)
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
java.lang.Thread.run(Thread.java:680)
akka.dispatch.MonitorableThread.run(ThreadPoolBuilder.scala:185)


Locked synchronizers : 
- locked <0x3b06e6f2> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
- locked <0x42243824> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
- locked <0x57e1e568> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)

 

 

 

Does anyone have any idea what may be going on here? It seems to happen pretty consistently somewhere between 5500-6000 messages, and it happens every single time I run my test. I'm just sending out messages as quickly as I possibly can. I'm using Akka with Actors, so the sending is potentially happening on different threads over time, but it is always only ever trying to send one message at a time due to the nature of the application. I will attach the class that is sending the notifications, as that may provide a clue. I suppose it's possible I'm using some kind of anti-pattern in my code and that I'm lucky it even sends out the number of messages it does.

  • 1. Re: Possible Deadlock Sending to JMS Queue
    David Hosier Newbie

    One interesting sidenote here. In my application, I am using the REST interface to create subscriptions. I am not subscribing directly from clients, but rather through my own service so that I do a few things prior to establishing the subscription. One of those things is to create a dedicated JMS Queue for that subscription. That all is working fine. Another thing I do is destroy that temporary queue when the client unsubscribes. When I do an unsubscribe and remove the queue, that seems to jar something loose, and that stuck thread wakes up. The server then proceeds to try to send all the backed up messages, but of course it can't, because I destroyed the queue. Here is the TRACE level logging associated with the destroying of the queue and what looks to be the resumption of my stuck thread. It's helpful having the thread names in the log file. Nothing really jumps out at me with this logging, but I figured I'd post it in case a better trained eye could spot something.

     

    2011-04-24 21:10:27,079 DEBUG [com.osi.notification.test.NotificationTestBean] (http-0.0.0.0-8080-2) Unsubscribe Request for 1
    2011-04-24 21:10:27,132 DEBUG [com.osi.notification.subscription.ActorBasedSubscriptionManager] (akka:event-driven:dispatcher:global-12) Received Message: com.osi.notification.subscription.UnsubscribeMessage@54fc218f
    2011-04-24 21:10:27,134 DEBUG [com.osi.notification.subscription.SubscriptionManager] (akka:event-driven:dispatcher:global-12) Removing Subscription: 1
    2011-04-24 21:10:27,182 DEBUG [com.osi.notification.subscription.SubscriptionHelper] (akka:event-driven:dispatcher:global-8) Removing subscription: http://localhost:8080/hornetq-rest/queues/jms.queue.testQueue1/push-consumers/1-1303691322274
    2011-04-24 21:10:27,218 DEBUG [com.osi.notification.destination.DestinationHelper] (akka:event-driven:dispatcher:global-8) Destroying Queue: testQueue1
    2011-04-24 21:10:27,266 TRACE [org.hornetq.ra.HornetQRAMessageProducer] (akka:event-driven:dispatcher:global-4) sent org.hornetq.ra.HornetQRAMessageProducer@17fa711f result=HornetQMessage[ID:20d61254-6ed3-11e0-b206-001c42000009]:PERSISTENT
    2011-04-24 21:10:27,266 TRACE [org.hornetq.ra.HornetQRASession] (akka:event-driven:dispatcher:global-4) unlock()
    2011-04-24 21:10:27,266 TRACE [org.hornetq.ra.HornetQRAManagedConnection] (akka:event-driven:dispatcher:global-4) unlock()
    2011-04-24 21:10:27,266 TRACE [org.hornetq.ra.HornetQRAMessageProducer] (akka:event-driven:dispatcher:global-4) close org.hornetq.ra.HornetQRAMessageProducer@17fa711f
    2011-04-24 21:10:27,266 TRACE [org.hornetq.ra.HornetQRASession] (akka:event-driven:dispatcher:global-4) removeProducer(org.hornetq.ra.HornetQRAMessageProducer@17fa711f)
    2011-04-24 21:10:27,266 TRACE [org.hornetq.ra.HornetQRASession] (akka:event-driven:dispatcher:global-4) close()
    2011-04-24 21:10:27,266 TRACE [org.hornetq.ra.HornetQRASessionFactoryImpl] (akka:event-driven:dispatcher:global-4) closeSession(org.hornetq.ra.HornetQRASession@1051694a)
    2011-04-24 21:10:27,266 TRACE [org.hornetq.ra.HornetQRASession] (akka:event-driven:dispatcher:global-4) Closing session
    2011-04-24 21:10:27,266 TRACE [org.hornetq.ra.HornetQRAManagedConnection] (akka:event-driven:dispatcher:global-4) stop()
    2011-04-24 21:10:27,266 TRACE [org.hornetq.ra.HornetQRAManagedConnection] (akka:event-driven:dispatcher:global-4) removeHandle(org.hornetq.ra.HornetQRASession@1051694a)
    2011-04-24 21:10:27,266 TRACE [org.hornetq.ra.HornetQRAManagedConnection] (akka:event-driven:dispatcher:global-4) sendEvent(javax.resource.spi.ConnectionEvent[source=org.hornetq.ra.HornetQRAManagedConnection@cee2beb])
    2011-04-24 21:10:27,266 TRACE [org.hornetq.ra.HornetQRASessionFactoryImpl] (akka:event-driven:dispatcher:global-4) close() org.hornetq.ra.HornetQRASessionFactoryImpl@393e0cc5
    2011-04-24 21:10:27,266 TRACE [org.hornetq.ra.HornetQRAManagedConnection] (Transaction Reaper Worker 0) unlock()
    2011-04-24 21:10:27,266 TRACE [org.hornetq.ra.HornetQRAXAResource] (Transaction Reaper Worker 0) rollback(< formatId=131076, gtrid_length=29, bqual_length=28, tx_uid=0:ffffc0a80145:126a:4db4c024:9ebc, node_name=1, branch_uid=0:ffffc0a80145:126a:4db4c024:9ebe, eis_name=unknown eis name >)
    2011-04-24 21:10:27,269 TRACE [org.hornetq.utils.UTF8Util] (Thread-2 (group:HornetQ-client-global-threads-2133005797)) Reading string with utfSize=235
    2011-04-24 21:10:27,280 WARN  [com.arjuna.ats.jta] (Transaction Reaper Worker 0) ARJUNA-16045 attempted rollback of < formatId=131076, gtrid_length=29, bqual_length=28, tx_uid=0:ffffc0a80145:126a:4db4c024:9ebc, node_name=1, branch_uid=0:ffffc0a80145:126a:4db4c024:9ebe, eis_name=unknown eis name > (org.hornetq.ra.HornetQRAXAResource@1a376e79) failed with exception code XAException.XAER_NOTA: javax.transaction.xa.XAException
              at org.hornetq.core.client.impl.ClientSessionImpl.rollback(ClientSessionImpl.java:1488) [:]
              at org.hornetq.ra.HornetQRAXAResource.rollback(HornetQRAXAResource.java:166) [:]
              at com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord.topLevelAbort(XAResourceRecord.java:336) [:6.0.0.Final]
              at com.arjuna.ats.arjuna.coordinator.BasicAction.doAbort(BasicAction.java:2902) [:6.0.0.Final]
              at com.arjuna.ats.arjuna.coordinator.BasicAction.doAbort(BasicAction.java:2881) [:6.0.0.Final]
              at com.arjuna.ats.arjuna.coordinator.BasicAction.Abort(BasicAction.java:1602) [:6.0.0.Final]
              at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.cancel(TwoPhaseCoordinator.java:119) [:6.0.0.Final]
              at com.arjuna.ats.arjuna.AtomicAction.cancel(AtomicAction.java:212) [:6.0.0.Final]
              at com.arjuna.ats.arjuna.coordinator.TransactionReaper.doCancellations(TransactionReaper.java:367) [:6.0.0.Final]
              at com.arjuna.ats.internal.arjuna.coordinator.ReaperWorkerThread.run(ReaperWorkerThread.java:79) [:6.0.0.Final]
    
    
    2011-04-24 21:10:27,281 TRACE [org.hornetq.ra.HornetQRAManagedConnection] (Transaction Reaper Worker 0) cleanup()
    2011-04-24 21:10:27,281 TRACE [org.hornetq.ra.HornetQRAManagedConnection] (Transaction Reaper Worker 0) destroyHandles()
    2011-04-24 21:10:27,281 WARN  [com.arjuna.ats.arjuna] (akka:event-driven:dispatcher:global-4) ARJUNA-12077 Abort called on already aborted atomic action 0:ffffc0a80145:126a:4db4c024:9ebc
    2011-04-24 21:10:27,289 WARN  [com.arjuna.ats.arjuna] (Transaction Reaper Worker 0) ARJUNA-12113 TransactionReaper::doCancellations worker Thread[Transaction Reaper Worker 0,5,jboss] missed interrupt when cancelling TX 0:ffffc0a80145:126a:4db4c024:9ebc -- exiting as zombie (zombie count decremented to 0)
    2011-04-24 21:10:27,284 ERROR [com.osi.notification.NotificationHelper] (akka:event-driven:dispatcher:global-4) Transaction Error sending notification: javax.transaction.RollbackException: ARJUNA-16063 The transaction is not active!
              at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1148) [:6.0.0.Final]
              at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:119) [:6.0.0.Final]
              at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:75) [:6.0.0.Final]
              at org.jboss.tm.usertx.client.ServerVMClientUserTransaction.commit(ServerVMClientUserTransaction.java:162) [:6.0.0.Final]
              at com.osi.notification.NotificationHelper.sendNotificationWithTx(NotificationHelper.java:41) [:]
              at com.osi.notification.actors.DestinationSenderActor.onReceive(DestinationSenderActor.java:30) [:]
              at akka.actor.UntypedActor$$anonfun$receive$1.apply(UntypedActor.scala:70) [:]
              at akka.actor.UntypedActor$$anonfun$receive$1.apply(UntypedActor.scala:69) [:]
              at akka.actor.Actor$$anonfun$4.apply(Actor.scala:481) [:]
              at akka.actor.Actor$$anonfun$4.apply(Actor.scala:464) [:]
              at akka.actor.Actor$class.apply(Actor.scala:435) [:]
              at akka.actor.UntypedActor.apply(UntypedActor.scala:63) [:]
              at akka.actor.LocalActorRef.akka$actor$LocalActorRef$$dispatch(ActorRef.scala:1012) [:]
              at akka.actor.LocalActorRef$$anonfun$invoke$1.apply$mcV$sp(ActorRef.scala:832) [:]
              at akka.actor.LocalActorRef$$anonfun$invoke$1.apply(ActorRef.scala:828) [:]
              at akka.actor.LocalActorRef$$anonfun$invoke$1.apply(ActorRef.scala:828) [:]
              at akka.util.ReentrantGuard.withGuard(LockUtil.scala:19) [:]
              at akka.actor.LocalActorRef.invoke(ActorRef.scala:827) [:]
              at akka.dispatch.MessageInvocation.invoke(MessageHandling.scala:23) [:]
              at akka.dispatch.ExecutableMailbox$class.processMailbox(ExecutorBasedEventDrivenDispatcher.scala:190) [:]
              at akka.dispatch.ExecutorBasedEventDrivenDispatcher$$anon$1.processMailbox(ExecutorBasedEventDrivenDispatcher.scala:109) [:]
              at akka.dispatch.ExecutableMailbox$class.run(ExecutorBasedEventDrivenDispatcher.scala:166) [:]
              at akka.dispatch.ExecutorBasedEventDrivenDispatcher$$anon$1.run(ExecutorBasedEventDrivenDispatcher.scala:109) [:]
              at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [:1.6.0_22]
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [:1.6.0_22]
              at java.lang.Thread.run(Thread.java:680) [:1.6.0_22]
              at akka.dispatch.MonitorableThread.run(ThreadPoolBuilder.scala:185) [:]
     

     

     

  • 2. Re: Possible Deadlock Sending to JMS Queue
    David Hosier Newbie

    So I grabbed the HornetQ source and took a peak at the code where the waiting is happening. It's stuck waiting on a Semaphore. Also, after adding some more information to my notifications so that I could tell exactly how many messages are sent, it gets stuck after exactly the same number of messages each time. So I think it seems pretty clear that I have something wrong in my code. I am having trouble understanding what the Semaphore is for, so I'm not sure exactly what my problem may be.

  • 3. Re: Possible Deadlock Sending to JMS Queue
    Clebert Suconic Master

    I will just give you a quick hint as I'm about to sleep.

     

    Take a look on paging/blocking.

  • 4. Re: Possible Deadlock Sending to JMS Queue
    David Hosier Newbie

    Thanks for the hint. I searched around the documentation for those terms, and I have things working now. I enabled paging in the hornetq-configuration.xml file and tweaked some of the other settings. I now get all the messages delivered to my client. I guess I'm a little confused still as to why it ceased to deliver messages under the BLOCK address-full-policy. If the address is full, that would imply that there are messages in the queue waiting to be delivered to the subscription. Under the BLOCK policy, it will fail to allow further sending to the queue. However, I have a REST push subscription setup, so I would assume that messages would be getting cleared out as each one in the queue is pushed to my subscription, thus allowing for further sending on the queue. What I was seeing is that neither happens. I stop getting the push messages, and of course therefore the server code ceases to send to the queue.

  • 5. Re: Possible Deadlock Sending to JMS Queue
    Clebert Suconic Master

    When you configure blocking, you are telling the system to pause producinf whenever the memory on the server is greater then the configured maximum.  You have to consume your messages in all the subscriptions otherwise you will only block or you page up to the point you have a full disk.

  • 6. Re: Possible Deadlock Sending to JMS Queue
    David Hosier Newbie

    Right, I think that jives with my understanding from the documentation. However, in this test, I only have a single REST push consumer. When the blocking was happening, I also stopped getting messages pushed to my consumer. At that point, there is nothing I can do to consume messages if the server stops pushing them to me. Since the server stops pushing messages to my consumer, I can't consume any messages, and therefore the memory never goes down and the producer blocks forever.

  • 7. Re: Possible Deadlock Sending to JMS Queue
    Clebert Suconic Master

    You have to ack the messages. You can keep consuming but the message will still be at the server until you ack the message.

     

    I'm not sure how Bill Burke did that on Rest, but look for options on acking the message.

     

     

    If you can't find it... then come here and open a new thread on how to ACK a message on Rest. I'm sure Bill B will answer it if you can't find it on the docs.

  • 8. Re: Possible Deadlock Sending to JMS Queue
    Clebert Suconic Master

    That is because until you ack the message, the message could be redelivered in case of a failure.

  • 9. Re: Possible Deadlock Sending to JMS Queue
    Bill Burke Master

    Push subscription is a listener.  It receives the message from queue or topic, pushes it to the URL, then acknowledges the message.  Is the URL you are pushing to blocking or failing?  In the beta version of HornetQ REST, the push subscription would continuously try to resend the message.  In 2.2, HornetQ REST can be configured to disable the subscription if it fails to deliver the message a certain number of times.

  • 10. Re: Possible Deadlock Sending to JMS Queue
    David Hosier Newbie

    Right, so I can see log messages like the following in the server when I send a message:

     

    12:46:53,095 DEBUG [org.hornetq.rest.queue.push.UriStrategy] Success

    12:46:53,095 DEBUG [org.hornetq.rest.queue.push.PushConsumer] Acknowledging: 2147688584

     

    However, I am not sure I know how to tell if the message is truly acked or not. I am looking at the Queue in the jmx-console trying to see if anything gives me a clue. I'm 99.9% positive my endpoint is not going away. I have seen the logging that happens when the message cannot be pushed to the endpoint, and I am definitely *not* seeing that in this case. I've run the test again with the max size policy set back to BLOCK. This is what I am seeing for the queue in the jmx-console:

     

    MessageCount: 5821 (my test sends 10,000 messages, and it always stops here)

    DeliveringCount: 5821

    MessagesAdded: 5821

     

    I'm not 100% certain what each of those means. MessageCount seems to indicate to me that there are 5821 messages actually in the queue that would be contributing to the memory size of the queue. However, when I invoke the listMessagesAsJSON operation, I just get an empty array. That seems to tell me that there are actually no messages left in the queue at all, and thus the memory size of the queue should be zero. Is there any other useful information I could capture from the jmx-console that I may be overlooking? Is there somewhere that I can see how much memory the queue thinks it is consuming? I could have missed it, but I've looked at basically all the MBeans in the org.hornteq group for that information.

     

    One last tidbit. The number of messages that I send (as indicated by the RECEIVED NOTIFICATION log message) is equal to the number of acknowledgments that I see (as indicated by the Acknowledging: #### log message).

     

    $ grep -e "RECEIVED NOTIFICATION" server.log |wc -l

        3302

    $ grep -e "Acknowledging:" server.log |wc -l

        3302

     

    But the number of PUT log messages from UriStrategy is 1 less.

    $ grep -e "PUT http://localhost:8080/notification/test/receive" server.log |wc -l

        3301

  • 11. Re: Possible Deadlock Sending to JMS Queue
    Bill Burke Master

    The HornetQ REST code is doing an acknowledgement immediately following the "Acknowledging:" debug message.

     

    Why are there 3302 REST receives (and mirror PUTs)?  Where are the other 6400 messages?

     

    Is there any way to reduce the problem and take out a moving part.  i.e. remove HornetQ REST and simulate it through a simple JMS connection?

     

    I don't understand whats going on here,  did you add paging and your 10K messages test now works and you're trying to figure out why it didn't work before?  I also don't understand the setup of your application.  You talk about subscriptions.  Are you using topics? or a queue?  For topics, if you have a dead subscription (meaning one that hasn't been shut down), you would see the behavior you are seeing (overflow).

  • 12. Re: Possible Deadlock Sending to JMS Queue
    Bill Burke Master

    Ok, I also found a stupid error if you have disableOnError set to true on your push registration.  If it is set to true, the message is pushed, but the consumer is shut down.

  • 13. Re: Possible Deadlock Sending to JMS Queue
    David Hosier Newbie

    What does this setting default to? It doesn't say in the User Guide, but it seems to imply a default value of false. Also, if this is set to true, should I see anything in the log? I tried setting it to true, but I just get the same behavior with no additional log messages.

     

    In terms of my test setup. I am currently just trying to prove out some architectural decisions, so my test setup is pretty simple. I am using a REST endpoint inside my server as a means to send in messages, which the endpoint then turns around and sends onto a single JMS Queue that has one REST push subscriber. The push subscription endpoint is another method in the same resource (but I've had the same results with a node.js script that was setup as an endpoint). The "missing" 6400 messages are in an internal queue (a data structure) waiting their turn to be sent to the JMS Queue, which is not happening of course.

     

    It "works" when I enable paging. I say "works", because I get all the messages sent to the Queue and they end up at my consumer, but all the paging files that created in the process still exist after all the messages get delivered. According to the HornetQ User Guide, "The system will navigate on the files as needed, and it will remove the page file as soon as all the messages are acknowledged up to that point." That indicates to me that the Acknowledge that the REST API is doing does not seem to be working since all the page files still exist after all messages are delivered. Would you agree that this seems to be the case?

  • 14. Re: Possible Deadlock Sending to JMS Queue
    Bill Burke Master

    I don't know what to tell you, the REST code isn't doing anything special.  As I said, a Listener is set up with autoAck = false.  Received message is forwarded to a URL, then the message is acknowledged.  Thats it.

     

    I'' set up a test that pushes 10k messages through a push subscription.  How large are these messages can you guess?

1 2 3 4 Previous Next