we are currently dealing with Messaging problems in the JBoss 4.3 CP08 container. We realize these issues are well known and that the best move is to upgrade. However, our client (a very large corporation) is reluctant to do so at this time. As a result, we are trying to mitigate the problems as best we can.
Basically, our application uses Queues for lots asynchronous processing, and it uses Topics to send notifications to subscribed clients. The server is a single instance running in a retail location and there are anywhere from 4-12 subscribed clients (on the same network, in the same retail location). It is a very simple setup and there is no clustering of any kind.
In November of last year, the stores started to experience network disruptions. We don't know what the root cause of the network disruptions are (and being the vendor, we may never know) but they are seriously impacting the JBoss Messaging component. The main problem we see is that the Post Office blocks indefinitely when it tries to write a message to a subscribed client workstation. This ends up deadlocking the entire Post Office, which in turn blocks all of the other Threads that are trying to write to the Queues. The entire server comes to a halt and all of the workstations eventually freeze up:
We have been dealing with Red Hat support on this and we have a very solid grasp on the problems with the Messaging implementation in 4.3. As I mentioned, we are looking for workarounds to the problem. One workaround we have come up with is to configure 2 separate Post Office implementations in the same JBoss instance. We assign the Queues to one instance, and we assign the Topics to another instance. The purpose of this change is to reduce the impact of a failure when writing a Topic message to a subscriber. If the Topic Post Office freezes up, the Queue Post Office is unnaffected and can continue processing requests from other Threads on the server.
Red Hat support has indicated that this configuration has never been through their QA process and therefore is not supported. I respect their position.
What I am interested in knowing is whether anyone else in the community has ever tried this type of configuration, and/or whether anyone knows of why this is a bad idea? We have run this configuration through our automated regression and performance tests and have not seen any problems. And it does reduce the impact of a failure in the Messaging system. Any thoughts?
|Retrieving data ...|