Writing a Clustered HA Singleton Service (AS 5.x)

The Problem

Writing high-availability software is relatively simple on a J2EE platform (and that's what the darn thing was invented for in the first place), but only if you know where the pitfalls are. It usually starts by googling simple examples that you pick from here and there, then you play around with them, until you get them to do what you want. The good thing is that 90% of the cases are pretty good covered in literature, on the Internet and various articles, but the problem is the obscure 10% that remain unveiled. That's when your headaches begin!

 

Let's take the following scenario: You have an application providing a critical business service to your customers. You are bound by an SLA that obliges you to pay fine to your client for each minute the service is offline. The service is being heavily used, and you can hardly find a time-frame with no traffic to restart or even stop it for maintenance or upgrades. The business service is provided as a web-service running on a 2-node JBoss 5 cluster, with an Apache running as a load-balancer.

 

That's no problem, you say - you can stop one node and the Apache will be using just the remaining node, with no impact to the client. When you're done with updates and the node is restarted, it will kick in, so you can stop the other one. Right?

 

Indeed, it would be that easy, if it wasn't for a catch: The system internally uses a cryptographic service that needs to be manually initialized from a web console, before it becomes available for usage. Alas, when the node starts, the service becomes available before being manually initialized, so all the client's requests that end up using this non-initialized instance will fail.

 

The solution is to use a highly-available service. Basically, it is a service running on just one node of the cluster; if that node fails (gracefully or by force), the JBoss will start a new instance on another node.

 

Node-Wise Singleton

The Basics

In order to get a cluster-wise singleton, we first must master a node-wise singleton. Unlike a stateless session bean that can be instantiated more that once, a service is instantiated exactly once on each node. This concept is "legally" introduced in EJB 3.1, but until JBoss AS 6.0 is released, we are covered with the JBoss annotation @org.jboss.ejb3.annotation.Service (we'll just call it @Service here)  that does exactly that. Each class annotated by @Service is instantiated exactly once on server startup and its public void start() metod gets executed. That's where you want to start your timers, load resources and do whatever your service does on start. When the server stops, or wants to undeploy or redeploy the service, it will call its public void stop() metod. Services have four reserved methods, but not all of them have to be present in your service:

 

    public void create() throws Exception;
    public void start() throws Exception;
    public void stop();
    public void destroy();

 

Let's start with an example:

 

@Service
public class CryptoServiceBean
    private MyCrypto crypto = ...;

    /*
     * Automatically called by JBoss on startup
     */
    public void start() {
        System.out.println("CryptoServiceBean started");
    }

    /*
     * Automatically called by JBoss before shutdown or undeployment
     */
    public void stop() {
        System.out.println("CryptoServiceBean stopped");
    }

    public byte[] encrypt(String str) throws CryptoException {
        return crypto.encrypt(str);
    }
   
    public String decrypt(byte[] data) throws CryptoException {
        return crypto.decrypt(data);
    }
}

 

Exposing The Service

A good way to use the service as a component is to make it a session bean with a remote or local interface. Since we'll make our service roam from node to node, we will have to use a @Remote interface. Since it's defined as a @Service, this session bean will be instantiated just once on each node. Let's write the interface with business methods and make the service bean implement it.

 

@Remote
public interface CryptoService {
    public byte[] encrypt(String str) throws CryptoException;
    public String decrypt(byte[] data) throws CryptoException;
}

@Service
public class CryptoServiceBean implements CryptoService {
    ...
}

 

Making it a Cluster-Wise Singleton

If we deploy it on a cluster, one instance will be running on each node, but we want it operational on just one. There are several ways to do that, but my favorite and most elegant is the one described here. In this scenario, the singleton is still instantiated all nodes, but is operational just on the master node. When we fetch the service from the HA-JNDI, we expect from it to return the remote proxy to the currently operational instance. So there are two things we have to take care of:

 

  1. How to trigger the right instance to make it active;
  2. How to make HA-JNDI to always return the active instance.

 

NOTE: HA-JNDI is short for High Availability JNDI. It is a cluster-wise implementation of JNDI. Looking for an object using a regular JNDI (the one injected in the context), will search only the local node's registry, while doing the same using a HA-JNDI will automatically expand the search to all connected nodes. You will see a bit later how to get the HA-JNDI instead of the plain old JNDI.

Triggering the Right One

Regarding the first issue, that's where the JBoss kicks in with something called HASingletonController. It's a gadget controlled by the JBoss that you use to specify services that run as clustered singletons. A clustered singleton is operational only on a current master node, and when it fails - another available node becomes the master, activating all clustered singletons on that node. The HASingletonController is configured in the META-INF/jboss-service.xml of the JAR holding the CryptoServiceBean. There you have to specify the following:

 

  • a reference to the singleton service
  • a reference to your method for starting the singleton (not to be confused with the @Service's start() method!)
  • a reference to your method for stopping the singleton (not to be confused with the @Service's stop() method!)

 

The latter two are not a problem, but the problem is the first one - referencing the service. In order to do so, you have to make the service available from the JBoss kernel by specifying a @Service parameter called objectName and by specifying the JMX Management interface:

 

@Service(objectName = "myexample:service=CryptoService")
@Management(CryptoService.class)     // otherwise it won't work!
public class CryptoServiceBean implements CryptoService {
    ...
}

 

Now you can write your META-INF/jboss-service.xml (or whatever *-service.xml you want to call it):

 

<?xml version="1.0" encoding="UTF-8"?>

<server>
    <mbean name="jboss.examples:service=CryptoService-Controller"
           code="org.jboss.ha.singleton.HASingletonController">
        <depends>myexample:service=CryptoService</depends>
        <attribute name="HAPartition"><inject bean="HAPartition" /></attribute>
        <attribute name="TargetName">myexample:service=CryptoService</attribute>
        <attribute name="TargetStartMethod">startSingleton</attribute>
        <attribute name="TargetStopMethod">stopSingleton</attribute>
     </mbean>
</server>

 

NOTE 1: The startSingleton and stopSingleton method names are not reserved - you can call them whatever you like.

 

NOTE 2: The start method of the @Service will be executed on all nodes, but the above defined startSingleton method will be called only on the master node. To be totally frank here, the startSingleton is sometimes called even on slave nodes after the server startup, but immediately is followed by a call to the stopSingleton. Why and when exactly this happens, I haven't figured out yet - but you have to be aware of that possibility.

 

Now we're almost there: the server will trigger the startSingleton method on the instance found on the current master node, while stopSingleton will be called:

 

  • always on graceful shutdown of the master node
  • if a service was started on a member (slave) node, immediately after a call to the startSingleton

 

Registering the Right One

One question remains: if the service is technically existing on each node, how to make the HA-JNDI return only the active one?

 

You should already know that HA-JNDI first looks in the local registry and if a requested object is not found, it asks the next node. The next node does the same, until the object is found somewhere, or there are no more nodes to ask.

 

The idea is to temporary remove the object from the local registry, until the node becomes a master. We can do that in the startSingleton and stopSingleton methods:

 

import org.jboss.naming.Util;

@Service(objectName = "myexample:service=CryptoService")
@Management(CryptoService.class)
public class CryptoServiceBean implements CryptoService {
    private static final String JNDI_NAME = "MyExampleEAR/CryptoServiceBean/remote";

    private static final MyCrypto crypto = MyCrypto.getInstance();
    private CryptoService reference;
    private boolean masterNode = false;

    public void start() throws NamingException {
        /*
         * Remove from local JNDI until started as a singleton,

 

         * but keep the reference locally so it could be restored later.
         */
       Context ctx = new InitialContext();
        reference = (CryptoService) ctx.lookup(JNDI_NAME);
        Util.unbind(ctx, JNDI_NAME);

    }

    /**
     * Called by HASingletonController on singleton start
     */
    public void startSingleton() throws NamingException {
        masterNode = true;
        /*
         * Rebind to JNDI when started as singleton
         */
        Context ctx = new InitialContext();
        Util.rebind(ctx, JNDI_NAME, reference);
    }

    /**
     * Called by HASingletonController on singleton stop
     */
    public void stopSingleton() throws NamingException {
        masterNode = false;
        /*
         * Unbind from local JNDI when stopped
         */
        Context ctx = new InitialContext();
        Util.unbind(ctx, JNDI_NAME);
    }
    ...
}

 

Furthermore, if you write and expose a isMasterNode() method on the interface, you will be able to see the status of instance in the JMX console.

 

The Deployment

When you write your service, it will most likely be a part of an EAR package. The service bean can be placed together with other beans in a JAR file with the META-INF/jboss-service.xml. The interfaces can be placed in another JAR. You can deploy the EAR manually to each node's deploy directory, or by deploying the EAR to either node in the farm directory, letting the JBoss do the distribution. But in that case the application would be instantly restarted on all nodes, leaving us without a single CryptoClass properly initialized! And that's in our case very bad.

 

The proper way of doing an update without downtime would be:

 

  1. Stop a node, leaving the other(s) running;
  2. Copy the EAR to the deploy directory;
  3. Start the node.
  4. If required by the service, initialize it using a local interface (e.g. you might have built a web page for entering activation codes). Make sure you connect directly to the node, by specifying the node IP and the port address - do NOT access through the httpd, because it does not guarantee which node you will get to.

 

The node now will be ready when hist turn of becoming a master-node comes.

 

Fetching the Singleton Service

In order to fetch our cluster-wise singleton service, you will have to address a required partition by passing the jnp.partitionName property to the InitialContext constructor. A context that gets injected by @Resource SessionContext ctx would return a local instance of the session bean, containing the local JNDI, and that's not what we want here. So you'll have to do the following:

 

import java.util.Properties;
import javax.naming.*;

public class HAUtils {
    private static final Properties p = new Properties();
    static {
        p.put(Context.INITIAL_CONTEXT_FACTORY,
              "org.jnp.interfaces.NamingContextFactory");
        p.put(Context.URL_PKG_PREFIXES, "org.jboss.naming:org.jnp.interfaces");
        p.put("jnp.partitionName", System.getProperty("jboss.partition.name", "DefaultPartition"));
    }

    public static Context getHAContext() {
        try {
            return new InitialContext(p);
        } catch (NamingException e) {
            return null;
        }
    }

    public static Object lookup(String name) {
        try {
            return getHAContext().lookup(name);
        } catch (NamingException e) {
            return null;
        }
    }

    public static String nodeName() {
        return System.getProperty("jboss.server.name");
    }
}

 

Now you just fetch and use the CryptoService like any other session bean, except you do it from the HA-JNDI instead of the local JNDI.

 

REMEMBER: The HA-JNDI is not injected by the container - only the local one is! You have to get the HA-JNDI manually.

 

    CryptoService cs = (CryptoService) 
                       HAUtils.lookup("MyExampleEAR/CryptoServiceBean/remote")
    byte[] data = (cs.encrypt("blahblah");
    ....

 

Conclusion

Now you know how to make a High Availability Singleton Service. There are also several other ways and countless variations that eventually lead to the same result, but to my experience this is the most elegant one, until someone comes out with something better.