Remote txinflow: XID changes| JBoss.org Content Archive (Read Only)

60. Re: Remote txinflow: XID changes

tomjenkinson Oct 21, 2011 8:15 AM (in response to marklittle)

Unfortunatly my brain failed me on my previous comment :s

64 - 4 - 12 is 48 not 58

This leaves 24 (I hope!) bytes per node name (less a variable length short key of 2).

Therefore David, if you can confirm Integers are OK with you then we can start the ball rolling on getting this to AS7 for you and I can show you the finalised example I have been working on that should prove very helpful when implementing the transport interceptors...

61. Re: Remote txinflow: XID changes

dmlloyd Oct 21, 2011 10:42 AM (in response to tomjenkinson)

If I understand correctly, you're using integer values internally but the user doesn't have to be aware of it? If so I'm in favor. Can you enumerate what additional configuration items the user will require with this solution?

62. Re: Remote txinflow: XID changes

tomjenkinson Oct 21, 2011 11:21 AM (in response to dmlloyd)

Sorry, its not quite as you say. The user has to configure various items for the transaction server and most of those remain unchanged. The one change that must be made is that the CoreEnvironmentBean.setNodeIdentifier () must be changed from a String to an Integer (and the coresponding JTAEnvironmentBean.setXARecoveryNodes).

This is app server configuration and should be unique for nodes sharing an object store (or sharing a remoting domain in the new scenario)

63. Re: Remote txinflow: XID changes

dmlloyd Oct 21, 2011 12:31 PM (in response to tomjenkinson)

OK, unfortunately it isn't acceptable to require the user to configure int values like this. But... what about this idea?

Basically you're looking for approximately 32 bits of uniqueness right? Well a node name is based on host name, which is ultimately based off of DNS labels. A DNS label can contain one of 26 letters, one of 10 digits, a "-", or a "." (but cannot start or end in "-" or "."). This means that a single character has only 38 possible values, which means you can fit about 6 of them into 32 bits, if you squeeze really hard.

Perhaps we can set it up so that we have just a couple more bits - maybe 42, which would be just enough for 8 characters - and default it to the compressed node name, and if the node name is greater than 8 characters we can print a warning and suggest that the user choose a unique 8-character name?

64. Re: Remote txinflow: XID changes

tomjenkinson Oct 23, 2011 4:23 AM (in response to dmlloyd)

Hi David,

I was wondering about something like that (i.e. reducing the "domain" of the character space). The main reason I backed out is a point I raised earlier:

"node names must be unique per instance"

This means that the host name (or a derivative thereof) is not unique enough. Lets say you have two instances on the same machine, lets say we got it into 42 bits (although to be fair we could allocate 6 bytes per name if that helped - 28 for the Uid, 6 per node name leaving 24 bytes for the EISname). Anyway lets say the name was "foo".

If you start up two servers with the node name "foo" you can't flow transactions between them as the node identifier is the same, fairly clear if "foo" is on a different server, but potentially it could be on the same server, i.e. the user could start up more than one instance on host "foo". Hence you would need to configure node identifiers to be "foo1" and "foo2" anyway, leading to an integer and therefore leading to needing to configure this integer somewhere (presumably the same place where the integer I was talking about would need to be - a central configuration repository). You see the node identifier defines who "owns" transaction logs (this is an exisiting requirement - not new for this release). If you are thinking about some kind of "launch counter" where we could launch servers with a locally incremented suffix (as I would be ) then that isn't really good enough as the user would want to deterministically know the server they are launching to know which transactions are going to be restored.

The other issue is that hostnames may be none roman alphabet leading to quite a few more options, take the servers listed on: http://en.wikipedia.org/wiki/List_of_Internet_top-level_domains#Test_TLDs or the ones from http://www.w3.org/2003/Talks/0425-duerst-idniri/slide12-0.html such as http://räksmörgås.josefsson.org/

Also, arguably by asking the user to "choose a unique 8-character name" we might as well ask them to choose a unique int instead?

Just to clarify, this requirement for unique node names is an exisiting requirement of JBoss Transactions, but in the JTA none-shared object store scenario it has been sufficient to give the same node identifier.

Tom

65. Re: Remote txinflow: XID changes

dmlloyd Oct 23, 2011 2:49 PM (in response to tomjenkinson)

I understand, it isn't perfect, but we do already have the requirement that our (hostname-based) node names be unique for EJB, management, clustering, etc. to function correctly. When more than one server is started up on the same host via domain management, the node name is based on the host name plus the server name (which is unique within a server group, which is a very manageable namespace as it is centrally defined).

Basically if we start up two servers with the same node name, there are a whole host of services that won't function correctly if the user makes the servers interact. It's still a lesser burden to choose a name which can mostly be defaulted to the node name which already must be unique than to choose an integer. Otherwise we're back to the drawing board to come up with a scheme which can use either a full name or topology information (or some other scheme, like topology + full name hash) to fill in the knowledge gap that each node needs for recovery. Using a plain int simply is not an option.

As for the non-roman alphabet-based names, they encode back to the same character set as any other domain name, so we don't really need any special handling for this case as we can perform the same encoding. It would be noted however that such names would have a significantly reduced maximum length, potentially as few as 2 or 3 characters.

66. Re: Remote txinflow: XID changes

tomjenkinson Oct 24, 2011 11:16 AM (in response to dmlloyd)

Just to round off the discussion:

Basically, I determined that the subordinate node name is not strictly a "node identifier" in the same sense as we are requiring it for a JTA gtrid value (which must be unique per object store). Instead it need only be unique per propagated transaction and the server does not need it to be the same value for different transactions flowed through the it, nor indeed must the value be globally unique (i.e. different servers can reuse the same id so long as it is not used by a different server within the scope of the same transaction).

All that that does need to happen is that the transport dynamically assign a unique integer for each node participating in a transaction, e.g. a counter starting at 1 and a recoverable mapping file be stored *by the transport* so that for any given transaction a server (keyed by the traditional String node identifier) knows what its own (now dynamic) "subordinate node name" was for the transaction, thereby assisting it with recovery.

To that end I have only had to change:

XID {

int:formatId

byte[]:gtrid { UID:sequence, String:nodeName }

byte[]:bqual {UID:sequence, String:EIS name }

}

To:

XID {

int:formatId

byte[]:gtrid { UID:sequence, String:nodeName }

byte[]:bqual {UID:sequence, int:subordinateNodeName, String:EIS name }

}

(i.e. added that single integer into the bqual).

It does mean the transport needs to work a little harder to maintain this mapping and dynamically track available subordinate node names for each transaction flow but has solved the issue with dealing with dynamically allocated subordinate node names. It also means that setNodeIdentifier is totally compatible with previous versions of AS (i.e. any String is allowed for a node identifier, not just an Integer encoded as a String) and has therefore further minimized the changeset required from the core JTA side so is safer from an existing functionality point of view.

Thanks to everyone for their suggestions on this, also, clearly the restriction about the (now reverted to String) node identifier being unique within the domain is all the more more vital.

For those tracking this discussion, the reason the parent node name is no longer required is because: a> only one proxy can be added for any particular transaction at any particular server and b> the proxy xa resource transport addition must track transactions in flight for each server so therefore it takes the place of this particular value

For those who are wondering about the subordinate node name is now able to be dynamic, the reason is the we rely on the transport must storing this data now rather than reading it from one of our configuration files.

Essentially the requirements of the solution are the same as they have always been, it is just that a little bit more data is required to be stored by the transport than was the case before.

I do have examples and tests for all of this: https://svn.jboss.org/repos/labs/labs/jbosstm/branches/JBOSSTS_4_15_0_Final/atsintegration/tests and https://svn.jboss.org/repos/labs/labs/jbosstm/branches/JBOSSTS_4_15_0_Final/atsintegration/examples.

In particular, the example should prove to be extremely useful when developing the transport required for this as it shows and describes the points where relevant transaction data must be reliably persisted.

As I say, by reverting to a String node identifier, the solution is even closer to the original mechanism so therefore this new approach should have no impact on existing JTA/JCA work. The only difference now is that the space available to EIS namesis one byte shorter (from 36 to 35 bytes).

67. Re: Remote txinflow: XID changes

tomjenkinson Oct 24, 2011 11:28 AM (in response to tomjenkinson)

I should point out that the impact of moving from static subordinate node identifiers to dynamic ones is that an extra log write/delete per transaction is required, which will clearly have an impact on performance.

68. Re: Remote txinflow: XID changes

mmusgrov Oct 24, 2011 12:10 PM (in response to tomjenkinson)

Couldn't the mapping be stored with the transaction log record thereby avoiding the need for two separate writes. A further benefit would be the transport would be using our optimised logging mechanisms.

69. Re: Remote txinflow: XID changes

tomjenkinson Oct 24, 2011 1:05 PM (in response to tomjenkinson)

I just had an interesting chat over on our projects freenode irc channel #jbossts where some legitimate concerns around performance of the solution were discussed. I thought it might be useful to reiterate some of the points I tried to address there for the rest of the community.

Persistence points of the solution:

1. The transport is required to maintain a persistent store of all Xids that it proxies to the remote side, it has two points where it needs to manipulate this, first during the initial contact with the remote server we must record at least the fact that we have talked to a remote server for a particular gtrid. Once we have access to the proxies actual bqual we need to update this record to ensure that the proxy can be recovered correctly in the case of a viable commit - in the current example I delayed this to prepare. We need to do this as we cannot filter the remote server's XIDs as XIDs do not have the parent node name encoded in them.

2. When a transaction is first propagated into a new server we must persist the dynamic subordinate node identifier for this server so that we can reliably recover transactions for this particular instance of server.

Removing these addtional log writes is feasible, but would require the change to allocate static node identifiers that can be compressed into the bqual for each node.

Removing the log writes associated with point 2 above can be done by using the nodes existing node identifier and compressing this into the bqual. This should be a relatively painless operation as we could use the earlier suggestion around compressing the EIS name into the integer, leaving ample space to encode a single node identifier in the bqual.

The bqual would move from byte[]:bqual { UID:sequence, int:subordinateIdentifier, String:EISname } to byte[]:bqual { UID:sequence, String:subordinatesNodeName, int:EISname }. It would require minor changes as explained earlier to ensure that the converted EIS name was logged in much the same way I outlined logging the subordinate name but is fairly trivial and is essentially compatible with the current approach.

Removing the log writes associated with 1 is slightly more involved. It is this scenario that necessitated the introduction of the parent node name. In this scenario the server is solely responsible for keeping a record of all transactions it has seen - as it does anyway - but this list can then be filtered by parent node name in recovery. Furthermore, to remove the requirement of keeping a record of the XIDs the proxy has seen you would also need to propagate a list of all nodes (remoting transport node name is fine) that a transaction has flowed through to ensure that you can register a proxy *before* you propagate to the remote server. The alternative is to always put a proxy in place but beware of the scenario where you have created a diamond, e.g. a flow going 1,2,3,1,3 you would get two "recover/completions" called on node 3 in that case, the second recover/completion sequence causing an error as the subordinates XID is already recovered. Yet another alternative to remove one of the log writes related to point 1 above would be for proxy xa resources to have a deterministic branch IDs, but we would still need the parent node name attribute adding to the XID.

Hope that clarifies some of the investigations I have done. The final approach was driven mainly by two key factors:

1. Ensure that the performance of existing JTA/JCA is not impacted

2. The servers node identifier must be encoded as a String

Unfortunately there are performance impacts on the new functionality as a result but at least we have a basis for further discussion when we come back to review this.

70. Re: Remote txinflow: XID changes

tomjenkinson Oct 24, 2011 1:21 PM (in response to mmusgrov)

Hi Mike,

Hopefully my previous answer helps to clarify some of the motivations for the current writes. It should also be pointed out that I don't mandate how this data is persisted, indeed potentially there are some items that the transport must store which can be persisted using TS optimised logging mechanisms, that is left to the transport to determine.

Thinking about your suggestion, one example where TS logging could possibly help to optimize this is the dynamic subordinate node identifier. As this Xid is for one of our own transactions, potentially David could provide his own XidImple implementation which packs the static subordinate node and its parents node identifier name after the bqual (which would be String node identifier instead of using a transport allocated dynamic int as I was describing earlier) in as an extra attributes to be persisted. Although it is likely that doing so would require work throughout the code base to ensure it was feasible to allow. Also, for example, I assume the JDBC action store assumes a schema that can take a BLOB(128) for the data part of an XidImple we would therefore need to determine which objectstores this would work with.

NOTE: This is a sponateous response to your point, I am not entirely certain it would work!

If you didn't go down that route (or it is proved not to work) then the issue you have is the transport needs to persist items at a different time to that which the transaction service current does. As Jonathan has pointed out, the transaction service is heavily optimized in this regard and with the particular set of requirements in hand, it seems that we have to add a few of these persistence points back (well the transport adds them, specific for this feature).

I will think more about providing a bespoke XidImple that packs in more data than a Xid typically allows - thanks for the suggestion

71. Re: Remote txinflow: XID changes

dmlloyd Oct 25, 2011 12:18 AM (in response to tomjenkinson)

Tom Jenkinson wrote:

Just to round off the discussion:

Basically, I determined that the subordinate node name is not strictly a "node identifier" in the same sense as we are requiring it for a JTA gtrid value (which must be unique per object store). Instead it need only be unique per propagated transaction and the server does not need it to be the same value for different transactions flowed through the it, nor indeed must the value be globally unique (i.e. different servers can reuse the same id so long as it is not used by a different server within the scope of the same transaction).

All that that does need to happen is that the transport dynamically assign a unique integer for each node participating in a transaction, e.g. a counter starting at 1 and a recoverable mapping file be stored *by the transport* so that for any given transaction a server (keyed by the traditional String node identifier) knows what its own (now dynamic) "subordinate node name" was for the transaction, thereby assisting it with recovery.

To that end I have only had to change:

XID {
int:formatId
byte[]:gtrid { UID:sequence, String:nodeName }
byte[]:bqual {UID:sequence, String:EIS name }
}

To:
XID {
int:formatId
byte[]:gtrid { UID:sequence, String:nodeName }
byte[]:bqual {UID:sequence, int:subordinateNodeName, String:EIS name }
}

(i.e. added that single integer into the bqual).

OK here's my first wave of dumb questions:

Why do you need a UID in the bqual?
EIS name in this case is the local node's unique name for the subordinate it's planning on talking to, right?
If so why do we also need subordinateNodeName? Just to create a way to differentiate between the "first" inflow versus subsequent?

Tom Jenkinson wrote:

I do have examples and tests for all of this: https://svn.jboss.org/repos/labs/labs/jbosstm/branches/JBOSSTS_4_15_0_Final/atsintegration/tests and https://svn.jboss.org/repos/labs/labs/jbosstm/branches/JBOSSTS_4_15_0_Final/atsintegration/examples.

In particular, the example should prove to be extremely useful when developing the transport required for this as it shows and describes the points where relevant transaction data must be reliably persisted.

As I say, by reverting to a String node identifier, the solution is even closer to the original mechanism so therefore this new approach should have no impact on existing JTA/JCA work. The only difference now is that the space available to EIS namesis one byte shorter (from 36 to 35 bytes).

OK I'm trying to make sense of this code now w.r.t. our stuff. More dumb questions:

It really looks like you're "handing off" a transaction between nodes so that only one has the transaction at a time, but in the real world many nodes in the graph may be performing useful work at the same time (potentially even from multiple threads) under the same (global) transaction. How does this impact this scheme (from a TM perspective, not a transport perspective)?
You talk about not committing at the root node... should we be tracking the transactions by gtid and detecting the circular flow at the transport layer? I assume that if that is the case, we'd use the original XID instead of the new alias and we'd treat commit requests of the alias XID as a "no-op" sort of thing.
It seems to me that we'd only have to persist the XID and enough information to revive a connection to the node corresponding to the XID, correct?

In the meantime I'm doing more reading, but the example code doesn't quite appear to be an accurate analog of our environment.

72. Re: Remote txinflow: XID changes

tomjenkinson Oct 25, 2011 3:34 AM (in response to dmlloyd)

Hi David,

Sorry the example doesn't map directly to your use case, I condensed it from the test suite and it is really more guidance on what tasks are required from the transport so hopefully it has proved at least useful in that regard. Indeed as a talking point for questions at least it seems to have helped

NOTE: My answers are below but they are caveated by the statement that they are broadly based on the current approach.

David Lloyd wrote:

Tom Jenkinson wrote:

Just to round off the discussion:

Basically, I determined that the subordinate node name is not strictly a "node identifier" in the same sense as we are requiring it for a JTA gtrid value (which must be unique per object store). Instead it need only be unique per propagated transaction and the server does not need it to be the same value for different transactions flowed through the it, nor indeed must the value be globally unique (i.e. different servers can reuse the same id so long as it is not used by a different server within the scope of the same transaction).

All that that does need to happen is that the transport dynamically assign a unique integer for each node participating in a transaction, e.g. a counter starting at 1 and a recoverable mapping file be stored *by the transport* so that for any given transaction a server (keyed by the traditional String node identifier) knows what its own (now dynamic) "subordinate node name" was for the transaction, thereby assisting it with recovery.

To that end I have only had to change:

XID {
int:formatId
byte[]:gtrid { UID:sequence, String:nodeName }
byte[]:bqual {UID:sequence, String:EIS name }
}

To:
XID {
int:formatId
byte[]:gtrid { UID:sequence, String:nodeName }
byte[]:bqual {UID:sequence, int:subordinateNodeName, String:EIS name }
}

(i.e. added that single integer into the bqual).
OK here's my first wave of dumb questions:
Why do you need a UID in the bqual?
EIS name in this case is the local node's unique name for the subordinate it's planning on talking to, right?
If so why do we also need subordinateNodeName? Just to create a way to differentiate between the "first" inflow versus subsequent?

Here is the answer to these questions:

Interesting suggestion! It is indeed possible that for this type of XA resource we could look at creating a bespoke XID with the format:

XID {

int:formatId

byte[]:gtrid { UID:sequence, String:nodeName }

byte[]:bqual {String:subordinateNodeName, String:parentNodeName }

}

We know we can do that because we know that there is only going to be (by convention) a single proxy to the remote server for this node, therefore the branch is unique without the Uid.

TS is not really geared up to support generating XIDs differently per XA resource, but we can probably change that!

Not to say we shouldn't do that, but I was thinking about this (well a close derivative of it, based on ints again sorry - but in my mind they are both tokens) over night though, there is still an issue with storing the node identifiers in the XID which the transport has to work around by persisting them.

Basically, if you did what you are saying (which is a neat idea). The proxy XA resource would be enlisted with the ID (assuming nodes named 1,2,3 etc - Strings though):

ProxyXAResource - bqual{"2","1"}

Remote subordinate transaction - bqual{"3", "2"}

That means in recovery you can't ask the remote server for a list of XIDs it knows about because it will return {"3","2"} but the local server doesn't know about 3,2, in this scenario it knows about 2,1.

Also, "normal" XIDs would need to be able to fit a String subordinate node name in them (for orphan detection). Meaning that we would definitely need to make space for a normal bqual:

byte[]:bqual {UID:sequence, String:EISname}

As Jonathan suggested we can basically do this by changing EISname to be an int leading to:

byte[]:bqual {UID:sequence, String:subordinatenodename, int:EISnameKey}

David Lloyd wrote:

Tom Jenkinson wrote:

I do have examples and tests for all of this: https://svn.jboss.org/repos/labs/labs/jbosstm/branches/JBOSSTS_4_15_0_Final/atsintegration/tests and https://svn.jboss.org/repos/labs/labs/jbosstm/branches/JBOSSTS_4_15_0_Final/atsintegration/examples.

In particular, the example should prove to be extremely useful when developing the transport required for this as it shows and describes the points where relevant transaction data must be reliably persisted.

As I say, by reverting to a String node identifier, the solution is even closer to the original mechanism so therefore this new approach should have no impact on existing JTA/JCA work. The only difference now is that the space available to EIS namesis one byte shorter (from 36 to 35 bytes).
OK I'm trying to make sense of this code now w.r.t. our stuff. More dumb questions:
It really looks like you're "handing off" a transaction between nodes so that only one has the transaction at a time, but in the real world many nodes in the graph may be performing useful work at the same time (potentially even from multiple threads) under the same (global) transaction. How does this impact this scheme (from a TM perspective, not a transport perspective)?
You talk about not committing at the root node... should we be tracking the transactions by gtid and detecting the circular flow at the transport layer? I assume that if that is the case, we'd use the original XID instead of the new alias and we'd treat commit requests of the alias XID as a "no-op" sort of thing.
It seems to me that we'd only have to persist the XID and enough information to revive a connection to the node corresponding to the XID, correct?

In the meantime I'm doing more reading, but the example code doesn't quite appear to be an accurate analog of our environment.

In terms of these questions:

1. It shouldn't really impact things, except to say that access to *obtaining a reference or creating a new instance* on the subordinate transaction should be synchronized (in the example this is getAndResumeTransaction). This use case does add further legitimacy to registering the proxy *after* you return from the remote call.

Jonathan may have a different perspective on this though? But as I understood it multiple threads accessing the same transaction is fine in JBoss TS.

2. Yes you are broadly correct, I do actually demonstrate this in the example. Take a look at the operations (and usages of) LocalServer::storeRootTransaction LocalServer::removeRootTransaction and LocalServer::getAndResumeTransaction.

3.There are two things that need persisting, one is the information you indicated, you can see that this must be done in two places, once when you talk to the remote server (before enlisting an actual proxy) and once again when the proxy is enlisted and you have a real Xid - if we did use your approach above we would only need to do this once as we can calculate the proxies XID before it is actually enlisted. In the example this is done when the proxy is created (ServerImpl::generateProxyXAResource) and updated when the proxy is enlisted (I delayed it to ProxyXAResource::prepare, this is the persist that could be elimated). That covers your point 3 above, but you also need to persist the dynamic subordinate node identifier at the moment.

Based on the discussions so far, the most appropriate changes would appear to be to the XID, to make special ones for proxy resources and to key EIS names as int so we have room for subordinate names for normal Xids. Making these changes would:

1. Impact existing users from a usability perspective that are already using the EIS name and expecting to read this from the persisted Xid somehow.

2. Require investigation to determine if customizing bqual for a particular type of resource (saying implementing a ProxyXAResource interface or similar) was even possible

3. Still require the transport to make an additional log write per transaction the list of servers it talks to in order to recover transactions at them.

That said, I am due to go on paternity any time now for two weeks so would be reluctant to alter the code too much further now (e.g. keying the EIS name or bespoke XID creation) but I can certainly investigate both of those in a branch.

73. Re: Remote txinflow: XID changes

marklittle Oct 25, 2011 9:45 AM (in response to tomjenkinson)

Some quick notes and thoughts:

yes, multiple threads in the same TX are fine.
we can create different XID formats as long as we change the format id (check out how JTS does this with different types of interposition).
leave this until you come back from paternity - it's an EAP 6.1 thing.
if we are getting into circularities then we're moving back to a general distributed transaction protocol and we already agreed that that would require JTS.

74. Re: Remote txinflow: XID changes

marklittle Oct 25, 2011 9:48 AM (in response to tomjenkinson)

This concerns me greatly. We've spent years optimising TS. Presumed abort semantics help here obviously, but there's been a lot of other work. Let's not negate that!