6 Replies Latest reply on Dec 20, 2011 11:38 AM by nfilotto

What is the best way to launch a distributed update?

nfilotto Dec 19, 2011 2:16 PM

Hi all,

I use ISPN 5.1.0.CR1 in distribution mode and I would like to be able to launch updates on each node. In other words, I have entries in my cache that I could need to change if I face some special use cases, so my original idea was to ask each node to apply the change locally using a MapReduceTask with eagerLockSingleNode set to true (because the MapReduceTask will only provide the key to the mapper if the local node is the main owner so if I set eagerLockSingleNode to true, the locking will be local too) but unfortunatelly I did not find any way to retrieve the cache instance on which the MapReduceTask has been launched from the Mapper and/or the Reducer so it looks like only Read operations are allowed so far, is it a wanted limitation? What do you think about this approach? does it make sense? if it doesn't make any sense What is the best way to launch a distributed update?

Thank you in advance,

BR,

Nicolas

1. Re: What is the best way to launch a distributed update?

sannegrinovero Dec 19, 2011 1:58 PM (in response to nfilotto)

Hi Nicolas,
yes indeed we have only thought about read operations, wich accumulate some result and return it to the (remote) invoker.

Your approach is definitely interesting and I think we could make it work in a transaction as well, so that each node prepares the updates, but committing only if all nodes succeed. This is not implemented though.

Don't you think this should be provided by the DistributedExecutorService rather than the Map/Reduce API?
It makes sense for the DistributedExecutorService to be provided with some context, like for example the cache and maybe more as you suggest. Ideally the remoted Callable should receive services by injection via CDI.. would that work for you?
Actions
2. Re: What is the best way to launch a distributed update?

nfilotto Dec 19, 2011 2:09 PM (in response to sannegrinovero)

Hi Sanne,

Thx for your quick answer, see below my answers.
>> Don't you think this should be provided by the DistributedExecutorService rather than the Map/Reduce API?
<< What I like with the Map/Reduce approach is the fact that the local node is the owner of the key/value pairs given to the mapper. In case of DistributedExecutorService, it seems that I would have to reimplement the filter that we have in MapReduceCommand.perform that allows to keep only key/value pairs owned by the local node which I would like to avoid as it is an internal logic. Moreover in products like Hadoop and Hbase, we have the ability to do RW operations thanks to a Map/Reduce (using the OutputFormat) so why not having it in ISPN too? It would be awsome don't you agree?

>> receive services by injection via CDI.. would that work for you?
<< That would be perfect
Actions
3. Re: What is the best way to launch a distributed update?

sannegrinovero Dec 20, 2011 8:24 AM (in response to nfilotto)

Totally agree.

For 5.1 : https://issues.jboss.org/browse/ISPN-1634
For 6.0 (?) : https://issues.jboss.org/browse/ISPN-1636

These are optimistic targets - you're very welcome to help defining the API or even contributing code and/or tests
Actions
4. Re: What is the best way to launch a distributed update?

vblagojevic Dec 20, 2011 8:38 AM (in response to nfilotto)

Nicolas, if you use DistributedExecutorService submitEverywhere and submit method with input keys then Callable task(s) will be executed only on nodes where input keys are local!
1 of 1 people found this helpful
Actions
5. Re: What is the best way to launch a distributed update?

nfilotto Dec 20, 2011 10:28 AM (in response to vblagojevic)

Thx Vladimir for your remark, I did not realize it as I only checked what matches most with my use case which is actually without providing a set of keys. In my use case, I don't know the entries to modify and I have no idea how many entries will be modified, I need to iterate over all the keys to know which one will be modified that's why the Map/Reduce approach with an access to the cache would be perfect for me.
Actions
6. Re: What is the best way to launch a distributed update?

nfilotto Dec 20, 2011 11:38 AM (in response to sannegrinovero)

@Sanne IMHO if you don't want to have to modify the API now (which makes sense as ISPN 5.1 is already a CR) you had better to do ISPN-1636 directly for ISPN 5.1 and don't do ISPN-1634, don't you agree?

Moreover, maybe I'm naive but it looks like it is not so hard to implement, we need to add the ComponentRegistry in the init method of the class MapReduceCommand (which also means that CommandsFactoryImpl and MapReduceTask must be modified to add this new parameter) then call componentRegistry.wireDependencies(object) on the mapper and the reducer in the perform method of MapReduceCommand, don't you agree?
Actions

Go to original post