How to calculate the average value of the items in a cache?
mariusneo Feb 8, 2016 2:48 AMello,
I was trying to calculate via streams (instead of using the obsolete model of map-reduce) the average temperature of all countries in the weatherapp infinispan sample code :
https://github.com/infinispan/infinispan-embedded-tutorial/compare/step-9...step-10
In order to calculate the average, I want to employ a double summarizer (for getting the sum as well as the count of the elements) :
Source code :
{code}
public void averageCountryTemperatures() {
System.out.println("---------- Average countries temperature ---------");
Map<String, WeatherDataStatistics> resultMap = cache.values().stream()
.collect(CacheCollectors.serializableCollector(() -> groupingBy(LocationWeather::getCountry,
collectingAndThen(
summarizingDouble(locationWeather -> (double) locationWeather.getTemperature()),
dss -> new WeatherDataStatistics(dss.getAverage(), dss.getSum())
)))
);
System.out.printf("Avg of the countries temperatures is : %s \n", resultMap);
}
{code}
(WeatherDataStatistics is serializable, DoubleSummaryStatistics on the other hand is not )
Unfortunately the following stacktrace pops out which seems not very much helpful for me as a user of infinispan:
{code}
ERROR: ISPN000073: Unexpected error while replicating
org.infinispan.commons.marshall.NotSerializableException: java.util.DoubleSummaryStatistics
Caused by: an exception which occurred:
in object java.util.DoubleSummaryStatistics@590d97f5
in object java.util.HashMap@8b788f28
in object org.infinispan.stream.impl.StreamResponseCommand@5ffec2a6
Feb 08, 2016 8:41:08 AM org.infinispan.remoting.inboundhandler.BasePerCacheInboundInvocationHandler exceptionHandlingCommand
WARN: ISPN000071: Caught exception when handling command StreamRequestCommand{cacheName='___defaultcache'}
org.infinispan.commons.marshall.NotSerializableException: java.util.DoubleSummaryStatistics
Caused by: an exception which occurred:
in object java.util.DoubleSummaryStatistics@590d97f5
in object java.util.HashMap@8b788f28
in object org.infinispan.stream.impl.StreamResponseCommand@5ffec2a6
{code}
Since there are very various cases in which streams could be used, it would make very much sense to build a solid number of code samples on how to employ the streams.
Two questions arise here :
- How can the problem of calculating averages in the cache can be solved via streams?
- It seems to me (I'm a newbie with infinispan) that the MapReduce utilities still make sense in infinispan because it is much more straightforward from a developer perspective how to get to calculate the average. Doesn't it seem not justified the deprecation of the mapreduce utilities from infinispan?
Thanks in advance for the support,
Marius.