All about Java stream collect offering

Lets start with a problem statement “Provided collection of employees, we need to find average, max and min age of employee by county”

public static void avgMaxMinEmployeeAgeByCounter() {
    String AVG = "AVG";
    String MAX = "MAX";
    String MIN = "MIN";
    String SUM = "SUM";
    String COUNT = "COUNT";

    List<Employee> employees = Employee.EMPLOYEES;//LIst of all employees, pojos
    // Mapping country name to average, max, min age of employee
    Map<String, Map<String, Double>> summary = new HashMap<>();

    for(Employee e: employees) {
        if(summary.containsKey(e.getCountry())){
            Map<String, Double> map = summary.get(e.getCountry());
            map.put(SUM, (map.get(SUM) + e.getAge()));
            map.put(COUNT, map.get(COUNT) +1);
            map.put(MAX, map.get(MAX) < e.getAge() ? e.getAge():map.get(MAX));
            map.put(MIN, map.get(MIN) > e.getAge() ? e.getAge():map.get(MIN));
        } else {
            Map<String, Double> map = new HashMap<>();
            map.put(AVG, Double.valueOf(e.getAge()));
            map.put(MAX,Double.valueOf(e.getAge()));
            map.put(MIN, Double.valueOf(e.getAge()));
            map.put(SUM, Double.valueOf(e.getAge()));
            map.put(COUNT, Double.valueOf(1));
            summary.put(e.getCountry(), map);
        }
    }
    //Printing result
    Set<String> keySet = summary.keySet();
    for(String country: keySet) {
        System.out.println(country);
        System.out.println("\t Average "+summary.get(country).get(SUM)/summary.get(country).get(COUNT));
        System.out.println("\t Max "+summary.get(country).get(MAX));
        System.out.println("\t MIN "+summary.get(country).get(MIN));
    }
}


Output ::
US
    Average 31.8
    Max 45.0
    MIN 25.0
IND
    Average 30.0
    Max 30.0
    MIN 30.0

We have been writing so many lines of code till Java 8 stream API. Whole perceptive of providing solution for a problem has changed now. Instead of imperative way, Java 8 stream API gives of flexibility of declarative programming. Just let the machine know what you want, instead of giving instructions to how to get it calculated. Considering above problem, we need NOT to instruct how to perform group by, calculate min , max or average.

Lets see how can we solve same problem using Java8 stream’s collect operation

Map<String, DoubleSummaryStatistics> collect = Employee.EMPLOYEES
    .stream()
    .collect(Collectors.groupingBy(Employee::getCountry, Collectors.summarizingDouble(Employee::getAge)));

collect.keySet().stream().forEach(country-> {
    System.out.println(country);
    System.out.println( "\t Average - " + collect.get(country).getAverage());
    System.out.println( "\t Max - " + collect.get(country).getMax());
    System.out.println( "\t Min - " + collect.get(country).getMin());
});

Output:
IND
Average - 30.0
Max - 30.0
Min - 30.0
US
Average - 31.8
Max - 45.0
Min - 25.0

We can see no of code lines reduces to less than half. Here API of interest is ‘collect’, it is called terminal operation. The argument passed to collect is an object of type java .util.stream.Collector. What does a Collector object do? It essentially describes a recipe for accumulating the elements of a stream into a final result. The Collector provides many recipe to consolidate streams like

  • transform stream to list, set etc
  • grouping stream by any attribute
  • Partitioning
  • multiple grouping
  • summarizing

In above example, we are using Collectors.groupingBy to group the stream of employees by first country and then summarizing each of group of records to find average, min, max. Its so simple !!

I have gone through most of the “collect” offering and applied to typical grouping/summarizing problem. For example

  • Employee count by country or city
  • Get employee name with highest salary in given country
  • Partition employee by employment mode by county

You can find complete code sample in my github repository https://github.com/yogeshdevatraj/java8/blob/master/StreamAPIPrj/src/com/stream/EmployeeFactory.java

Comments

  1. great info about hadoop in this blog At SynergisticIT we offer the best best hadoop training in the bay area

    ReplyDelete
  2. Mod Apk is a platform for Android users to download and install app mods. These are modifications to applications that improve their performance or add new features. Mod Apk allows users to browse, download and install mods from a variety of sources. For more Info visit this apk reservoir

    ReplyDelete

Post a Comment

Popular posts from this blog

Composite Design Pattern by example

State Design Pattern by Example

Eclipse command framework core expression: Property tester