What are collectors in Java Stream API?

In Java, the Collector is a concept in the Stream API which provides a way to collect the results of various operations in the stream.

It is used in conjunction with the collect method of the Stream interface. The collect method allows you to accumulate the elements of the stream into a summary result.

Here’s an example of how you can use a Collector:

package org.kodejava.stream;

import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;

public class CollectToList {
    public static void main(String[] args) {
        List<String> stringList = Arrays.asList("A", "AA", "AAA", "B", "BB", "BBB");
        List<String> collectedList = stringList.stream()
                .filter(s -> s.startsWith("A"))
                .collect(Collectors.toList());
    }
}

In the above example, the Collector used is Collectors.toList(), which will accumulate the stream’s elements into a List.

java.util.stream.Collectors is a utility class that contains various methods for creating common kinds of Collectors.

A Collector can perform transformations on the input elements, accumulation of processed input elements into a container, and combining of two result containers. In fact, Collector is extremely flexible, and you can supply Collector with your own functions for these purposes if you need to.

Collector is an interface in the java.util.stream package. It’s used along with the collect() terminal operation to consume elements from a stream and store them into a collection or possibly other types of result container.

public interface Collector<T, A, R> {
    Supplier<A> supplier();
    BiConsumer<A, T> accumulator();
    BinaryOperator<A> combiner();
    Function<A, R> finisher();
    Set<Characteristics> characteristics();
}

Each Collector contains four functions: supplier(), accumulator(), combiner(), and finisher(), and a characteristics set which provides hints for the implementation to optimize processing.

  1. Supplier: It creates a new mutable result container, where T is the type of items in the stream to be collected, and A is the type of the mutable accumulation container.
  2. Accumulator: It incorporates an additional input element into a result container.
  3. Combiner: It combines two result containers into one. This is used in parallel processing.
  4. Finisher: It performs the final transformation from the intermediate accumulation type A to the final result type R.
  5. Characteristics: It returns a Set of Collector.Characteristics indicating the characteristics of this Collector. This can be CONCURRENT, UNORDERED or IDENTITY_FINISH.

Here’s an example demonstrating how to create a custom collector:

package org.kodejava.stream;

import java.util.Arrays;
import java.util.LinkedList;
import java.util.List;
import java.util.stream.Collector;

public class CustomCollector {
    public static void main(String[] args) {
        Collector<String, ?, LinkedList<String>> toLinkedList =
                Collector.of(
                        LinkedList::new,              // The Supplier
                        LinkedList::add,              // The Accumulator
                        (left, right) -> {            // The Combiner
                            left.addAll(right);
                            return left;
                        },
                        Collector.Characteristics.IDENTITY_FINISH
                );

        List<String> strings = Arrays.asList("a", "b", "c", "d");
        LinkedList<String> collectedStrings = strings.stream()
                .collect(toLinkedList);
    }
}

In the built-in java.util.stream.Collectors class, there are various static methods which return Collector instances for common use cases, such as toList(), toSet(), joining(), groupingBy(), partitioningBy(), and others.

How do I use sorted method in Java Stream API?

In the Java Stream API, the sorted() method is used to sort elements in a stream. The sorted() method returns a stream consisting of the elements of the original stream, sorted according to natural order. If the elements of the stream are not Comparable, a java.lang.ClassCastException may be thrown when the terminal operation is executed.

Take a look at this example:

package org.kodejava.stream;

import java.util.stream.Stream;

public class SortedString {
    public static void main(String[] args) {
        Stream<String> stream = Stream.of("d", "a", "b", "c", "e");
        stream.sorted().forEach(System.out::println);
    }
}

In this code, we create a stream of String objects and sort it using the sorted() operation. The forEach method is a terminal operation that processes the sorted stream.

If you would like to sort objects of a custom class, you may need to supply your own comparator:

package org.kodejava.stream;

import java.util.Comparator;
import java.util.stream.Stream;

public class SortedCustomComparator {

    public static void main(String[] args) {
        Stream<User> usersStream = Stream.of(
                new User("John", 30),
                new User("Rosa", 25),
                new User("Adam", 23));

        usersStream
                .sorted(Comparator.comparing(User::getAge))
                .forEach(System.out::println);
    }

    static class User {
        String name;
        int age;

        User(String name, int age) {
            this.name = name;
            this.age = age;
        }

        String getName() {
            return name;
        }

        int getAge() {
            return age;
        }

        @Override
        public String toString() {
            return "User{" + "name='" + name + '\'' + ", age=" + age + '}';
        }
    }
}

In this case, the sorted() method takes a Comparator argument, which is created using a lambda function. This comparator compares User objects by their ages.

In Java Stream API, you can use the sorted() and limit() methods together, but their ordering impacts performance. The sorted() method sorts all the elements in the stream, whereas limit(n) shortens the stream to be no longer than n elements in length.

If you call limit() before sorted(), like:

stream.limit(10).sorted()

The operation will only sort the first 10 elements from the stream.

But if you call sorted() before limit(), like:

stream.sorted().limit(10)

The operation will sort the entire stream, which may be much larger and more time-consuming, and then cut down the result to only keep the first 10 items.

So, if your task is to ‘find the smallest (or largest) n elements’, it is more efficient to first sort the stream and then limit it. If you want to ‘sort the first n elements’, you should limit the stream first and then sort it.

What is the peek method in Java Stream API and how to use it?

The peek method in Java’s Stream API is an intermediary operation. This method provides a way to inspect the elements in the stream as they’re being processed by other stream operations. However, it’s important to note that peek should only be used for debugging purposes, as it can be quite disruptive to the stream’s data flow, particularly when it’s used in parallel streams.

The key point of the peek operation is that it’s not a terminal operation (i.e., it doesn’t trigger data processing), instead, it integrates nicely within the operation chain, allowing insights to be gained during the processing phase.

Here’s a simple way of using it with Java:

package org.kodejava.util;

import java.util.stream.Stream;

public class PeekMethodStream {
    public static void main(String[] args) {
        Stream.of(1, 2, 3, 4, 5)
                .peek(i -> System.out.println("Number: " + i))
                .map(i -> i * i)
                .forEach(i -> System.out.println("Squared: " + i));
    }
}

Output:

Number: 1
Squared: 1
Number: 2
Squared: 4
Number: 3
Squared: 9
Number: 4
Squared: 16
Number: 5
Squared: 25

In this code snippet:

  • We create a stream with Stream.of(1, 2, 3, 4, 5).
  • Then we use peek to print each element in its current state: “Number: 1”, “Number: 2”, etc.
  • After that, we use map to square each element.
  • Finally, we use forEach to print the squared numbers: “Squared: 1”, “Squared: 4”, etc.

Remember, use peek carefully and preferably only for debugging purposes.

How do I use map, filter, reduce in Java Stream API?

The map(), filter(), and reduce() methods are key operations used in Java Stream API which is used for processing collection objects in a functional programming manner.

Java Streams provide many powerful methods to perform common operations like map, filter, reduce, etc. These operations can transform and manipulate data in many ways.

  • map: The map() function is used to transform one type of Stream to another. It applies a function to each element of the Stream and then returns the function’s output as a new Stream. The number of input and output elements is the same, but the type or value of the elements may change.

Here’s an example:

package org.kodejava.basic;

import java.util.Arrays;
import java.util.List;

public class MapToUpperCase {
    public static void main(String[] args) {
        List<String> myList = Arrays.asList("a1", "a2", "b1", "c2", "c1");
        myList.stream()
                .map(String::toUpperCase)
                .sorted()
                .forEach(System.out::println);
    }
}

Output:

A1
A2
B1
C1
C2

Another example to use map() to convert a list of Strings to a list of their lengths:

package org.kodejava.basic;

import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;

public class MapStringToLength {
    public static void main(String[] args) {
        List<String> words = Arrays.asList("Java", "Stream", "API");
        List<Integer> lengths = words
                .stream()
                .map(String::length)
                .collect(Collectors.toList());

        System.out.println("Lengths = " + lengths);
    }
}

Output:

Lengths = [4, 6, 3]
  • filter: The filter() function is used to filter out elements from a Stream based upon a Predicate. It is an intermediate operation and returns a new stream which consists of elements of the current stream which satisfies the predicate condition.

Here’s an example:

package org.kodejava.basic;

import java.util.Arrays;
import java.util.List;

public class FilterStartWith {
    public static void main(String[] args) {
        List<String> myList = Arrays.asList("a1", "a2", "b1", "c2", "c1");
        myList.stream()
                .filter(s -> s.startsWith("c"))
                .map(String::toUpperCase)
                .sorted()
                .forEach(System.out::println);
    }
}

Output:

C1
C2

Another example:

package org.kodejava.basic;

import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;

public class FilterEvenNumber {
    public static void main(String[] args) {
        List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5, 6);
        List<Integer> evens = numbers
                .stream()
                .filter(n -> n % 2 == 0)
                .collect(Collectors.toList());

        System.out.println("Even numbers = " + evens);
    }
}

Output:

Even numbers = [2, 4, 6]
  • reduce: The reduce() function takes two parameters: an initial value, and a BinaryOperator function. It reduces the elements of a Stream to a single value using the BinaryOperator, by repeated application.

Here’s an example:

package org.kodejava.basic;

import java.util.Arrays;
import java.util.List;
import java.util.Optional;

public class ReduceSum {
    public static void main(String[] args) {
        List<Integer> myList = Arrays.asList(1, 2, 3, 4, 5);
        Optional<Integer> sum = myList
                .stream()
                .reduce((a, b) -> a + b);

        sum.ifPresent(System.out::println);
    }
}

Output:

15

In the above example, the reduce method will sum all the integers in the stream and then ifPresent is simply used to print the sum if the Optional is not empty.

All these operations can be chained together to build complex data processing pipelines. Furthermore, they are “lazy”, meaning they don’t perform any computations until a terminal operation (like collect()) is invoked on the stream.

How do I handle exceptions in Stream.forEach() method?

When using Java’s Stream.forEach() method, you might encounter checked exceptions. Checked exceptions can’t be thrown inside a lambda without being caught because of the Consumer functional interface. It does not allow for this in its method signature.

Here is an example of how you might deal with an exception in a forEach operation:

package org.kodejava.basic;

import java.util.List;

public class ForEachException {
    public static void main(String[] args) {
        List<String> list = List.of("Java", "Kotlin", "Scala", "Clojure");
        list.stream().forEach(item -> {
            try {
                // methodThatThrowsExceptions can be any method that throws a 
                // checked exception
                methodThatThrowsExceptions(item);
            } catch (Exception e) {
                e.printStackTrace();
            }
        });
    }

    public static void methodThatThrowsExceptions(String item) throws Exception {
        // Method implementation
    }
}

In the above code, we have a method methodThatThrowsExceptions that can throw a checked exception. In the forEach operation, we have a lambda in which we use a try-catch block to handle potential exceptions from methodThatThrowsExceptions.

However, this approach is not generally recommended because it suppresses the exception where it occurs and doesn’t interrupt the stream processing. If you need to properly handle the exception and perhaps stop processing, you may need to use a traditional for-or-each loop.

There are several reasons why exception handling within lambda expressions in Java Streams is not generally recommended.

  1. Checked Exceptions: Lambda expressions in Java do not permit checked exceptions to be thrown, so you must handle these within the lambda expression itself. This often results in bloated, less readable lambda expressions due to the necessity of a try-catch block.

  2. Suppressed Exceptions: If you catch the exception within the lambda and print the stack trace – or worse, do nothing at all – the exception is effectively suppressed. This could lead to silent failures in your code, where an error condition is not properly propagated up the call stack. This can make it harder to debug issues, as you may be unaware an exception has occurred.

  3. Robust Error Handling: Handling the exception within the lambda expression means you’re dealing with it right at the point of failure, and it might not be the best place to handle the exception. Often, you’ll want to stop processing the current operation when an exception occurs. Propagate the error up to a higher level in your software where it can be handled properly (e.g., by displaying an error message to the user, logging the issue, or retrying the operation).

  4. Impure Functions: By handling exceptions (a side effect) within lambda expressions, we are making them impure functions – i.e., functions that modify state outside their scope or depend on state from outside their scope. This goes against the principles of functional programming.

In summary, while you can handle exceptions within forEach lambda expressions in Java, doing so can create challenges in how the software handles errors, potentially leading to suppressed exceptions, less readable code, and deviations from functional programming principles. Better approaches often are to handle exceptions at a higher level, use optional values, or use features from new versions of Java (like CompletableFuture.exceptionally) or third-party libraries designed to handle exceptions in functional programming contexts.