How do I use Files.lines() to process a text file line by line?

To use Files.lines() to process a text file line by line in Java, you should follow the pattern of returning a Stream<String> within a try-with-resources block. This approach is memory-efficient because it reads the file lazily, meaning it doesn’t load the entire file into memory at once.

Basic Implementation

package org.kodejava.nio;

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.stream.Stream;

public class FileStreamExample {
    public static void main(String[] args) {
        Path path = Paths.get("example.txt");

        // Use try-with-resources to ensure the stream (and underlying file) is closed
        try (Stream<String> lines = Files.lines(path)) {
            lines.forEach(System.out::println);
        } catch (IOException e) {
            System.err.println("Error reading file: " + e.getMessage());
        }
    }
}

Advanced Processing (Filtering and Mapping)

The power of Files.lines() comes from the Stream API, allowing you to filter, transform, or search through the file content easily:

try (Stream<String> lines = Files.lines(path)) {
    long count = lines
        .filter(line -> line.contains("ERROR")) // Only keep lines with "ERROR"
        .map(String::trim)                       // Remove whitespace
        .peek(System.out::println)               // Print each matching line
        .count();                                // Count the occurrences

    System.out.println("Total errors found: " + count);
} catch (IOException e) {
    e.printStackTrace();
}

Key Considerations:

  1. Try-with-Resources is Mandatory: Unlike most streams, the stream returned by Files.lines() holds an open resource (the file handle). If you don’t close it, you may run into a “too many open files” error.
  2. Character Encoding: By default, Files.lines() uses UTF-8. If your file uses a different encoding (like ISO-8859-1), you can specify it as a second argument:
    Files.lines(path, StandardCharsets.ISO_8859_1)
    
  3. Performance: For very large files, Files.lines() is significantly more efficient than Files.readAllLines(), which would attempt to store every line in a List<String>, potentially causing an OutOfMemoryError.

How do I read large files with streams?

Reading large files in Java efficiently is best achieved by using Stream-based APIs that process the file line-by-line or chunk-by-chunk. This prevents loading the entire file into memory (preventing OutOfMemoryError).

Here are the most common and efficient ways to do this:

1. Using Files.lines() (Recommended)

This is the most modern and idiomatic way in Java. It returns a Stream<String> where each element is a line from the file. It reads the lines lazily, meaning it only keeps a small portion of the file in memory at any given time.

Important: Always use a try-with-resources block to ensure the file handle is closed.

package org.kodejava.nio;

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.stream.Stream;

public class LargeFileReader {
    public static void main(String[] args) {
        Path path = Paths.get("D:/large-file.txt");

        try (Stream<String> lines = Files.lines(path)) {
            lines.filter(line -> line.contains("Error")) // Example processing
                    .forEach(System.out::println);
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

2. Using BufferedReader.lines()

If you already have a BufferedReader (for example, if you’re dealing with a specific character encoding), you can use its .lines() method. This also returns a lazy stream.

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;

try (BufferedReader br = new BufferedReader(new FileReader("large-file.txt"))) {
    br.lines()
      .map(String::toLowerCase)
      .forEach(line -> {
          // Process each line here
      });
} catch (IOException e) {
    e.printStackTrace();
}

3. Using Scanner (For Tokens)

If you need to read tokens (like words or numbers) rather than full lines, Scanner is useful. However, it is generally slower than BufferedReader.

import java.util.Scanner;
import java.io.File;

try (Scanner scanner = new Scanner(new File("large-file.txt"))) {
    while (scanner.hasNextLine()) {
        String line = scanner.nextLine();
        // Process line
    }
} catch (IOException e) {
    e.printStackTrace();
}

Summary of Tips for Large Files:

  • Lazy Evaluation: Operations like filter and map on Java Streams are lazy. They don’t process the data until a terminal operation (like forEach or collect) is called.
  • Memory Efficiency: The Stream API ensures that you aren’t storing the whole file in a List<String>, which would quickly crash your app for multi-gigabyte files.
  • Parallelism: For huge files, you can use .parallel() on the stream. However, be careful as IO-bound tasks often don’t benefit much from parallel streams unless the processing logic per line is very heavy.