Reading large files in Java efficiently is best achieved by using Stream-based APIs that process the file line-by-line or chunk-by-chunk. This prevents loading the entire file into memory (preventing OutOfMemoryError).
Here are the most common and efficient ways to do this:
1. Using Files.lines() (Recommended)
This is the most modern and idiomatic way in Java. It returns a Stream<String> where each element is a line from the file. It reads the lines lazily, meaning it only keeps a small portion of the file in memory at any given time.
Important: Always use a try-with-resources block to ensure the file handle is closed.
package org.kodejava.nio;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.stream.Stream;
public class LargeFileReader {
public static void main(String[] args) {
Path path = Paths.get("D:/large-file.txt");
try (Stream<String> lines = Files.lines(path)) {
lines.filter(line -> line.contains("Error")) // Example processing
.forEach(System.out::println);
} catch (IOException e) {
e.printStackTrace();
}
}
}
2. Using BufferedReader.lines()
If you already have a BufferedReader (for example, if you’re dealing with a specific character encoding), you can use its .lines() method. This also returns a lazy stream.
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
try (BufferedReader br = new BufferedReader(new FileReader("large-file.txt"))) {
br.lines()
.map(String::toLowerCase)
.forEach(line -> {
// Process each line here
});
} catch (IOException e) {
e.printStackTrace();
}
3. Using Scanner (For Tokens)
If you need to read tokens (like words or numbers) rather than full lines, Scanner is useful. However, it is generally slower than BufferedReader.
import java.util.Scanner;
import java.io.File;
try (Scanner scanner = new Scanner(new File("large-file.txt"))) {
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
// Process line
}
} catch (IOException e) {
e.printStackTrace();
}
Summary of Tips for Large Files:
- Lazy Evaluation: Operations like
filterandmapon Java Streams are lazy. They don’t process the data until a terminal operation (likeforEachorcollect) is called. - Memory Efficiency: The
StreamAPI ensures that you aren’t storing the whole file in aList<String>, which would quickly crash your app for multi-gigabyte files. - Parallelism: For huge files, you can use
.parallel()on the stream. However, be careful as IO-bound tasks often don’t benefit much from parallel streams unless the processing logic per line is very heavy.
