A friend of mine told me that he has a large Excel file, and he asked me if I could split the file into multiple smaller Excel files. So I write this little program using Apache POI to do it.
The code snippet below basically contains the following steps:
- Load the Excel as an
InputStream
from the classpath, if the file was not found the program will exit. - Create
XSSFWorkbook
from the input stream, and get the firstXSSFSheet
from the workbook. - Iterate the rows of data from the source worksheet.
- On the first
rownum
for each split file we create a new workbook usingSXSSFWorkbook
and also create theSXSSFSheet
. - Read the first row from the source worksheet, store it in
headerRow
to be used for creating header row in each new sheet. - If we are at the first row, write the header.
- Write each row from the source sheet to the destination sheet untuk the max rows is reached.
- Write the workbook into a new file.
And here is the complete code that you can try.
package org.kodejava.poi;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.CellType;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.xssf.streaming.SXSSFCell;
import org.apache.poi.xssf.streaming.SXSSFRow;
import org.apache.poi.xssf.streaming.SXSSFSheet;
import org.apache.poi.xssf.streaming.SXSSFWorkbook;
import org.apache.poi.xssf.usermodel.XSSFRow;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.time.LocalDateTime;
import java.time.temporal.ChronoUnit;
public class SplitExcelDemo {
public static final int MAX_ROWS_PER_FILE = 1000 - 1;
public static final boolean WITH_HEADER = true;
public static void main(String[] args) {
LocalDateTime startTime = LocalDateTime.now();
String filename = "/stock.xlsx";
try (InputStream is = SplitExcelDemo.class.getResourceAsStream(filename)) {
if (is == null) {
System.out.println("Source file was not found!");
System.exit(1);
}
XSSFWorkbook srcWorkbook = new XSSFWorkbook(is);
XSSFSheet srcSheet = srcWorkbook.getSheetAt(0);
int physicalNumberOfRows = srcSheet.getPhysicalNumberOfRows();
int rownum = 0;
int splitCounter = 0;
SXSSFWorkbook destWorkbook = null;
SXSSFSheet destSheet = null;
XSSFRow headerRow = null;
for (Row srcRow : srcSheet) {
if (rownum == 0) {
// At the beginning let's create a new workbook and worksheet
destWorkbook = new SXSSFWorkbook();
destSheet = destWorkbook.createSheet();
}
if (srcRow.getRowNum() == 0 && WITH_HEADER) {
// Copy header row to be use in each split file
headerRow = (XSSFRow) srcRow;
}
if (rownum == 0 && WITH_HEADER) {
// Add row header to each split file
if (headerRow != null) {
SXSSFRow firstRow = destSheet.createRow(rownum);
int index = 0;
for (Cell cell : headerRow) {
SXSSFCell headerCell = firstRow.createCell(index++, cell.getCellType());
if (cell.getCellType() == CellType.STRING) {
headerCell.setCellValue(cell.getStringCellValue());
}
}
}
} else {
// Copy rows from source worksheet into destination worksheet
SXSSFRow descRow = destSheet.createRow(rownum);
int index = 0;
for (Cell cell : srcRow) {
SXSSFCell destCell = descRow.createCell(index++, cell.getCellType());
switch (cell.getCellType()) {
case NUMERIC -> destCell.setCellValue(cell.getNumericCellValue());
case STRING -> destCell.setCellValue(cell.getStringCellValue());
}
}
}
// When a max number of rows copied are reached, or when we are at the end of worksheet,
// write data into a new file
if (rownum == MAX_ROWS_PER_FILE || srcRow.getRowNum() == physicalNumberOfRows - 1) {
rownum = -1;
String output = String.format("split-%03d.xlsx", splitCounter++);
System.out.println("Writing " + output);
try (OutputStream os = new FileOutputStream(output)) {
destWorkbook.write(os);
} catch (IOException e){
e.printStackTrace();
}
}
rownum = rownum + 1;
}
// Display processing time
LocalDateTime endTime = LocalDateTime.now();
long minutes = startTime.until(endTime, ChronoUnit.MINUTES);
startTime = startTime.plusMinutes(minutes);
long seconds = startTime.until(endTime, ChronoUnit.SECONDS);
System.out.printf("Splitting finished in %d minutes and %d seconds %n", minutes, seconds);
} catch (Exception e) {
e.printStackTrace();
}
}
}
The output of running this program will look like this:
Writing split-000.xlsx
Writing split-001.xlsx
Writing split-002.xlsx
Writing split-003.xlsx
Writing split-004.xlsx
Writing split-005.xlsx
Writing split-006.xlsx
Writing split-007.xlsx
Writing split-008.xlsx
Writing split-009.xlsx
Splitting finished in 0 minutes and 8 seconds
Maven Dependencies
<dependencies>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi</artifactId>
<version>5.2.5</version>
</dependency>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml</artifactId>
<version>5.2.5</version>
</dependency>
</dependencies>
Latest posts by Wayan (see all)
- How do I get number of each day for a certain month in Java? - September 8, 2024
- How do I get operating system process information using ProcessHandle? - July 22, 2024
- How do I sum a BigDecimal property of a list of objects using Java Stream API? - July 22, 2024