Introduction to the Apache POI library for creating Excel file

Apache POI is a robust open-source library that provides APIs for manipulating various file formats based upon the Office Open XML standards (OOXML) and Microsoft’s OLE2 compound document format (OLE2). Being Java-based, it’s readily used by Java developers for creating, reading, and modifying Excel files.

To create a new Excel file using Apache POI, you will want to follow these basic steps:

1. Create a Workbook

You create an instance of Workbook which represents an Excel file. Workbook is an interface and the instance can be created by using one of its implementing classes like HSSFWorkbook (for .xls) or XSSFWorkbook (for .xlsx).

package org.kodejava.poi;

import org.apache.poi.xssf.usermodel.XSSFWorkbook;

import java.io.FileOutputStream;
import java.io.IOException;

public class CreateWorkbook {
   public static void main(String[] args) {
      // Create Blank workbook
      try (XSSFWorkbook workbook = new XSSFWorkbook()) {
         // Write the workbook to a file
         try (FileOutputStream fileOut = new FileOutputStream("workbook.xlsx")) {
            workbook.write(fileOut);
         } catch (Exception e) {
            e.printStackTrace();
         }
      } catch (IOException e) {
         e.printStackTrace();
      }
   }
}

This block of code will create a new Excel file called “workbook.xlsx” in your project directory.

2. Create a Sheet

You create a Sheet in the Workbook. A sheet is a collection of rows where your data resides. To create a new sheet using Apache POI, you can use the createSheet() method of the Workbook interface. This method returns an object of the Sheet interface.

Below is an example of creating a new sheet in an Excel workbook:

package org.kodejava.poi;

import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

import java.io.FileOutputStream;
import java.io.IOException;

public class CreateSheetExample {
    public static void main(String[] args) {
        // Create a workbook
        try (XSSFWorkbook workbook = new XSSFWorkbook()) {
            // Create a blank sheet within the workbook
            XSSFSheet sheet = workbook.createSheet("New Sheet");

            // Write the workbook in a file system
            try (FileOutputStream out = new FileOutputStream("New.xlsx")) {
                workbook.write(out);
            } catch (IOException e) {
                e.printStackTrace();
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

This code will create a new sheet named “New Sheet” in an Excel file named New.xlsx.

3. Create Rows and Cells

Inside your Sheet, you create Row. Inside your Row, you create Cell.

Creating new rows and cells with Apache POI is quite straightforward. Here’s a step-by-step way to do this:

  • Create a workbook.
  • Within the workbook, create a sheet.
  • Within the sheet, create a row.
  • Within the row, create a cell.

Each Cell can then be populated with data. Apache POI supports various data types including a text, numeric, date, boolean and formula.

Here is an example using Apache POI in Java:

package org.kodejava.poi;

import org.apache.poi.xssf.usermodel.XSSFCell;
import org.apache.poi.xssf.usermodel.XSSFRow;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

import java.io.FileOutputStream;
import java.io.IOException;

public class CreateCellExample {
    public static void main(String[] args) {
        // Create a workbook
        try (XSSFWorkbook workbook = new XSSFWorkbook()) {
            // Create a blank sheet within the workbook
            XSSFSheet sheet = workbook.createSheet("New Sheet");

            // Create a row within the sheet
            XSSFRow row = sheet.createRow(0);

            // Create a cell within the row
            XSSFCell cell = row.createCell(0);

            // Set the value of the cell
            cell.setCellValue("Hello, Apache POI!");

            // Write the workbook to the file system
            try (FileOutputStream out = new FileOutputStream("Cell.xlsx")) {
                workbook.write(out);
            } catch (IOException e) {
                e.printStackTrace();
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

This code will create a new workbook with a sheet named New Sheet. In that sheet, it will create a row at index 0. In that row, it will create a cell at index 0 and set its value to “Hello, Apache POI!”

This will all be saved to a new Excel file named Cell.xlsx in the current project directory.

4. Creating a Date cell

Using Apache POI involves a few steps outlined below:

  • First, you create a new Workbook, Sheet, and Row, as shown previously.
  • Then, you create a cell in any given row and set its type to CellType.NUMERIC.
  • You then use CellStyle to format the date according to your needs.

Here’s an example:

package org.kodejava.poi;

import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

import java.io.FileOutputStream;
import java.io.IOException;
import java.util.Date;

public class CreateDateCellExample {
    public static void main(String[] args) {
        // Create a workbook
        try (Workbook workbook = new XSSFWorkbook()) {
            // Create a blank sheet within the workbook
            Sheet sheet = workbook.createSheet("Date Example");

            // Create a row within the sheet
            Row row = sheet.createRow(0);

            // Create a cell within the row and set its type to NUMERIC
            Cell cell = row.createCell(0, CellType.NUMERIC);

            // Create a CellStyle
            CellStyle cellStyle = workbook.createCellStyle();
            CreationHelper creationHelper = workbook.getCreationHelper();
            // Set the built-in date format style
            cellStyle.setDataFormat(creationHelper.createDataFormat().getFormat("m/d/yy h:mm"));

            cell.setCellStyle(cellStyle);

            // Set the value of the cell as today's date
            cell.setCellValue(new Date());

            // Write the workbook to the file system
            try (FileOutputStream out = new FileOutputStream("DateExample.xlsx")) {
                workbook.write(out);
            } catch (IOException e) {
                e.printStackTrace();
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

In this example, a date cell has been created, and the date format has been set to m/d/yy h:mm, but there are many other ways to format the date or time. You can use a custom format if you prefer, so long as it follows the rule of Excel custom date format.

5. Creating various types of cell

Using Apache POI, you need to create a cell and set the cell type and value accordingly.

Let’s take a look at some examples:

package org.kodejava.poi;

import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

import java.io.FileOutputStream;
import java.io.IOException;

public class CreateVariousCellsExample {
    public static void main(String[] args) throws IOException {
        Workbook workbook = new XSSFWorkbook();
        Sheet sheet = workbook.createSheet("New Sheet");

        Row row = sheet.createRow(0);

        // Create a numeric cell
        Cell numericCell = row.createCell(0, CellType.NUMERIC);
        numericCell.setCellValue(123.456);

        // Create a string cell
        Cell stringCell = row.createCell(1, CellType.STRING);
        stringCell.setCellValue("Hello, POI!");

        // Create a boolean cell
        Cell booleanCell = row.createCell(2, CellType.BOOLEAN);
        booleanCell.setCellValue(true);

        // Create a formula cell
        Cell formulaCell = row.createCell(3, CellType.FORMULA);
        formulaCell.setCellFormula("A1*B1");

        try (FileOutputStream fileOut = new FileOutputStream("./differentCellTypes.xlsx")) {
            workbook.write(fileOut);
        }
        workbook.close();
    }
}

This code will create an Excel file named “differentCellTypes.xlsx” with a sheet named “New Sheet”. In the first row of this sheet, there will be four cells:

  • A numeric cell containing the number 123.456
  • A string cell containing the value “Hello, POI!”
  • A boolean cell containing the boolean value of true
  • A formula cell containing a formula which multiplies the values of the first and second cells.

And that’s the basic of creating an Excel file using Apache POI. More advanced features include formatting and styling cells, adding images, creating charts, formulas, hyperlinks and so forth. You would find Apache POI’s API and its official documentation very resourceful for covering those.

Do note, it’s important to always close the Workbook instance at the end to free up resources and to prevent memory leaks.

Maven Dependencies

<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-ooxml</artifactId>
    <version>5.2.5</version>
</dependency>

Maven Central

How do I create XLSX files using Apache POI?

First of all, make sure that you have the Apache POI library included in your project. If you are using Maven, you can add the following to your pom.xml:

Maven Dependencies

<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-ooxml</artifactId>
    <version>5.2.5</version>
</dependency>

Maven Central

Then, you can use the following code to create an Excel .xlsx file:

package org.kodejava.poi;

import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

import java.io.FileOutputStream;
import java.io.IOException;

public class ExcelExample {
    public static void main(String[] args) {
        // Create a new Workbook
        Workbook workbook = new XSSFWorkbook();

        // Create a new sheet in the workbook with a name
        Sheet sheet = workbook.createSheet("MySheet");

        // Create a row at index 0
        Row row = sheet.createRow(0);

        // Create a cell at index 0
        Cell cell = row.createCell(0, CellType.STRING);
        // Assign the string "Hello" to the cell
        cell.setCellValue("Hello");

        // Add some other rows and cells similarly...
        // ...

        try (FileOutputStream fos = new FileOutputStream("MyExcel.xlsx")) {
            workbook.write(fos);
        } catch (IOException e) {
            e.printStackTrace();
        }

        System.out.println("Excel file has been generated!");

        try {
            workbook.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

This program will create an Excel .xlsx file named MyExcel.xlsx with a single sheet called “MySheet.” The sheet has a single cell in the first row which contains the string “Hello.” The output file will be generated in the root directory of your project.

The above-given code is only a basic example. Apache POI allows advanced features like applying styles to cells, merging cells, and working with formulas among others. The library also supports other Microsoft Office file formats.

Here is another slightly more complex example where we create multiple rows and cells:

package org.kodejava.poi;

import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

import java.io.FileOutputStream;
import java.io.IOException;

public class ExcelAnotherExample {
    public static void main(String[] args) {
        Workbook workbook = new XSSFWorkbook();
        Sheet sheet = workbook.createSheet("Students");

        Row rowHeader = sheet.createRow(0);
        Cell cellHeader1 = rowHeader.createCell(0, CellType.STRING);
        cellHeader1.setCellValue("ID");

        Cell cellHeader2 = rowHeader.createCell(1, CellType.STRING);
        cellHeader2.setCellValue("Name");

        for(int i = 1; i < 6; i++) {
            Row rowData = sheet.createRow(i);

            Cell cellData1 = rowData.createCell(0, CellType.NUMERIC);
            cellData1.setCellValue(i);

            Cell cellData2 = rowData.createCell(1, CellType.STRING);
            cellData2.setCellValue("Student " + i);
        }

        try (FileOutputStream fos = new FileOutputStream("StudentsData.xlsx")) {
            workbook.write(fos);
        } catch (IOException e) {
            e.printStackTrace();
        }

        System.out.println("Excel file has been generated!");

        try {
            workbook.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

This code creates an Excel file named StudentsData.xlsx with a single worksheet. The worksheet contains a listing of student names and their respective IDs. The first row is a header with “ID” and “Name” labels. The next rows contain the IDs (numeric) and names (string “Student X”) for five different students.

How do I split large excel file into multiple smaller files?

A friend of mine told me that he has a large Excel file, and he asked me if I could split the file into multiple smaller Excel files. So I write this little program using Apache POI to do it.

The code snippet below basically contains the following steps:

  1. Load the Excel as an InputStream from the classpath, if the file was not found the program will exit.
  2. Create XSSFWorkbook from the input stream, and get the first XSSFSheet from the workbook.
  3. Iterate the rows of data from the source worksheet.
  4. On the first rownum for each split file we create a new workbook using SXSSFWorkbook and also create the SXSSFSheet.
  5. Read the first row from the source worksheet, store it in headerRow to be used for creating header row in each new sheet.
  6. If we are at the first row, write the header.
  7. Write each row from the source sheet to the destination sheet untuk the max rows is reached.
  8. Write the workbook into a new file.

And here is the complete code that you can try.

package org.kodejava.poi;

import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.CellType;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.xssf.streaming.SXSSFCell;
import org.apache.poi.xssf.streaming.SXSSFRow;
import org.apache.poi.xssf.streaming.SXSSFSheet;
import org.apache.poi.xssf.streaming.SXSSFWorkbook;
import org.apache.poi.xssf.usermodel.XSSFRow;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.time.LocalDateTime;
import java.time.temporal.ChronoUnit;

public class SplitExcelDemo {
    public static final int MAX_ROWS_PER_FILE = 1000 - 1;
    public static final boolean WITH_HEADER = true;

    public static void main(String[] args) {
        LocalDateTime startTime = LocalDateTime.now();
        String filename = "/stock.xlsx";
        try (InputStream is = SplitExcelDemo.class.getResourceAsStream(filename)) {
            if (is == null) {
                System.out.println("Source file was not found!");
                System.exit(1);
            }

            XSSFWorkbook srcWorkbook = new XSSFWorkbook(is);
            XSSFSheet srcSheet = srcWorkbook.getSheetAt(0);
            int physicalNumberOfRows = srcSheet.getPhysicalNumberOfRows();

            int rownum = 0;
            int splitCounter = 0;

            SXSSFWorkbook destWorkbook = null;
            SXSSFSheet destSheet = null;

            XSSFRow headerRow = null;

            for (Row srcRow : srcSheet) {
                if (rownum == 0) {
                    // At the beginning let's create a new workbook and worksheet
                    destWorkbook = new SXSSFWorkbook();
                    destSheet = destWorkbook.createSheet();
                }

                if (srcRow.getRowNum() == 0 && WITH_HEADER) {
                    // Copy header row to be use in each split file
                    headerRow = (XSSFRow) srcRow;
                }

                if (rownum == 0 && WITH_HEADER) {
                    // Add row header to each split file
                    if (headerRow != null) {
                        SXSSFRow firstRow = destSheet.createRow(rownum);
                        int index = 0;
                        for (Cell cell : headerRow) {
                            SXSSFCell headerCell = firstRow.createCell(index++, cell.getCellType());
                            if (cell.getCellType() == CellType.STRING) {
                                headerCell.setCellValue(cell.getStringCellValue());
                            }
                        }
                    }
                } else {
                    // Copy rows from source worksheet into destination worksheet
                    SXSSFRow descRow = destSheet.createRow(rownum);
                    int index = 0;
                    for (Cell cell : srcRow) {
                        SXSSFCell destCell = descRow.createCell(index++, cell.getCellType());
                        switch (cell.getCellType()) {
                            case NUMERIC -> destCell.setCellValue(cell.getNumericCellValue());
                            case STRING -> destCell.setCellValue(cell.getStringCellValue());
                        }
                    }
                }

                // When a max number of rows copied are reached, or when we are at the end of worksheet, 
                // write data into a new file 
                if (rownum == MAX_ROWS_PER_FILE || srcRow.getRowNum() == physicalNumberOfRows - 1) {
                    rownum = -1;
                    String output = String.format("split-%03d.xlsx", splitCounter++);
                    System.out.println("Writing " + output);
                    try (OutputStream os = new FileOutputStream(output)) {
                        destWorkbook.write(os);
                    } catch (IOException e){
                        e.printStackTrace();
                    }
                }

                rownum = rownum + 1;
            }

            // Display processing time
            LocalDateTime endTime = LocalDateTime.now();
            long minutes = startTime.until(endTime, ChronoUnit.MINUTES);
            startTime = startTime.plusMinutes(minutes);
            long seconds = startTime.until(endTime, ChronoUnit.SECONDS);
            System.out.printf("Splitting finished in %d minutes and %d seconds %n", minutes, seconds);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

The output of running this program will look like this:

Writing split-000.xlsx
Writing split-001.xlsx
Writing split-002.xlsx
Writing split-003.xlsx
Writing split-004.xlsx
Writing split-005.xlsx
Writing split-006.xlsx
Writing split-007.xlsx
Writing split-008.xlsx
Writing split-009.xlsx
Splitting finished in 0 minutes and 8 seconds 

Maven Dependencies

<dependencies>
    <dependency>
        <groupId>org.apache.poi</groupId>
        <artifactId>poi</artifactId>
        <version>5.2.3</version>
    </dependency>
    <dependency>
        <groupId>org.apache.poi</groupId>
        <artifactId>poi-ooxml</artifactId>
        <version>5.2.3</version>
    </dependency>
</dependencies>

Maven Central Maven Central

How do I check if a cell contains a date value?

The code below check if the first cell of an Excel file contains a date value. In Excel the cell type for date is returned as HSSFCell.CELL_TYPE_NUMERIC, to make sure if it contains a date we can use a utility method DateUtil.isCellDateFormatted(Cell cell), this method will check if the cell value is a valid date.

package org.kodejava.poi;

import org.apache.poi.hssf.usermodel.HSSFCell;
import org.apache.poi.hssf.usermodel.HSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.ss.usermodel.CellType;
import org.apache.poi.ss.usermodel.DateUtil;

import java.io.FileInputStream;
import java.io.FileNotFoundException;

public class DateCellType {
    public static void main(String[] args) throws Exception {
        String filename = "datecelltype.xls";

        try (FileInputStream fis = new FileInputStream(filename)) {
            HSSFWorkbook workbook = new HSSFWorkbook(fis);
            HSSFSheet sheet = workbook.getSheetAt(0);

            // Read a cell the first cell on the sheet.
            HSSFCell cell = sheet.getRow(0).getCell(0);
            if (cell.getCellType() == CellType.NUMERIC) {
                System.out.println("Cell type for date data type is numeric.");
            }

            // Using HSSFDateUtil to check if a cell contains a date.
            if (DateUtil.isCellDateFormatted(cell)) {
                System.out.println("The cell contains a date value: "
                        + cell.getDateCellValue());
            }
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }
    }
}

The example result of the code snippet:

Cell type for date data type is numeric.
The cell contains a date value: Fri Oct 08 00:00:00 CST 2021

Maven Dependencies

<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi</artifactId>
    <version>5.2.3</version>
</dependency>

Maven Central

How do I obtain Excel’s cell data type?

In this example we try to obtain the Excel’s cell data type so that we can read the value using the right method. The data to be read is in a file named celltype.xls. The matrix below depict how the file is.

    |   COL
ROW |   0       1   2   3   4
----|-------------------------
0   |   1       2   A   B   TRUE
1   |   FALSE   X   Y   Z   10
package org.kodejava.poi;

import org.apache.poi.hssf.usermodel.HSSFCell;
import org.apache.poi.hssf.usermodel.HSSFRow;
import org.apache.poi.hssf.usermodel.HSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.CellType;
import org.apache.poi.ss.usermodel.Row;

import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.util.Iterator;

public class ObtainingCellType {

    public static void main(String[] args) throws Exception {
        String filename = "celltype.xls";

        try (FileInputStream fis = new FileInputStream(filename)) {
            HSSFWorkbook workbook = new HSSFWorkbook(fis);
            HSSFSheet sheet = workbook.getSheetAt(0);

            Iterator<Row> rows = sheet.rowIterator();
            while (rows.hasNext()) {
                HSSFRow row = (HSSFRow) rows.next();

                Iterator<Cell> cells = row.cellIterator();
                while (cells.hasNext()) {
                    HSSFCell cell = (HSSFCell) cells.next();

                    CellType type = cell.getCellType();
                    if (type == CellType.STRING) {
                        System.out.printf("[%d, %d] = STRING; Value = %s%n",
                                cell.getRowIndex(), cell.getColumnIndex(),
                                cell.getRichStringCellValue().toString());
                    } else if (type == CellType.NUMERIC) {
                        System.out.printf("[%d, %d] = NUMERIC; Value = %f%n",
                                cell.getRowIndex(), cell.getColumnIndex(),
                                cell.getNumericCellValue());
                    } else if (type == CellType.BOOLEAN) {
                        System.out.printf("[%d, %d] = BOOLEAN; Value = %b%n",
                                cell.getRowIndex(), cell.getColumnIndex(),
                                cell.getBooleanCellValue());
                    } else if (type == CellType.BLANK) {
                        System.out.printf("[%d, %d] = BLANK CELL%n",
                                cell.getRowIndex(), cell.getColumnIndex());
                    }
                }
            }
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }
    }
}

Our program iterates the Excel file rows and cells and produce the following
output:

[0, 0] = NUMERIC; Value = 1.000000
[0, 1] = NUMERIC; Value = 2.000000
[0, 2] = STRING; Value = A
[0, 3] = STRING; Value = B
[0, 4] = BOOLEAN; Value = true
[1, 0] = BOOLEAN; Value = false
[1, 1] = STRING; Value = X
[1, 2] = STRING; Value = Y
[1, 3] = STRING; Value = Z
[1, 4] = NUMERIC; Value = 10.000000

Maven Dependencies

<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi</artifactId>
    <version>5.2.3</version>
</dependency>

Maven Central