How do I replace text in Microsoft Word document using Apache POI?

The code snippet below show you how you can replace string in Microsoft Word document using the Apache POI library. The class below have three method, the openDocument(), saveDocument() and replaceText().

The routine for replacing text is implemented in the replaceText() method. This method take the HWPFDocument, the String to find and the String to replace it as parameters. The openDocument() opens the Word document. When the text replacement is done the Word document will be saved by the saveDocument() method.

And here is the complete code snippet. It will replace every o characters with 0 character in the source document, the lipsum.doc and save the result in a new document called new-lipsum.doc.

package org.kodejava.example.poi;

import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.usermodel.CharacterRun;
import org.apache.poi.hwpf.usermodel.Paragraph;
import org.apache.poi.hwpf.usermodel.Range;
import org.apache.poi.hwpf.usermodel.Section;
import org.apache.poi.poifs.filesystem.POIFSFileSystem;

import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.net.URL;

public class WordReplaceText {
    private static final String SOURCE_FILE = "lipsum.doc";
    private static final String OUTPUT_FILE = "new-lipsum.doc";

    public static void main(String[] args) throws Exception {
        WordReplaceText instance = new WordReplaceText();
        HWPFDocument doc = instance.openDocument(SOURCE_FILE);
        if (doc != null) {
            doc = instance.replaceText(doc, "o", "0");
            instance.saveDocument(doc, OUTPUT_FILE);
        }
    }

    private HWPFDocument replaceText(HWPFDocument doc, String findText, String replaceText) {
        Range r = doc.getRange();
        for (int i = 0; i < r.numSections(); ++i) {
            Section s = r.getSection(i);
            for (int j = 0; j < s.numParagraphs(); j++) {
                Paragraph p = s.getParagraph(j);
                for (int k = 0; k < p.numCharacterRuns(); k++) {
                    CharacterRun run = p.getCharacterRun(k);
                    String text = run.text();
                    if (text.contains(findText)) {
                        run.replaceText(findText, replaceText);
                    }
                }
            }
        }
        return doc;
    }

    private HWPFDocument openDocument(String file) throws Exception {
        URL res = getClass().getClassLoader().getResource(file);
        HWPFDocument document = null;
        if (res != null) {
            document = new HWPFDocument(new POIFSFileSystem(
                new File(res.getPath())));
        }
        return document;
    }

    private void saveDocument(HWPFDocument doc, String file) {
        try (FileOutputStream out = new FileOutputStream(file)) {
            doc.write(out);
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

Maven Dependencies

<!-- https://search.maven.org/remotecontent?filepath=org/apache/poi/poi/4.1.0/poi-4.1.0.jar -->
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi</artifactId>
    <version>4.1.0</version>
</dependency>
<!-- https://search.maven.org/remotecontent?filepath=org/apache/poi/poi-scratchpad/4.1.0/poi-scratchpad-4.1.0.jar -->
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-scratchpad</artifactId>
    <version>4.1.0</version>
</dependency>

Maven Central
Maven Central

5 Comments

  1. Your method replaceText(...) will not fulfill its contract. It will work on char by char replacements, but not words (String), since a word can and will be stored in different runs. So you should change the signature of your method to replaceText(HWPFDocument doc, char a, char b).

    Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.