How do I find text between two strings?

In this example we’ll use the StringUtils.substringBetween() method. Here we’ll extract the title and body of our HTML document. Let’s see the code.

package org.kodejava.commons.lang;

import java.util.Date;

import org.apache.commons.lang3.StringUtils;

public class NestedString {
    public static void main(String[] args) {
        String helloHtml = "<html>" +
                "<head>" +
                "   <title>Hello World from Java</title>" +
                "<body>" +
                "Hello, today is: " + new Date() +
                "</body>" +
                "</html>";

        String title = StringUtils.substringBetween(helloHtml, "<title>", "</title>");
        String content = StringUtils.substringBetween(helloHtml, "<body>", "</body>");

        System.out.println("title = " + title);
        System.out.println("content = " + content);
    }
}

By printing out the title and content, we’ll see something similar to:

title = Hello World from Java
content = Hello, today is: Thu Sep 30 06:32:32 CST 2021

Maven Dependencies

<dependency>
    <groupId>org.apache.commons</groupId>
    <artifactId>commons-lang3</artifactId>
    <version>3.14.0</version>
</dependency>

Maven Central

Wayan

2 Comments

  1. Hi, How to get text between a tag like <a href rel="nofollow">some text here</a>

    String content = StringUtils.substringBetween(helloHtml, "<a>", "</a>");
    

    Wont work because <a> has some attributes inside.

    Reply
    • Hi Richa,

      You can try using regex as the following example:

      package tacos;
      
      import java.util.regex.Matcher;
      import java.util.regex.Pattern;
      
      public class TextA {
          public static void main(String[] args) {
              String input = "<a href=\"https://kodejava.org\" target=\"blank\">" +
                      "Kode Java Hello World" +
                      "</a>";
      
              Pattern pattern = Pattern.compile("<a.*>(.+?)</a>");
              Matcher matcher = pattern.matcher(input);
      
              while (matcher.find()) {
                  String anchorText = matcher.group(1);
                  System.out.println("anchorText = " + anchorText);
              }
          }
      }
      
      Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.