Regex - Learn Java by Examples

How do I do pattern matching using regular expression in Spring EL?

By Wayan in Spring Core Last modified: June 30, 2023 0 Comment

In this Spring Expression Language example we are going to learn how to use regular expression or regex to check if a text matches a certain pattern. Spring EL support regular expression using the matches operator.

The matches operator will check if the value has a pattern defined by the regex string, and it returns the evaluation result as a boolean value true if the text matches the regex or false if otherwise.

For example, we can use the matches operator to check if the given email address is a valid email address. As can be seen in the following example:

<property name="emailValid" 
          value="#{user.email matches '^[_A-Za-z0-9-]+(.[_A-Za-z0-9-]+)*@[A-Za-z0-9]+(.[A-Za-z0-9]+)*(.[A-Za-z]{2,})$'}"/>

This configuration will evaluate the user.email property to check if the email pattern matches with the given regular expression. If matches then the emailValid property will be set to true otherwise it will be false.

Let’s see the complete example. Here are the spring configuration file, the User bean and a simple class for running the configuration file.

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd">

    <bean id="user" class="org.kodejava.spring.core.el.User">
        <constructor-arg name="email" value="[email protected]" />
        <property name="emailValid"
                  value="#{user.email matches '^[_A-Za-z0-9-]+(.[_A-Za-z0-9-]+)*@[A-Za-z0-9]+(.[A-Za-z0-9]+)*(.[A-Za-z]{2,})$'}" />
    </bean>

    <bean id="user2" class="org.kodejava.spring.core.el.User">
        <constructor-arg name="email" value="kodejava.at.gmail.dot.com" />
        <property name="emailValid"
                  value="#{user2.email matches '^[_A-Za-z0-9-]+(.[_A-Za-z0-9-]+)*@[A-Za-z0-9]+(.[A-Za-z0-9]+)*(.[A-Za-z]{2,})$'}" />
    </bean>

</beans>

The User bean is a simple pojo with two properties, a string email property and a boolean validEmail property.

package org.kodejava.spring.core.el;

public class User {
    private String email;
    private boolean emailValid;

    public User() {
    }

    public User(String email) {
        this.email = email;
    }

    public String getEmail() {
        return email;
    }

    public void setEmail(String email) {
        this.email = email;
    }

    public boolean isEmailValid() {
        return emailValid;
    }

    public void setEmailValid(boolean emailValid) {
        this.emailValid = emailValid;
    }
}

And finally the application class.

package org.kodejava.spring.core.el;

import org.springframework.context.support.ClassPathXmlApplicationContext;

public class SpELRegexExample {
    public static void main(String[] args) {
        try (ClassPathXmlApplicationContext context =
                     new ClassPathXmlApplicationContext("spel-regex.xml")) {

            User user = (User) context.getBean("user");
            System.out.println("user.getEmail()     = " + user.getEmail());
            System.out.println("user.isEmailValid() = " + user.isEmailValid());

            User user2 = (User) context.getBean("user2");
            System.out.println("user.getEmail()     = " + user2.getEmail());
            System.out.println("user.isEmailValid() = " + user2.isEmailValid());
        }
    }
}

When we run the code we will obtain the following result:

user.getEmail()     = [email protected]
user.isEmailValid() = true
user.getEmail()     = kodejava.at.gmail.dot.com
user.isEmailValid() = false

Maven Dependencies

<dependencies>
    <dependency>
        <groupId>org.springframework</groupId>
        <artifactId>spring-core</artifactId>
        <version>5.3.23</version>
    </dependency>
    <dependency>
        <groupId>org.springframework</groupId>
        <artifactId>spring-beans</artifactId>
        <version>5.3.23</version>
    </dependency>
    <dependency>
        <groupId>org.springframework</groupId>
        <artifactId>spring-context-support</artifactId>
        <version>5.3.23</version>
    </dependency>
</dependencies>

How to truncate a string after n number of words?

By Wayan in Regex Last modified: June 8, 2023 1 Comment

package org.kodejava.regex;

public class GetNumberOfWordsFromString {
    public static void main(String[] args) {
        String text = "The quick brown fox jumps over the lazy dog.";

        String one = truncateAfterWords(1, text);
        System.out.println("1 = " + one);

        String two = truncateAfterWords(2, text);
        System.out.println("2 = " + two);

        String four = truncateAfterWords(4, text);
        System.out.println("4 = " + four);

        String six = truncateAfterWords(6, text);
        System.out.println("6 = " + six);
    }

    /**
     * Truncate a string after n number of words.
     *
     * @param words number of words to truncate after.
     * @param text  the text.
     * @return truncated text.
     */
    public static String truncateAfterWords(int words, String text) {
        String regex = String.format("^((?:\\W*\\w+){%s}).*$", words);
        return text.replaceAll(regex, "$1");
    }
}

The result of the snippet:

1 = The
2 = The quick
4 = The quick brown fox
6 = The quick brown fox jumps over

How to remove non ASCII characters from a string?

By Wayan in Regex Last modified: June 8, 2023 0 Comment

The code snippet below remove the characters from a string that is not inside the range of x20 and x7E ASCII code. The regex below strips non-printable and control characters. But it also keeps the linefeed character n (x0A) and the carriage return r (x0D) characters.

package org.kodejava.regex;

public class ReplaceNonAscii {
    public static void main(String[] args) {
        String str = "ThÃ¨ quÃ¯ck brÃ¸wn fÃ¸x jumps over the lÃ£zy dÃ´g.";
        System.out.println("str = " + str);

        // Replace all non ascii chars in the string.
        str = str.replaceAll("[^\\x0A\\x0D\\x20-\\x7E]", "");
        System.out.println("str = " + str);
    }
}

Snippet output:

str = ThÃ¨ quÃ¯ck brÃ¸wn fÃ¸x jumps over the lÃ£zy dÃ´g.
str = Th quck brwn fx jumps over the lzy dg.

How do I count the number of capturing groups?

By Wayan in Regex Last modified: May 31, 2023 0 Comment

Capturing groups are numbered by counting the opening parentheses from left to right. To find out how many groups are present in the expression, call the groupCount() method on a matcher object. The groupCount() method returns an int showing the number of capturing groups present in the matcher’s pattern.

package org.kodejava.example.regex;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class CountingGroupDemo {
    public static void main(String[] args) {
        // Define regex to find the word 'quick' or 'lazy' or 'dog'
        String regex = "(quick)|(lazy)|(dog)";
        String text = "the quick brown fox jumps over the lazy dog";

        // Obtain the required matcher
        Pattern pattern = Pattern.compile(regex);
        Matcher matcher = pattern.matcher(text);

        int groupCount = matcher.groupCount();
        System.out.println("Number of group = " + groupCount);

        // Find every match and print it
        while (matcher.find()) {
            for (int i = 0; i <= groupCount; i++) {
                // Group i substring
                System.out.println("Group " + i + ": " + matcher.group(i));
            }
        }
    }
}

The result of the program:

Number of group = 3
Group 0: quick
Group 1: quick
Group 2: null
Group 3: null
Group 0: lazy
Group 1: null
Group 2: lazy
Group 3: null
Group 0: dog
Group 1: null
Group 2: null
Group 3: dog

How do I compile character classes with quantifier?

By Wayan in Regex Last modified: May 31, 2023 0 Comment

This example show you how to attach quantifier to character classes or capturing group in regular expressions.

package org.kodejava.example.regex;

import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class CombineWithQuantifier {
    public static void main(String[] args) {
        // [abc]{3} --> apply quantifier in character class.
        // Find 'a' or 'b' or 'c', three times in a row.
        //
        // (abc){3} --> apply quantifier in capturing group.
        // Find 'abc', three times in a row.
        //
        // abc{3} --> apply quantifier in character class.
        // Find character 'c', three times in a row.
        String[] regexs = {"[abc]{3}", "(abc){3}", "abc{3}"};
        String text = "abcabcabcabcaba";

        for (String regex : regexs) {
            Pattern pattern = Pattern.compile(regex);
            Matcher matcher = pattern.matcher(text);

            // Find every match and print it
            System.out.format("Regex:  %s %n", regex);
            while (matcher.find()) {
                System.out.format("Text \"%s\" found at %d to %d.%n",
                    matcher.group(), matcher.start(),
                    matcher.end());
            }
            System.out.println("------------------------------");
        }
    }
}

This program will print the following output:

Regex:  [abc]{3} 
Text "abc" found at 0 to 3.
Text "abc" found at 3 to 6.
Text "abc" found at 6 to 9.
Text "abc" found at 9 to 12.
Text "aba" found at 12 to 15.
------------------------------
Regex:  (abc){3} 
Text "abcabcabc" found at 0 to 9.
------------------------------
Regex:  abc{3} 
------------------------------