How to remove non ASCII characters from a string?

The code snippet below remove the characters from a string that is not inside the range of x20 and x7E ASCII code. The regex below strips non-printable and control characters. But it also keeps the linefeed character n (x0A) and the carriage return r (x0D) characters.

package org.kodejava.example.regex;

public class ReplaceNonAscii {
    public static void main(String[] args) {
        String str = "Thè quïck brøwn føx jumps over the lãzy dôg.";
        System.out.println("str = " + str);

        // Replace all non ascii chars in the string.
        str = str.replaceAll("[^\x0A\x0D\x20-\x7E]", "");
        System.out.println("str = " + str);

Snippet output:

str = Thè quïck brøwn føx jumps over the lãzy dôg.
str = Th quck brwn fx jumps over the lzy dg.

Wayan Saryada

A programmer, runner, recreational diver, currently living in the island of Bali, Indonesia. Mostly programming in Java, creating web based application with Spring Framework, JPA, etc. If you need help on Java programming you can hire me on Fiverr.

Leave a Reply