How do I detect non-ASCII characters in string?

The code below detect if a given string has a non ASCII characters in it. We use the CharsetDecoder class from the java.nio package to decode string to be a valid US-ASCII charset.


import java.nio.charset.CharsetDecoder;
import java.nio.charset.Charset;
import java.nio.charset.CharacterCodingException;
import java.nio.CharBuffer;
import java.nio.ByteBuffer;
import java.util.Arrays;

public class NonAsciiValidation {
    public static void main(String[] args) {
        // This string contains a non ASCII character which will produce exception
        // in this program. While the second string has a valid ASCII only chars.
        byte[] invalidBytes = "Copyright © 2017 Kode Java Org".getBytes();
        byte[] validBytes = "Copyright (c) 2017 Kode Java Org".getBytes();

        // Returns a charset object for the named charset.
        CharsetDecoder decoder = Charset.forName("US-ASCII").newDecoder();
        try {
            CharBuffer buffer = decoder.decode(ByteBuffer.wrap(validBytes));

            buffer = decoder.decode(ByteBuffer.wrap(invalidBytes));
        } catch (CharacterCodingException e) {
            System.err.println("The information contains a non ASCII character(s).");

Below is the result of the program:

The information contains a non ASCII character(s).
[C, o, p, y, r, i, g, h, t,  , (, c, ),  , 2, 0, 1, 7,  , K, o, d, e,  , J, a, v, a,  , O, r, g]
java.nio.charset.MalformedInputException: Input length = 1
    at java.base/java.nio.charset.CoderResult.throwException(
    at java.base/java.nio.charset.CharsetDecoder.decode(

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.