The code below detect if a given string has a non ASCII characters in it. We use the CharsetDecoder
class from the java.nio
package to decode string to be a valid US-ASCII charset.
package org.kodejava.io;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CharacterCodingException;
import java.nio.CharBuffer;
import java.nio.ByteBuffer;
import java.nio.charset.StandardCharsets;
import java.util.Arrays;
public class NonAsciiValidation {
public static void main(String[] args) {
// This string contains a non ASCII character which will produce exception
// in this program. While the second string has a valid ASCII only characters.
byte[] invalidBytes = "Copyright © 2021 Kode Java Org".getBytes();
byte[] validBytes = "Copyright (c) 2021 Kode Java Org".getBytes();
// Returns a charset object for the named charset.
CharsetDecoder decoder = StandardCharsets.US_ASCII.newDecoder();
try {
CharBuffer buffer = decoder.decode(ByteBuffer.wrap(validBytes));
System.out.println(Arrays.toString(buffer.array()));
buffer = decoder.decode(ByteBuffer.wrap(invalidBytes));
System.out.println(Arrays.toString(buffer.array()));
} catch (CharacterCodingException e) {
System.err.println("The information contains a non ASCII character(s).");
e.printStackTrace();
}
}
}
Below is the result of the program:
[C, o, p, y, r, i, g, h, t, , (, c, ), , 2, 0, 2, 1, , K, o, d, e, , J, a, v, a, , O, r, g]
The information contains a non ASCII character(s).
java.nio.charset.MalformedInputException: Input length = 1
at java.base/java.nio.charset.CoderResult.throwException(CoderResult.java:274)
at java.base/java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:820)
at org.kodejava.io.NonAsciiValidation.main(NonAsciiValidation.java:23)