Whitespace characters in Java (or programming in general) aren’t just the space ' '
character. It also includes other characters that create some form of space or break in the text. The most common ones include:
- space
' '
- tab
'\t'
- newline
'\n'
- carriage return
'\r'
- form feed
'\f'
.
All these characters fall into the category of whitespace characters.
Now, if we want to check if a character in Java is one of these whitespace characters, we can make use of the built-in method Character.isWhitespace(char ch)
. Character
is a class in Java that provides a number of useful class (i.e., static
) methods for working with characters. And the isWhitespace()
method is one of them which checks if the provided character is a whitespace character.
Here is a simple code snippet:
package org.kodejava.lang;
public class CharacterIsWhitespace {
public static void main(String[] args) {
char ch = ' ';
if (Character.isWhitespace(ch)) {
System.out.println(ch + " is a whitespace character.");
} else {
System.out.println(ch + " is not a whitespace character.");
}
}
}
This code first defines a character ch
and then uses Character.isWhitespace(ch)
to check if it is a whitespace character. The isWhitespace()
method returns true
if the given character is a space, new line, tab, or other whitespace characters, false
otherwise.
Here’s a little more expansive example:
package org.kodejava.lang;
import java.util.Arrays;
import java.util.List;
public class CharacterIsWhitespaceDemo {
public static void main(String[] args) {
List<Character> characters = Arrays.asList(' ', '\t', '\n', '\r', '\f', 'a', '1');
for (char ch : characters) {
if (Character.isWhitespace(ch)) {
System.out.println("'" + ch + "' is a whitespace character.");
} else {
System.out.println("'" + ch + "' is not a whitespace character.");
}
}
}
}
Output:
' ' is a whitespace character.
' ' is a whitespace character.
'
' is a whitespace character.
' is a whitespace character.
'' is a whitespace character.
'a' is not a whitespace character.
'1' is not a whitespace character.
In this code snippet, we are checking and outputting whether each character in a list of characters is a whitespace character or not. The list includes a space, a tab, newline, carriage return, form feed, an alphabetic character, and a digit. The isWhitespace()
method identifies correctly which ones are the whitespace characters.
The Character.isWhitespace(char ch)
method in Java also considers Unicode whitespace. It checks for whitespace according to the Unicode standard. The method considers a character as a whitespace if and only if it is a Unicode space separator (category “Zs”), or if it is one of the following explicit characters:
- U+0009, HORIZONTAL TABULATION (‘\t’)
- U+000A, LINE FEED (‘\n’)
- U+000B, VERTICAL TABULATION
- U+000C, FORM FEED (‘\f’)
- U+000D, CARRIAGE RETURN (‘\r’)
Here is an example of checking Unicode whitespace:
package org.kodejava.lang;
public class CharacterIsWhitespaceUnicode {
public static void main(String[] args) {
char ch = '\u2003'; // EM SPACE
if (Character.isWhitespace(ch)) {
System.out.println("Character '" + ch + "' (\\u2003) is a whitespace character.");
} else {
System.out.println("Character '" + ch + "' (\\u2003) is not a whitespace character.");
}
}
}
Output:
Character ' ' (\u2003) is a whitespace character.
In this example, \u2003
is a Unicode representation of the “EM SPACE” character, which is a type of space character in the Unicode standard. The isWhitespace()
method correctly identifies it as a whitespace character.