Base64 encoding scheme is designed to encode binary data, especially when that data needs to be stored and transferred over media that designed to deal with text. Base64 encoding helps to ensure that the data remains intact without modification during transport.
Base64 is a group of similar binary-to-text encoding schemes that represent binary data in an ASCII string format by translating it into a radix-64 representation. Base64 is used commonly in a number of applications including email via MIME (Multipurpose Internet Mail Extensions), and storing complex data in XML or JSON.
Radix-64 refers to a numeral system that uses 64 unique characters to represent data. In the case of Base64 encoding, the 64 characters are typically the uppercase letters A
–Z
, the lowercase letters a
–z
, the numerals 0
–9
, and an additional two characters, usually +
and /
. These characters are used to encode data into a text format that can be safely transferred over various systems designed to handle text data.
Here’s a brief overview of how the Base64 encoding process works:
- Take the input data (which is binary data). For example, we have the text “Hello”. Each character is converted into their ASCII code.
- Split the input data into chunks of 24 bits (3 bytes). If the input is not divisible by 24, it is padded with zeros on the right to make up a full chunk.
ASCII | H = 72 | e = 101 | l = 108 | l = 108 | o = 111 |
---|---|---|---|---|---|
Binary | 01001000 | 01100101 | 01101100 | 01101100 | 01101111 |
- Each chunk of 24 bits is then split into four chunks of 6 bits.
6-bits | 010010 | 000110 | 010101 | 101100 |
---|---|---|---|---|
Value | 18 | 6 | 21 | 44 |
6-bits | 011011 | 000110 | 111100 [00 = PAD] |
---|---|---|---|
Value | 27 | 6 | 60 |
- Each 6-bit chunk is then mapped to an encoded character using an index table (below), which is a set of 64 distinct characters—these are A–Z, a–z, 0–9, + and / in the Base64 alphabet.
Value | 18 | 6 | 21 | 44 | 27 | 6 | 60 |
---|---|---|---|---|---|---|---|
Encoding | S | G | V | s | b | G | 8 |
- If the last 8-bit block (third block for normal Base64 encoding) had the padding zero bits, the output of the corresponding 6-bit block (fourth block for normal Base64 encoding) is replaced with one
=
symbol. If the second 8-bit block also had padding zeros, then fourth, and third blocks of Base64 encoding will have=
symbols.
Value | 18 | 6 | 21 | 44 | 27 | 6 | 60 | PAD |
---|---|---|---|---|---|---|---|---|
Encoding | S | G | V | s | b | G | 8 | = |
- The output is the encoded string.
SGVsbG8=
This process can be reversed to decode a Base64 string back into the original binary data.
Do note, though, while Base64 can encode any binary data, it is not encryption nor should it be used for encryption purposes – it does not hide or secure information, it’s merely an encoding scheme.
Base64 Index Table
Value | Encoding | Value | Encoding | Value | Encoding | Value | Encoding |
---|---|---|---|---|---|---|---|
0 | A | 16 | Q | 32 | g | 48 | w |
1 | B | 17 | R | 33 | h | 49 | x |
2 | C | 18 | S | 34 | i | 50 | y |
3 | D | 19 | T | 35 | j | 51 | z |
4 | E | 20 | U | 36 | k | 52 | 0 |
5 | F | 21 | V | 37 | l | 53 | 1 |
6 | G | 22 | W | 38 | m | 54 | 2 |
7 | H | 23 | X | 39 | n | 55 | 3 |
8 | I | 24 | Y | 40 | o | 56 | 4 |
9 | J | 25 | Z | 41 | p | 57 | 5 |
10 | K | 26 | a | 42 | q | 58 | 6 |
11 | L | 27 | b | 43 | r | 59 | 7 |
12 | M | 28 | c | 44 | s | 60 | 8 |
13 | N | 29 | d | 45 | t | 61 | 9 |
14 | O | 30 | e | 46 | u | 62 | + |
15 | P | 31 | f | 47 | v | 63 | / |
- How do I get number of each day for a certain month in Java? - September 8, 2024
- How do I get operating system process information using ProcessHandle? - July 22, 2024
- How do I sum a BigDecimal property of a list of objects using Java Stream API? - July 22, 2024