How does Base64 encoding work?

Base64 encoding scheme is designed to encode binary data, especially when that data needs to be stored and transferred over media that designed to deal with text. Base64 encoding helps to ensure that the data remains intact without modification during transport.

Base64 is a group of similar binary-to-text encoding schemes that represent binary data in an ASCII string format by translating it into a radix-64 representation. Base64 is used commonly in a number of applications including email via MIME (Multipurpose Internet Mail Extensions), and storing complex data in XML or JSON.

Radix-64 refers to a numeral system that uses 64 unique characters to represent data. In the case of Base64 encoding, the 64 characters are typically the uppercase letters AZ, the lowercase letters az, the numerals 09, and an additional two characters, usually + and /. These characters are used to encode data into a text format that can be safely transferred over various systems designed to handle text data.

Here’s a brief overview of how the Base64 encoding process works:

  • Take the input data (which is binary data). For example, we have the text “Hello”. Each character is converted into their ASCII code.
  • Split the input data into chunks of 24 bits (3 bytes). If the input is not divisible by 24, it is padded with zeros on the right to make up a full chunk.
ASCII H = 72 e = 101 l = 108 l = 108 o = 111
Binary 01001000 01100101 01101100 01101100 01101111
  • Each chunk of 24 bits is then split into four chunks of 6 bits.
6-bits 010010 000110 010101 101100
Value 18 6 21 44
6-bits 011011 000110 111100 [00 = PAD]
Value 27 6 60
  • Each 6-bit chunk is then mapped to an encoded character using an index table (below), which is a set of 64 distinct characters—these are A–Z, a–z, 0–9, + and / in the Base64 alphabet.
Value 18 6 21 44 27 6 60
Encoding S G V s b G 8
  • If the last 8-bit block (third block for normal Base64 encoding) had the padding zero bits, the output of the corresponding 6-bit block (fourth block for normal Base64 encoding) is replaced with one = symbol. If the second 8-bit block also had padding zeros, then fourth, and third blocks of Base64 encoding will have = symbols.
Value 18 6 21 44 27 6 60 PAD
Encoding S G V s b G 8 =
  • The output is the encoded string.
SGVsbG8=

This process can be reversed to decode a Base64 string back into the original binary data.

Do note, though, while Base64 can encode any binary data, it is not encryption nor should it be used for encryption purposes – it does not hide or secure information, it’s merely an encoding scheme.

Base64 Index Table

Value Encoding Value Encoding Value Encoding Value Encoding
0 A 16 Q 32 g 48 w
1 B 17 R 33 h 49 x
2 C 18 S 34 i 50 y
3 D 19 T 35 j 51 z
4 E 20 U 36 k 52 0
5 F 21 V 37 l 53 1
6 G 22 W 38 m 54 2
7 H 23 X 39 n 55 3
8 I 24 Y 40 o 56 4
9 J 25 Z 41 p 57 5
10 K 26 a 42 q 58 6
11 L 27 b 43 r 59 7
12 M 28 c 44 s 60 8
13 N 29 d 45 t 61 9
14 O 30 e 46 u 62 +
15 P 31 f 47 v 63 /
Wayan

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.