BASE32 Encoder vs. Base64: When to Use Each and Why

How to Use a BASE32 Encoder: Step-by-Step Guide with Examples

What BASE32 is

BASE32 is an encoding scheme that represents binary data using a 32-character alphabet (A–Z and 2–7). It converts every 5 bits of data into one ASCII character, producing readable text suitable for URLs, filenames, and systems that are case-insensitive or limited to a restricted character set.

When to use it

  • Store or transmit binary data where case-insensitivity or filename-safety matters.
  • Represent keys, tokens, or small binary blobs in human-readable form.
  • Use in systems that require a limited character set (e.g., DNS labels, some QR code scenarios).

Step-by-step: encode text (conceptual)

  1. Convert input text to bytes using a character encoding (usually UTF-8).
  2. Group the byte stream into 5-bit chunks.
  3. Map each 5-bit value (0–31) to the BASE32 alphabet: A–Z, 2–7.
  4. If the final chunk is less than 5 bits, pad with zeros and append ‘=’ padding characters so the output length is a multiple of 8 characters (standard RFC 4648 behavior).
  5. Output the resulting BASE32 string.

Step-by-step: decode BASE32 (conceptual)

  1. Remove any non-alphabet characters and padding (‘=’).
  2. Map each BASE32 character back to its 5-bit value.
  3. Concatenate bits and split into 8-bit bytes.
  4. Discard any extra padding bits added during encoding.
  5. Convert bytes back to text using the original character encoding (UTF-8).

Examples

Example 1 — Encode the string “hello”
  • Bytes (UTF-8): 68 65 6C 6C 6F
  • BASE32 output (RFC 4648): NBSWY3DP
Example 2 — Decode “NBSWY3DP”
  • BASE32 input: N B S W Y 3 D P
  • Decodes to bytes: 68 65 6C 6C 6F
  • Text: “hello”
Example 3 — Command-line (Linux/macOS)
  • Encode a file:

Code

base32 input.bin > output.txt
  • Decode:

Code

base32 –decode output.txt > recovered.bin
Example 4 — Python (built-in library)

Code

import base64 data = “hello”.encode(‘utf-8’) encoded = base64.b32encode(data).decode(‘ascii’) decoded = base64.b32decode(encoded).decode(‘utf-8’)print(encoded) # NBSWY3DP print(decoded) # hello

Padding variants and URL-safe forms

  • RFC 4648 standard uses ‘=’ padding to make output length a multiple of 8.
  • Some implementations omit padding; when decoding, allow for missing padding.
  • A URL-safe variant may substitute characters or omit padding; confirm the expected alphabet with the system you’re interoperating with.

Common pitfalls

  • Confusing BASE32 with Base64 — they use different alphabets and block sizes (5-bit vs 6-bit).
  • Forgetting UTF-8 when converting text to bytes (can corrupt non-ASCII characters).
  • Not handling or expecting padding consistently between encoders/decoders.

Quick reference (BASE32 alphabet)

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 2 3 4 5 6 7

Comments

Leave a Reply