Encoding Primer

Computers think in bits, which are just 0s and 1s. Humans read text. Encoding is the bridge between those two worlds, and it is what lets a 64-byte post-quantum public key of your avatar become 87 characters that you could share with the world and anyone, anywhere can decode it back into the exact same bytes.

Every .avtr domain name, every signed engram, and every verification you earn traces back to keys and hashes encoded this way. Before we can talk about what those keys do, we need to understand how they are written.

The number of bits you group together determines how many distinct values you can represent.

1 bit   = 0 or 1                            (2 possibilities)
4 bits  = 0000 to 1111                      (16 possibilities)
6 bits  = 000000 to 111111                  (64 possibilities)
8 bits  = 1 byte = 00000000 to 11111111     (256 possibilities)

That last row is the one that matters most. Eight bits grouped together form a byte, and the byte is the standard unit that computers use to store and transmit data. Cryptographic keys and hashes are just long sequences of these bytes.

To display those bytes as readable text, whether in a URL, a config file, or on screen, we encode them into printable characters. Each encoding scheme does this by slicing the bytes into smaller bit groups and mapping each group to a character. The only real question is how many bits each character carries.

Base16 (Hexadecimal)

4 bits per character. 16 possible values.

Hexadecimal is the simplest encoding because each character maps to exactly 4 bits, which is half a byte, so two hex characters together always form one full byte.

Alphabet (16 characters):
0 1 2 3 4 5 6 7 8 9 a b c d e f

Each value maps to a specific 4-bit pattern:

BitsValueCharBitsValueChar
000000100088
000111100199
001022101010a
001133101111b
010044110012c
010155110113d
011066111014e
011177111115f

How 1 byte becomes 2 hex characters:

Byte: 01111010  (decimal 122)
      ├──┤├──┤
       ↓    ↓
       7    a    →  "7a"

If you encoded a 64-byte public key in hex, it would become 128 characters, because every byte turns into two characters. Base16 is the simplest encoding to implement, since you just slice the bytes into groups of four bits, and it is fast and universally understood by every tool that touches binary data. The tradeoff is that hex doubles the length of whatever you encode, which becomes noticeable once keys and signatures grow large.

Base64

6 bits per character. 64 possible values.

Base64 packs 50 percent more information into each character than hex by using 6-bit groups instead of 4-bit groups, drawing from all uppercase and lowercase letters, all ten digits, plus + and / to reach exactly 64.

Alphabet (64 characters):
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z   (26)
a b c d e f g h i j k l m n o p q r s t u v w x y z   (26)
0 1 2 3 4 5 6 7 8 9                                   (10)
+ /                                                   (2)
                                               Total: (64)

Each value maps to a specific 6-bit pattern:

BitsValueCharBitsValueChar
0000000A10000032g
0000011B10000133h
0000102C10001034i
0000113D10001135j
0001004E10010036k
0001015F10010137l
0001106G10011038m
0001117H10011139n
0010008I10100040o
0010019J10100141p
00101010K10101042q
00101111L10101143r
00110012M10110044s
00110113N10110145t
00111014O10111046u
00111115P10111147v
01000016Q11000048w
01000117R11000149x
01001018S11001050y
01001119T11001151z
01010020U110100520
01010121V110101531
01011022W110110542
01011123X110111553
01100024Y111000564
01100125Z111001575
01101026a111010586
01101127b111011597
01110028c111100608
01110129d111101619
01111030e11111062+
01111131f11111163/

How 3 bytes become 4 base64 characters:

3 bytes = 24 bits
24 bits ÷ 6 bits per char = exactly 4 characters

Bytes:  01001101  01100001  01101110
        ├────┤├──────┤├──────┤├────┤
        010011 010110  000101 101110
          ↓       ↓       ↓      ↓
          T       W       F      u     →  "TWFu"

If you encoded the same 64-byte key in base64, it would become 88 characters once padding is added. That is only 33 percent overhead compared to hex's 100 percent, which is why base64 became the standard for embedding binary data in HTTP headers, JSON, and email attachments. It does have two rough edges that matter for identity systems. The + and / characters break when dropped into URLs without escaping, and characters like uppercase I, lowercase l, uppercase O, and the digit 0 look nearly identical in many fonts, which makes base64 risky when a human has to read a key aloud or copy it by hand.

Base58

Around 5.86 bits per character. 58 possible values.

Base58 was invented by Bitcoin to solve exactly the readability problem base64 suffered from. The idea is straightforward: start with base64's 64 characters and remove the 6 that cause the most trouble.

Base64 alphabet:  A-Z  a-z  0-9  + /      (64 characters)

Remove:  0  (looks like O)
         O  (looks like 0)
         I  (looks like l)
         l  (looks like I)
         +  (breaks URLs)
         /  (breaks URLs)

64 - 6 = 58 characters remain

The same grid with the removed characters struck through:

Alphabet (58 characters remaining):
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z   (24, removed I and O)
a b c d e f g h i j k l m n o p q r s t u v w x y z   (25, removed l)
0 1 2 3 4 5 6 7 8 9                                   (9, removed 0)
+ /                                                   (0, removed + and /)
                                               Total: (58)

The remaining 58 characters are assigned values 0 through 57:

ValueCharValueCharValueCharValueChar
0115G30X45n
1216H31Y46o
2317J32Z47p
3418K33a48q
4519L34b49r
5620M35c50s
6721N36d51t
7822P37e52u
8923Q38f53v
9A24R39g54w
10B25S40h55x
11C26T41i56y
12D27U42j57z
13E28V43k
14F29W44m

Note: I is skipped after H, O after N, l after k.

Why 5.86 bits and no clean bit mapping

58 is NOT a power of 2.

2^5 = 32   (too few, wastes 26 values)
2^6 = 64   (too many, wastes 6 values)

log₂(58) = 5.858 bits per character  (not a whole number)

Because there is no clean bit boundary, Base58 cannot slice bytes into fixed groups the way hex and base64 can. Instead it treats the entire input as one enormous number and divides by 58 repeatedly, collecting the remainders as it goes.

Input (32 bytes as one giant number):
  N = 0x7ab2c3d4e5f6...  (very large number)

Encoding loop:
  N ÷ 58 = quotient, remainder 23 → alphabet[23] = 'Q'
  Q ÷ 58 = quotient, remainder  7 → alphabet[7]  = '8'
  Q ÷ 58 = quotient, remainder 41 → alphabet[41] = 'i'
  ...repeat until quotient = 0, then reverse

A 64-byte Avatarnet public key becomes roughly 87 Base58 characters, putting it on par with base64 for compactness while offering the payoff of readability: no visually ambiguous characters, no symbols that break URLs, and no embarrassing moments when someone misreads an identifier over the phone. The only real cost is that big-number division is slower than the simple bit slicing that hex and base64 rely on.

Comparison

The same 64-byte Avatarnet public key in three encodings:

Hex:    7ab2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1b2   (64)
        c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1b27ab2   (64)
        └─────────────────────── 128 characters ───────────────────────┘

Base64: erLD1OX2p7jJ0OHyo7TF1uf4qbDB0uP0pbbH2Onwobk9VgrgPz8CFMCyaELnMvZn   (64)
        NxywjcDhRqH3wareRstBJ2r==                                          (24)
        └──────────────────────── 88 characters ───────────────────────┘

Base58: 9VgrgPz8CFMCyaELnMvZnNxywjcDhRqH3wareRstBJ2r7ab2c3d4e5f6a7b8c9d    (64)
        0e1f2a3b4c5d6e7f8a9b                                               (23)
        └──────────────────────── 87 characters ───────────────────────┘

The pattern is clear: more bits per character means fewer characters for the same data.

EncodingBits/Char64-byte keyClean slicing?
Base164.00128 charsYes (4-bit groups)
Base646.0088 charsYes (6-bit groups)
Base585.86~87 charsNo (division math)

64 Bytes in Every Base

An Avatarnet public key and a SHA-512 content hash are both 64 bytes, which is 512 bits. The three encodings above are the ones Avatarnet actually uses, but they sit on a spectrum that includes every common base. The table below shows how the same 512 bits expand or compress depending on how many bits each character carries.

The math

BaseValues/CharBits/CharFormulaChars for 64 bytes
221.00512 ÷ 1 = 512512 (binary)
883.00512 ÷ 3 = 171171 (octal)
10103.32512 ÷ 3.32 = 155155 (decimal)
16164.00512 ÷ 4 = 128128 (hex)
32325.00512 ÷ 5 = 103103 (base32)
58585.86512 ÷ 5.86 = 88~87 (base58)
64646.00512 ÷ 6 = 8688 (base64 + pad)

Why more bits per character means shorter output

512 bits to encode:

Base2:   ████████████████████████████████████████████████████  512 chars
Base8:   █████████████████                                     171 chars
Base10:  ████████████████                                      155 chars
Base16:  █████████████                                         128 chars
Base32:  ██████████                                            103 chars
Base58:  █████████                                            ~87 chars
Base64:  █████████                                             88 chars

← fewer characters = more efficient

Each step up in base squeezes more information into each character. Base58 and Base64 land at nearly the same length because 5.86 and 6.00 bits per character are close, but Base58 trades that last fraction of efficiency for the readability gains described above.

The tradeoff

More compact ──────────────────────────────────────► More readable
Base64/58          Base32          Base16            Base2
(~88 chars)        (103 chars)     (128 chars)       (512 chars)
hard to read       Tor .onion      easy to read      impractical
hard to debug      addresses       easy to debug     but obvious

For Avatarnet:

  • Hex (Base16) for keys and hashes in logs, config files, and debugging output, because every byte is exactly two characters and the mapping is trivial to read
  • Base58 for Peer IDs, following the libp2p and Bitcoin convention, because humans may need to read, copy, or compare them
  • Base64 for signatures in wire formats and JSON transport, because compactness matters when a single signature is 49,856 bytes

The keys and signatures these encodings carry are dramatically larger than what Bitcoin or Signal use, and that raises an obvious question: why accept the extra weight? The answer starts with quantum computers and the algorithms that survive them. That is the subject of the next page, Post-Quantum Cryptography.