Glossary

Encoding

2 min read

The Bitcoin Protocol uses several different encoding schemes to represent data to users. The encoding schemes are chosen based on data type to optimise for human readability and conserve disk space. For encoding schemes, the longer the alphabet, the more compression can be achieved, yielding shorter outputs. However, some characters, such as zero and the letter o, are undesirable and error prone in terms of human readability. Below is a list of the most commonly used encodings in Bitcoin:

Hexadecimal is an encoding scheme which uses 16 characters: the digits 0-9 and the letters A-F. Upper and lowercase letters are interchangeable, and most data represented in hexadecimal should begin with a ‘0x’. Public keys, hashes, scripts, and transactions are usually encoded in hexadecimal format.

Base58 has an alphabet consisting of 58 characters, including upper and lowercase letters and the digits 1-9. Base58 excludes the zero, uppercase O, uppercase I, and lowercase l, to avoid reader confusion. Legacy Bitcoin addresses and private keys are represented in Base58.

Base64 has an alphabet consisting of 64 characters, including all upper and lowercase letters and the digits 0-9. The ‘+’ and ‘/’ characters complete the set of 64. Partially Signed Bitcoin Transactions (PSBTs) are encoded in Base64.

Bech32 uses only lowercase letters and numbers, excluding the number 1 and the letters ‘b’, ‘i’, and ‘o’. Bech32 is used to encode SegWit addresses and Lightning invoices, and contains an error detection mechanism.

QR codes are visual representations of data, recognizable as conglomerations of black and white squares. QR codes are useful for encoding larger pieces of data that are too long to read or copy-paste. QR codes can be scanned and decoded by most mobile cameras. Lightning invoices and Bitcoin addresses often use QR codes.

Learn more about reading Bitcoin data.