Overview#
Base58Check is a modified Base 58
encoding used for
Bitcoin address.
More generically, Base58Check encoding is used for encoding byte arrays in Bitcoin into human-typable strings.
Background#
The original Bitcoin client source code explains the reasoning behind base58 encoding:
base58.h:
// Why base-58 instead of standard base-64 encoding?
// - Don't want 0OIl characters that look the same in some fonts and
// could be used to create visually identical looking account numbers.
// - A string with non-alphanumeric characters is not as easily accepted as an account number.
// - E-mail usually won't line-break if there's no punctuation to break at.
// - Doubleclicking selects the whole number as one word if it's all alphanumeric.
Features of Base58Check#
Base58Check has the following features:
- An arbitrarily sized payload.
- A set of 58 alphanumeric symbols consisting of easily distinguished uppercase and lowercase letters (0OIl are not used)
- One byte of version/application information. Bitcoin addresses use 0x00 for this byte (future ones may use 0x05).
- Four bytes (32 bits) of SHA256-based error checking code. This code can be used to automatically detect and possibly correct typographical errors.
- An extra step for preservation of leading zeroes in the data.
Creating a Base58Check string#
A Base58Check string is created from a version/application byte and payload as follows.
- Take the version byte and payload bytes, and concatenate them together (bytewise).
- Take the first four bytes of SHA256(SHA256(results of step 1))
- Concatenate the results of step 1 and the results of step 2 together (bytewise).
- Treating the results of step 3 - a series of bytes - as a single Big-Endian bignumber, convert to base-58 using normal mathematical steps (bignumber division) and the base-58 alphabet described below. The result should be normalized to not have any leading base-58 zeroes (character '1').
- The leading character '1', which has a value of zero in base58, is reserved for representing an entire leading zero byte, as when it is in a leading position, has no value as a base-58 symbol. There can be one or more leading '1's when necessary to represent one or more leading zero bytes. Count the number of leading zero bytes that were the result of step 3 (for old Bitcoin addresses, there will always be at least one for the version/application byte; for new addresses, there will never be any). Each leading zero byte shall be represented by its own character '1' in the final result.
- Concatenate the 1's from step 5 with the results of step 4. This is the Base58Check result.
There might be more information for this subject on one of the following: