Prefix-free codes and ordinals

Consider the problem of representing a number in computer memory, which is idealized as a sequence of zeros and ones. The binary number system is a well-known solution to this problem — for example, the sequence “01101” represents 0 \cdot 2^4 + 1 \cdot 2^3 + 1 \cdot 2^2 + 0 \cdot 2^1 + 1 \cdot 2^0 = 11. But there’s a problem: You don’t expect the entire computer to just be used to represent one number; you expect it to have other things stored afterwards. So how do you tell where the number ends? If the sequence begins 01101001\dots does this represent the number 011_2 or 01101_2 or 0110100_2?

The solution to this problem most commonly used in practice is to declare in advance a fixed number of bits that will be used to represent the number, usually 32 bits or 64 bits. For example, if we fix a 5-bit representation the 01101001\dots always means that the number is 01101_2 = 11 and 001\dots represents other stuff in memory. This works well enough in practice, but it has a problem: The number of bits you set aside for storing the number forces an upper limit on how big the number can be. For example, you cannot store any number bigger than 2^64 in 64 bits. If the computer ever needs to store a bigger number than can be represented with space set aside then the computer fails.

I’ll introduce some terminology. The condition that a system of representing numbers is unambiguous can be phrased formally by saying the this method is a prefix-free code. A prefix-free code consists of a set of codewords, which are sequences of bits, such that no codeword is a prefix of another. A continuing stream of bits can be interpreted as a codeword by taking an initial segment that is a codeword, and by prefix-free property at most one such interpretation is possible. Since we want this code to represent numbers we also want a map from the codewords to the natural numbers (by which I mean including zero, naturally)

The fixed-width codes I mentioned earlier are examples of prefix-free codes whose code words map to a finite subset of the natural numbers. The simplest prefix-free code that can represent every natural number is the unary number system: A number is represented by that number of repeating ones followed by a zero. So “0” is 0, “10” is 1, “110” is 2, et cetera.

This system is extremely inefficient: It takes n+1 bits to represent the number n. In contrast, the simple (non-prefix-free) binary representation takes \log_2 n bits. A more efficient system is what I’ll call the level 1 binary system: A number is represented by a number k in unary concatenated by a k-bit representation of the number. This takes 2 \log_2 n bits to represent the number n. That’s much more efficient than the unary system, but is still a factor of two less efficient than the simple binary system. A more efficient system is the level 2 binary system: First represent a number k_0 in unary, then a k_0-bit number k_1, and then the number n is represented as a k_1-bit number.

Asymptotically this should be efficient enough for any purpose, but the way it is constructed naturally inspires the generalization to a level k binary system for all k. The level 3 system shaves off \log \log n bits, the level 4 system another \log \log \log n bits, and so on. After all of these comes the level \omega binary system: A number k is given in unary, followed by a representation of a number in the level k binary system. Here’s a particularly elegant reformulation of this system: “0” is a codeword representing 0. If x is a codeword representing n and y is a n-bit string, then 1 x y is a codeword representing the number which has 1y as its binary representation. Within this formulation the system has the property that every number has a unique code for it. This trick where the bit-string y represents the binary number 1y can also be applied to the level k binary systems for finite k, which results in a modified system where number has a unique code.

All of the prefix-free codes I described have the following additional property: If we restrict only to codewords that represent a number in the shortest possible way than one codeword is lexicographically less than another codeword if and only if the corresponding number is smaller. It follows that for any bit-string a, the set of numbers whose shortest codeword starts with a is either an interval or a ray. For example, in the unary system the string 1^n of n consecutive ones is a prefix for all numbers in the ray of numbers greater or equal to n. In the level 1 binary system, the prefix 1^9 0110 is used to represent any 9-bit number whose first three bits are 110, i.e., numbers in the range from 1.10_2 \times 2^8 to 1.11_2 \times 2^8 - 1. In other words, the prefix 1^9 0110 gives a scientific-notation-like approximation for the corresponding number as 1.10_2 \times 2^8.

Similarly, in the level $\omega$ binary system, if the sequence x denotes the number n and b_0, \dots, b_{k-1} are bits, then the prefix 1 x b_0 \dots b_{k-1} begins the representation of any number that is represented as (1.b_0 \dots b_{k-1})_2 \times 2^n in binary scientific notation. What about the prefix 11x b_0 \dots b_{k-1}? If the representation of a number m begins with this sequence, then the fragment 1xb_0 \dots b_{k-1} must denote a truncation of a representation of the number k of bits of m. Then k is approximately (1.b_0 \dots b_{k-1})_2 \times 2^n, and so m is approximately 2^{1.b_0 \dots b_{k-1} \times 2^n}. Similarly the prefix 1 \dots 11x b_0 \dots b_{k-1} represents a number approximately equal to 2^{\cdot^{\cdot^{\cdot^{2^{ 1.b_0 \dots b_{k-1} \times 2^n}}}}}. Every bit-sequence that is a prefix of a valid level \omega codeword takes one of these forms.

In conclusion, truncating the level \omega binary system representation of a number provides information on the approximate value of the number in a manner generalizing scientific notation. Moreover, this information remains efficient and useful as long as the number is no longer than a small tetration. In contrast, plain scientific notation ceases to be useful for numbers larger than an exponentiall whose exponent is too large to be written in full.

The suggestive terms “level k” and “level \omega” should inspire you to wonder whether there are level \alpha systems for larger ordinals \alpha. For example, the level \omega+1 system would represent a number using a unary representation of a number k, followed by a k-bit number $latex $n$, followed by the number represented in the level n binary number system. Truncations of such representations could describe the approximate value of a number like truncations of the level \omega system except with bigger ordinals there will be an even larger range that these truncations can usefully describe. Hopefully this can combine the power of ordinal-based function hierarchies for describing large numbers with the precision of scientific notation. Indeed such a system can be defined. I will describe a particularly elegant formulation:

Let \alpha be a countable ordinal. Let X = \{\beta \leq \alpha| \beta \text{ is a limit ordinal}\}. Define a cofinal sequence system on \alpha to be a sequence (p_n)_{n \in \omega} of functions p_n : X \to \alpha such that \beta = \sup_n p_n (\beta) for all \beta \in X. The name for this concept is my own invention; I’m sure this concept has been studied before but I have not bothered to look up the standard terminology. Given such a system define p : \alpha \setminus \{0\} \to \alpha by p (\beta) = \beta' if \beta = \beta' + 1 is a successor ordinal and p (\beta) = p_0 (\beta) if \beta is a limit ordinal. Define a partial order \lesssim by \alpha \lesssim \beta is \alpha can be obtained from \beta by iterating p a finite number of times. Then define the cofinal sequence system (p_n)_{n \in \omega} to be progressive if p_n (\beta) \lesssim p_{n+1} (\beta) for all n \in omega, \beta \in X.

Let (p_n)_{n \in \omega} be a progressive cofinal sequence system, with associated order \lesssim. A function f : \alpha+1 \to \omega is said to have good support if \mathrm{supp} f := \{x \leq \alpha | f (x) \neq 0\} is finite and linearly ordered by \lesssim. For example, the Kronecker delta function \delta_\beta (\beta') = 1 if \beta = \beta' and =0 if \beta \neq \beta' has good support. For each function f with good support I will define an associated prefix-free code.

To define this prefix-free code, first decompose f as f = g + n \delta_\beta where \beta is the largest element of \mathrm{supp} f and n = f (\beta) (so that every element of \mathrm{supp} g is strictly \lesssim-less than \beta). If f = 0 set \beta = n = g = 0. The definition of the prefix-free code associated to f depends on the cofinality of \beta:

  • If \beta = 0 (and so g = 0) the empty string e is the only valid codeword and is associated to the number n.
  • If \beta = \beta' + 1 is a successor ordinal then the codewords are bx where b is a bit and x is a codeword for the prefix-free code of g + (2n-1+b) \delta_{\beta'} with the same associated number.
  • If \beta is a limit ordinal then the codewords of f are the codewords of g+\sum_{k<n} \delta_{p_k (\beta)}, with the same associated numbers.

These codes satisfy the following properties:

  • The set of numbers that can be represented in each code is a closed interval.
  • Each number is represented by at most one codeword in each code.
  • Each code maps the lexicographic ordering of the codewords into the standard order of the corresponding numbers.
  • If f \not\equiv 0 has good support and \beta is the largest element of \mathrm{supp} f then the smallest element that can be represented with the code associated to f is one greater than the largest number that can be represented with the code associated to f - \delta_{\beta}.

 

These properties can be proven by induction on the largest element of \mathrm{supp} f.

Finally define the level \alpha prefix-free code to have codewords 1^n 0 x where x is a codeword for n \delta_\alpha. Every number can be represented uniquely by this code, and lexicographically greater codewords map to bigger numbers.

In particular, take \alpha = \omega^2. Define a progressive cofinal sequence system on \omega^2 by p_n ((m+1) \omega) = m \omega + n, p_n (\omega^2) = n \omega. Then the level \omega^2 prefix-free code is exactly identical to the level \omega binary number system I defined earlier.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s