A string with an arbitrary length (
Pascal string) takes
Z + log(
Z) space, and is thus succinct. If there is a maximum length – which is the case in practice, since 232 = 4 GiB of data is a very long string, and 264 = 16 EiB of data is larger than any string in practice – then a string with a length is also implicit, taking
Z +
k space, where
k is the number of data to represent the maximum length (e.g., 64 bits). When a sequence of variable-length items (such as strings) needs to be encoded, there are various possibilities. A direct approach is to store a length and an item in each record – these can then be placed one after another. This allows efficient next, but not finding the
kth item. An alternative is to place the items in order with a delimiter (e.g.,
null-terminated string). This uses a delimiter instead of a length, and is substantially slower, since the entire sequence must be scanned for delimiters. Both of these are space-efficient. An alternative approach is out-of-band separation: the items can simply be placed one after another, with no delimiters. Item bounds can then be stored as a sequence of length, or better, offsets into this sequence. Alternatively, a separate binary string consisting of 1s in the positions where an item begins, and 0s everywhere else is encoded along with it. Given this string, the select function can quickly determine where each item begins, given its index. This is
compact but not
succinct, as it takes 2
Z space, which is O(
Z). Another example is the representation of a
binary tree: an arbitrary binary tree on n nodes can be represented in 2n + o(n) bits while supporting a variety of operations on any node, which includes finding its parent, its left and right child, and returning the size of its subtree, each in constant time. The number of different binary trees on n nodes is {\tbinom{2n}{n}}/(n+1). For large n, this is about 4^n; thus we need at least about \log_2(4^n)=2n bits to encode it. A succinct binary tree therefore would occupy only 2 bits per node. ==See also==