A formal description for computational and mathematical views.
Textual representation For exact latitude and longitude translations Geohash is a
spatial index of
base 4, because it transforms the continuous latitude and longitude space coordinates into a hierarchical discrete grid, using a recurrent four-partition of the space. To be a compact code it uses
base 32 and represents its values by the following alphabet, that is the "standard textual representation". The "Geohash alphabet" (32ghs) uses all digits 0-9 and all lower case letters except "a", "i", "l" and "o". For example, using the table above and the constant B=32, the Geohash ezs42 can be converted to a decimal representation by ordinary
positional notation: : [ezs42]32ghs = [(e \times B^4) + (z \times B^3) + (s \times B^2) + (4 \times B^1) + (2 \times B^0)]_{32ghs} : = [e]_{32ghs} \times B^4 + [z]_{32ghs} \times B^3 + [s]_{32ghs} \times B^2 + [4]_{32ghs} \times B^1 + [2]_{32ghs} \times B^0 : = [13]_{10} \times B^4 + [31]_{10} \times B^3 + [24]_{10} \times B^2 + [4]_{10} \times B^1 + [2]_{10} \times B^0 : = 13 \times 1048576 + 31 \times 32768 + 24 \times 1024 + 4 \times 32 + 2 \times 1 = 13631488 + 1015808 + 24576 + 128 + 2 = 14672002
Geometrical representation The geometry of the Geohash has a mixed spatial representation: • Geohashes with 2, 4, 6, ...
e digits (
even digits) are represented by
Z-order curve in a "regular grid" where decoded pair (latitude, longitude) has uniform uncertainty, valid as
Geo URI. • Geohashes with 1, 3, 5, ...
d digits (odd digits) are represented by "И-order curve". Latitude and longitude of the decoded pair has different uncertainty (longitude is truncated). It is possible to build the "И-order curve" from the Z-order curve by merging neighboring cells and indexing the resulting rectangular grid by the function j = \left\lfloor\frac{i}{2}\right\rfloor. The illustration shows how to obtain the grid of 32 rectangular cells from the grid of 64 square cells. The most important property of Geohash for humans is that it '
preserves spatial hierarchy
in the code prefixes'''''. For example, in the "1 Geohash digit grid" illustration of 32 rectangles, above, the spatial region of the code e (rectangle of greyish blue circle at position 4,3) is preserved with prefix e in the "2 digit grid" of 1024 rectangles (scale showing em and greyish green to blue circles at grid).
Algorithm and example Using the hash ezs42 as an example, here is how it is decoded into a decimal latitude and longitude. The first step is decoding it from textual "
base 32ghs", as showed above, to obtain the binary representation: : [e]_{32ghs}=[13]_{10}=[01101]_2 : [z]_{32ghs}=[31]_{10}=[11111]_2 : [s]_{32ghs}=[24]_{10}=[11000]_2 : [4]_{32ghs}=[4]_{10}=[00100]_2 : [2]_{32ghs}=[2]_{10}=[00010]_2. This operation results in the
bits 01101 11111 11000 00100 00010. Starting to count from the left side with the digit 0 in the first position, the digits in the even positions form the longitude code (0111110000000), while the digits in the odd positions form the latitude code (101111001001). Each binary code is then used in a series of divisions, considering one bit at a time, again from the left to the right side. For the latitude value, the interval −90 to +90 is divided by 2, producing two intervals: −90 to 0, and 0 to +90. Since the first bit is 1, the higher interval is chosen, and becomes the current interval. The procedure is repeated for all bits in the code. Finally, the latitude value is the center of the resulting interval. Longitudes are processed in an equivalent way, keeping in mind that the initial interval is −180 to +180. For example, in the latitude code 101111001001, the first bit is 1, so we know our latitude is somewhere between 0 and 90. Without any more bits, we'd guess the latitude was 45, giving us an error of ±45. Since more bits are available, we can continue with the next bit, and each subsequent bit halves this error. This table shows the effect of each bit. At each stage, the relevant half of the range is highlighted in green; a low bit selects the lower range, a high bit selects the upper range. The column "mean value" shows the latitude, simply the mean value of the range. Each subsequent bit makes this value more precise. (The numbers in the above table have been rounded to 3 decimal places for clarity) Final rounding should be done carefully in a way that : \min \le \mathrm{round}(value) \le \max So while rounding 42.605 to 42.61 or 42.6 is correct, rounding to 43 is not.
Digits and precision in km == Limitations when used for deciding proximity ==