This project provides wordlists for conversion between numbers and pronounceable words, and scripts with sample implementations of the necessary methods.

The wordlists were created with several principles in mind:

- They should be as distinct as possible from each other, especially not be homophones, and not have same or too similar beginnings or endings.
- They do not contain any words from the ICAO/ITU spelling alphabet: alfa, bravo, charlie, delta, echo, ..., x-ray, yankee, zulu.
- They do not begin with any of the letters
`x, y,`

or`z,`

allowing this to be an additional check or flag for special purposes. - Any first four letters shall occur only once in all lists.
- Similarly sounding words are grouped on one line and should be treated equivalent.

In addition, encoding of geographical coordinates is presented as application for such wordlists.

- header line beginning with '# ' and containing the number of following lines (referred to as "MAX" further down) plus possibly a comment (after space)
- word lines of the format
`NUMBER WORD [WORD ..]`

The NUMBERs must be subsequent integers starting with 0, i.e the last one must be MAX-1.

This permits a program using these files to convert between words and integers in the range 0..(MAX-1) or alternatively in the range 1..MAX. (In the latter case, the indices of course must be increased by 1.)

A program using these lists shall convert an integer (first column) to the first word (second column) of the equivalent words of the corresponding line, but should accept any of the words of this line when converting words to integers.

```
# 1001 lines of four-letter words
0 able abel
1 ache
2 acid
3 acre
4 aeon
...
```

`Dict/four`

is a list of (mostly decent...) four-letter words`Dict/five`

is a list of five-and-more-letter words, which do not begin with the same letters as those from`four`

`Dict/fourplus`

is the combination of`four`

and`five`

, merged and sorted`numwords.sh`

is a shell/dc-script for conversion of fractional numbers`coconv.sh`

is a shell/dc-script for coordinate conversion, which uses`numwords.sh`

The scripts are explained further below.

The lists `four`

and `five`

can be used in several ways:

- Only one of them;
- both intermittently;
- combined as
`fourplus`

.

If the shortest possible words should be used, case 1 with only `four`

might be best suited.

The combination (case 3) is be useful in case the number of words must be as small as possible.

The intermittent use of both lists is safest from the viewpoint of recognition of words, as they are mutually exclusive and provide a small implicit verification check. In addition, the change in word length might be more comfortable to pronounce and memorize.

If a defined number of digits (or long integers) should be encoded,
it might be best to only use the `four`

list, possibly pruned to exactly
1000 or 100 entries.
If chunks of two digits are to be encoded (00..99), an additional
verification could be implemented by selecting the third
digit based on a checksum; this would effectively reduce the amount of
possible words by a factor of ten.

If real numbers must be encoded, with an absolute value smaller than a certain maximum, it might be best to first divide them by this maximum value, therefore converting them to the range from -0.9999... to 0.9999..., and then encoding just the fractional part by repeated multiplication by the MAX value of the list, modulo MAX calculation, and continuation with the remainder, until the desired precision is reached. (In other words, represent the number in the base made from the length of the wordlist.)

A negative sign can be indicated by prepending `minus`

,
which is not present in the word lists.

In case additional verification is needed, a checksum might be calculated and e.g its value modulo 23 be added to the encoding words, with the first 23 words of the ICAO/ITU alphabet (alfa..whiskey) corresponding to the values 0..23.

Excluding the words x-ray, yankee, zulu keeps the sequence of encoding
words free from `x, y, z`

at the beginning. This may be helpful for
specific applications.

The script `numwords.sh`

takes as arguments either a list of words or
a fractional number between 0 and 1, and converts them into the other
type. The wordlists can be defined in the environment variable `NUMBERWORDS.`

The script uses the tool `dc`

for arithmetics of arbitrary precision, and might
therefore be difficult to understand. However, it simply implements conversion
between fractional numbers of base 10 and of a base defined by the concatenation
of the lengths of the used wordlists.

With two words, one can encode about seven digits:

```
$ export NUMBERWORDS="Dict/five Dict/four"
$ ./numwords.sh
usage: ./numwords.sh [words|fractional number]
will convert between words out of wordlists,
and a fractional number (pattern /[.0-9]*/ i.e between 0 and 1)
wordlists: Dict/five Dict/four
(may be set with NUMBERWORDS from the environment)
allowing for 7 encoded digits or less
$ ./numwords.sh 0.1234567
bidder foam
$ ./numwords.sh bidder foam
.1234563
```

In this case, only six digits are reliably encoded.
*The value given in the usage information is only a rough estimation:
the effective precision depends on the lengths of the word lists
and their combination.*

The initial idea for the "numberwords" project is stolen from the project what3words, which attributes to every patch of 3m by 3m on Earth's surface a unique combination of three words, its "address".

Unfortunately, the conversion algorithm is proprietary, and although the company is promising that there will be always a free way to use the conversion facility and that the algorithm will be transferred to some other entity in case the company ceases its operations, this is not satisfying.

A free and open alternative is preferable, because that is the only future-proof way.

Different applications might require different resolution of geographical coordinates, therefore we propose a slightly other way of encoding them, instead of cutting Earth's surface into pieces of 9 square meters.

For the discussion below, angles will be expressed in degrees, with a full circle corresponding to 360 degrees.

Geographical latitude (90 degrees south to 90 degrees north) can be trivially projected onto a flat surface, because there is a linear correspondence between its value and the distance on Earth's surface from the equator or one of its poles to the point corresponding to the latitude: the arc length given by the latitude angle and the (mean) radius of the Earth.

Geographical longitude (180 degrees west to 180 degrees east) on the other hand is not linearly dependent on the distance between meridian zero (whatever that reference may be) and the meridian passing through the corresponding point on Earth's surface: a difference in longitude of one degree corresponds to about 111 km at the equator, but reduces towards zero with increasing northern or southern latitude.

If simple linearisation is acceptable, the following conversion formula will be sufficient:

```
cLat = (90+Lat)/180
cLong = (180+Long)/360
```

where positive latitude is for the northern hemisphere, negative for the southern, and positive longitude is for the eastern hemisphere, negative for the western. This results in encoded values between 0 and 1 for both coordinates.

The inverse functions are trivial in this case:

```
Lat = 180*cLat-90
Long = 360*cLong-180
```

Although this linearisation may be acceptable for most cases, it is a waste of precision for higher latitudes. Therefore, one could convert the angle values of a pair of coordinates in dependence of each other. However, the gain is rarely worth the complexity of the needed calculations, therefore it has not been implemented here.

The resulting fractional numbers now can be converted to a sequence of words, as described above, with the number of words chosen freely according to the resolution needed.

In the subdirectory `coordinates`

of this repository is a shell script
`coconv.sh`

with a reference implementation of the coordinate conversion.

The script is using `dc`

for arithmetics with arbitrarily high precision,
and might therefore not be too easy to understand.
It is however just the `dc`

implementation of the functions described above.

The word lists used for word conversion are chosen in the script
at the beginning; they can also be specified with the environment
variable `NUMBERWORDS`

as noted in the script source and usage information.
Currently, four words will be generated for a coordinate
pair, from the wordlists `five, four, five, four.`

The combination of wordlists `five`

and `four`

results in a precision of
`1/(1930*1001)`

or approximatively 5E-7.
For latitude (with Earth's half circle of about 20'000 km), the absolute
precision is close to 10 m.
For longitude, it varies from about 21 m at the equator to 15 m at 45 degrees
north or south, 10 m at 60 degrees, and tends to 0 m at the poles.

Convert the coordinates of the Victoria Falls according to Openstreetmap (OSM) into a four word sequence:

```
$ ./coconv.sh -17.9246 25.8567
fulcrum sole motive pest
```

Convert the coordinates of the post office at Livingston Way in Victoria Falls:

```
$ ./coconv.sh -17.9270 25.8406
fulcrum skip motive mock
```

Convert the last result back into an URL for direct OSM display:

```
$ ./coconv.sh :osm fulcrum skip motive mock
http://openstreetmap.org/?mlat=-17.9270460&mlon=25.8404400
```

(about 10m south and 15m west from the initial coordinates)

*(2015 Y.C.Bonetti)*