The coordinates in the original UN/LOCODE list has major problems:
1. Only 80% of locations have coordinates
This doesn't just include tiny villages, but major transport hubs like Shanhai Port (CNSHG), Port of Shenzhen (CNSZP), Hong Kong (HKHKG) and Los Angeles (USLAX).
2. Many coordinates are just wrong
Problems like typos (AUMID), pointing to the wrong country (CKPZK) and just flat out being wrong (EGSCN)
This project aims to solve most of these cases by combining the data with data from OpenStreetMap's Nominatim and Wikidata.
3. Multiple coordinate formats
Most UN/LOCODES have a specific coordinate format, like USNYC: 4042N 07400W. However, this is not always true. Entries in Bhutan like BTPDL have decimal coordinates: 26.8128N 89.1903E. This project solves this with its 2 columns: the Coordinates
column has the UN/LOCODE style degrees, while the CoordinatesDecimal
column has a decimal representation.
You can find the improved list as code-list-improved.csv. It has both corrected coordinates, as well as just way more of them (98.3%).
3 extra columns are created:
- CoordinatesDecimal: the coordinates in decimal format.
- A distance column. Either the distance between UN/LOCODE and nominatim, or
"N/A (no UN/LOCODE)"
/"N/A (no Nominatim)"
/"N/A"
(when no result is found) - Source (either
"N/A"
when no coordinates,"UN/LOCODE"
, a link to the entry in OpenStreetMap or a link to the entry in Wikidata) These columns can be used to determine whether you'd want to have a human doublecheck the coordinates or not.
- When the coordinates can't be found with Nominatim, choose unlocode
- When no coordinates are specified in unlocode, choose Nominatim, but only if the region matches or UN/LOCODE hasn't specified a working region
- When the unlocode coordinates match the one from Nominatim within 100km (even when the region don't match!), choose unlocode
- Choose the first hit from Nominatim
- When that doesn't exist, choose the result from Wikidata
Other than that, all differences between the UN/LOCODE have been manually (quickly) tested and the correct ones are manually specified. Differences between Wikidata and this list are also tested and the correct ones manually specified, making this list as reliable as you can reasonably expect.
This project also contain extra scripts to automatically detect problems with the UN/LOCODE dataset, like incorrect regions.
The United Nations Code for Trade and Transport Locations is a code list mantained by UNECE, United Nations agency, to facilitate trade. The list is comes from the UNECE page, released twice a year.
All unlocode data is licensed under the ODC Public Domain Dedication and Licence (PDDL).
ODbL 1.0. http://osm.org/copyright
CC-0 (No rights reserved)
CC-0 (No rights reserved)