Geocoding using the OSM datasets with trigrams
In this recipe, you will use OpenStreetMap streets' datasets imported in PostGIS to implement a very basic Python class in order to provide geocoding features to the class' consumer. The geocode engine will be based on the implementation of the PostgreSQL trigrams provided by the contrib
module of PostgreSQL: pg_trgm
.
A trigram is a group of three consecutive characters contained in a string, and it is a very effective way to measure the similarity of two strings by counting the number of trigrams they have in common.
This recipe aims to be a very basic sample to implement some kinds of geocoding functionalities (it will just return one or more points from a street name), but it could be extended to support more advanced features.
Getting ready
- For this recipe, make sure you have the latest GDAL, at least version 1.10, as you will use it with the
ogr2ogr
the OGR OSM driver (http://www.gdal.org/drv_osm.html):
$ ogrinfo --version GDAL 2.1.2, released...