Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Mastering PostGIS

You're reading from   Mastering PostGIS Modern ways to create, analyze, and implement spatial data

Arrow left icon
Product type Paperback
Published in May 2017
Publisher Packt
ISBN-13 9781784391645
Length 328 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Authors (4):
Arrow left icon
Dominik Mikiewicz Dominik Mikiewicz
Author Profile Icon Dominik Mikiewicz
Dominik Mikiewicz
Michal Mackiewicz Michal Mackiewicz
Author Profile Icon Michal Mackiewicz
Michal Mackiewicz
Tomasz Nycz Tomasz Nycz
Author Profile Icon Tomasz Nycz
Tomasz Nycz
George Silva George Silva
Author Profile Icon George Silva
George Silva
Arrow right icon
View More author details
Toc

Table of Contents (10) Chapters Close

Preface 1. Importing Spatial Data FREE CHAPTER 2. Spatial Data Analysis 3. Data Processing - Vector Ops 4. Data Processing - Raster Ops 5. Exporting Spatial Data 6. ETL Using Node.js 7. PostGIS – Creating Simple WebGIS Applications 8. PostGIS Topology 9. pgRouting

Importing flat data

Loading flat data may seem to be a bit dull initially but it is important to understand that many popular and interesting datasets often contain the spatial information in very different formats, such as:

  • Coordinates expressed in Lon/Lat or projected coordinates
  • Encoded geometry, for example WKT, TopoJSON, GeoJSON
  • Location in the form of an address
  • Location in non-cartesian coordinates, for example start point, angle and direction
  • While the earlier examples indicate the data would require further processing in order to extract the spatial content into a usable form, clearly ability to import flat datasets should not be underestimated
Flat data in our scenario is data with no explicitly expressed geometry - non-spatial format, text-based files

Importing data using psql

Psql is the pgsql's command-line tool. While one can achieve quite a lot with GUI based database management utilities, psql is very useful when one needs to handle database backups, management and alike via scripting. When there is no GUI installed on the server, psql becomes pretty much the only option so it is worth being familiar with it even if you're not a fan.

In order to import the data in psql we will use a \COPY command. This requires us to define the data model for the incoming data first.

Defining the table data model from a text file may be prone to errors that will prevent data from being imported. If for of some reason you are not sure what data types are stored in the particular columns of your source file you can import all the data as text and then re-cast it as required at a later time.

Importing data interactively

In this example we will import the earthquakes data from USGS. So let's fire up psql and connect to the database server:

F:\mastering_postgis\chapter02>psql -h localhost -p 5434 -U postgres

You should see a similar output:

    psql (9.5.0)
Type "help" for help.
postgres=#

Then we need to connect to the mastering_postgis database:

    postgres=# \c mastering_postgis

The following output should be displayed:

    You are now connected to database "mastering_postgis" as user 
"postgres".
mastering_postgis=#
In the psql examples I am using postgres user. As you may guess, it is a superuser account. This is not the thing you would normally do, but it will keep the examples simple.
In a production environment, using a db user with credentials allowing access to specific resources is a sensible approach.

If you have not had a chance to create our data_import schema, let's take care of it now by typing the following command:

    mastering_postgis=# create schema if not exists data_import;

You should see a similar output:

    NOTICE:  schema "data_import" already exists, skipping
CREATE SCHEMA

Once the schema is there, we create the table that will store the data. In order to do so just type or paste the following into psql:

  create table data_import.earthquakes_csv ( 
"time" timestamp with time zone,
latitude numeric,
longitude numeric,
depth numeric,
mag numeric,
magType varchar,
nst numeric,
gap numeric,
dmin numeric,
rms numeric,
net varchar,
id varchar,
updated timestamp with time zone,
place varchar,
type varchar,
horizontalError numeric,
depthError numeric,
magError numeric,
magNst numeric,
status varchar,
locationSource varchar,
magSource varchar
);

You should see the following output:

    mastering_postgis=# create table data_import.earthquakes_csv (
mastering_postgis(# "time" timestamp with time zone,
mastering_postgis(# latitude numeric,
mastering_postgis(# longitude numeric,
mastering_postgis(# depth numeric,
mastering_postgis(# mag numeric,
mastering_postgis(# magType varchar,
mastering_postgis(# nst numeric,
mastering_postgis(# gap numeric,
mastering_postgis(# dmin numeric,
mastering_postgis(# rms numeric,
mastering_postgis(# net varchar,
mastering_postgis(# id varchar,
mastering_postgis(# updated timestamp with time zone,
mastering_postgis(# place varchar,
mastering_postgis(# type varchar,
mastering_postgis(# horizontalError numeric,
mastering_postgis(# depthError numeric,
mastering_postgis(# magError numeric,
mastering_postgis(# magNst numeric,
mastering_postgis(# status varchar,
mastering_postgis(# locationSource varchar,
mastering_postgis(# magSource varchar
mastering_postgis(# );
CREATE TABLE

Now, as we have our data table ready, we can finally get to the import part. The following command should handle importing the data into our newly created table:

\copy data_import.earthquakes_csv from data\2.5_day.csv with DELIMITER ',' CSV HEADER

You should see a similar output:

mastering_postgis=# \copy data_import.earthquakes_csv from data\2.5_day.csv with DELIMITER ',' CSV HEADER
COPY 25
If you require a complete reference on the \COPY command, simply type in: \h COPY into the cmd.

While you can customize your data after importing, you may wish to only import a subset of columns in the first place. Unfortunately \COPY command imports all the columns (although you may specify where to put them) so the solution to this would be:

  • Create a table that models the input CSV
  • Import all the data
  • Create a table with a subset of columns
  • Copy data over
  • Delete the input table

Even though everything said earlier is possible in psql, it requires quite a lot of typing. Because of that we will take care of this scenario in the next stage. Demonstrating the non-interactive psql mode.

Importing data non-interactively

For the non-interactive psql data import example we'll do a bit more than in the interactive mode. We'll:

  • Import the full earthquakes dataset
  • Select a subset of earthquakes data mentioned in the previous example and insert it into its own table
  • Import another dataset - in this case the Ordnance Survey's POIs

Basically the non-interactive usage of psql means we simply provide it with an SQL to execute. This way we can put together many statements without having to execute them one by one.

Once again we will need the data model prior to loading the data, and then a \COPY command will be used.

If you're still in psql, you can execute a script by simply typing:

\i path\to\the\script.sql

For example:

\i F:/mastering_postgis/chapter02/code/data_import_earthquakes.sql

You should see a similar output:

mastering_postgis-# \i F:/mastering_postgis/chapter02/code/data_import_earthquakes.sql
CREATE SCHEMA
psql:F:/mastering_postgis/chapter02/code/data_import_earthquakes.sql:5: NOTICE: table "earthquakes_csv" does not exist, skipping
DROP TABLE
CREATE TABLE
COPY 25
psql:F:/mastering_postgis/chapter02/code/data_import_earthquakes.sql:58: NOTICE: table "earthquakes_csv_subset" does not exist, skipping
DROP TABLE
SELECT 25
mastering_postgis-#

If you quit psql already, type the following command into cmd:

psql -h host -p port -U user -d database -f path\to\the\script.sql

For example:

psql -h localhost -p 5434 -U postgres -d mastering_postgis -f F:\mastering_postgis\chapter02\code\data_import_earthquakes.sql

You should see a similar output:

F:\mastering_postgis\chapter02>psql -h localhost -p 5434 -U postgres -d mastering_postgis -f F:\mastering_postgis\chapter02\code\data_import_earthquakes.sql
psql:F:/mastering_postgis/chapter02/code/data_import_earthquakes.sql:2: NOTICE: schema "data_import" already exists, skipping
CREATE SCHEMA
DROP TABLE
CREATE TABLE
COPY 25
DROP TABLE
SELECT 25

The script executed earlier is in the book's code repository under Chapter02/code/ data_import_earthquakes.sql.

Loading OS POI data is now a piece of cake. This dataset is in a bit of a different format though, so it requires slight adjustments. You can review the code in Chapter02/code/ data_import_gb_poi.sql.

Importing data using pgAdmin

In this section we'll import some new data we have not interacted with before - this time we'll have a look at the Ordnance Survey's address data we obtained in the CSV format.

Depending on the pgAdmin version, the UI may differ a bit. The described functionality should always be present though.
For the examples involving pgAdmin, screenshots were taken using pgAdmin III (1.22.2).

PgAdmin's import functionality is basically a wrapper around the \COPY so it does require a data model in order to work. Because of that, let's quickly create a table that will be populated with the imported data. You can do it with the GUI by simply right-clicking a schema node you want to create the table in and choosing New Object | New Table and then providing all the necessary model definitions in the displayed window:

You can also type some SQL which in many cases is a bit quicker:

drop table if exists data_import.osgb_addresses; 
create table data_import.osgb_addresses(
uprn bigint,
os_address_toid varchar,
udprn integer,
organisation_name varchar,
department_name varchar,
po_box varchar,
sub_building_name varchar,
building_name varchar,
building_number varchar,
dependent_thoroughfare varchar,
thoroughfare varchar,
post_town varchar,
dbl_dependent_locality varchar,
dependent_locality varchar,
postcode varchar,
postcode_type varchar,
x numeric,
y numeric,
lat numeric,
lon numeric,
rpc numeric,
country varchar,
change_type varchar,
la_start_date date,
rm_start_date date,
last_update_date date,
class varchar
);

Once our table is ready, importing data is just a matter of right clicking the table node in PgAdmin and choosing Import. An import wizard that assists with the import process will be displayed:

All the earlier could obviously be achieved with pure SQL and in fact we have done this already in the previous section on importing data in psql in non-interactive mode. You can review the SQL code available in Chapter02/code for details.

Extracting spatial information from flat data

As we have some flat data already in our database, it's time to extract the spatial information. So far all the datasets, used Cartesian coordinate systems so our job is really straightforward:

drop table if exists data_import.earthquakes_subset_with_geom; 
select
id,
"time",
depth,
mag,
magtype,
place,Points of Interest in TXT format
ST_SetSRID(ST_Point(longitude, latitude), 4326) as geom
into data_import.earthquakes_subset_with_geom
from data_import.earthquakes_csv;

This example extracts a subset of data and puts data into a new table with coordinates being expressed as a geometry type, rather than two columns with numeric data appropriate for Lon and Lat.

In order to quickly preview the data, we dump the table's content to KML using ogr2ogr (this is a little spoiler on the next chapter on exporting the data from PostGIS indeed):

ogr2ogr -f "KML" earthquakes_from_postgis.kml PG:"host=localhost port=5434 user=postgres dbname=mastering_postgis" data_import.earthquakes_subset_with_geom -t_srs EPSG:4326

Such KML can be viewed for example in Google Earth (you can use the original KML downloaded from USGS just as a cross check for the output data):

More examples of extracting the spatial data from different formats are addressed in the ETL chapter.
You have been reading a chapter from
Mastering PostGIS
Published in: May 2017
Publisher: Packt
ISBN-13: 9781784391645
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime