Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
PostGIS Cookbook

You're reading from   PostGIS Cookbook Store, organize, manipulate, and analyze spatial data

Arrow left icon
Product type Paperback
Published in Mar 2018
Publisher
ISBN-13 9781788299329
Length 584 pages
Edition 2nd Edition
Languages
Tools
Arrow right icon
Authors (6):
Arrow left icon
Pedro Wightman Pedro Wightman
Author Profile Icon Pedro Wightman
Pedro Wightman
Bborie Park Bborie Park
Author Profile Icon Bborie Park
Bborie Park
Paolo Corti Paolo Corti
Author Profile Icon Paolo Corti
Paolo Corti
Stephen Vincent Mather Stephen Vincent Mather
Author Profile Icon Stephen Vincent Mather
Stephen Vincent Mather
Thomas Kraft Thomas Kraft
Author Profile Icon Thomas Kraft
Thomas Kraft
Mayra Zurbarán Mayra Zurbarán
Author Profile Icon Mayra Zurbarán
Mayra Zurbarán
+2 more Show less
Arrow right icon
View More author details
Toc

Table of Contents (14) Chapters Close

Preface 1. Moving Data In and Out of PostGIS FREE CHAPTER 2. Structures That Work 3. Working with Vector Data – The Basics 4. Working with Vector Data – Advanced Recipes 5. Working with Raster Data 6. Working with pgRouting 7. Into the Nth Dimension 8. PostGIS Programming 9. PostGIS and the Web 10. Maintenance, Optimization, and Performance Tuning 11. Using Desktop Clients 12. Introduction to Location Privacy Protection Mechanisms 13. Other Books You May Enjoy

Handling batch importing and exporting of datasets

In many GIS workflows, there is a typical scenario where subsets of a PostGIS table must be deployed to external users in a filesystem format (most typically, shapefiles or a spatialite database). Often, there is also the reverse process, where datasets received from different users have to be uploaded to the PostGIS database.

In this recipe, we will simulate both of these data flows. You will first create the data flow for processing the shapefiles out of PostGIS, and then the reverse data flow for uploading the shapefiles.

You will do it using the power of bash scripting and the ogr2ogr command.

Getting ready

If you didn't follow all the other recipes, be sure to import the hotspots (Global_24h.csv) and the countries dataset (countries.shp) in PostGIS. The following is how to do it with ogr2ogr (you should import both the datasets in their original SRID, 4326, to make spatial operations faster):

  1. Import in PostGIS the Global_24h.csv file, using the global_24.vrt virtual driver you created in a previous recipe:
      $ ogr2ogr -f PostgreSQL PG:"dbname='postgis_cookbook' 
user='me' password='mypassword'" -lco SCHEMA=chp01 global_24h.vrt
-lco OVERWRITE=YES -lco GEOMETRY_NAME=the_geom -nln hotspots
  1. Import the countries shapefile using ogr2ogr:
      $ ogr2ogr -f PostgreSQL -sql "SELECT ISO2, NAME AS country_name 
FROM wborders" -nlt MULTIPOLYGON PG:"dbname='postgis_cookbook'
user='me' password='mypassword'" -nln countries
-lco SCHEMA=chp01 -lco OVERWRITE=YES
-lco GEOMETRY_NAME=the_geom wborders.shp
If you already imported the hotspots dataset using the 3857 SRID, you can use the PostGIS 2.0 method that allows the user to modify the geometry type column of an existing spatial table. You can update the SRID definition for the hotspots table in this way thanks to the support of typmod on geometry objects:
postgis_cookbook=# ALTER TABLE chp01.hotspots
ALTER COLUMN the_geom
SET DATA TYPE geometry(Point, 4326)
USING ST_Transform(the_geom, 4326);

How to do it...

The steps you need to follow to complete this recipe are as follows:

  1. Check how many hotspots there are for each distinct country by using the following query:
      postgis_cookbook=> SELECT c.country_name, MIN(c.iso2) 
as iso2, count(*) as hs_count
FROM chp01.hotspots as hs
JOIN chp01.countries as c ON ST_Contains(c.the_geom, hs.the_geom)
GROUP BY c.country_name ORDER BY c.country_name;

The output of the preceding command is as follows:

  1. Using the same query, generate a CSV file using the PostgreSQL COPY command or the ogr2ogr command (in the first case, make sure that the Postgre service user has full write permission to the output directory). If you are following the COPY approach and using Windows, be sure to replace /tmp/hs_countries.csv with a different path:
      $ ogr2ogr -f CSV hs_countries.csv 
PG:"dbname='postgis_cookbook' user='me' password='mypassword'"
-lco SCHEMA=chp01 -sql "SELECT c.country_name, MIN(c.iso2) as iso2,
count(*) as hs_count FROM chp01.hotspots as hs
JOIN chp01.countries as c ON ST_Contains(c.the_geom, hs.the_geom)
GROUP BY c.country_name ORDER BY c.country_name"
postgis_cookbook=> COPY (SELECT c.country_name, MIN(c.iso2) as iso2,
count(*) as hs_count
FROM chp01.hotspots as hs
JOIN chp01.countries as c
ON ST_Contains(c.the_geom, hs.the_geom)
GROUP BY c.country_name
ORDER BY c.country_name)
TO '/tmp/hs_countries.csv' WITH CSV HEADER;
  1. If you are using Windows, go to step 5. With Linux, create a bash script named export_shapefiles.sh that iterates each record (country) in the hs_countries.csv file and generates a shapefile with the corresponding hotspots exported from PostGIS for that country:
        #!/bin/bash 
        while IFS="," read country iso2 hs_count 
        do 
          echo "Generating shapefile $iso2.shp for country 
$country ($iso2) containing $hs_count features." ogr2ogr out_shapefiles/$iso2.shp
PG:"dbname='postgis_cookbook' user='me' password='mypassword'"
-lco SCHEMA=chp01 -sql "SELECT ST_Transform(hs.the_geom, 4326),
hs.acq_date, hs.acq_time, hs.bright_t31
FROM chp01.hotspots as hs JOIN chp01.countries as c
ON ST_Contains(c.the_geom, ST_Transform(hs.the_geom, 4326)) WHERE c.iso2 = '$iso2'" done < hs_countries.csv
  1. Give execution permissions to the bash file, and then run it after creating an output directory (out_shapefiles) for the shapefiles that will be generated by the script. Then, go to step 7:
      chmod 775 export_shapefiles.sh
mkdir out_shapefiles
$ ./export_shapefiles.sh
Generating shapefile AL.shp for country
Albania (AL) containing 66 features.
Generating shapefile DZ.shp for country
Algeria (DZ) containing 361 features.
...
Generating shapefile ZM.shp for country
Zambia (ZM) containing 1575 features.
Generating shapefile ZW.shp for country
Zimbabwe (ZW) containing 179 features.
If you get the output ERROR: function getsrid(geometry) does not exist LINE 1: SELECT getsrid("the_geom") FROM (SELECT,..., you will need to load legacy support in PostGIS, for example, in a Debian Linux box:
psql -d postgis_cookbook -f /usr/share/postgresql/9.1/contrib/postgis-2.1/legacy.sql
  1. If you are using Windows, create a batch file named export_shapefiles.bat that iterates each record (country) in the hs_countries.csv file and generates a shapefile with the corresponding hotspots exported from PostGIS for that country:
        @echo off 
        for /f "tokens=1-3 delims=, skip=1" %%a in (hs_countries.csv) do ( 
          echo "Generating shapefile %%b.shp for country %%a 
(%%b) containing %%c features" ogr2ogr .\out_shapefiles\%%b.shp
PG:"dbname='postgis_cookbook' user='me' password='mypassword'"
-lco SCHEMA=chp01 -sql "SELECT ST_Transform(hs.the_geom, 4326),
hs.acq_date, hs.acq_time, hs.bright_t31
FROM chp01.hotspots as hs JOIN chp01.countries as c
ON ST_Contains(c.the_geom, ST_Transform(hs.the_geom, 4326))
WHERE c.iso2 = '%%b'" )
  1. Run the batch file after creating an output directory (out_shapefiles) for the shapefiles that will be generated by the script:
      >mkdir out_shapefiles
>export_shapefiles.bat
"Generating shapefile AL.shp for country
Albania (AL) containing 66 features"
"Generating shapefile DZ.shp for country
Algeria (DZ) containing 361 features"
...
"Generating shapefile ZW.shp for country
Zimbabwe (ZW) containing 179 features"
  1. Try to open a couple of these output shapefiles in your favorite desktop GIS. The following screenshot shows you how they look in QGIS:
  1. Now, you will do the return trip, uploading all of the generated shapefiles to PostGIS. You will upload all of the features for each shapefile and include the upload datetime and the original shapefile name. First, create the following PostgreSQL table, where you will upload the shapefiles:
      postgis_cookbook=# CREATE TABLE chp01.hs_uploaded
(
ogc_fid serial NOT NULL,
acq_date character varying(80),
acq_time character varying(80),
bright_t31 character varying(80),
iso2 character varying,
upload_datetime character varying,
shapefile character varying,
the_geom geometry(POINT, 4326),
CONSTRAINT hs_uploaded_pk PRIMARY KEY (ogc_fid)
);
  1. If you are using Windows, go to step 12. With OS X, you will need to install findutils with homebrew and run the script for Linux:
      $ brew install findutils 
  1. With Linux, create another bash script named import_shapefiles.sh:
        #!/bin/bash 
        for f in `find out_shapefiles -name \*.shp -printf "%f\n"` 
        do 
          echo "Importing shapefile $f to chp01.hs_uploaded PostGIS
table..." #, ${f%.*}" ogr2ogr -append -update -f PostgreSQL
PG:"dbname='postgis_cookbook' user='me'
password='mypassword'" out_shapefiles/$f
-nln chp01.hs_uploaded -sql "SELECT acq_date, acq_time,
bright_t31, '${f%.*}' AS iso2, '`date`' AS upload_datetime,
'out_shapefiles/$f' as shapefile FROM ${f%.*}" done
  1. Assign the execution permission to the bash script and execute it:
      $ chmod 775 import_shapefiles.sh
$ ./import_shapefiles.sh
Importing shapefile DO.shp to chp01.hs_uploaded PostGIS table
...
Importing shapefile ID.shp to chp01.hs_uploaded PostGIS table
...
Importing shapefile AR.shp to chp01.hs_uploaded PostGIS table
...
...

Now, go to step 14.

  1. If you are using Windows, create a batch script named import_shapefiles.bat:
        @echo off 
        for %%I in (out_shapefiles\*.shp*) do ( 
          echo Importing shapefile %%~nxI to chp01.hs_uploaded
PostGIS table... ogr2ogr -append -update -f PostgreSQL
PG:"dbname='postgis_cookbook' user='me'
password='password'" out_shapefiles/%%~nxI -nln chp01.hs_uploaded -sql "SELECT acq_date, acq_time,
bright_t31, '%%~nI' AS iso2, '%date%' AS upload_datetime,
'out_shapefiles/%%~nxI' as shapefile FROM %%~nI" )
  1. Run the batch script:
      >import_shapefiles.bat
Importing shapefile AL.shp to chp01.hs_uploaded PostGIS table...
Importing shapefile AO.shp to chp01.hs_uploaded PostGIS table...
Importing shapefile AR.shp to chp01.hs_uploaded PostGIS table......
  1. Check some of the records that have been uploaded to the PostGIS table by using SQL:
      postgis_cookbook=# SELECT upload_datetime,
shapefile, ST_AsText(wkb_geometry)
FROM chp01.hs_uploaded WHERE ISO2='AT';

The output of the preceding command is as follows:

  1. Check the same query with ogrinfo as well:
      $ ogrinfo PG:"dbname='postgis_cookbook' user='me'
password='mypassword'"
chp01.hs_uploaded -where "iso2='AT'"

The output of the preceding command is as follows:

How it works...

You could implement both the data flows (processing shapefiles out from PostGIS, and then into it again) thanks to the power of the ogr2ogr GDAL command.

You have been using this command in different forms and with the most important input parameters in other recipes, so you should now have a good understanding of it.

Here, it is worth mentioning the way OGR lets you export the information related to the current datetime and the original shapefile name to the PostGIS table. Inside the import_shapefiles.sh (Linux, OS X) or the import_shapefiles.bat (Windows) scripts, the core is the line with the ogr2ogr command (here is the Linux version):

ogr2ogr -append -update  -f PostgreSQL PG:"dbname='postgis_cookbook' user='me' password='mypassword'" out_shapefiles/$f -nln chp01.hs_uploaded -sql "SELECT acq_date, acq_time, bright_t31, '${f%.*}' AS iso2, '`date`' AS upload_datetime, 'out_shapefiles/$f' as shapefile FROM ${f%.*}" 

Thanks to the -sql option, you can specify the two additional fields, getting their values from the system date command and the filename that is being iterated from the script.

You have been reading a chapter from
PostGIS Cookbook - Second Edition
Published in: Mar 2018
Publisher:
ISBN-13: 9781788299329
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime