In this example we will import the earthquakes data from USGS. So let's fire up psql and connect to the database server:
F:\mastering_postgis\chapter02>psql -h localhost -p 5434 -U postgres
You should see a similar output:
psql (9.5.0)
Type "help" for help.
postgres=#
Then we need to connect to the mastering_postgis database:
postgres=# \c mastering_postgis
The following output should be displayed:
You are now connected to database "mastering_postgis" as user
"postgres".
mastering_postgis=#
In the psql examples I am using postgres user. As you may guess, it is a superuser account. This is not the thing you would normally do, but it will keep the examples simple.
In a production environment, using a db user with credentials allowing access to specific resources is a sensible approach.
If you have not had a chance to create our data_import schema, let's take care of it now by typing the following command:
mastering_postgis=# create schema if not exists data_import;
You should see a similar output:
NOTICE: schema "data_import" already exists, skipping
CREATE SCHEMA
Once the schema is there, we create the table that will store the data. In order to do so just type or paste the following into psql:
create table data_import.earthquakes_csv (
"time" timestamp with time zone,
latitude numeric,
longitude numeric,
depth numeric,
mag numeric,
magType varchar,
nst numeric,
gap numeric,
dmin numeric,
rms numeric,
net varchar,
id varchar,
updated timestamp with time zone,
place varchar,
type varchar,
horizontalError numeric,
depthError numeric,
magError numeric,
magNst numeric,
status varchar,
locationSource varchar,
magSource varchar
);
You should see the following output:
mastering_postgis=# create table data_import.earthquakes_csv (
mastering_postgis(# "time" timestamp with time zone,
mastering_postgis(# latitude numeric,
mastering_postgis(# longitude numeric,
mastering_postgis(# depth numeric,
mastering_postgis(# mag numeric,
mastering_postgis(# magType varchar,
mastering_postgis(# nst numeric,
mastering_postgis(# gap numeric,
mastering_postgis(# dmin numeric,
mastering_postgis(# rms numeric,
mastering_postgis(# net varchar,
mastering_postgis(# id varchar,
mastering_postgis(# updated timestamp with time zone,
mastering_postgis(# place varchar,
mastering_postgis(# type varchar,
mastering_postgis(# horizontalError numeric,
mastering_postgis(# depthError numeric,
mastering_postgis(# magError numeric,
mastering_postgis(# magNst numeric,
mastering_postgis(# status varchar,
mastering_postgis(# locationSource varchar,
mastering_postgis(# magSource varchar
mastering_postgis(# );
CREATE TABLE
Now, as we have our data table ready, we can finally get to the import part. The following command should handle importing the data into our newly created table:
\copy data_import.earthquakes_csv from data\2.5_day.csv with DELIMITER ',' CSV HEADER
You should see a similar output:
mastering_postgis=# \copy data_import.earthquakes_csv from data\2.5_day.csv with DELIMITER ',' CSV HEADER
COPY 25
If you require a complete reference on the \COPY command, simply type in: \h COPY into the cmd.
While you can customize your data after importing, you may wish to only import a subset of columns in the first place. Unfortunately \COPY command imports all the columns (although you may specify where to put them) so the solution to this would be:
- Create a table that models the input CSV
- Import all the data
- Create a table with a subset of columns
- Copy data over
- Delete the input table
Even though everything said earlier is possible in psql, it requires quite a lot of typing. Because of that we will take care of this scenario in the next stage. Demonstrating the non-interactive psql mode.