Notes on Postgis
These are some of my notes in attempting to create a geodatabase using Postgis, for Yolo county, California.
1. Platform
- Windows XP
- PostgreSQL8.3
- Postgis1.5
2. Create a new database
- Used the GUI to create a new database called ‘yolocounty’
- encoding UTF8
- could not use template_postgis (another database was using it)
- so ran sql scripts ‘postgis.sql’, ‘spatial_ref_system.sql’ and ‘comments.sql’
3. Creating database tables for GIS layers
Tasks:
- Load several GIS layers for California from shapefiles, each in different projections.
- Re-project individual files to one consistent projection (EPSG: 3488, California Albers)
- Create specific overlays and spatial joins
1. counties
loading counties boundary shapefile to table ‘counties’ :
- used shp2pgsql piped to psql to create table called counties
- its in Lat Long WGS 84 (EPSG:4326)
shp2pgsql -s 4326 -I -D -g the_geom “E:/xxx/yyy/zzz.shp” counties | psql -U user -h host - p port -d yolocounty
- the many flags are useful – check online for their meaning
- example: http://casoilresource.lawr.ucdavis.edu/drupal/node/266
- succesfully connected to database and visualized in UDig.
2. Groundwater basins
- used shp2pgsql piped to psql to create table called pa
- its in UTM 10N NAD 27 (EPSG:26710)
shp2pgsql -s 26710 -I -D -g the_geom E:/xxx/yyy/zzz.shp gwbasins | psql -U user -h host -d yolocounty
3. DWR Planning Areas
- used shp2pgsql piped to psql to create table called pa
- its in UTM 11N WGS 84(EPSG:32611)
shp2pgsql -s 32611 -I -D -g the_geom E:/xxx/yyy/zzz.shp PA | psql -U user -h host -d yolocounty
4. DWR Hydrologic Regions
- Same procedure as for Planning Areas above
- saved as table HR
5. California Dams
- Same procedure as above, original data in EPSG 4269 (NAD 83, Latlong)
- saved as table dams
6. major rivers
- Same procedure as above, original data in EPSG 4269
- saved as table rivers
7. non rivers
- Same procedure as above, original data in EPSG 4269
- saved as table non rivers
Now I’ll perform some common geoprocessing tasks using these layers. Its best to pose some questions we want to ask. Here’s a list of plausible qns one might want to ask of this database.
Questions
a) How many counties are within or intersect the Sacramento Valley Hydrologic Region? What are their names?
b) What are the dams within Yolo, Solano and Lake counties?
c) How many Planning Areas are within the Sacramento Valley Hydrologic Region? What are their names?
d) How many Planning Areas and Hydrologic Regions does Yolo county span?
Solutions
First, we have to reproject all the data to a common projections. I chose California Albers, which is EPSG 3488:
Reprojecting each layer to EPSG 3488
We can add a transformed geometry column to existing spatial tables, using ST_TRANSFORM :
ALTER TABLE counties ADD column geom_albers geometry;
UPDATE counties SET geom_albers = ST_TRANSFORM(the_geom,3488);
Now check for the transformed SRID :
SELECT ST_SRID(geom_albers) from counties;
returns 3488 so ok.
Can also check for the original geometry column:
SELECT ST_SRID(the_geom) from counties;
returns 4326. this shows that 2 projections can exist for the same geometry.
Now repeat the transformation for the other layers.
a) How many counties are within or intersect the Sacramento Valley Hydrologic Region? What are their names?
A table of Sacramento River HR and intersecting counties can be generated :
SELECT hr_name, name
FROM hr, counties
WHERE ST_Intersects(hr.geom_albers,counties.geom_albers)AND hr.hr_name = ‘Sacramento River’;
However, it seems to include counties like Mendocino that merely touch the Sacramento River HR on the outside…
27 counties touch the Sac R. HR
The same result is produced with an INNER JOIN :
SELECT hr_name, name
FROM counties inner join hr on
ST_Intersects(hr.geom_albers,counties.geom_albers)
Where hr.hr_name = ‘Sacramento River’;
Lets take this a little further, namely
1. create a new spatial table out of this intersection (the above is just a tabulation)
CREATE TABLE AS hru_counties
SELECT hr.hr_name as hr_name,counties.name as countyname,
ST_Intersection(hr.geom_albers,counties.geom_albers) as geom_albers, ROUND(ST_AREA(ST_Intersection(hr.geom_albers,counties.geom_albers))/1000000,0) as area
FROM hr, counties
WHERE ST_Intersects(hr.geom_albers,counties.geom_albers)
AND hr_hrname = ‘Sacramento River’;
2. add indices so that it can be viewed in conventional GIS viewere (that need an OID)
– creating a serial index needed for viewing in regular GIS viewers
ALTER table sac_counties ADD Column gid serial not null;
CREATE UNIQUE INDEX uidx_sac_counties_gid on sac_counties using btree(gid);
–creating a primary key on gid, note that this is deprecated (but works for now)
ALTER table sac_counties ADD Constraint pkey_sac_counties_gid PRIMARY KEY(gid);
–creating a spatial index
CREATE Index idx_sac_counties_geom ON sac_counties using gist(geom_albers);
VACUUM ANALYZE;
3. calculate intersecting areas of these polygons.
done in 1 above.
4. update the layer to remove sliver polygons (TO DO)
b) What are the dams within Yolo, Solano and Lake counties?
SELECT dams.dam_name, dams.county, dams.own_name, counties.name, dams.river
FROM counties, dams
WHERE
ST_CONTAINS(counties.geom_albers,dams.geom_albers)
AND
(counties.name = ‘Lake’ OR counties.name = ‘Yolo’
OR counties.name = ‘Solano’) order by dams.county;
This gives a list of24 dams. Interestingly (i) there are a few dams in the dams database table that seem to incorrectly be listed in Colusa instead of Lake counties; (ii) there are 3 dykes sitting on top of each other; (iii) Monticello dam does not show up. Perhaps its in Napa county (at least as seen by the geodatabase) – check if this is actually the case. Yes, it is:
http://en.wikipedia.org/wiki/Monticello_Dam
Now, can we extract this as a new spatial table?
I found this link useful
And here’s what what worked:
- create spatial table of dams within 4 counties
create table select_dams as
select st_intersection(counties.geom_albers,dams.geom_albers) as geom_albers,
dams.dam_name as dam_name, counties.name as county, dams.own_name as owner_name, dams.river as river
from dams,counties
where st_contains(counties.geom_albers,dams.geom_albers)
and counties.name in (‘Solano’,'Yolo’,'Lake’,'Colusa’,'Napa’)
order by county;
vacuum analyze;
–indices are not there, so create them
–first, spatial index on geometry
create index idx_gist_select_dams on select_dams using gist(geom_albers);
–then adding
ALTER table select_dams ADD Column gid serial not null;
ALTER table select_dams ADD Constraint pkey_select_dams_gid PRIMARY KEY(gid);
To give you a visual idea, here’s what the entire California dams layer (red triangles) looks like (zoomed to the extent of the layer, and exported as an image file from QGIS.)
..And here’s what it looks like after selecting only the dams (pink triangles) from 4 counties (also zoomed to the extent of the newly created selection):
of course in both cases i’ve added background layers like county boundaries and drainage and watersheds of interest.

