Experiments with Postgis

Notes on Postgis

These are some of my notes in attempting to create a geodatabase using Postgis, for Yolo county, California.

1. Platform

  • Windows XP
  • PostgreSQL8.3
  • Postgis1.5

2. Create a new database

  • Used the GUI to create a new database called ‘yolocounty’
    • encoding UTF8
    • could not use  template_postgis (another database was using it)
    • so ran sql scripts ‘postgis.sql’, ‘spatial_ref_system.sql’ and ‘comments.sql’

3. Creating database tables for GIS layers

Tasks:

  1. Load several GIS layers for California from shapefiles, each in different projections.
  2. Re-project individual files to one consistent projection (EPSG: 3488, California Albers)
  3. Create specific overlays and spatial joins

1. counties

loading counties boundary shapefile to table ‘counties’ :

  • used shp2pgsql piped to psql to create table called counties
  • its in Lat Long WGS 84 (EPSG:4326)

shp2pgsql -s 4326 -I -D -g the_geom “E:/xxx/yyy/zzz.shp” counties | psql -U user -h host - p port -d yolocounty

2.  Groundwater basins

  • used shp2pgsql piped to psql to create table called pa
  • its in UTM 10N NAD 27 (EPSG:26710)

shp2pgsql -s 26710 -I -D -g the_geom E:/xxx/yyy/zzz.shp gwbasins | psql -U user -h host -d yolocounty

3. DWR Planning Areas

  • used shp2pgsql piped to psql to create table called pa
  • its in UTM 11N WGS 84(EPSG:32611)

shp2pgsql -s 32611 -I -D -g the_geom E:/xxx/yyy/zzz.shp PA | psql -U user -h host -d yolocounty

4. DWR Hydrologic Regions

  • Same procedure as for Planning Areas above
  • saved as table HR

5.  California Dams

  • Same procedure as above, original data in EPSG 4269 (NAD 83, Latlong)
  • saved as table dams

6. major rivers

  • Same procedure as above, original data in EPSG 4269
  • saved as table rivers
  • 7. non rivers

    • Same procedure as above, original data in EPSG 4269
    • saved as table non rivers
  • Now I’ll perform some common geoprocessing tasks using these layers. Its best to pose some questions we want to ask. Here’s a list of plausible qns one might want to ask of this database.

    Questions

    a) How many counties are within or intersect the Sacramento Valley Hydrologic Region? What are their names?

    b) What are the dams within Yolo, Solano and Lake counties?

    c) How many Planning Areas are within the Sacramento Valley Hydrologic Region? What are their names?

    d) How many Planning Areas and Hydrologic Regions does Yolo county span?

    Solutions

    First, we have to reproject all the data to a common projections. I chose California Albers, which is EPSG 3488:

    Reprojecting each layer to EPSG 3488

    We can add a transformed geometry column to existing spatial tables, using ST_TRANSFORM :

    ALTER TABLE counties ADD column geom_albers geometry;

    UPDATE counties SET geom_albers = ST_TRANSFORM(the_geom,3488);

    Now check for the transformed SRID :

    SELECT ST_SRID(geom_albers) from counties;

    returns 3488 so ok.

    Can also check for the original geometry column:

    SELECT ST_SRID(the_geom) from counties;

    returns 4326. this shows that 2 projections can exist for the same geometry.

    Now repeat the transformation for the other layers.

    a) How many counties are within or intersect the Sacramento Valley Hydrologic Region? What are their names?

    A table of Sacramento River HR and intersecting counties can be generated :

    SELECT hr_name, name

    FROM hr, counties
    WHERE ST_Intersects(hr.geom_albers,counties.geom_albers)

    AND hr.hr_name = ‘Sacramento River’;

    However, it seems to include counties like Mendocino that merely touch the Sacramento River HR on the outside…

    27 counties touch the Sac R. HR

    The same result is produced with an INNER JOIN :

    SELECT hr_name, name
    FROM counties inner join hr on
    ST_Intersects(hr.geom_albers,counties.geom_albers)
    Where hr.hr_name = ‘Sacramento River’;

    Lets take this a little further, namely

    1. create a new spatial table out of this intersection (the above is just a tabulation)

    CREATE TABLE AS hru_counties
    SELECT hr.hr_name as hr_name,counties.name as countyname,
    ST_Intersection(hr.geom_albers,counties.geom_albers) as geom_albers, ROUND(ST_AREA(ST_Intersection(hr.geom_albers,counties.geom_albers))/1000000,0) as area
    FROM hr, counties
    WHERE ST_Intersects(hr.geom_albers,counties.geom_albers)
    AND hr_hrname = ‘Sacramento River’;

    2. add indices so that it can be viewed in conventional GIS viewere (that need an OID)

    – creating a serial index needed for viewing in regular GIS viewers

    ALTER table sac_counties ADD Column gid serial not null;
    CREATE UNIQUE INDEX uidx_sac_counties_gid on sac_counties using btree(gid);

    –creating a primary key on gid, note that this is deprecated (but works for now)

    ALTER table sac_counties ADD Constraint pkey_sac_counties_gid PRIMARY KEY(gid);

    –creating a spatial index

    CREATE Index idx_sac_counties_geom ON sac_counties using gist(geom_albers);

    VACUUM ANALYZE;

    3. calculate intersecting areas of these polygons.

    done in 1 above.

    4. update the layer to remove sliver polygons (TO DO)

    b) What are the dams within Yolo, Solano and Lake counties?

    SELECT  dams.dam_name, dams.county, dams.own_name, counties.name, dams.river
    FROM counties, dams
    WHERE
    ST_CONTAINS(counties.geom_albers,dams.geom_albers)
    AND
    (counties.name = ‘Lake’ OR counties.name = ‘Yolo’
    OR counties.name = ‘Solano’) order by dams.county;

    This gives a list of24 dams. Interestingly (i) there are a few dams in the dams database table that seem to incorrectly be listed in Colusa instead of Lake counties; (ii) there are 3 dykes sitting on top of each other; (iii) Monticello dam does not show up. Perhaps its in Napa county (at least as seen by the geodatabase) – check if this is actually the case. Yes, it is:

    http://en.wikipedia.org/wiki/Monticello_Dam

    Now, can we extract this as a new spatial table?

    I found this link useful

    And here’s what what worked:

    - create spatial table of dams within 4 counties
    create table select_dams as
    select st_intersection(counties.geom_albers,dams.geom_albers) as geom_albers,
    dams.dam_name as dam_name, counties.name as county, dams.own_name as owner_name, dams.river as river
    from dams,counties
    where st_contains(counties.geom_albers,dams.geom_albers)
    and counties.name in (‘Solano’,'Yolo’,'Lake’,'Colusa’,'Napa’)
    order by county;
    vacuum analyze;

    –indices are not there, so create them
    –first, spatial index on geometry

    create index idx_gist_select_dams on select_dams using gist(geom_albers);
    –then adding
    ALTER table select_dams ADD Column gid serial not null;
    ALTER table select_dams ADD Constraint pkey_select_dams_gid PRIMARY KEY(gid);

    To give you a visual idea, here’s what the entire California dams layer (red triangles) looks like (zoomed to the extent of the layer, and exported as an image file from QGIS.)

    Dams in CaliforniaDams in California

    ..And here’s what it looks like after selecting only the dams (pink triangles) from 4 counties (also zoomed to the extent of the newly created selection):

    Selected dams within 4 counties

    of course in both cases i’ve added background layers like county boundaries and drainage and watersheds of interest.

    Leave a Reply

    Fill in your details below or click an icon to log in:

    WordPress.com Logo

    You are commenting using your WordPress.com account. Log Out / Change )

    Twitter picture

    You are commenting using your Twitter account. Log Out / Change )

    Facebook photo

    You are commenting using your Facebook account. Log Out / Change )

    Connecting to %s

    Follow

    Get every new post delivered to your Inbox.