An Introduction to Geoprocessing with Python¶

This is a very brief overview of how Python can be used to carry out common GIS tasks. We will begin with a quick outline of the advantages and disadvantages of using Python for GIS and then see Python in action with a small demonstration.

Link: https://ccchris-allen.github.io/

Table of Contents¶

Python vs. Desktop GIS
1. Advantages of Python
2. Disadvantages of Python
Is Bigfoot a Cheesehead? Or, Mapping Bigfoot Sightings using Python
Appendix

1. Python vs. Desktop GIS ¶

Traditional desktop GIS applications are powerful tools for reading, writing, and manipulating geospatial data. However, there are many compelling reasons for using programming languages such as Python for working with spatial data.

Advantages of Python ¶

Repeatability: Using Python can help you automate tedious geoprocessing tasks. Python scripts can create a repeatable workflow for loading, manipulating, and writing data.
Extensibility: Python has many great general purpose analysis modules, which can allow you to easily combine geoprocessing and non-spatial forms analysis (eg: data mining, statistical anlysis, text mining) into a single procedure. Setting up a similar workflow is often difficult (or impossible) when peforming geoprocessing in desktop GIS.
Affordability: Python is free!
Cross-Platform: Unlike some desktop GIS applications, Python can be run on macOS, Windows, Linux, etc.

Disadvantages of Python ¶

Less Graphical: Desktop GIS software tends to prioritize the graphical view of geospatial data; by default, you see the geometry of your features. As we'll see below, you're able to view the geometry of GIS features in Python, but it's not as automatic as in desktop GIS.
Less Familiar: In general, desktop software is more familiar to most of us, since we carry out most of our daily computing tasks in desktop programs that have user-friendly interfaces. Because of this, there is a learning curve when using Python, since programming involves using text commands to carry out tasks (which we'll see below).

2. Is Bigfoot a Cheesehead? Or, Mapping Bigfoot Sightings using Python ¶

Bigfoot

In this example, we'll pretend you're interested in learning about the distribution of Bigfoot sightings in mainland United States. We'll use the geopandas Python library to read and manipulate relevant geospatial data. We'll also try to answer a very important research question: Is Bigfoot a cheesehead?

Note: Don't worry about trying to understand every detail of the code shown here. This is merely intended to give you an overview of what's possible in Python.

Importing necessary Python modules ¶

Loading relevant modules is often the first thing done when working in Python. Modules are collections of code that provide specialized functions. In this case, we will rely on the geopandas module to perform geospatial functions.

Note: geopandas is a powerful tool for reading and writing data, while also having capabilities for performing the geometric operations that are essential for geospatial work. However, see the Additional Links section for other great Python geospatial modules.

# Load the geopandas library, which will let us work with geospatial data
import geopandas

# This code helps us configure the appearance of our outputs (not important to understand right now)
%matplotlib inline
import matplotlib.pyplot as plt
plt.rcParams["figure.figsize"] = (12, 9)

Reading the GIS files ¶

Here, we load two shapefies by calling the read_file function:

bigfoot_sightings.shp: This is a point file where each point represents a Bigfoot sighting.
cb_2016_us_state_5m.shp: This is a polygon file of all states in the United States.

sightings = geopandas.read_file('data/bigfoot_sightings.shp')
usa = geopandas.read_file('data/cb_2016_us_state_5m.shp')

Viewing the data ¶

Similar to desktop GIS, geopandas allows us to view geospatial data using a table view as well as a graphical view.

The table view allows us to see the attributes that are associated with our features.

usa

The graphical view is helpful when we are particularly interested in the geometry of our features. We show the graphical view by using the plot function.

usa.plot(color='white')

<matplotlib.axes._subplots.AxesSubplot at 0x106dc0f0>

So far, we've only been looking at the usa dataset, but we can also plot the sightings layer on top of usa.

base_layer = usa.plot(color='white')
sightings.plot(ax=base_layer, color='red')

<matplotlib.axes._subplots.AxesSubplot at 0x127402e8>

Filtering our dataset by attribute ¶

From the plots above we can see that our usa layer contains many states and territories that are not part of mainland US, but luckily geopandas gives us the ability to filter our dataset by attribute. Since we only want to work with features within mainland US, we first create a list of features to filter out by inspecting the NAME attribute shown in the table above. Then we use the query function in geopandas to keep only the features that have a NAME attribute value that is not in our list.

# NAME values that are not mainland features
not_mainland_names = ['American Samoa', 'Guam', 'Puerto Rico', 'United States Virgin Islands', 
                       'Commonwealth of the Northern Mariana Islands', 'Alaska', 'Hawaii']

# Use `query` to keep only features where the NAME values is NOT in our list
mainland_usa = usa.query('NAME not in @not_mainland_names')

Now, we can again use the plot function to make sure our new dataset only contains features from mainland US. Looks good!

mainland_usa.plot(color='white')

<matplotlib.axes._subplots.AxesSubplot at 0x1d133160>

How many sightings are there per state? ¶

Now we'll use a spatial join to determine how many Bigfoot sightings have occurred in each state.

In geopandas we use the sjoin function to perform a spatial join. (Note: Ignore the "CRS does not match!" warning for now.)

points_in_polygon = geopandas.sjoin(sightings, mainland_usa, how='inner', op='within')

Next we count the points in each state and create a new attribute called SIGHTINGS_COUNT to hold this count for each state.

# These lines of code might look scary, but all you need to know is that this code aggregates
# all points within each state and creates a new attribute called SIGHTINGS_COUNT that reflects 
# the number of sightings in that state.
counts = points_in_polygon.groupby('NAME').size()
counts = counts.to_frame('SIGHTINGS_COUNT').reset_index()
mainland_with_counts = mainland_usa.merge(counts)

We can now visialize our data as a choropleth map by telling the plot function that we would like to use the SIGHTINGS_COUNT attribute as our thematic variable.

mainland_with_counts.plot(column='SIGHTINGS_COUNT', cmap='Reds', scheme='quantiles', k=7, legend=True)

<matplotlib.axes._subplots.AxesSubplot at 0x20b85f98>

What percentage of sightings are in Wisconsin? ¶

Finally, we can find out if Sasquatch is a cheesehead! This is a rather silly example, so we'll take an equally silly (and simplistic) approach to answering this question by merely finding out what percentage of the total mainland US sightings are in Wisconsin.

First we select the Wisconsin feature and get the SIGHTINGS_COUNT value.

wi_with_count = mainland_with_counts.query('NAME == "Wisconsin"')
wi_sightings = wi_with_count.SIGHTINGS_COUNT.iloc[0]

Next we find the total number of sightings in mainland US and then use this value to find the percentage of sightings that occurred in Wisconsin.

mainland_total_sightings = mainland_with_counts.SIGHTINGS_COUNT.sum()
wi_percentage = 100 * wi_sightings / float(mainland_total_sightings)

We now simply print out the percentage.

print "{0:.2f} percent of sightings are in WI.".format(wi_percentage)

1.81 percent of sightings are in WI.

Only about 2% of total sightings occurred in Wisconsin, so we must conclude that Bigfoot is NOT a cheesehead!¶

3. Appendix ¶

Credits ¶

The idea for this notebook was inspired by work from Joshua Stevens and Timothy Renner.

Data Sources ¶

Data on bigfoot sightings (bigfoot_sightings.shp) came from Timothy Renner.
US state boundaries (cb_2016_us_state_5m.shp) came from the Census Bureau.

Additional Links ¶

This is a short list of other useful Python geospatial modules.

GDAL/OGR: A Python implementation of GDAL/OGR
arcpy: ESRI's Python interface to ArcGIS.
fiona: A user friendly interface for reading and writing geospatial data.
shapely: A module for performing geometric operations.
pyproj: For working with map projections.

	AFFGEOID	ALAND	AWATER	GEOID	NAME	STATEFP	STATENS	STUSPS	geometry
0	0400000US01	1.311737e+11	4.593686e+09	01	Alabama	01	01779775	AL	(POLYGON ((-88.04374299999999 30.517423, -88.0...
1	0400000US02	1.477946e+12	2.453905e+11	02	Alaska	02	01785533	AK	(POLYGON ((-133.655819 55.625617, -133.624921 ...
2	0400000US04	2.941986e+11	1.027346e+09	04	Arizona	04	01779777	AZ	POLYGON ((-114.799683 32.593621, -114.809393 3...
3	0400000US08	2.684293e+11	1.175113e+09	08	Colorado	08	01779779	CO	POLYGON ((-109.060253 38.599328, -109.059541 3...
4	0400000US09	1.254264e+10	1.815476e+09	09	Connecticut	09	01779780	CT	POLYGON ((-73.72777499999999 41.100696, -73.69...
5	0400000US12	1.389242e+11	3.138604e+10	12	Florida	12	00294478	FL	(POLYGON ((-80.751643 24.857254, -80.729063 24...
6	0400000US13	1.491698e+11	4.741101e+09	13	Georgia	13	01705317	GA	POLYGON ((-85.605165 34.984678, -85.4741053158...
7	0400000US16	2.140429e+11	2.398670e+09	16	Idaho	16	01779783	ID	POLYGON ((-117.242675 44.396548, -117.234835 4...
8	0400000US18	9.279055e+10	1.536767e+09	18	Indiana	18	00448508	IN	POLYGON ((-88.09566199999999 37.905808, -88.08...
9	0400000US20	2.117533e+11	1.346236e+09	20	Kansas	20	00481813	KS	POLYGON ((-102.051744 40.003078, -101.916696 4...
10	0400000US22	1.119049e+11	2.374630e+10	22	Louisiana	22	01629543	LA	(POLYGON ((-88.88145399999999 30.053202, -88.8...
11	0400000US25	2.020444e+10	7.130621e+09	25	Massachusetts	25	00606926	MA	(POLYGON ((-70.275526 41.310464, -70.260632 41...
12	0400000US27	2.062323e+11	1.892918e+10	27	Minnesota	27	00662849	MN	POLYGON ((-97.239209 48.968684, -97.2380249999...
13	0400000US29	1.780527e+11	2.487575e+09	29	Missouri	29	01779791	MO	POLYGON ((-95.773549 40.578205, -95.7685270000...
14	0400000US30	3.769650e+11	3.866987e+09	30	Montana	30	00767982	MT	POLYGON ((-116.049155 48.481247, -116.04915652...
15	0400000US32	2.843293e+11	2.047351e+09	32	Nevada	32	01779793	NV	POLYGON ((-120.005743 39.228664, -120.005142 3...
16	0400000US34	1.905076e+10	3.541928e+09	34	New Jersey	34	01779795	NJ	POLYGON ((-75.559102 39.629056, -75.5594459999...
17	0400000US36	1.220530e+11	1.924310e+10	36	New York	36	01779796	NY	(POLYGON ((-72.034958 41.255458, -72.029438 41...
18	0400000US38	1.787118e+11	4.399094e+09	38	North Dakota	38	01779797	ND	POLYGON ((-104.048652 48.865734, -104.048824 4...
19	0400000US40	1.776637e+11	3.373836e+09	40	Oklahoma	40	01102857	OK	POLYGON ((-103.002565 36.526588, -103.002188 3...
20	0400000US42	1.158822e+11	3.396806e+09	42	Pennsylvania	42	01779798	PA	POLYGON ((-80.519891 40.906661, -80.5190910000...
21	0400000US45	7.786172e+10	5.070944e+09	45	South Carolina	45	01779799	SC	POLYGON ((-83.35323800000001 34.728648, -83.34...
22	0400000US46	1.963484e+11	3.380783e+09	46	South Dakota	46	01785534	SD	POLYGON ((-104.057698 44.997431, -104.05021 44...
23	0400000US48	6.766335e+11	1.902599e+10	48	Texas	48	01779801	TX	POLYGON ((-106.645479 31.89867, -106.64084 31....
24	0400000US50	2.387347e+10	1.031125e+09	50	Vermont	50	01779802	VT	POLYGON ((-73.43773999999999 44.045006, -73.43...
25	0400000US54	6.226560e+10	4.899028e+08	54	West Virginia	54	01779805	WV	POLYGON ((-82.64299699999999 38.16956, -82.639...
26	0400000US72	8.868100e+09	4.923178e+09	72	Puerto Rico	72	01779808	PR	(POLYGON ((-65.335701 18.349535, -65.329334 18...
27	0400000US05	1.347715e+11	2.960192e+09	05	Arkansas	05	00068085	AR	POLYGON ((-94.61791900000002 36.49941400000001...
28	0400000US06	4.035011e+11	2.046672e+10	06	California	06	01779778	CA	(POLYGON ((-118.603375 33.478098, -118.598783 ...
29	0400000US10	5.047195e+09	1.398721e+09	10	Delaware	10	01779781	DE	(POLYGON ((-75.570798 39.626768, -75.559445999...
30	0400000US11	1.583650e+08	1.863340e+07	11	District of Columbia	11	01702382	DC	POLYGON ((-77.119759 38.934343, -77.1045 38.94...
31	0400000US15	1.663410e+10	1.177770e+10	15	Hawaii	15	01779782	HI	(POLYGON ((-156.05722 19.742536, -156.052315 1...
32	0400000US17	1.437887e+11	6.206694e+09	17	Illinois	17	01779784	IL	POLYGON ((-91.512974 40.181062, -91.511073 40....
33	0400000US19	1.446676e+11	1.077808e+09	19	Iowa	19	01779785	IA	POLYGON ((-96.639704 42.737071, -96.635886 42....
34	0400000US21	1.022663e+11	2.388771e+09	21	Kentucky	21	01779786	KY	(POLYGON ((-89.405654 36.528165, -89.398685 36...
35	0400000US23	7.988522e+10	1.174876e+10	23	Maine	23	01779787	ME	(POLYGON ((-68.37659099999999 44.113762, -68.3...
36	0400000US24	2.514775e+10	6.983312e+09	24	Maryland	24	01714934	MD	(POLYGON ((-76.04861699999999 38.014843, -76.0...
37	0400000US26	1.464553e+11	1.040316e+11	26	Michigan	26	01779789	MI	(POLYGON ((-84.009664 46.101332, -83.987684 46...
38	0400000US28	1.215299e+11	3.930506e+09	28	Mississippi	28	01779790	MS	(POLYGON ((-88.510674 30.217021, -88.492378 30...
39	0400000US31	1.989727e+11	1.356291e+09	31	Nebraska	31	01779792	NE	POLYGON ((-104.053514 41.157257, -104.052666 4...
40	0400000US33	2.318740e+10	1.028679e+09	33	New Hampshire	33	01779794	NH	POLYGON ((-72.554232 42.860038, -72.555132 42....
41	0400000US35	3.141694e+11	7.557230e+08	35	New Mexico	35	00897535	NM	POLYGON ((-109.050173 31.480004, -109.049843 3...
42	0400000US37	1.259213e+11	1.347006e+10	37	North Carolina	37	01027616	NC	(POLYGON ((-75.72680699999999 35.935844, -75.7...
43	0400000US39	1.058331e+11	1.026460e+10	39	Ohio	39	01085497	OH	(POLYGON ((-82.73570699999999 41.603361, -82.7...
44	0400000US41	2.486043e+11	6.195106e+09	41	Oregon	41	01155107	OR	POLYGON ((-124.552441 42.840568, -124.500141 4...
45	0400000US44	2.677899e+09	1.323552e+09	44	Rhode Island	44	01219835	RI	(POLYGON ((-71.36152 41.464831, -71.34707 41.4...
46	0400000US47	1.067977e+11	2.355189e+09	47	Tennessee	47	01325873	TN	POLYGON ((-90.309877 35.00975, -90.30689699999...
47	0400000US49	2.128855e+11	6.999529e+09	49	Utah	49	01455989	UT	POLYGON ((-114.052962 37.592783, -114.052472 3...
48	0400000US51	1.022573e+11	8.528606e+09	51	Virginia	51	01779803	VA	(POLYGON ((-75.999658 37.848198, -75.996859 37...
49	0400000US53	1.721134e+11	1.255845e+10	53	Washington	53	01779804	WA	(POLYGON ((-122.328343 48.021335, -122.321721 ...
50	0400000US55	1.402736e+11	2.936139e+10	55	Wisconsin	55	01779806	WI	(POLYGON ((-86.956198 45.352006, -86.953389 45...
51	0400000US60	1.977591e+08	1.307244e+09	60	American Samoa	60	01802701	AS	(POLYGON ((-168.163532 -14.536407, -168.159438...
52	0400000US66	5.435583e+08	9.343350e+08	66	Guam	66	01802705	GU	(POLYGON ((144.642331 13.241898, 144.649492 13...
53	0400000US69	4.722767e+08	4.644268e+09	69	Commonwealth of the Northern Mariana Islands	69	01779809	MP	(POLYGON ((146.033197 16.004517, 146.05429 16....
54	0400000US78	3.480131e+08	1.550245e+09	78	United States Virgin Islands	78	01802710	VI	(POLYGON ((-64.634323 17.795891, -64.622142 17...
55	0400000US56	2.514649e+11	1.861273e+09	56	Wyoming	56	01779807	WY	POLYGON ((-111.056888 44.866658, -111.055629 4...