Measuring diversity

As important diversity is for cities, as complicated is to capture it. momepy offers several options on how to do that using urban morphometrics. Generally, we can distinguish three types of diversity characters, based on:

1. Absolute values
2. Relative values
3. Categorization (binning)

This notebook provides examples from each of them.

import momepy
import geopandas as gpd
import matplotlib.pyplot as plt


We will again use osmnx to get the data for our example and after preprocessing of building layer will generate tessellation.

import osmnx as ox

gdf = ox.footprints.footprints_from_place(place='Kahla, Germany')
gdf_projected = ox.project_gdf(gdf)

buildings = momepy.preprocess(gdf_projected, size=30,
compactness=True, islands=True)
buildings['uID'] = momepy.unique_id(buildings)
limit = momepy.buffered_limit(buildings)
tessellation = momepy.Tessellation(buildings, unique_id='uID', limit=limit).tessellation

Loop 1 out of 2.

Identifying changes: 100%|██████████| 2932/2932 [00:00<00:00, 4911.85it/s]
Changing geometry: 100%|██████████| 31/31 [00:00<00:00, 60.16it/s]

Loop 2 out of 2.

Identifying changes: 100%|██████████| 2520/2520 [00:00<00:00, 3314.04it/s]
Changing geometry: 100%|██████████| 2/2 [00:00<00:00, 37.00it/s]

Inward offset...
Discretization...

  2%|▏         | 43/2521 [00:00<00:05, 426.19it/s]
Generating input point array...

100%|██████████| 2521/2521 [00:03<00:00, 647.70it/s]

Generating Voronoi diagram...
Generating GeoDataFrame...

Vertices to Polygons: 100%|██████████| 267595/267595 [00:06<00:00, 39429.12it/s]

Dissolving Voronoi polygons...
Preparing limit for edge resolving...
Building R-tree...

 18%|█▊        | 65/371 [00:00<00:00, 315.92it/s]
Identifying edge cells...

100%|██████████| 371/371 [00:01<00:00, 334.28it/s]
24%|██▎       | 56/238 [00:00<00:00, 546.20it/s]
Cutting...

100%|██████████| 238/238 [00:00<00:00, 513.26it/s]


### Queen contiguity

Morphological tessellation allows using contiguity-based weights matrix. While libpysal.weights.contiguity.Queen will do a classic Queen contiguity matrix; it might not be enough to capture proper context. For that reason, we can use momepy.sw_high to capture all neighbours within set topological distance k. More in Generating spatial weights.

sw3 = momepy.sw_high(k=3, gdf=tessellation, ids='uID')


To have a character whose diversity can be measured, we can use the area of tessellation.

tessellation['area'] = momepy.Area(tessellation).series


### Range

The range is as simple as it sounds; it measures the range of the values withing all neighbours as captured by spatial_weights.

area_rng = momepy.Range(tessellation, values='area',
spatial_weights=sw3, unique_id='uID')
tessellation['area_rng'] = area_rng.series

100%|██████████| 2518/2518 [00:03<00:00, 768.12it/s]

f, ax = plt.subplots(figsize=(10, 10))
tessellation.plot(ax=ax, column='area_rng', legend=True, cmap='Spectral_r')
buildings.plot(ax=ax, color="white", alpha=0.4)
ax.set_axis_off()
plt.show()


However, as we can see from the plot above, there is a massive effect of large-scale buildings, which can be seen as outliers. For that reason, we can define rng keyword argument to limit the range taken into account. To get the interquartile range:

area_iqr = momepy.Range(tessellation, values='area',
spatial_weights=sw3, unique_id='uID',
rng=(25, 75))
tessellation['area_IQR'] = area_iqr.series

100%|██████████| 2518/2518 [00:03<00:00, 685.81it/s]

f, ax = plt.subplots(figsize=(10, 10))
tessellation.plot(ax=ax, column='area_IQR', legend=True, cmap='Spectral_r')
buildings.plot(ax=ax, color="white", alpha=0.4)
ax.set_axis_off()
plt.show()


The effect of outliers has been successfully eliminated.

### Theil index

Theil index is a measure of inequality (as Gini index is). momepy is using pysal's implementation of the Theil index to do the calculation.

area_theil = momepy.Theil(tessellation, values='area',
spatial_weights=sw3,
unique_id='uID')
tessellation['area_Theil'] = area_theil.series

100%|██████████| 2518/2518 [00:05<00:00, 434.18it/s]

f, ax = plt.subplots(figsize=(10, 10))
tessellation.plot(ax=ax, column='area_Theil', scheme='fisherjenkssampled', k=10, legend=True, cmap='plasma')
buildings.plot(ax=ax, color="white", alpha=0.4)
ax.set_axis_off()
plt.show()

/Users/martin/anaconda3/envs/momepy_guide/lib/python3.7/site-packages/mapclassify/classifiers.py:482: UserWarning: Deprecated (2.1.0): Fisher_Jenks is being renamed to FisherJenks. Fisher_Jenks will be removed on 2020-01-31.
warn(self.message)


Again, the outlier effect is present. We can use the same keyword as above to limit it and measure the Theil index on the inter-decile range.

area_id_theil = momepy.Theil(tessellation, values='area',
spatial_weights=sw3,
unique_id='uID',
rng=(10, 90))
tessellation['area_Theil_ID'] = area_id_theil.series

100%|██████████| 2518/2518 [00:04<00:00, 594.41it/s]

f, ax = plt.subplots(figsize=(10, 10))
tessellation.plot(ax=ax, column='area_Theil_ID', scheme='fisherjenkssampled', k=10, legend=True, cmap='plasma')
buildings.plot(ax=ax, color="white", alpha=0.4)
ax.set_axis_off()
plt.show()

/Users/martin/anaconda3/envs/momepy_guide/lib/python3.7/site-packages/mapclassify/classifiers.py:482: UserWarning: Deprecated (2.1.0): Fisher_Jenks is being renamed to FisherJenks. Fisher_Jenks will be removed on 2020-01-31.
warn(self.message)


### Simpson's diversity index

Simpson's diversity index is one of the most used indices capturing diversity. However, we need to be careful using it for continuous values, as it depends on the binning of these values to categories. The effect of different binning could be significant. momepy uses Head/tail Breaks as a large number of morphometric characters follows power-law distribution (for which Head/tail Breaks are designed). However, you can use any binning provided by mapclassify (including user-defined). The default Head/tail Breaks:

area_simpson = momepy.Simpson(tessellation, values='area',
spatial_weights=sw3,
unique_id='uID')
tessellation['area_simpson'] = area_simpson.series

/Users/martin/anaconda3/envs/momepy_guide/lib/python3.7/site-packages/mapclassify/classifiers.py:86: DeprecationWarning: Call to deprecated function (or staticmethod) headTail_breaks. (use head_tail_breaks)
11%|█         | 278/2518 [00:00<00:04, 553.75it/s]/Users/martin/anaconda3/envs/momepy_guide/lib/python3.7/site-packages/mapclassify/classifiers.py:804: RuntimeWarning: invalid value encountered in double_scalars
100%|██████████| 2518/2518 [00:05<00:00, 488.10it/s]

f, ax = plt.subplots(figsize=(10, 10))
tessellation.plot(ax=ax, column='area_simpson', legend=True, cmap='viridis')
buildings.plot(ax=ax, color="white", alpha=0.4)
ax.set_axis_off()
plt.show()


And binning based on quantiles (into 7 bins of equal size):

tessellation['area_simpson_q7'] = momepy.Simpson(tessellation, values='area',
spatial_weights=sw3,
unique_id='uID',
binning='quantiles', k=7).series

 12%|█▏        | 308/2518 [00:00<00:04, 456.02it/s]/Users/martin/anaconda3/envs/momepy_guide/lib/python3.7/site-packages/mapclassify/classifiers.py:804: RuntimeWarning: invalid value encountered in double_scalars

f, ax = plt.subplots(figsize=(10, 10))