forked from MajidBenam/seshat
-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #167 from edwardchalstrey1/clio-notebook
Add Cliopatria viewer notebook
- Loading branch information
Showing
6 changed files
with
343 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -219,5 +219,4 @@ seshat/staticfiles | |
pulumi/logs | ||
pulumi/Pulumi.seshat.yaml | ||
scripts | ||
.DS_Store | ||
*.ipynb | ||
.DS_Store |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
# Visualise Cliopatria shape dataset | ||
|
||
Cliopatria is the shape dataset used by the Seshat Global History Databank website. It can also be explored in a local Jupyter notebook running on your local machine by following these instructions. | ||
|
||
1. Ensure you have a working installation of Python 3 and Conda. If not, [download Anaconda](https://docs.anaconda.com/free/anaconda/install/index.html), which should give you both | ||
- Note: you can use a different tool for creating a Python virtual environment than conda (e.g. venv) if you prefer | ||
|
||
2. Set up the required virtual environment, install packages into it and create a jupyter kernel. | ||
- Conda example: | ||
``` | ||
conda create --name cliopatria python=3.11 | ||
conda activate cliopatria | ||
pip install -r requirements.txt | ||
python -m ipykernel install --user --name=cliopatria --display-name="Python (cliopatria)" | ||
``` | ||
- Note: This will install Geopandas 0.13.2, but if you [install from source](https://geopandas.org/en/stable/getting_started/install.html#installing-from-source) it's much faster with version 1.0.0 (unreleased on pip as of 18th June 2024) | ||
3. Open the `cliopatria.ipynb` notebook with Jupyter (or another application that can run notebooks such as VSCode). | ||
- `jupyter lab` (or `jupyter notebook`) | ||
- Note: make sure the notebook Python kernel is using the virtual environment you created (click top right) | ||
4. Follow the instructions in the notebook. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,144 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Cliopatria viewer\n", | ||
"\n", | ||
"1. To get started, download a copy of the Cliopatria dataset from here: `[INSERT LINK]`\n", | ||
"2. Move the downloaded dataset to an appropriate location on your machine and pass in the paths in the code cell below and run\n", | ||
"3. Run the subsequent cells of the notebook\n", | ||
"4. Play around with both the GeoDataFrame (gdf) and the rendered map\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 1, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"cliopatria_geojson_path = \"../data/cliopatria_composite_unique_nonsimplified.geojson_06052024/cliopatria_composite_unique_nonsimplified.geojson\"\n", | ||
"cliopatria_json_path = \"../data/cliopatria_composite_unique_nonsimplified.geojson_06052024/cliopatria_composite_unique_nonsimplified_name_years.json\"" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 3, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from map_functions import cliopatria_gdf, display_map" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 4, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# Load the Cliopatria data to a GeoDataFrame including end years for each shape\n", | ||
"gdf = cliopatria_gdf(cliopatria_geojson_path, cliopatria_json_path)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Play with the data on the map\n", | ||
"\n", | ||
"**Notes**\n", | ||
"- The slider is a bit buggy, the best way to change year is to enter a year in the box and hit enter. Use minus numbers for BCE.\n", | ||
"- The map is also displayed thrice for some reason!\n", | ||
"- Initial attempts to implement a play button similar to the website code failed, but that may not be needed here.\n", | ||
"- Click the shapes to reveal the polity display names, using the same logic used in the website code - see `map_functions.py`" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 6, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"data": { | ||
"application/vnd.jupyter.widget-view+json": { | ||
"model_id": "a95aced3593446ceb228a171178f978b", | ||
"version_major": 2, | ||
"version_minor": 0 | ||
}, | ||
"text/plain": [ | ||
"IntText(value=0, description='Year:')" | ||
] | ||
}, | ||
"metadata": {}, | ||
"output_type": "display_data" | ||
}, | ||
{ | ||
"data": { | ||
"application/vnd.jupyter.widget-view+json": { | ||
"model_id": "80c96982f4a34628b3026e9f853a6af9", | ||
"version_major": 2, | ||
"version_minor": 0 | ||
}, | ||
"text/plain": [ | ||
"IntSlider(value=0, description='Year:', max=2024, min=-3400)" | ||
] | ||
}, | ||
"metadata": {}, | ||
"output_type": "display_data" | ||
}, | ||
{ | ||
"data": { | ||
"application/vnd.jupyter.widget-view+json": { | ||
"model_id": "44078fdd8e91499bad99d7fd38b76a65", | ||
"version_major": 2, | ||
"version_minor": 0 | ||
}, | ||
"text/plain": [ | ||
"Output()" | ||
] | ||
}, | ||
"metadata": {}, | ||
"output_type": "display_data" | ||
}, | ||
{ | ||
"name": "stderr", | ||
"output_type": "stream", | ||
"text": [ | ||
"/Users/echalstrey/.pyenv/versions/3.11.4/lib/python3.11/site-packages/geopandas/geodataframe.py:1538: SettingWithCopyWarning: \n", | ||
"A value is trying to be set on a copy of a slice from a DataFrame.\n", | ||
"Try using .loc[row_indexer,col_indexer] = value instead\n", | ||
"\n", | ||
"See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n", | ||
" super().__setitem__(key, value)\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"display_year = 0\n", | ||
"display_map(gdf, display_year)" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python (cliopatria1)", | ||
"language": "python", | ||
"name": "cliopatria1" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.11.4" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 4 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,171 @@ | ||
import geopandas as gpd | ||
import json | ||
import folium | ||
import folium | ||
import ipywidgets as widgets | ||
from IPython.display import display, clear_output | ||
|
||
|
||
def convert_name(gdf, i): | ||
""" | ||
Convert the polity name of a shape in the Cliopatria dataset to what we want to display on the Seshat world map. | ||
Where gdf is the geodataframe, i is the index of the row/shape of interest. | ||
Returns the name to display on the map. | ||
Returns None if we don't want to display the shape (see comments below for details). | ||
""" | ||
polity_name = gdf.loc[i, 'Name'].replace('(', '').replace(')', '') # Remove spaces and brackets from name | ||
# If a shape has components (is a composite) we'll load the components instead | ||
# ... unless the components have their own components, then load the top level shape | ||
# ... or the shape is in a personal union, then load the personal union shape instead | ||
try: | ||
if gdf.loc[i, 'Components']: # If the shape has components | ||
if ';' not in gdf.loc[i, 'SeshatID']: # If the shape is not a personal union | ||
if len(gdf.loc[i, 'Components']) > 0 and '(' not in gdf.loc[i, 'Components']: # If the components don't have components | ||
polity_name = None | ||
except KeyError: # If the shape has no components, don't modify the name | ||
pass | ||
return polity_name | ||
|
||
|
||
def cliopatria_gdf(cliopatria_geojson_path, cliopatria_json_path): | ||
""" | ||
Load the Cliopatria shape dataset with GeoPandas and add the EndYear column to the geodataframe. | ||
""" | ||
# Load the geojson and json files | ||
gdf = gpd.read_file(cliopatria_geojson_path) | ||
with open(cliopatria_json_path, 'r') as f: | ||
name_years = json.load(f) | ||
|
||
# Create new columns in the geodataframe | ||
gdf['EndYear'] = None | ||
gdf['DisplayName'] = None | ||
|
||
# Loop through the geodataframe | ||
for i in range(len(gdf)): | ||
|
||
# Get the raw name of the current row and the name to display | ||
polity_name_raw = gdf.loc[i, 'Name'] | ||
polity_name = convert_name(gdf, i) | ||
|
||
if polity_name: # convert_name returns None if we don't want to display the shape | ||
if gdf.loc[i, 'Type'] != 'POLITY': # Add the type to the name if it's not a polity | ||
polity_name = gdf.loc[i, 'Type'] + ': ' + polity_name | ||
|
||
# Get the start year of the current row | ||
start_year = gdf.loc[i, 'Year'] | ||
|
||
# Get a sorted list of the years for that name from the geodataframe | ||
this_polity_years = sorted(gdf[gdf['Name'] == polity_name_raw]['Year'].unique()) | ||
|
||
# Get the end year for a shape | ||
# Most of the time, the shape end year is the year of the next shape | ||
# Some polities have a gap in their active years | ||
# For a shape year at the start of a gap, set the end year to be the shape year, so it doesn't cover the inactive period | ||
start_end_years = name_years[polity_name_raw] | ||
end_years = [x[1] for x in start_end_years] | ||
|
||
polity_start_year = start_end_years[0][0] | ||
polity_end_year = end_years[-1] | ||
|
||
# Raise an error if the shape year is not the start year of the polity | ||
if this_polity_years[0] != polity_start_year: | ||
raise ValueError(f'First shape year for {polity_name} is not the start year of the polity') | ||
|
||
# Find the closest higher value from end_years to the shape year | ||
next_end_year = min(end_years, key=lambda x: x if x >= start_year else float('inf')) | ||
|
||
if start_year in end_years: # If the shape year is in the list of polity end years, the start year is the end year | ||
end_year = start_year | ||
else: | ||
this_year_index = this_polity_years.index(start_year) | ||
try: # Try to use the next shape year minus one as the end year if possible, unless it's higher than the next_end_year | ||
next_shape_year_minus_one = this_polity_years[this_year_index + 1] - 1 | ||
end_year = next_shape_year_minus_one if next_shape_year_minus_one < next_end_year else next_end_year | ||
except IndexError: # Otherwise assume the end year of the shape is the end year of the polity | ||
end_year = polity_end_year | ||
|
||
# Set the EndYear column to the end year | ||
gdf.loc[i, 'EndYear'] = end_year | ||
|
||
# Set the DisplayName column to the name to display | ||
gdf.loc[i, 'DisplayName'] = polity_name | ||
|
||
return gdf | ||
|
||
|
||
def create_map(selected_year, gdf, map_output): | ||
global m | ||
m = folium.Map(location=[0, 0], zoom_start=2, tiles='https://a.basemaps.cartocdn.com/rastertiles/voyager_nolabels/{z}/{x}/{y}.png', attr='CartoDB') | ||
|
||
# Filter the gdf for shapes that overlap with the selected_year | ||
filtered_gdf = gdf[(gdf['Year'] <= selected_year) & (gdf['EndYear'] >= selected_year)] | ||
|
||
# Remove '0x' and add '#' to the start of the color strings | ||
filtered_gdf['Color'] = '#' + filtered_gdf['Color'].str.replace('0x', '') | ||
|
||
# Transform the CRS of the GeoDataFrame to WGS84 (EPSG:4326) | ||
filtered_gdf = filtered_gdf.to_crs(epsg=4326) | ||
|
||
# Define a function for the style_function parameter | ||
def style_function(feature, color): | ||
return { | ||
'fillColor': color, | ||
'color': color, | ||
'weight': 2, | ||
'fillOpacity': 0.5 | ||
} | ||
|
||
# Add the polygons to the map | ||
for _, row in filtered_gdf.iterrows(): | ||
# Convert the geometry to GeoJSON | ||
geojson = folium.GeoJson( | ||
row.geometry, | ||
style_function=lambda feature, color=row['Color']: style_function(feature, color) | ||
) | ||
|
||
# Add a popup to the GeoJSON | ||
folium.Popup(row['DisplayName']).add_to(geojson) | ||
|
||
# Add the GeoJSON to the map | ||
geojson.add_to(m) | ||
|
||
# Display the map | ||
with map_output: | ||
clear_output(wait=True) | ||
display(m) | ||
|
||
|
||
def display_map(gdf, display_year): | ||
|
||
# Create a text box for input | ||
year_input = widgets.IntText( | ||
value=display_year, | ||
description='Year:', | ||
) | ||
|
||
# Define a function to be called when the value of the text box changes | ||
def on_value_change(change): | ||
create_map(change['new'], gdf, map_output) | ||
|
||
# Create a slider for input | ||
year_slider = widgets.IntSlider( | ||
value=display_year, | ||
min=gdf['Year'].min(), | ||
max=gdf['EndYear'].max(), | ||
description='Year:', | ||
) | ||
|
||
# Link the text box and the slider | ||
widgets.jslink((year_input, 'value'), (year_slider, 'value')) | ||
|
||
# Create an output widget | ||
map_output = widgets.Output() | ||
|
||
# Attach the function to the text box | ||
year_input.observe(on_value_change, names='value') | ||
|
||
# Display the widgets | ||
display(year_input, year_slider, map_output) | ||
|
||
# Call create_map initially to display the map | ||
create_map(display_year, gdf, map_output) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
jupyter==1.0.0 | ||
ipykernel==6.29.3 | ||
geopandas==0.13.2 | ||
contextily==1.6.0 | ||
folium==0.16.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters