Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Cliopatria loading and visualiser notebook #22

Merged
merged 77 commits into from
Jul 12, 2024
Merged
Show file tree
Hide file tree
Changes from 62 commits
Commits
Show all changes
77 commits
Select commit Hold shift + click to select a range
5c9631e
dont need to calculate end year
edwardchalstrey1 Jul 9, 2024
b58fde4
rename Year to FromYear and EndYear to ToYear
edwardchalstrey1 Jul 9, 2024
a74360e
change Year to FromYear
edwardchalstrey1 Jul 9, 2024
b90b8dd
remove end year generation
edwardchalstrey1 Jul 9, 2024
2e1aa1c
remove polity years dict
edwardchalstrey1 Jul 9, 2024
3d180a0
remove loading of name_years
edwardchalstrey1 Jul 9, 2024
469d752
get polity_colour_key as well as polity_name for each polity
edwardchalstrey1 Jul 9, 2024
365fa67
update convert_name docstring
edwardchalstrey1 Jul 9, 2024
ec5db36
add distinctipy to requirements
edwardchalstrey1 Jul 9, 2024
10150ee
add colours to gdf
edwardchalstrey1 Jul 9, 2024
6049800
remove needless hex conversion
edwardchalstrey1 Jul 9, 2024
1829e20
update Cliopatria notebook
edwardchalstrey1 Jul 9, 2024
ecfbd73
refactor map functions
edwardchalstrey1 Jul 9, 2024
c3b0afc
Ignore rows where the DisplayName is None
edwardchalstrey1 Jul 9, 2024
9093da3
filter unions in the same way done in the world map template
edwardchalstrey1 Jul 9, 2024
fc9db8c
comment
edwardchalstrey1 Jul 9, 2024
b366e07
use geopandas
edwardchalstrey1 Jul 9, 2024
977ab7a
add 'PolityStartYear' and 'PolityEndYear'
edwardchalstrey1 Jul 9, 2024
4c08e11
same as prev commit
edwardchalstrey1 Jul 9, 2024
065b818
update notebook
edwardchalstrey1 Jul 9, 2024
cc624d5
add stdout messages
edwardchalstrey1 Jul 9, 2024
a01dbb5
refactor cliopatria_gdf as a single func
edwardchalstrey1 Jul 9, 2024
81dd3b2
double quotes for f strings
edwardchalstrey1 Jul 9, 2024
430af5c
use geojson directly
edwardchalstrey1 Jul 9, 2024
ee16480
Merge branch 'dev' into cliopatria-end-years
edwardchalstrey1 Jul 10, 2024
6243748
add logger
edwardchalstrey1 Jul 10, 2024
4c7e6d6
use stdout not log
edwardchalstrey1 Jul 10, 2024
571bfc9
remove try except
edwardchalstrey1 Jul 10, 2024
83599f7
add cliopatria script
edwardchalstrey1 Jul 10, 2024
83c786a
update doc
edwardchalstrey1 Jul 10, 2024
f8114f0
use print statements
edwardchalstrey1 Jul 10, 2024
db0c7fe
remove notebook functions moved to cliopatria folder
edwardchalstrey1 Jul 10, 2024
c014a40
update notebook instructions
edwardchalstrey1 Jul 10, 2024
8751d1f
update notebook instructions
edwardchalstrey1 Jul 10, 2024
6b91264
notebook works
edwardchalstrey1 Jul 10, 2024
f93cf11
use properties of the loaded json
edwardchalstrey1 Jul 10, 2024
ed32c81
add word to error msg typo
edwardchalstrey1 Jul 10, 2024
1f3c2fd
fix how to get geometry
edwardchalstrey1 Jul 10, 2024
3b8704b
Ignore rows with None DisplayName
edwardchalstrey1 Jul 10, 2024
6523c54
update VideoShapefile model to remove polity as separate field from n…
edwardchalstrey1 Jul 11, 2024
62b33eb
simplify cliopatria_gdf
edwardchalstrey1 Jul 11, 2024
5619443
filter based on MemberOf
edwardchalstrey1 Jul 11, 2024
6fc6653
separate polities and components view
edwardchalstrey1 Jul 11, 2024
dcc7b91
fix display_map toggle to set year
edwardchalstrey1 Jul 11, 2024
08a4cc7
ignore personal unions
edwardchalstrey1 Jul 11, 2024
a9fab1d
Revert "update VideoShapefile model to remove polity as separate fiel…
edwardchalstrey1 Jul 11, 2024
59b5d7a
load all polities to db and deprecate polity field
edwardchalstrey1 Jul 11, 2024
139400d
no personal unions
edwardchalstrey1 Jul 11, 2024
abb3803
add components and member_of fields to the VideoShapefile table
edwardchalstrey1 Jul 11, 2024
5e44241
update model docstring
edwardchalstrey1 Jul 11, 2024
41bc573
update command script with new fields
edwardchalstrey1 Jul 11, 2024
0daf2ca
increase charfield maxlength
edwardchalstrey1 Jul 11, 2024
a933a9b
increase charfield to 500 and make it as a single new migration
edwardchalstrey1 Jul 11, 2024
cb2f9df
include new field in views
edwardchalstrey1 Jul 11, 2024
e6a0929
filter by not a component
edwardchalstrey1 Jul 11, 2024
f5b66f2
add toggle for colour by polities or components
edwardchalstrey1 Jul 11, 2024
250db64
removed deprecated union logic
edwardchalstrey1 Jul 11, 2024
4854b81
legend works like world map
edwardchalstrey1 Jul 11, 2024
f12d1bd
move opacity below "colour by"
edwardchalstrey1 Jul 11, 2024
70b9378
hide colour by when polity not selected
edwardchalstrey1 Jul 11, 2024
1c0c603
fix indent
edwardchalstrey1 Jul 11, 2024
7883344
add components to polity map
edwardchalstrey1 Jul 11, 2024
d832611
use name not polity field of videoshapefile
edwardchalstrey1 Jul 12, 2024
ee7463a
remove polity field from view and map_functions
edwardchalstrey1 Jul 12, 2024
c147204
remove polity field which was deprecated
edwardchalstrey1 Jul 12, 2024
e19566a
rename dropdown
edwardchalstrey1 Jul 12, 2024
430e133
show components by default for polity map
edwardchalstrey1 Jul 12, 2024
53bdfd2
add space
edwardchalstrey1 Jul 12, 2024
1f9e692
add docstrings for notebook functions
edwardchalstrey1 Jul 12, 2024
89f8b5e
reduce indent
edwardchalstrey1 Jul 12, 2024
a5d592a
unindent
edwardchalstrey1 Jul 12, 2024
4a5df21
add docstrings
edwardchalstrey1 Jul 12, 2024
6ef49a0
add docstring
edwardchalstrey1 Jul 12, 2024
c45ea48
remove polity field from videoshapefile in tests
edwardchalstrey1 Jul 12, 2024
74f523b
update test shape name
edwardchalstrey1 Jul 12, 2024
454af3b
add components and member_of fields to test db setup and expected res…
edwardchalstrey1 Jul 12, 2024
6ffacae
remove docs note on shape simplification
edwardchalstrey1 Jul 12, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 64 additions & 0 deletions cliopatria/convert_data.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
import geopandas as gpd
from distinctipy import get_colors, get_hex
import sys

def cliopatria_gdf(gdf):
"""
Load the Cliopatria shape dataset with GeoPandas, process names and colors efficiently.
"""

# Generate DisplayName for each shape based on the 'Name' field
gdf['DisplayName'] = gdf['Name'].str.replace('[()]', '', regex=True)

# Add type prefix to DisplayName where type is not 'POLITY'
gdf.loc[gdf['Type'] != 'POLITY', 'DisplayName'] = gdf['Type'].str.capitalize() + ': ' + gdf['DisplayName']

print(f"Generated shape names for {len(gdf)} shapes.")
print("Assigning colours to shapes...")

# Use DistinctiPy package to assign a colour based on the DisplayName field
colour_keys = gdf['DisplayName'].unique()
colours = [get_hex(col) for col in get_colors(len(colour_keys))]
colour_mapping = dict(zip(colour_keys, colours))

# Map colors to a new column
gdf['Color'] = gdf['DisplayName'].map(colour_mapping)

print(f"Assigned colours to {len(gdf)} shapes.")
print("Determining polity start and end years...")

# Add a column called 'PolityStartYear' to the GeoDataFrame which is the minimum 'FromYear' of all shapes with the same 'Name'
gdf['PolityStartYear'] = gdf.groupby('Name')['FromYear'].transform('min')

# Add a column called 'PolityEndYear' to the GeoDataFrame which is the maximum 'ToYear' of all shapes with the same 'Name'
gdf['PolityEndYear'] = gdf.groupby('Name')['ToYear'].transform('max')

print(f"Determined polity start and end years for {len(gdf)} shapes.")

return gdf


# Check if a GeoJSON file path was provided as a command line argument
if len(sys.argv) < 2:
print("Please provide the path to the GeoJSON file as a command line argument.")
sys.exit(1)

geojson_path = sys.argv[1]

try:
gdf = gpd.read_file(geojson_path)
except Exception as e:
print(f"Error loading GeoJSON file: {str(e)}")
sys.exit(1)

# Call the cliopatria_gdf function to process the GeoDataFrame
processed_gdf = cliopatria_gdf(gdf)

# Save the processed GeoDataFrame as a new GeoJSON file
output_path = geojson_path.replace('.geojson', '_seshat_processed.geojson')
try:
processed_gdf.to_file(output_path, driver='GeoJSON')
print(f"Processed GeoDataFrame saved to: {output_path}")
except Exception as e:
print(f"Error saving processed GeoDataFrame: {str(e)}")
sys.exit(1)
12 changes: 9 additions & 3 deletions docs/source/getting-started/setup/spatialdb.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,14 +25,20 @@ Cliopatria shape dataset
-------------------------

..
TODO: Add a link here to the published Clipatria dataset
TODO: Add a link here to the published Cliopatria dataset

1. Download and unzip the Cliopatria dataset.
2. Populate ``core_videoshapefile`` table using the following command:
2. Update the Cliopatria GeoJSON file with colours and other properties required by Seshat:

.. code-block:: bash

$ python cliopatria/convert_data.py /path/to/cliopatria.geojson
edwardchalstrey1 marked this conversation as resolved.
Show resolved Hide resolved
Note: this will create a new file with the same name but with the suffix "_seshat_processed.geojson"
3. Populate ``core_videoshapefile`` table using the following command:

.. code-block:: bash

$ python manage.py populate_videodata /path/to/data
$ python manage.py populate_videodata /path/to/cliopatria_seshat_processed.geojson

Note: if you wish to further simplify the Cliopatria shape resolution used by the world map after loading it into the database, open ``seshat/apps/core/management/commands/populate_videodata.py`` and modify the SQL query under the comment: "Adjust the tolerance param of ST_Simplify as needed"
edwardchalstrey1 marked this conversation as resolved.
Show resolved Hide resolved

Expand Down
Loading
Loading