-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add updated script #1155
base: master
Are you sure you want to change the base?
Add updated script #1155
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked through the code and I made a few comments but didn't try to run because the main thing I wanted to test was the update_point_data
but that doesn't seem to be finished.
from dotenv import load_dotenv | ||
load_dotenv() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are mostly not including this in the scripts, though it would be nice to come up with a more standard way of handling this.
DATASET_ID = 'ea208a8b-4559-434b-82ee-95e041596a3a' | ||
|
||
#DATASET_ID = 'ea208a8b-4559-434b-82ee-95e041596a3a' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason this is commented out?
@@ -173,16 +177,20 @@ def fetch_data(src_url): | |||
# initialize at 0 so that we can start pulling from page 1 in the loop | |||
page = 0 | |||
|
|||
# number of records to return per page | |||
limit = 5000 | |||
|
|||
# length (number of rows) of new_data | |||
# initialize at 1 so that the while loop works during first step | |||
new_count = 1 | |||
|
|||
# create an empty list to store ids of data | |||
new_ids = [] | |||
# create an empty dataframe to store data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this comment be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or updated to say we are creating an empty json array?
data_df = pd.json_normalize(accumulated_json, record_path =['data']) | ||
|
||
|
||
#TODO: cols = ['event_id_cnty','event_type', 'latitude', 'longitude'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What needs to be done here?
|
||
#TODO: New update_point_data function to update exsiting point data in in Carto table by adding new records and deleting deleted ones. Run if REFRESH_ALL_POINT_DATA is FALSE and return all records from carto as data_gdf. | ||
|
||
r = cartosql.getFields('event_id_cnty', CARTO_GEO, f='csv', post=True, user=CARTO_USER, key=CARTO_KEY) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed here?
def update_point_data(data_gdf): | ||
|
||
''' | ||
Update point data in Carto table by adding new records and deleting deleted ones | ||
INPUT data_gdf: geopandas dataframe storing the point ACLED data (geopandas dataframe) | ||
RETURN data_gdf: geopandas dataframe storing the point ACLED data (geopandas dataframe) | ||
''' | ||
|
||
# get the ids of polygons from the carto table storing administrative areas | ||
r = cartosql.getFields('event_id_cnty', CARTO_GEO, f='csv', post=True, user=CARTO_USER, key=CARTO_KEY) | ||
|
||
# turn the response into a list of ids | ||
geo_id = r.text.split('\r\n')[1:-1] | ||
|
||
# create an empty list to store the ids of rows uploaded to Carto | ||
uploaded_ids = [] | ||
|
||
|
||
|
||
|
||
return data_gdf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look like this was finished maybe?
Updated ACLED script such that the data we store in Carto is kept up to date with the API (i.e. deleted and updated).