-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
171 world bank projects database #172
base: main
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## main #172 +/- ##
==========================================
- Coverage 75.82% 74.26% -1.56%
==========================================
Files 25 26 +1
Lines 1369 1531 +162
==========================================
+ Hits 1038 1137 +99
- Misses 331 394 +63
|
Including https for api call
@jm-rivera see updated script
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Luca!
A couple of minor changes but otherwise looks really good.
Two things to also do in relation to this PR:
- Open an issue about documentation. Based on the example I showed you for IDS, this should have very detailed documentation and examples. That can happen later though
- Open an issue about the 'additional_fields' discussion we had.
- Add the changes to the changelog to get things ready for a minor release
@jm-rivera for your review |
|
||
# check if there are missing sectors from the dict | ||
if (len(sectors_dict) == len(sectors) - 1) and (sum(sectors_dict.values()) < 100): | ||
# loop through all the available sectors | ||
for s in sector_names: | ||
# if a sectors has not been picked up it must be the missing sector | ||
if s not in sectors_dict: | ||
sectors_dict[s] = 100 - sum(sectors_dict.values()) | ||
|
||
if sum(sectors_dict.values()) != 100: | ||
raise ValueError("Sector percentages don't add up to 100%") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jm-rivera we need to revaluate the strategy here. In some instances the sectors that exist in the list sectors
they do exist in the disctionary sectors_dict
. However, the missing sector does not exist in the website or the excel download from the main projects page. see for example project P178202
- "waste management is not included in the project page. In the api response, we don't see a percentage - this is the response value 'sector5': 'Waste Management!$!11!$!WB'
.
My original assumption was that if sectors are specified their percentage allocations should add to 100%. However I have noticed this is not always the case. Project P073479
has 1 sector allocated for ICT technologies at 24%. If sector data does not have to add up to 100% there is no way to determine the percent that should be allocated to those missing sectors.
I'd suggest then that we remove this calculation and just parse the sectors which have a percent allocation
Open to suggestions.
@lpicci96 as we discussed, date filtering is not working. I think we should change to arguments for start_year:int end_year:int, which make an api call for those values in the "fiscalyear" field (i.e fiscalyear=2019^2020^2021 for example) |
New import module:
world_bank_projects
query the World Bank API and format projects data from the response json
Closes #171