-
Notifications
You must be signed in to change notification settings - Fork 1
feat(A16): add pipeline #47
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had trouble running the pipeline due to API limits.
Otherwise it looks fine to me.
Just my two comments about what I would leave in. But we can also merge and discuss this with @nooralahzadeh . Therefore I approve the PR.
scripts/A16.R
Outdated
janitor::clean_names() %>% | ||
dplyr::filter( | ||
sex == "Sex - total" & | ||
citizenship_category == "Citizenship (category) - total" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would leave the citizenship category in, since it relates to the other columns, where as gender does not:
For example the out migration and in migration: it is interesting whether Swiss citizens left the country or came into it from another country or whether it was other nationalities that left or came.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, I will add it! thanks
scripts/A16.R
Outdated
-change_of_population_type, | ||
-population_on_31_december, | ||
-natural_change, | ||
-acquisition_of_swiss_citizenship |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would leave in acquisition_of_swiss_citizenship
since this is interesting. Redundance is not an issue here, where you want to translate natural language question into sql. What ever term might resemble a natural language question should stay. Also you need to consider that the data in these tables is never complete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is interesting, but the way this is coded is not intuitive.
I'll make sure to write queries that explicitely use it.
year canton acquisition_of_swiss_citi…¹ citizenship_category
<chr> <chr> <dbl> <chr>
1 1971 Aargau 0 Citizenship (catego…
2 1971 Aargau 746 Switzerland
3 1971 Aargau -746 Foreign country
4 1971 Appenzell Ausserrhoden 0 Citizenship (catego…
5 1971 Appenzell Ausserrhoden 73 Switzerland
6 1971 Appenzell Ausserrhoden -73 Foreign country
7 1971 Appenzell Innerrhoden 0 Citizenship (catego…
8 1971 Appenzell Innerrhoden 24 Switzerland
9 1971 Appenzell Innerrhoden -24 Foreign country
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello Cyril, I just looked at your PR again: Can you please rebase this before you merge? It has 26 file changes for this pipeline. That seems to much from my perspective.
…p_category. Use assignment pipe
Adds the pipeline for dataset A16 (#39): Demographic balance by canton
Some columns were dropped as they were uninformative:
Observations prior to 1981 were discarded as they only contained a subset of variables, with others set to 0.