feat(A16): add pipeline #47

cmdoret · 2023-08-02T15:20:08Z

Adds the pipeline for dataset A16 (#39): Demographic balance by canton

Some columns were dropped as they were uninformative:

Acquisition of swiss citizenship: always had a value of 0
change of population type: already accounted for in "immigration" and "emigration".
natural change: can easily be obtained using births - deaths

Observations prior to 1981 were discarded as they only contained a subset of variables, with others set to 0.

> dplyr::glimpse(ds$data)
Rows: 1,107
Columns: 12
$ year                             <chr> "1981", "1981", "1981", "1981", "1981…
$ total_population                 <dbl> 6335243, 1120815, 911016, 294421, 335…
$ births                           <dbl> 73747, 12325, 10599, 3747, 438, 1358,…
$ deaths                           <dbl> 59763, 10283, 8862, 2693, 291, 846, 2…
$ immigration                      <dbl> 121420, 23883, 11544, 4025, 421, 1265…
$ in_migration_from_another_canton <dbl> 134359, 17791, 13809, 5965, 477, 2719…
$ emigration                       <dbl> 97743, 19791, 10205, 2980, 361, 911, …
$ out_migration_to_another_canton  <dbl> 134359, 20900, 13148, 6107, 736, 2663…
$ net_migration                    <dbl> 23677, 983, 2000, 903, -199, 410, 293…
$ statistical_adjustment           <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ population_change                <dbl> 37661, 3025, 3737, 1957, -52, 922, 46…
$ spatialunit_uid                  <chr> "0_CH", "1_A.ADM1", "2_A.ADM1", "3_A.…

sabinem

I had trouble running the pipeline due to API limits.

Otherwise it looks fine to me.

Just my two comments about what I would leave in. But we can also merge and discuss this with @nooralahzadeh . Therefore I approve the PR.

sabinem · 2023-08-07T11:06:11Z

scripts/A16.R

+  janitor::clean_names() %>%
+  dplyr::filter(
+    sex == "Sex - total" &
+      citizenship_category == "Citizenship (category) - total"


I think I would leave the citizenship category in, since it relates to the other columns, where as gender does not:

For example the out migration and in migration: it is interesting whether Swiss citizens left the country or came into it from another country or whether it was other nationalities that left or came.

True, I will add it! thanks

sabinem · 2023-08-07T11:10:04Z

scripts/A16.R

+    -change_of_population_type,
+    -population_on_31_december,
+    -natural_change,
+    -acquisition_of_swiss_citizenship


I would leave in acquisition_of_swiss_citizenship since this is interesting. Redundance is not an issue here, where you want to translate natural language question into sql. What ever term might resemble a natural language question should stay. Also you need to consider that the data in these tables is never complete.

This is interesting, but the way this is coded is not intuitive.

I'll make sure to write queries that explicitely use it.

year canton acquisition_of_swiss_citi…¹ citizenship_category <chr> <chr> <dbl> <chr> 1 1971 Aargau 0 Citizenship (catego… 2 1971 Aargau 746 Switzerland 3 1971 Aargau -746 Foreign country 4 1971 Appenzell Ausserrhoden 0 Citizenship (catego… 5 1971 Appenzell Ausserrhoden 73 Switzerland 6 1971 Appenzell Ausserrhoden -73 Foreign country 7 1971 Appenzell Innerrhoden 0 Citizenship (catego… 8 1971 Appenzell Innerrhoden 24 Switzerland 9 1971 Appenzell Innerrhoden -24 Foreign country

sabinem

Hello Cyril, I just looked at your PR again: Can you please rebase this before you merge? It has 26 file changes for this pipeline. That seems to much from my perspective.

…p_category. Use assignment pipe

cmdoret changed the title ~~More feat(A16): add pipeline~~ feat(A16): add pipeline Aug 2, 2023

cmdoret linked an issue Aug 2, 2023 that may be closed by this pull request

New dataset: A16 Demographic balance by canton #39

Open

cmdoret self-assigned this Aug 2, 2023

cmdoret added the dataset Proposal for a new dataset label Aug 2, 2023

cmdoret requested a review from sabinem August 2, 2023 15:21

sabinem approved these changes Aug 7, 2023

View reviewed changes

cmdoret added 3 commits August 7, 2023 15:32

feat(A16): add pipeline

c00d1f6

refactor(A16): rename net migration col

b99eec4

fix(A16): simplify column name

0f14d4c

sabinem suggested changes Aug 7, 2023

View reviewed changes

cmdoret added 2 commits August 7, 2023 16:41

refactor(A16): move to pipelines dir

71f718d

refactor(A16): use postgres_export attribute

dae425a

cmdoret force-pushed the ds_a16 branch from a211fd7 to dae425a Compare August 7, 2023 14:43

sabinem approved these changes Aug 7, 2023

View reviewed changes

feat(A16): add acquisition_of_swiss_citizenship and retain citizenshi…

84d6a34

…p_category. Use assignment pipe

cmdoret merged commit 0b8715c into main Aug 7, 2023

cmdoret deleted the ds_a16 branch August 30, 2023 13:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(A16): add pipeline #47

feat(A16): add pipeline #47

cmdoret commented Aug 2, 2023

sabinem left a comment

sabinem Aug 7, 2023

cmdoret Aug 7, 2023

sabinem Aug 7, 2023

cmdoret Aug 7, 2023 •

edited

Loading

sabinem left a comment

feat(A16): add pipeline #47

feat(A16): add pipeline #47

Conversation

cmdoret commented Aug 2, 2023

sabinem left a comment

Choose a reason for hiding this comment

sabinem Aug 7, 2023

Choose a reason for hiding this comment

cmdoret Aug 7, 2023

Choose a reason for hiding this comment

sabinem Aug 7, 2023

Choose a reason for hiding this comment

cmdoret Aug 7, 2023 • edited Loading

Choose a reason for hiding this comment

sabinem left a comment

Choose a reason for hiding this comment

cmdoret Aug 7, 2023 •

edited

Loading