Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

📊 Marriages and Divorces: OECD Family Database #3773

Merged
merged 7 commits into from
Dec 31, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions dag/families.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
steps:
#
# OECD Family Database
#
data://meadow/oecd/2024-12-30/family_database:
- snapshot://oecd/2024-12-30/family_database.csv
data://garden/oecd/2024-12-30/family_database:
- data://meadow/oecd/2024-12-30/family_database
data://grapher/oecd/2024-12-30/family_database:
- data://garden/oecd/2024-12-30/family_database
1 change: 1 addition & 0 deletions dag/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -784,3 +784,4 @@ include:
- dag/tourism.yml
- dag/migration.yml
- dag/equality.yml
- dag/families.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
{
"Argentina": "Argentina",
"Australia": "Australia",
"Austria": "Austria",
"Belgium": "Belgium",
"Brazil": "Brazil",
"Bulgaria": "Bulgaria",
"Canada": "Canada",
"Chile": "Chile",
"Colombia": "Colombia",
"Costa Rica": "Costa Rica",
"Croatia": "Croatia",
"Cyprus": "Cyprus",
"Czechia": "Czechia",
"Denmark": "Denmark",
"Estonia": "Estonia",
"Finland": "Finland",
"France": "France",
"Germany": "Germany",
"Greece": "Greece",
"Hungary": "Hungary",
"Iceland": "Iceland",
"India": "India",
"Indonesia": "Indonesia",
"Ireland": "Ireland",
"Israel": "Israel",
"Italy": "Italy",
"Japan": "Japan",
"Latvia": "Latvia",
"Lithuania": "Lithuania",
"Luxembourg": "Luxembourg",
"Malta": "Malta",
"Mexico": "Mexico",
"Netherlands": "Netherlands",
"New Zealand": "New Zealand",
"Norway": "Norway",
"Poland": "Poland",
"Portugal": "Portugal",
"Romania": "Romania",
"Russia": "Russia",
"Slovak Republic": "Slovakia",
"Slovenia": "Slovenia",
"South Africa": "South Africa",
"Spain": "Spain",
"Sweden": "Sweden",
"Switzerland": "Switzerland",
"United Kingdom": "United Kingdom",
"United States": "United States",
"China (People's Republic of)": "China",
"Korea": "South Korea",
"T\u00fcrkiye": "Turkey"
}
162 changes: 162 additions & 0 deletions etl/steps/data/garden/oecd/2024-12-30/family_database.meta.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
# NOTE: To learn more about the fields, hover over their names.
definitions:
common:
presentation:
topic_tags:
- Marriages & Divorces
display:
numDecimalPlaces: 1

description_producer_empl: &description_producer_empl
Employment rates for women (15–64-year-olds) with at least one child aged 0-14, with ‘children’ defined as any children aged 0-14 inclusivewho live in the same household as the woman and who are reported as the child of the woman (including both biological children and step or adoptive children). Women with children who do not live in the same household are generally not included, nor are women with children aged 15 and over regardless of whether or not the child lives in the same household and/or is dependent on the mother. Exceptions to this definition are Canada, Korea and the United States, were children aged 0-17 are included. For Australia and Japan, data cover all women aged 15 and over, and for Korea married women aged 15-54.

# Learn more about the available fields:
# http://docs.owid.io/projects/etl/architecture/metadata/reference/
dataset:
update_period_days: 365


tables:
family_database:
variables:
child_poverty_rate:
title: Child poverty rate
unit: "%"
short_unit: "%"
description_short: The percentage of children under 18 living in households with incomes below the poverty line.
description_from_producer: The child relative income poverty rate, defined as the percentage of children (0-17 year-olds) with an equivalised household disposable income (i.e. an income after taxes and transfers adjusted for household size) below the poverty threshold. The poverty threshold is set here at 50% of the median disposable income in each country.

crude_divorce_rate__divorces_per_1000_people:
title: Crude divorce rate (divorces per 1,000 people)
unit: "per 1,000 people"
short_unit: "per 1,000 people"
description_short: Number of divorces during a given year per 1,000 people.
description_from_producer: The crude divorce rate is defined as the number of divorces during a given year per 1,000 people.

crude_marriage_rate__marriages_per_1000_people:
title: Crude marriage rate (marriages per 1,000 people)
unit: "per 1,000 people"
short_unit: ""
description_short: Number of marriages during a given year per 1,000 people.
description_from_producer: The crude marriage rate is defined as the number of marriages during a given year per 1,000 people.

employment_rates__pct__for_all_mothers__15_64_year_olds__with_at_least_one_child_under_15:
title: Employment rates (%), for all mothers (15-64 year-olds) with at least one child under 15
unit: "%"
short_unit: "%"
description_short: The percentage of mothers aged 15-64 with at least one child under 15 who are employed.

employment_rates__pct__for_partnered_mothers__15_64_year_olds__with_at_least_one_child_under_15:
title: Employment rates (%), for partnered mothers (15-64 year-olds) with at least one child under 15
unit: "%"
short_unit: "%"
description_short: The percentage of partnered mothers aged 15-64 with at least one child under 15 who are employed.
description_from_producer: *description_producer_empl

employment_rates__pct__for_sole_parent_mothers__15_64_year_olds__with_at_least_one_child_under_15:
title: Employment rates (%), for sole parent mothers (15-64 year-olds) with at least one child under 15
unit: "%"
short_unit: "%"
description_short: The percentage of sole parent mothers aged 15-64 with at least one child under 15 who are employed.
description_from_producer: *description_producer_empl

length_of_paid_maternity__parental_and_home_care_leave_available_to_mothers_in_weeks:
title: Length of paid maternity, parental and home care leave available to mothers
unit: "weeks"
short_unit: ""
description_short: The number of weeks of paid maternity, parental and home care leave available to mothers.


length_of_paid_paternity_and_parental_leave_reserved_for_fathers_in_weeks:
title: Length of paid paternity and parental leave reserved for fathers
unit: "weeks"
short_unit: ""
description_short: The number of weeks of paid paternity and parental leave reserved for fathers.

proportion__pct__of_children__aged_0_14__that_live_in_households_where_all_adults_are_in_employment__working:
title: Proportion of children aged 0-14 that live in households where all adults are in employment
unit: "%"
short_unit: "%"
description_short: The percentage of children aged 0-14 that live in households where all adults are in employment.

proportion__pct__of_children__aged_0_17__living_in_other_types_of_household:
title: Proportion of children aged 0-17 living in other types of household
unit: "%"
short_unit: "%"
description_short: Refers to a situation where the child lives in a household where no adult is considered a parent.
presentation:
title_public: Share of children living without an adult parent.
display:
name: No adult is considered a parent

proportion__pct__of_children__aged_0_17__living_with_a_single_parent:
title: Proportion of children aged 0-17 living with a single parent
unit: "%"
short_unit: "%"
description_short: The percentage of children aged 0-17 living with a single parent.
presentation:
title_public: Share of children living with a single parent.
display:
name: Single parent

proportion__pct__of_children__aged_0_17__living_with_two_parents:
title: Proportion of children aged 0-17 living with two parents
unit: "%"
short_unit: "%"
description_short: The percentage of children aged 0-17 living with two parents.
presentation:
title_public: Share of children living with two parents.
display:
name: Two parents

proportion__pct__of_children_aged_0_2_enrolled_in_formal_childcare_and_pre_school:
title: Proportion of children aged 0-2 enrolled in formal childcare and pre-school
unit: "%"
short_unit: "%"
description_short: The proportion of children under 3 enrolled in formal childcare or pre-school.


share_of_births_outside_of_marriage__pct_of_all_births:
title: Share of births outside of marriage (% of all births)
unit: "%"
short_unit: "%"
description_short: The percentage of births that occur outside of marriage.

total_public_social_expenditure_on_families_as_a_pct_of_gdp:
title: Total public social expenditure on families as a % of GDP
unit: "%"
short_unit: "%"
description_short: Public spending on family benefits, measured as a percentage of GDP, covers cash transfers, childcare and family support services, and tax-based benefits, all designed to support families and children.

public_social_expenditure_on_services_and_in_kind_benefits_for_families_as_a_pct_of_gdp:
title: Public social expenditure on services and in-kind benefits for families as a % of GDP
unit: "%"
short_unit: "%"
description_short: Includes funding for childcare, early education, youth assistance, residential facilities, and family services like center-based care and home help.
description_from_producer: Includes the direct financing or subsidisation of childcare and early childhood education facilities, public childcare support through earmarked payments to parents, public spending on assistance for young people and residential facilities, and public spending on family services, including centre-based facilities and home help services for families in need.
presentation:
title_public: Services and in-kind benefits for families as a % of GDP
display:
name: Service-focused benefits

public_social_expenditure_on_cash_benefits_for_families_as_a_pct_of_gdp:
title: Public social expenditure on cash benefits for families as a % of GDP
unit: "%"
short_unit: "%"
description_short: Includes child allowances, parental leave income support, and single-parent family assistance.
description_from_producer: Includes child allowances (which are sometimes income-tested, and with payment levels that in some countries vary with the age or number of children public income support payments during periods of parental leave, and, in some countries, income support for single-parent families.
presentation:
title_public: Cash benefits for families as a % of GDP
display:
name: Cash-based benefits

public_social_expenditure_on_tax_breaks_for_families_as_a_pct_of_gdp:
title: Financial support for families provided through the tax system as a % of GDP
unit: "%"
short_unit: "%"
description_short: Includes tax exemptions, child tax allowances, and child tax credits, with any excess credits refunded in cash classified as cash transfers.
description_from_producer: This includes tax exemptions (e.g. income from child benefits that is not included in the tax base); child tax allowances (amounts for children that are deducted from gross income and are not included in taxable income), and child tax credits (amounts that are deducted from the tax liability). If any excess of the child tax credit over the liability is returned to the taxpayer in cash, then the resulting cash payment is recorded under cash transfers above (the same applies to child tax credits that are paid out in cash to recipients as a general rule).
presentation:
title_public: Tax breaks for families as a % of GDP
display:
name: Tax-based benefits
60 changes: 60 additions & 0 deletions etl/steps/data/garden/oecd/2024-12-30/family_database.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
"""Load a meadow dataset and create a garden dataset."""

from etl.data_helpers import geo
from etl.helpers import PathFinder, create_dataset

# Get paths and naming conventions for current step.
paths = PathFinder(__file__)


def run(dest_dir: str) -> None:
#
# Load inputs.
#
# Load meadow dataset.
ds_meadow = paths.load_dataset("family_database")

# Read table from meadow dataset.
tb = ds_meadow.read("family_database")

#
# Process data.
#
tb = geo.harmonize_countries(df=tb, countries_file=paths.country_mapping_path)

tb = tb.pivot(index=["country", "year"], columns="indicator", values="value").reset_index()

columns_of_interest = [
"Child poverty rate",
"Crude divorce rate (divorces per 1000 people)",
"Crude marriage rate (marriages per 1000 people)",
"Employment rates (%) for all mothers (15-64 year olds) with at least one child under 15",
"Employment rates (%) for partnered mothers (15-64 year olds) with at least one child under 15",
"Employment rates (%) for sole-parent mothers (15-64 year olds) with at least one child under 15",
"Length of paid maternity, parental and home care leave available to mothers in weeks",
"Length of paid paternity and parental leave reserved for fathers in weeks",
"Proportion (%) of children (aged 0-14) that live in households where all adults are in employment (working)",
"Proportion (%) of children (aged 0-17) living in 'other' types of household",
"Proportion (%) of children (aged 0-17) living with a single parent",
"Proportion (%) of children (aged 0-17) living with two parents",
"Proportion (%) of children aged 0-2 enrolled in formal childcare and pre-school",
"Share of births outside of marriage (% of all births)",
"Public social expenditure on services and in-kind benefits for families as a % of GDP",
"Public social expenditure on cash benefits for families as a % of GDP",
"Public social expenditure on tax breaks for families as a % of GDP",
"Total public social expenditure on families as a % of GDP",
]

tb = tb[["country", "year"] + columns_of_interest]
tb = tb.format(["country", "year"])

#
# Save outputs.
#
# Create a new garden dataset with the same metadata as the meadow dataset.
ds_garden = create_dataset(
dest_dir, tables=[tb], check_variables_metadata=True, default_metadata=ds_meadow.metadata
)

# Save changes in the new garden dataset.
ds_garden.save()
28 changes: 28 additions & 0 deletions etl/steps/data/grapher/oecd/2024-12-30/family_database.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
"""Load a garden dataset and create a grapher dataset."""

from etl.helpers import PathFinder, create_dataset

# Get paths and naming conventions for current step.
paths = PathFinder(__file__)


def run(dest_dir: str) -> None:
#
# Load inputs.
#
# Load garden dataset.
ds_garden = paths.load_dataset("family_database")

# Read table from garden dataset.
tb = ds_garden.read("family_database", reset_index=False)

#
# Save outputs.
#
# Create a new grapher dataset with the same metadata as the garden dataset.
ds_grapher = create_dataset(
dest_dir, tables=[tb], check_variables_metadata=True, default_metadata=ds_garden.metadata
)

# Save changes in the new grapher dataset.
ds_grapher.save()
36 changes: 36 additions & 0 deletions etl/steps/data/meadow/oecd/2024-12-30/family_database.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
"""Load a snapshot and create a meadow dataset."""

from etl.helpers import PathFinder, create_dataset

# Get paths and naming conventions for current step.
paths = PathFinder(__file__)


def run(dest_dir: str) -> None:
#
# Load inputs.
#
# Retrieve snapshot.
snap = paths.load_snapshot("family_database.csv")

# Load data from snapshot.
tb = snap.read(safe_types=False)
columns_to_use = ["Country", "Indicator", "TIME_PERIOD", "OBS_VALUE"]

tb = tb[columns_to_use]
tb = tb.rename(columns={"TIME_PERIOD": "year", "OBS_VALUE": "value"})

#
# Process data.
#
# Ensure all columns are snake-case, set an appropriate index, and sort conveniently.
tb = tb.format(["country", "year", "indicator"])

#
# Save outputs.
#
# Create a new meadow dataset with the same metadata as the snapshot.
ds_meadow = create_dataset(dest_dir, tables=[tb], check_variables_metadata=True, default_metadata=snap.metadata)

# Save changes in the new meadow dataset.
ds_meadow.save()
27 changes: 27 additions & 0 deletions snapshots/oecd/2024-12-30/family_database.csv.dvc
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# http://docs.owid.io/projects/etl/architecture/metadata/reference/
meta:
origin:
# Data product / Snapshot
title: OECD Family Dabatabse
description: |-
The OECD Family Database provides cross-national indicators on family outcomes and family policies across the OECD countries, its enhanced engagement partners and EU member states. It includes 70 indicators under four main dimensions: (i) structure of families, (ii) labour market position of families, (iii) public policies for families and children and (iv) child outcomes.
date_published: 2024-03-27

# Citation
producer: OECD
citation_full: |-
OECD (2024). OECD Family Database.

# Files
url_main: https://data-explorer.oecd.org/vis?tenant=archive&df[ds]=DisseminateArchiveDMZ&df[id]=DF_FAMILY&df[ag]=OECD&dq=..FAM14%2BFAM13%2BFAM15A%2BFAM15B%2BFAM10C%2BFAM10B%2BFAM10A%2BFAM9C%2BFAM9A%2BFAM9B%2BFAM7%2BFAM17%2BFAM8C%2BFAM8B%2BFAM11D%2BFAM11C%2BFAM11B%2BFAM11A%2BFAM8A%2BFAM5C%2BFAM5B%2BFAM5A%2BFAM4B%2BFAM12A%2BFAM12B%2BFAM4A%2BFAM3&pd=1960%2C2022&to[TIME_PERIOD]=false&vw=tb
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I browsed into this link and got the message

Whoops, something went wrong on our side. We are working to solve this. Please try again later.

I guess that's a temporary thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for a quick review :) I think this happens sometimes with the OECD website but the link works for me now!

url_download: https://sdmx.oecd.org/archive/rest/data/OECD,DF_FAMILY,/..FAM14+FAM13+FAM15A+FAM15B+FAM10C+FAM10B+FAM10A+FAM9C+FAM9A+FAM9B+FAM7+FAM17+FAM8C+FAM8B+FAM11D+FAM11C+FAM11B+FAM11A+FAM8A+FAM5C+FAM5B+FAM5A+FAM4B+FAM12A+FAM12B+FAM4A+FAM3?startPeriod=1960&endPeriod=2022&dimensionAtObservation=AllDimensions&format=csvfilewithlabels
date_accessed: 2024-12-30

# License
license:
name: OECD Terms of Conditions
url: https://www.oecd.org/en/about/terms-conditions.html
outs:
- md5: 13a5061e7b4794ce209b2c139c7346f9
size: 4314783
path: family_database.csv
Loading
Loading