Skip to content
This repository has been archived by the owner on May 8, 2024. It is now read-only.

Adding a metadata file: start-end dates of riksdag sessions #356

Merged
merged 4 commits into from
Dec 15, 2023

Conversation

BobBorges
Copy link
Collaborator

The file was curated from the Wikipedia article Lista över svenska riksdagar. Columns are

Columns:

  • name: The benämning of a session
  • parliamant_year: the parliament year (corresponds to sub directories under corpus/protocols/)
  • start: start date
  • end: end date
  • protocol_specifier: corresponds to the third element in the protocol names in bicameral period, e.g. höst or urtima

We know this list from Wikipedia isn't perfect, so it's open here as a PR for full scrutiny.

For example, @Lottabrorsson wrote in email:

"""
The change for the Riksmöte to being the whole year, autumn to autumn without a break during the summer, took place already in 1995. Attached is a comment regarding Riksdagsordningen, that this concerns.

And for example, year 1998 – the last meeting before the summer was June 25, not June 10, although now it is wrong to say that it ended before the summer.

And I think that is because the last protokoll on the web before the summer is from June 10. (And that means that the protocol 1997/98:123 is missing.)
"""
RO_kap1_kommentar.pdf

I also compared this list to the actual dates indicated in protocol metadata, which reveals, e.g. that the 201718 parliament year probably has an incorrect end date. I'm attaching a json file with the results of this comparison (again Github doesn't like json files, so just remove the .txt. extension), which is sorted by the number of protocols that have a docDate in the metadata outside the range of dates in the proposed metadata list.

oor.json.txt

Note that a date out of range isn't necessarily a problem with the list. Example: prot-1914-b-ak--27.xml, in addition to the actual date of the document, our find_dates.py script picked up an out of range date that was quoted in the protocol and added it to the protocol metadata. In this case, the out of range date indicates a potential problem with the protocol rather than the list of dates. (Do we want to remove such dates?)

riksdag_start-end_quoted-date

riksdag_start-end_quoted-date-prot

@fredrik1984
Copy link
Collaborator

Ok, great! I suggest that if you @Lottabrorsson has any questions about this work you can ask Bob

@BobBorges
Copy link
Collaborator Author

#342

@MansMeg
Copy link
Collaborator

MansMeg commented Sep 18, 2023

Nice, I would order the file according to the dates, i.e. the latest Riksdag as the first value. Then it will be easier to check the file as well.

@MansMeg
Copy link
Collaborator

MansMeg commented Sep 28, 2023

We are here waiting for @Lottabrorsson on checking that the dates in the file is correct.

@fredrik1984
Copy link
Collaborator

@Lottabrorsson I found this table in Stjernquist 1966 vol. 4 – all start/end dates for riksdag meeting years between 1933 and 1965

Unknown

@fredrik1984
Copy link
Collaborator

@Lottabrorsson has done a great list of start/end dates of all riksmöten since 1867. See the list here:

Riksmöten_def.xlsx

@BobBorges
Copy link
Collaborator Author

Excellent, and thanks for posting the file here @fredrik1984. I'll compare Lotta's work with what we came up with from the protocols and give some signal here when it's time to merge.

@fredrik1984
Copy link
Collaborator

fredrik1984 commented Oct 24, 2023

#288 and probably related to #254 too

@BobBorges
Copy link
Collaborator Author

I've been looking carefully at @Lottabrorsson's file, and there are some points to discuss:

  • There are differences between the two lists. Most of them are a matter of a day or two, but others are more severe -- e.g. 2006--2009 months are off, or 1947, half a year. Should we double check these?
    lotta-wiki_discrepancies.csv

  • in our protocols and the wiki list we have a 1980 urtima session and 1919 lagtima/urtima, but it's not on Lotta's list

  • Just eyballing it, Lotta's dates seem more restrictive (start later, end earlier), which has consequence if we test all protocols against the lists -- of 18,000 dates 820 are out of range according to Lotta, 683 out of range according to wikipedia (N.b.: there are valid reasons for a date to be out of range, but we might want to look into some of them if that 3--5% is concerning)
    prot-oor_cf-lotta.json
    prot-oor_cf-wiki.json

@Lottabrorsson
Copy link
Collaborator

I've been looking carefully at @Lottabrorsson's file, and there are some points to discuss:

  • There are differences between the two lists. Most of them are a matter of a day or two, but others are more severe -- e.g. 2006--2009 months are off, or 1947, half a year. Should we double check these?
    lotta-wiki_discrepancies.csv
  • in our protocols and the wiki list we have a 1980 urtima session and 1919 lagtima/urtima, but it's not on Lotta's list
  • Just eyballing it, Lotta's dates seem more restrictive (start later, end earlier), which has consequence if we test all protocols against the lists -- of 18,000 dates 820 are out of range according to Lotta, 683 out of range according to wikipedia (N.b.: there are valid reasons for a date to be out of range, but we might want to look into some of them if that 3--5% is concerning)
    prot-oor_cf-lotta.json
    prot-oor_cf-wiki.json

Sorry @BobBorges! I am just now double checking some of the dates. I'll get back to you with some changes. Some times it is differents dates for the two chambers. And some times Stjernquist has a different day compered to the protocols. It´s not easy :)

@BobBorges
Copy link
Collaborator Author

Sorry @BobBorges!

No worries! I know it's tricky :)

@Lottabrorsson
Copy link
Collaborator

Sorry @BobBorges!

No worries! I know it's tricky :)

I´ll soon send you an email about this and a new list.

@fredrik1984
Copy link
Collaborator

This also relates to #416

Copy link
Collaborator

@ninpnin ninpnin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change all rows "fk-ak" into two rows, one with "fk" and one with "ak"

@BobBorges
Copy link
Collaborator Author

Change all rows "fk-ak" into two rows, one with "fk" and one with "ak"

fixed in c0f7a4b

@BobBorges BobBorges mentioned this pull request Dec 14, 2023
@MansMeg
Copy link
Collaborator

MansMeg commented Dec 14, 2023

Looks like unix dates:

parliament_year,specifier,chamber,start,end
1867,,fk,-12037,-11916
1867,,ak,-12037,-11916

@BobBorges
Copy link
Collaborator Author

Looks like unix dates:

parliament_year,specifier,chamber,start,end
1867,,fk,-12037,-11916
1867,,ak,-12037,-11916

c725272

@MansMeg MansMeg requested a review from ninpnin December 14, 2023 12:42
@MansMeg
Copy link
Collaborator

MansMeg commented Dec 14, 2023

Is there anywhere we can state that this dataset has been created by experts (Lotta)?

@BobBorges
Copy link
Collaborator Author

readme?

@MansMeg
Copy link
Collaborator

MansMeg commented Dec 14, 2023

Maybe add a source column with a reference and then adda ref in the bibtex?

@MansMeg
Copy link
Collaborator

MansMeg commented Dec 15, 2023

Ull add this as a separate issue, so Im happy now.

Copy link
Collaborator

@ninpnin ninpnin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@MansMeg MansMeg merged commit af48fbb into dev Dec 15, 2023
3 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants