Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing - Authority record samples #1431

Open
4 tasks
ahafele opened this issue Nov 12, 2024 · 2 comments
Open
4 tasks

Testing - Authority record samples #1431

ahafele opened this issue Nov 12, 2024 · 2 comments
Assignees

Comments

@ahafele
Copy link

ahafele commented Nov 12, 2024

  • Set up ebsco tools/airflow
  • Test load authority record samples
    • new - see comment below
    • updates

Some records have been loaded to stage already

Sample files here

@ahafele ahafele added Authorities data export Related to Data Export out of FOLIO to external vendors and removed data export Related to Data Export out of FOLIO to external vendors labels Nov 12, 2024
@ahafele
Copy link
Author

ahafele commented Nov 15, 2024

Documenting here differences in ebsco loaded authority records vs. data import

Jeremy posted a record using the ebsco tools to both /authority-storage/authorities and SRS
GET authority-storage/authorities/71b2cacf-4f96-5154-ae35-6e6a30986683 results in

{
    "id": "71b2cacf-4f96-5154-ae35-6e6a30986683",
    "_version": 0,
    "source": "MARC",
    "personalName": "Grimm, Wilhelm, 1786-1859",
    "sftPersonalName": [
        "Grim, Vilkhelm, 1786-1859",
        "Grimm, Guglielmo, 1786-1859",
        "Grimm, Vilʹgelʹm Karl, 1786-1859",
        "Grimm, Wilhelm Karl, 1786-1859",
        "Grimm Brothers",
        "Brothers Grimm",
        "Brüder Grimm",
        "Bratʹi͡a Grimm",
        "Braty Grimm",
        "Krim eghbayrner",
        "Гримм, Вильгельм, 1786-1859",
        "ברודער גרים",
        "גרים וילהלם",
        "גרים, ווילהלם",
        "גרים, וילהלם",
        "גרים, וילהלם, 1786־1859",
        "גרים, וילהלם, 1859־1786"
    ],
    "sftPersonalNameTitle": [
        "Grim, Vilkhelm, 1786-1859",
        "Grimm, Guglielmo, 1786-1859",
        "Grimm, Vilʹgelʹm Karl, 1786-1859",
        "Grimm, Wilhelm Karl, 1786-1859",
        "Grimm Brothers",
        "Brothers Grimm",
        "Brüder Grimm",
        "Bratʹi͡a Grimm",
        "Braty Grimm",
        "Krim eghbayrner",
        "Гримм, Вильгельм, 1786-1859",
        "ברודער גרים",
        "גרים וילהלם",
        "גרים, ווילהלם",
        "גרים, וילהלם",
        "גרים, וילהלם, 1786־1859",
        "גרים, וילהלם, 1859־1786"
    ],
    "identifiers": [
        {
            "value": "n  78095679",
            "identifierTypeId": "c858e4f2-2b6b-4385-842b-60732ee14abb"
        },
        {
            "value": "(OCoLC)oca00230755",
            "identifierTypeId": "7e591197-f335-4afb-bc6d-a6d76ca3bace"
        },
        {
            "value": "(DLC)n  78095679",
            "identifierTypeId": "7e591197-f335-4afb-bc6d-a6d76ca3bace"
        }
    ],
    "notes": [
        {
            "noteTypeId": "76c74801-afec-45a0-aad7-3ff23591e147",
            "note": "Machine-derived non-Latin script reference project."
        },
        {
            "noteTypeId": "76c74801-afec-45a0-aad7-3ff23591e147",
            "note": "Non-Latin script references not evaluated."
        },
        {
            "noteTypeId": "76c74801-afec-45a0-aad7-3ff23591e147",
            "note": "Brothers Grimm/Brüder Grimm not considered joint pseudonym of Wilhelm Grimm and Jacob Grimm; evidence indicates that original works were not issued under this name."
        }
    ],
    "sourceFileId": "af045f2f-e851-4613-984c-4bc13430454a",
    "naturalId": "n78095679",
    "metadata": {
        "createdDate": "2024-11-14T17:55:50.85306Z",
        "createdByUserId": "58d0aaf6-dcda-4d5e-92da-012e6b7dd766",
        "updatedDate": "2024-11-14T17:55:50.85306Z",
        "updatedByUserId": "58d0aaf6-dcda-4d5e-92da-012e6b7dd766"
    }
}

I loaded the same record to data import with the default authorities profile and did a
GET authority-storage/authorities/fa83bb1b-23af-4b44-b5ed-33a918e04594

{
    "id": "fa83bb1b-23af-4b44-b5ed-33a918e04594",
    "_version": 0,
    "source": "MARC",
    "personalName": "Grimm, Wilhelm, 1786-1859",
    "sftPersonalName": [
        "Grim, Vilkhelm, 1786-1859",
        "Grimm, Guglielmo, 1786-1859",
        "Grimm, Vilʹgelʹm Karl, 1786-1859",
        "Grimm, Wilhelm Karl, 1786-1859",
        "Grimm Brothers",
        "Brothers Grimm",
        "Brüder Grimm",
        "Bratʹi͡a Grimm",
        "Braty Grimm",
        "Krim eghbayrner",
        "Гримм, Вильгельм, 1786-1859",
        "ברודער גרים",
        "גרים וילהלם",
        "גרים, ווילהלם",
        "גרים, וילהלם",
        "גרים, וילהלם, 1786־1859",
        "גרים, וילהלם, 1859־1786"
    ],
    "subjectHeadings": "a",
    "identifiers": [
        {
            "value": "n  78095679",
            "identifierTypeId": "5d164f4b-0b15-4e42-ae75-cfcf85318ad9"
        },
        {
            "value": "n  78095679",
            "identifierTypeId": "c858e4f2-2b6b-4385-842b-60732ee14abb"
        },
        {
            "value": "(OCoLC)oca00230755",
            "identifierTypeId": "fe19bae4-da28-472b-be90-d442e2428ead"
        }
    ],
    "notes": [
        {
            "noteTypeId": "76c74801-afec-45a0-aad7-3ff23591e147",
            "note": "Machine-derived non-Latin script reference project."
        },
        {
            "noteTypeId": "76c74801-afec-45a0-aad7-3ff23591e147",
            "note": "Non-Latin script references not evaluated."
        },
        {
            "noteTypeId": "76c74801-afec-45a0-aad7-3ff23591e147",
            "note": "Brothers Grimm/Brüder Grimm not considered joint pseudonym of Wilhelm Grimm and Jacob Grimm; evidence indicates that original works were not issued under this name."
        }
    ],
    "sourceFileId": "af045f2f-e851-4613-984c-4bc13430454a",
    "naturalId": "n78095679",
    "metadata": {
        "createdDate": "2024-11-14T18:12:52.88211Z",
        "createdByUserId": "ffba9979-3f5d-4aac-a74f-18218dd2573f",
        "updatedDate": "2024-11-14T18:12:52.88211Z",
        "updatedByUserId": "ffba9979-3f5d-4aac-a74f-18218dd2573f"
    }
}

Linking worked for the data import record and the ebsco loaded record.
Interestingly the esbco loaded record isn't not reflected as LCSH in the thesaurus facet, but I think this is unrelated since that uses the 008.
Other differences include the identifierTypeId for the OCLC numbers and ebsco includes sftPersonalNameTitle but the data loaded record is sftPersonalName.

@ahafele
Copy link
Author

ahafele commented Nov 18, 2024

The $t is being ignored in the ebsco handling of 4xx fields. There is an open ticket from 2022 in the folio-migration-tools to support requiredSubfield

I haven't figured out exactly what is going on with the ID differences but one is that the migration tools are concatenating the 001 and 003
https://github.com/FOLIO-FSE/folio_migration_tools/blob/719e0c0a4175a1716d58eb768c[…]folio_migration_tools/marc_rules_transformation/hrid_handler.py

Something is missing regarding the 008 handling as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants