Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supervised or Semi-Supervised machine learning for deduplication #251

Open
8 tasks done
maxkadel opened this issue Nov 14, 2024 · 0 comments
Open
8 tasks done

Supervised or Semi-Supervised machine learning for deduplication #251

maxkadel opened this issue Nov 14, 2024 · 0 comments
Assignees
Labels

Comments

@maxkadel
Copy link
Contributor

maxkadel commented Nov 14, 2024

Introduction

The team has been asked by leadership to look into possible applications of AI (writ large) in the library.

Problem Statement

Right now, the catalog includes Princeton records from Alma, and partner records from SCSB. There can be duplication between these collections, which can be confusing for users of the Catalog.

Can we identify when a SCSB record is a duplicate of a record in Alma using supervised or semi-supervised machine learning?

Acceptance criteria

  • If there is no research directory, create one.

  • Update an existing markdown document or add a new one in the research directory.

  • It has introduction: Explains the goals and purpose of this research work.

  • It lists methods: Describe what you did to research the question.

  • It has a conclusion: Includes a summary of what was discovered in the research process.

  • It has a step by step list of potential next steps that build upon the research.

  • It includes references: References any related resources that have assisted in the research process (links to other tickets, online articles etc.).

  • It includes any artifacts (charts, notes, code samples etc.) that were produced during the work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant