Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Match authors across papers with different forms of name #24

Open
anjackson opened this issue Oct 22, 2024 · 0 comments
Open

Match authors across papers with different forms of name #24

anjackson opened this issue Oct 22, 2024 · 0 comments

Comments

@anjackson
Copy link
Contributor

Some authors are get treated as separate because of small differences in form of name.

  • e.g. I turn up as 'Andrew Jackson' and 'Andrew N. Jackson'
  • Jack O'Sullivan reports three different forms

This is quite difficult to fix in general, but can be fixed manually, given a slightly richer data model and some clarify over where the 'master' copy of this data should reside.

One alternative measure would be to have a simple 'authority file' that matched specific names to a canonical form. This doesn't scale very well with the number of authors (as it can't handle different people having the same name), and unless the data model is modified, would also force the name itself into canonical form and away from what is recorded as being on the publication. The advantage would be that this can be deployed as a 'patch' over the source data, and so chained into the analysis process as an overlay rather than a fork.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant