Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-organize Sources for Clarity #174

Open
kkdavis14 opened this issue Dec 9, 2024 · 6 comments
Open

Re-organize Sources for Clarity #174

kkdavis14 opened this issue Dec 9, 2024 · 6 comments
Assignees
Labels
enhancement New feature to add to the code Low Low priority task

Comments

@kkdavis14
Copy link
Contributor

kkdavis14 commented Dec 9, 2024

Problem:

  • It's unclear what the difference between "authorities" and "general" is.
  • In a sense, all external sources are "authorities".
  • "Libraries" is only non-US libraries and doesn't include e.g. Library of Congress.
  • "Museums" is unclear, when there are multiple "museum" sources in LUX.
  • The primary difference between sources is internal and external.
/ 
|---sources/
|    |---archives/
|    |---authorities/
|    |---general/
|    |---libraries/
|    |---lux/
|    |---museums/
|    |---yale/
...

Solution

In sources, have either two top level folders internal and external with folders for each source underneath, e.g. lc, or simply have all source folders at the top level of sources.

/ 
|---sources/
|    |---internal/
|    |    |---yuag/
|    |    |---ycba/
|    |    |---ypm/
|    |    |---ipch/
|    |    |---yul/
|    |    |---pmc/
|    |    |---lux/
|    |---external/
|   |    |---getty/
|   |    |---lc/
|   |    |---wikidata/
|   |    |---bnf/
|   |    |---bne/
|   |    |---gbif/
|   |    |---viaf/
...

Implementation Steps

  1. Create new directory structure
  2. Move scripts to appropriate subdirectories
  3. Update all references to script paths in:
  4. Documentation
  5. Other scripts that may call these scripts
  6. Deployment configurations
  7. Test that all scripts work from their new locations
  8. Update README.md with new project structure

Migration Strategy

  1. Create feature branch for reorganization
  2. Move files in logical groups
  3. Test thoroughly after each group move
  4. Update relative path references
  5. Run full test suite
  6. Deploy to staging environment for verification

Breaking Changes

  • All paths referencing these scripts will need to be updated
  • Development workflows may need adjustment

Type

🏗️ Enhancement

Priority

Low

@kkdavis14 kkdavis14 added the enhancement New feature to add to the code label Dec 9, 2024
@wjbmattingly
Copy link

I think we should handle this with #173 and #172 all in a single branch and PR after were finalize the downloading of external data. They all share a common thread in that they each change directory structure for both the pipeline and scripts to improve readability.

@wjbmattingly
Copy link

I had a thought here. Should we separate it out not as internal, external, rather internal, external and Yale?

@kkdavis14
Copy link
Contributor Author

@wjbmattingly internal is Yale (the unit mappers/loaders)--then there's LUX which is still internal, but isn't unit-related. It's post-processing pipeline stuff (Marklogic mapper, final mapper for final cleanups and name selections).

@wjbmattingly
Copy link

Ah, @kkdavis14 I misspoke. I meant internal, external, lux.

@kkdavis14
Copy link
Contributor Author

@wjbmattingly this makes sense--LUX is not internal in the same sense the units are, but sources is probably the best place for those files (for now anyway, we could review when we get to mappers).

@wjbmattingly
Copy link

Makes sense to me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature to add to the code Low Low priority task
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

2 participants