Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use context manager for file handling in codebase #177

Open
1 of 4 tasks
wjbmattingly opened this issue Dec 9, 2024 · 0 comments
Open
1 of 4 tasks

Use context manager for file handling in codebase #177

wjbmattingly opened this issue Dec 9, 2024 · 0 comments
Assignees
Labels
enhancement New feature to add to the code Low Low priority task

Comments

@wjbmattingly
Copy link

wjbmattingly commented Dec 9, 2024

Current Behavior

Many Python files currently uses direct open() which requires manually closing each file.

Proposed Change

Replace direct open() calls with Python's context manager (with statement) to ensure proper file handling and automatic cleanup.

Rationale

  1. Performance remains the same
  2. with ensures proper file handling and automatically closes the file when done, even if an exception occurs
  3. Prevents potential resource leaks from unclosed file handles

Code Changes

# Before
fh = open('nohup.out')
while not finished:
    lines = fh.readlines()
    # ... rest of the while loop ...

# After
with open('nohup.out') as fh:
    while not finished:
        lines = fh.readlines()
        # ... rest of the while loop ...

Implementation Notes

  • This is a best practice change that implements the context manager protocol
  • No functional changes to the core logic
  • Improves resource management and exception handling

Files Affected

  1. ./pipeline/storage/cache/filesystem.py
  2. ./pipeline/storage/marklogic/ml_lexer.py
  3. ./pipeline/storage/idmap/filesystem.py
  4. ./pipeline/storage/idmap/redis.py
  5. ./pipeline/sources/internal/library/mapper.py
  6. ./pipeline/sources/internal/library/loader.py
  7. ./pipeline/sources/internal/ipch/loader.py
  8. ./pipeline/sources/internal/final/mapper.py
  9. ./pipeline/sources/external/ror/loader.py
  10. ./pipeline/sources/external/geonames/loader.py
  11. ./pipeline/sources/external/viaf/loader.py
  12. ./pipeline/sources/external/viaf/index_loader.py
  13. ./pipeline/sources/external/lc/mapper.py
  14. ./pipeline/sources/external/lc/loader.py
  15. ./pipeline/sources/external/dnb/mapper.py
  16. ./pipeline/sources/external/dnb/loader.py
  17. ./pipeline/sources/external/wikidata/loader.py
  18. ./pipeline/sources/external/wikidata/index_loader.py
  19. ./pipeline/sources/external/getty/harvester.py
  20. ./pipeline/sources/external/getty/fetcher.py
  21. ./pipeline/process/validator.py
  22. ./pipeline/process/update_manager.py
  23. ./pipeline/process/reference_manager.py
  24. ./pipeline/process/base/mapper.py
  25. ./experiments/test-gemini-results.py
  26. ./experiments/gemini-places.py
  27. ./scripts/storage/make_test_dataset.py
  28. ./scripts/storage/populate-timestamps.py
  29. ./scripts/storage/watch-mlcp.py
  30. ./scripts/storage/manage-data.py
  31. ./scripts/storage/google-sames-diffs.py
  32. ./scripts/storage/make-wd-differents.py
  33. ./scripts/storage/merge-metatypes.py
  34. ./scripts/runs/run-reconcile.py
  35. ./scripts/runs/run-merge.py
  36. ./scripts/runs/run-export.py

Tasks

  • Identify files that need updating
  • Update each file in a separate branch
  • Use this opportunity to remove unnecessary files
  • Test the pipeline to make sure the pipeline continues to run as expected
@wjbmattingly wjbmattingly added the enhancement New feature to add to the code label Dec 9, 2024
@wjbmattingly wjbmattingly added the Low Low priority task label Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature to add to the code Low Low priority task
Projects
None yet
Development

No branches or pull requests

2 participants