-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check registry file hash before downloading #93
Merged
Merged
Changes from 14 commits
Commits
Show all changes
15 commits
Select commit
Hold shift + click to select a range
36de2b3
Add registry hash checking
OliviaLynn dcd6095
Add registry hash checking
OliviaLynn d94bfd6
Merge branch 'issue/54/registry-hash-file' of https://github.com/linc…
OliviaLynn ced64a5
Add unit test for registry hash match; fix reg download fn a bit
OliviaLynn 9646d42
Simplify branching; add unit test
OliviaLynn e6ebbb4
Expand on http error message
OliviaLynn 8fb5d8c
Break unit tests into sep file; cover all logic branches; organize be…
OliviaLynn 9e70639
Merge branch 'issue/54/registry-hash-file' of https://github.com/linc…
OliviaLynn c288748
Add mocked file content to one of the unit tests
OliviaLynn cbb7e0a
Expand unit test mocking
OliviaLynn 9bc25ed
Add last unit test (local reg exists/fail to dl hash)
OliviaLynn c354fe6
Change patches from dectorators to context managers, which imo does i…
OliviaLynn 692d3e5
Modify error handling in data retrieval code
OliviaLynn e46196e
Simplify _check_registry_is_latest_version return
OliviaLynn 49f7936
Update var name
OliviaLynn File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nice, I like the way you've documented the logic and the tests that cover it. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,188 @@ | ||
"""There's a lot of branching logic in download_registry_from_github. | ||
|
||
I did my best to keep it simple, but it's still a little tricky. | ||
|
||
Here's how it works: | ||
|
||
Call download_registry_from_github | ||
1. Local registry file does not exist → Download remote registry file | ||
1. Successfully downloaded registry file → Exit (now have local registry) | ||
2. Fail to download registry file → raise Exception | ||
2. Local registry file exists | ||
→ Need to check if local registry is up to date | ||
→ Call _check_registry_is_latest_version | ||
1. True → Exit (confirmed local registry is up to date) | ||
2. False → Download updated version | ||
1. Successfully downloaded registry file → Exit (local registry updated) | ||
2. Fail to download registry file → raise Exception | ||
3. Failed to download hash file → raise Exception | ||
""" | ||
|
||
import os | ||
import tempfile | ||
from unittest.mock import Mock, patch | ||
|
||
import pytest | ||
import requests | ||
from lephare.data_retrieval import ( | ||
download_registry_from_github, | ||
) | ||
|
||
|
||
def test_download_registry_success(): | ||
# 1. Local registry does not exist (so no mocking needed here) | ||
# 1. Successfully downloaded registry file | ||
|
||
# Mock remote registry file | ||
mock_response = Mock() | ||
mock_response.status_code = 200 | ||
mock_response.text = "file1\nfile2\nfile3" | ||
with patch("requests.get", return_value=mock_response) as mock_get_remote_registry: # noqa: F841 | ||
with tempfile.TemporaryDirectory() as tmp_dir: | ||
registry_outfile = os.path.join(tmp_dir, "registry.txt") | ||
download_registry_from_github(outfile=registry_outfile) | ||
# Check that we can open it (and it contains expected content) | ||
with open(registry_outfile, "r") as file: | ||
assert file.read() == "file1\nfile2\nfile3" | ||
|
||
|
||
def test_download_registry_failure(): | ||
# 1. Local registry does not exist (no mocking needed) | ||
# 2. Fail to download registry file | ||
|
||
# Mock failed registry file download | ||
mock_response = Mock() | ||
mock_response.raise_for_status.side_effect = requests.exceptions.HTTPError( | ||
"404 Client Error: Not Found for url" | ||
) | ||
with patch("requests.get", return_value=mock_response) as mock_get_remote_registry: # noqa: F841 | ||
with pytest.raises(requests.exceptions.HTTPError): | ||
download_registry_from_github() | ||
|
||
|
||
def test_update_registry_hash_matches(): | ||
# 2. Local registry exists | ||
# 1. _check_registry_is_latest_version == True | ||
|
||
# Mock the local registry file existing | ||
with patch("os.path.isfile", return_value=True) as mock_local_registry_existing: # noqa: F841 | ||
# Mock local registry having a certain pooch hash | ||
with patch("pooch.file_hash", return_value="registryhash123") as mock_local_registry_hash: # noqa: F841 | ||
# Mock getting the remote registry hash file | ||
mock_response = Mock() | ||
mock_response.status_code = 200 | ||
mock_response.text = "registryhash123" | ||
with patch("requests.get", return_value=mock_response) as mock_get_remote_hash_file: | ||
# Call the function | ||
with tempfile.TemporaryDirectory() as tmp_dir: | ||
registry_outfile = os.path.join(tmp_dir, "registry.txt") | ||
download_registry_from_github( | ||
url="http://example.com/data_registry.txt", outfile=registry_outfile | ||
) | ||
# Assert that we only make 1 request for the hash file, and that we used the right url: | ||
assert mock_get_remote_hash_file.call_count == 1 | ||
assert ( | ||
mock_get_remote_hash_file.call_args[0][0] | ||
== "http://example.com/data_registry_hash.sha256" | ||
) | ||
|
||
|
||
def test_update_registry_hash_mismatches(): | ||
# 2. Local registry exists | ||
# 2. _check_registry_is_latest_version == False | ||
# 1. Successfully downloaded registry file | ||
|
||
# Mock the local registry file existing | ||
with patch("os.path.isfile", return_value=True) as mock_local_registry_existing: # noqa: F841 | ||
# Mock local registry having a certain pooch hash | ||
with patch("pooch.file_hash", return_value="registryhash123") as mock_local_registry_hash: # noqa: F841 | ||
# Mock getting the remote hash/registry files | ||
mock_hash_response = Mock() | ||
mock_hash_response.status_code = 200 | ||
mock_hash_response.text = "hash_doesn't_match123" | ||
|
||
mock_registry_response = Mock() | ||
mock_registry_response.status_code = 200 | ||
mock_registry_response.text = "file1\nfile2\nfile3" | ||
|
||
def which_mock_get(*args, **kwargs): | ||
url = args[0] | ||
if "hash.sha256" in url: | ||
return mock_hash_response | ||
else: | ||
return mock_registry_response | ||
|
||
with patch("requests.get", side_effect=which_mock_get) as mock_remote_files_get: | ||
# Call the function | ||
with tempfile.TemporaryDirectory() as tmp_dir: | ||
registry_outfile = os.path.join(tmp_dir, "registry.txt") | ||
download_registry_from_github( | ||
url="http://example.com/data_registry.txt", outfile=registry_outfile | ||
) | ||
# Checks: | ||
# One call to download the hash file, one call to download the full registry file: | ||
assert mock_remote_files_get.call_count == 2 | ||
# The following set of [0][0][0] and such is because the call args list is | ||
# [call('http://example.com/data_registry_hash.sha256', timeout=60), | ||
# call('http://example.com/data_registry.txt', timeout=60)] | ||
# and we're only interested in checking the urls: | ||
assert ( | ||
mock_remote_files_get.call_args_list[0][0][0] | ||
== "http://example.com/data_registry_hash.sha256" | ||
) | ||
assert ( | ||
mock_remote_files_get.call_args_list[1][0][0] | ||
== "http://example.com/data_registry.txt" | ||
) | ||
|
||
|
||
def test_update_registry_hash_mismatches_and_download_fails(): | ||
# 2. Local registry exists | ||
# 2. _check_registry_is_latest_version == False | ||
# 2. Fail to download registry file | ||
|
||
# Mock the local registry file existing | ||
with patch("os.path.isfile", return_value=True) as mock_local_registry_existing: # noqa: F841 | ||
# Mock local registry having a certain pooch hash | ||
with patch("pooch.file_hash", return_value="registryhash123") as mock_local_registry_hash: # noqa: F841 | ||
# Mock getting the remote hash/registry files | ||
mock_hash_response = Mock() | ||
mock_hash_response.status_code = 200 | ||
mock_hash_response.text = "hash_doesn't_match123" | ||
|
||
mock_registry_response = Mock() | ||
mock_registry_response.raise_for_status.side_effect = requests.exceptions.HTTPError( | ||
"404 Client Error: Not Found for url" | ||
) | ||
|
||
def which_mock_get(*args, **kwargs): | ||
url = args[0] | ||
if "hash.sha256" in url: | ||
return mock_hash_response | ||
else: | ||
return mock_registry_response | ||
|
||
with patch("requests.get", side_effect=which_mock_get) as mock_remote_files_get: # noqa: F841 | ||
# Check that we raise HTTPError as expected | ||
with pytest.raises(requests.exceptions.HTTPError): | ||
download_registry_from_github() | ||
|
||
|
||
def test_update_registry_hash_download_fails(): | ||
# 2. Local registry exists | ||
# 3. Fail to download registry hash file | ||
|
||
# Mock the local registry file existing | ||
with patch("os.path.isfile", return_value=True) as mock_local_registry_existing: # noqa: F841 | ||
# Mock local registry having a certain pooch hash | ||
with patch("pooch.file_hash", return_value="registryhash123") as mock_local_registry_hash: # noqa: F841 | ||
# Mock getting the remote hash/registry files | ||
mock_hash_response = Mock() | ||
mock_hash_response.raise_for_status.side_effect = requests.exceptions.HTTPError( | ||
"404 Client Error: Not Found for url" | ||
) | ||
|
||
with patch("requests.get", return_value=mock_hash_response) as mock_get_remote_hash_file: # noqa: F841 | ||
# Check that we get the expected exception | ||
with pytest.raises(requests.exceptions.HTTPError): | ||
download_registry_from_github() |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice