-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove token #528
Remove token #528
Conversation
Thanks for catching and redacting the token! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Doug! My bad on the token issue, and for failing to coordinate. It slipped my mind that you had mentioned this, but now I recall discussing with @mfisher87 the need to handle this very situation.
However, instead of simply hand-modifying the cassette, I'd like to see proper request and response filtering. Without using the vcrpy capabilities for scrubbing sensitive information, if someone were to re-record this cassette, they would run the risk of committing sensitive information again, if they didn't realize (like me) that sensitive information was being recorded.
See:
Yup. I looked into using |
Ok. I'll take a look. If I can't sort it out in short order, I'll let you know and we can approve this redaction in the meantime. If that's the case, then I'll open a new issue to later address this via vcrpy functionality. |
Sorry, I should have been more specific. I can keep looking at it. Just not immediately. |
Great catch, Doug! |
One way that might help here is to put PRs into, or back into, draft status if there's planned future work but the PR appears to be in ready-for-review or ready-to-merge state. |
This is a 'preliminary pull request' because I wanted to get your eyes on the |
Thanks Doug! I'll take an initial pass over your updates. |
tests/unit/test_results.py
Outdated
("client_id", "foo"), | ||
] | ||
|
||
def redact_key_values(keys_to_redact): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice use of a higher-order function here! I'd suggest just making this a top level function. No need to nest this within the _get_vcr
method, nor within the TestResults
class at all.
tests/unit/test_results.py
Outdated
def before_record_response(response): | ||
# Only do this if the response has not been recorded | ||
string_body = response["body"]["string"].decode("utf8") | ||
if REDACTED_STRING not in string_body: | ||
# Only do this is if the body contains one or more | ||
# of the keys to redact. | ||
if any(key in string_body for key in keys_to_redact): | ||
try: | ||
is_list = False | ||
# Marshall into json object, if it is a JSON object. | ||
payload = json.loads(string_body) | ||
if isinstance(payload, list): | ||
payload = payload[0] | ||
is_list = True | ||
for key in keys_to_redact: | ||
if key in payload: | ||
# Redact the key value | ||
payload[key] = REDACTED_STRING | ||
# Write out the updated json object to the response | ||
# body string. | ||
if is_list: | ||
payload = [payload] | ||
response["body"]["string"] = json.dumps(payload).encode() | ||
except ValueError: | ||
# If it is not a json object, return | ||
return response | ||
|
||
return response |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To simplify this logic a bit, I suggest not bothering to avoid redacting an already redacted payload. This redaction is idempotent, so redacting an already redacted payload yields the same result, and in the context of running unit tests, the performance impact is likely unnoticeable.
def before_record_response(response): | |
# Only do this if the response has not been recorded | |
string_body = response["body"]["string"].decode("utf8") | |
if REDACTED_STRING not in string_body: | |
# Only do this is if the body contains one or more | |
# of the keys to redact. | |
if any(key in string_body for key in keys_to_redact): | |
try: | |
is_list = False | |
# Marshall into json object, if it is a JSON object. | |
payload = json.loads(string_body) | |
if isinstance(payload, list): | |
payload = payload[0] | |
is_list = True | |
for key in keys_to_redact: | |
if key in payload: | |
# Redact the key value | |
payload[key] = REDACTED_STRING | |
# Write out the updated json object to the response | |
# body string. | |
if is_list: | |
payload = [payload] | |
response["body"]["string"] = json.dumps(payload).encode() | |
except ValueError: | |
# If it is not a json object, return | |
return response | |
return response | |
def redact(payload): | |
for key in keys_to_redact: | |
if key in payload: | |
payload[key] = "REDACTED" | |
return payload | |
def before_record_response(response): | |
body = response["body"]["string"].decode("utf8") | |
with contextlib.suppress(json.JSONDecodeError): | |
payload = json.loads(body) | |
redacted_payload = ( | |
list(map(redact, payload)) | |
if isinstance(payload, list) | |
else redact(payload) | |
) | |
response["body"]["string"] = json.dumps(redacted_payload).encode() | |
return response |
Also, please move the entirety of redact_key_values
to the top level. It does not need to be nested with the _get_vcr
method. The nesting just adds unnecessary clutter.
Same goes for the nested redact_login_request
below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oddly, I move this method out of TestCase it no longer gets called. Investigating.
I've made all the suggestions changes with the exception of moving |
@doug-newman-nasa, I put these at the top level in def redact_login_request(request):
if "/api/users/" in request.path and "/api/users/tokens" not in request.path:
_, user_name = os.path.split(request.path)
request.uri = request.uri.replace(user_name, REDACTED_STRING)
return request
def redact_key_values(keys_to_redact):
def redact(payload):
for key in keys_to_redact:
if key in payload:
payload[key] = REDACTED_STRING
return payload
def before_record_response(response):
body = response["body"]["string"].decode("utf8")
with contextlib.suppress(json.JSONDecodeError):
payload = json.loads(body)
redacted_payload = (
list(map(redact, payload))
if isinstance(payload, list)
else redact(payload)
)
response["body"]["string"] = json.dumps(redacted_payload).encode()
return response
return before_record_response Give it a shot and let me know what you see. Also, I'd like to request a minor tweak to an assertion in the function So instead of this assertion: self.assertEqual(len(granules), 2520) let's do this, which will be less strict, but still validate the results correctly: self.assertIn(len(granules), range(2001, 3001)) |
Regarding |
Perfect! Even better! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic! Thanks Doug!
I've approved this, but does anybody else want to review this? BTW, I'd like to suggest that we squash merge PRs to have a tidier, more readable change history. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry Doug. One last tweak, please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you sir! Nice work.
Last call for someone else to review. If I don't get any takers today, I'll proceed with a squash merge near COB US Eastern. |
The new cassette for the test_data_links has some information in it pertaining to the user who executed the test. This PR redacts it.
BTW - at the hackathon I stated that I was going to add a VCR cassette for test_data_links which I attempted to PR today only to find the same work had been merged three days ago. We need to coordinate better.
📚 Documentation preview 📚: https://earthaccess--528.org.readthedocs.build/en/528/