-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug-1827506: Implement the storage interface for GCS. #2957
Merged
Merged
Changes from all commits
Commits
Show all changes
19 commits
Select commit
Hold shift + click to select a range
97c609b
bug-1564452: Simplify StorageError class.
smarnach ce4b234
bug-1827506: Add GCS implementation of StorageInterface.
smarnach 1f6fa08
bug-1827506: Add basic GCSStorage test.
smarnach 03f783f
bug-1827506: Make SymbolStorage backend-agnostic.
smarnach 6e90872
bug-1827506: Refactor storage backend tests.
smarnach 190f926
bug-1827506: Make SymbolStorage tests run with both GCS and S3 backends.
smarnach 1301d70
bug-1827506: Make download tests run with both GCS and S3 backends.
smarnach 67bad26
bug-1827506: Inline and remove create_storage_backend fixture.
smarnach a699f44
bug-1827506: Make upload tests run with both GCS and S3 backends.
smarnach 655585a
bug-1908868: Use structured configuration for backends.
smarnach 1dddeeb
fixup! bug-1908868: Use structured configuration for backends.
smarnach 8a82839
bug-1827506: Prevent decompressive transcoding for GCS downloads.
smarnach 2ec435d
fixup! bug-1908868: Use structured configuration for backends.
smarnach b34a5be
bug-1827506: de-obscure StorageError
willkg f94f899
bug-1827506: move test data into subdirectory
willkg e90ff03
bug-1827506: move sym header extraction, rework tests
willkg 08c8c25
bug-1827506: Include the try_symbols flag in the storage backend repr.
smarnach 669f9fc
bug-1827506: Log backend errors in the Dockerflow health check.
smarnach ea0ff03
bug-1827506: Rename storage timeout settings.
smarnach File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# This Source Code Form is subject to the terms of the Mozilla Public | ||
# License, v. 2.0. If a copy of the MPL was not distributed with this | ||
# file, You can obtain one at https://mozilla.org/MPL/2.0/. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,148 @@ | ||
# This Source Code Form is subject to the terms of the Mozilla Public | ||
# License, v. 2.0. If a copy of the MPL was not distributed with this | ||
# file, You can obtain one at https://mozilla.org/MPL/2.0/. | ||
|
||
from io import BufferedReader | ||
import threading | ||
from typing import Optional | ||
from urllib.parse import quote | ||
|
||
from django.conf import settings | ||
|
||
from google.api_core.client_options import ClientOptions | ||
from google.api_core.exceptions import ClientError, NotFound | ||
from google.cloud import storage | ||
|
||
from tecken.libstorage import ObjectMetadata, StorageBackend, StorageError | ||
|
||
|
||
class GCSStorage(StorageBackend): | ||
""" | ||
An implementation of the StorageBackend interface for Google Cloud Storage. | ||
""" | ||
|
||
def __init__( | ||
self, | ||
bucket: str, | ||
prefix: str, | ||
try_symbols: bool = False, | ||
endpoint_url: Optional[str] = None, | ||
): | ||
self.bucket = bucket | ||
self.prefix = prefix | ||
self.try_symbols = try_symbols | ||
self.endpoint_url = endpoint_url | ||
self.clients = threading.local() | ||
# The Cloud Storage client doesn't support setting global timeouts for all requests, so we | ||
# need to pass the timeout for every single request. the default timeout is 60 seconds for | ||
# both connecting and reading from the socket. | ||
self.timeout = (settings.STORAGE_CONNECT_TIMEOUT, settings.STORAGE_READ_TIMEOUT) | ||
|
||
def __repr__(self): | ||
return f"<{self.__class__.__name__} gs://{self.bucket}/{self.prefix}> try:{self.try_symbols}" | ||
|
||
def _get_client(self) -> storage.Client: | ||
"""Return a thread-local low-level storage client.""" | ||
if not hasattr(self.clients, "client"): | ||
options = ClientOptions(api_endpoint=self.endpoint_url) | ||
self.clients.client = storage.Client(client_options=options) | ||
return self.clients.client | ||
|
||
def _get_bucket(self) -> storage.Bucket: | ||
"""Return a thread-local low-level storage bucket client.""" | ||
if not hasattr(self.clients, "bucket"): | ||
client = self._get_client() | ||
try: | ||
self.clients.bucket = client.get_bucket( | ||
self.bucket, timeout=self.timeout | ||
) | ||
except NotFound as exc: | ||
raise StorageError(str(exc), backend=self) from exc | ||
return self.clients.bucket | ||
|
||
def exists(self) -> bool: | ||
"""Check that this storage exists. | ||
|
||
:returns: True if the storage exists and False if not | ||
|
||
:raises StorageError: an unexpected backend-specific error was raised | ||
""" | ||
try: | ||
self._get_bucket() | ||
except StorageError: | ||
return False | ||
except ClientError as exc: | ||
raise StorageError(str(exc), backend=self) from exc | ||
return True | ||
|
||
def get_download_url(self, key: str) -> str: | ||
"""Return the download URL for the given key.""" | ||
endpoint_url = self.endpoint_url or self._get_client().meta.endpoint_url | ||
endpoint_url = endpoint_url.removesuffix("/") | ||
return f"{endpoint_url}/{self.bucket}/{self.prefix}/{quote(key)}" | ||
|
||
def get_object_metadata(self, key: str) -> Optional[ObjectMetadata]: | ||
"""Return object metadata for the object with the given key. | ||
|
||
:arg key: the key of the symbol file not including the prefix, i.e. the key in the format | ||
``<debug-file>/<debug-id>/<symbols-file>``. | ||
|
||
:returns: An OjbectMetadata instance if the object exist, None otherwise. | ||
|
||
:raises StorageError: an unexpected backend-specific error was raised | ||
""" | ||
bucket = self._get_bucket() | ||
gcs_key = f"{self.prefix}/{key}" | ||
try: | ||
blob = bucket.get_blob(gcs_key, timeout=self.timeout) | ||
if not blob: | ||
return None | ||
except ClientError as exc: | ||
raise StorageError(str(exc), backend=self) from exc | ||
gcs_metadata = blob.metadata or {} | ||
original_content_length = gcs_metadata.get("original_size") | ||
if original_content_length is not None: | ||
try: | ||
original_content_length = int(original_content_length) | ||
except ValueError: | ||
original_content_length = None | ||
metadata = ObjectMetadata( | ||
download_url=blob.public_url, | ||
content_type=blob.content_type, | ||
content_length=blob.size, | ||
content_encoding=blob.content_encoding, | ||
original_content_length=original_content_length, | ||
original_md5_sum=gcs_metadata.get("original_md5_hash"), | ||
last_modified=blob.updated, | ||
) | ||
return metadata | ||
|
||
def upload(self, key: str, body: BufferedReader, metadata: ObjectMetadata): | ||
"""Upload the object with the given key and body to the storage backend. | ||
|
||
:arg key: the key of the symbol file not including the prefix, i.e. the key in the format | ||
``<debug-file>/<debug-id>/<symbols-file>``. | ||
:arg body: A stream yielding the symbols file contents. | ||
:arg metadata: An ObjectMetadata instance with the metadata. | ||
|
||
:raises StorageError: an unexpected backend-specific error was raised | ||
""" | ||
bucket = self._get_bucket() | ||
blob = bucket.blob(f"{self.prefix}/{key}") | ||
gcs_metadata = {} | ||
if metadata.original_content_length: | ||
# All metadata values must be strings. | ||
gcs_metadata["original_size"] = str(metadata.original_content_length) | ||
if metadata.original_md5_sum: | ||
gcs_metadata["original_md5_hash"] = metadata.original_md5_sum | ||
blob.metadata = gcs_metadata | ||
blob.content_type = metadata.content_type | ||
blob.content_encoding = metadata.content_encoding | ||
# Prevent "decompressive transcoding" on download. See bug 1827506. | ||
blob.cache_control = "no-transform" | ||
try: | ||
blob.upload_from_file( | ||
body, size=metadata.content_length, timeout=self.timeout | ||
) | ||
except ClientError as exc: | ||
raise StorageError(str(exc), backend=self) from exc |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This gets used by the "upload now" ui view. I bet we could remove this ui view and the API endpoints it uses. I wrote up a bug to consider doing that.
https://bugzilla.mozilla.org/show_bug.cgi?id=1909013