You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Package name and version: google-cloud-php-storage:v1.33.1
Steps to reproduce
try to fetch files from storage bucket when storage.googleapis.com is having intermittent issues and returns server error 500 or 503 and response body like "We encountered an internal error. Please try again." for first request and actual response for next request.
\Google\Cloud\Storage\Connection\Rest->downloadObject returns mangled file content where the beginning of actual file data is replaced with text "We encountered an internal error. Please try again."
Problematic code in downloadObject function is following:
$requestOptions['restRetryListener'] = function (
\Exception$e,
$retryAttempt,
&$arguments
) use (
$resultStream,
$requestedBytes,
$invocationId
) {
// if the exception has a response for us to useif ($einstanceof RequestException && $e->hasResponse()) {
$msg = (string) $e->getResponse()->getBody();
$fetchedStream = Utils::streamFor($msg);
// add the partial response to our stream that we will return
Utils::copyToStream($fetchedStream, $resultStream);
// Start from the byte that was last fetched$startByte = intval($requestedBytes['startByte']) + $resultStream->getSize();
$endByte = $requestedBytes['endByte'];
// modify the range headers to fetch the remaining data$arguments[1]['headers']['Range'] = sprintf('bytes=%s-%s', $startByte, $endByte);
$arguments[0] = $this->modifyRequestForRetry($arguments[0], $retryAttempt, $invocationId);
}
};
It tries to handle all RequestException cases by remembering $e->getResponse()->getBody() and assuming that these were the correct bytes and then tries to resume fetching by asking the remaining data (skipping number of bytes already remembered).
As a result, whenever internal server occurs in a way that some content is returned (not just header but actual body), that said content is injected into the file.
We found out by getting intermittent image format exceptions when fetching files from storage bucket during partial outage and suddenly random images got their first 51 bytes replaced with "We encountered an internal error. Please try again.".
We were quite surprised that instead of throwing exception during storage.googleapis.com partial outage, that retry/resume logic started randomly replacing data like this.
Same problematic code exists in current latest 1.44 version:
I understand that the point of that code is to resume file download (useful when there is some network error when 990MB of 1000MB file is downloaded), but any non-network-errors should get ignored, not injected into file content silently.
The text was updated successfully, but these errors were encountered:
Is there any additional information I could provide? Or how to proceed from here?
I tried to look at the code and I can see there is a test class RetryConformanceTest with some setup in retry_tests.json but unable to run them locally.
What I can see, is that there is no test-case to handle error 500 that contains "We encountered an internal error. Please try again.".
testDownloadAsStreamForStartBytesGiven seems to only test for happy-case return-broken-stream-after-256K and does not check any error 500 or anything else. Also it does not verify actual content, only content size, which based on our experience, allows for data corruption.
I can also see that in RequestProcessorTrait.php there is mapping:
case Code::UNKNOWN:
$exception = Exception\ServerException::class;
break;
case Code::INTERNAL:
$exception = Exception\ServerException::class;
break;
One possible solution is to handle the downloadObject code like this (cannot confirm with conformance tests):
Environment details
Steps to reproduce
Code example
Problematic code in downloadObject function is following:
It tries to handle all RequestException cases by remembering $e->getResponse()->getBody() and assuming that these were the correct bytes and then tries to resume fetching by asking the remaining data (skipping number of bytes already remembered).
As a result, whenever internal server occurs in a way that some content is returned (not just header but actual body), that said content is injected into the file.
We found out by getting intermittent image format exceptions when fetching files from storage bucket during partial outage and suddenly random images got their first 51 bytes replaced with "We encountered an internal error. Please try again.".
We were quite surprised that instead of throwing exception during storage.googleapis.com partial outage, that retry/resume logic started randomly replacing data like this.
Same problematic code exists in current latest 1.44 version:
https://github.com/googleapis/google-cloud-php-storage/blob/ebdec855364c1df9e81755e9626e3ff4687263f4/src/Connection/Rest.php#L321
I understand that the point of that code is to resume file download (useful when there is some network error when 990MB of 1000MB file is downloaded), but any non-network-errors should get ignored, not injected into file content silently.
The text was updated successfully, but these errors were encountered: