Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small(?) multipart upload has errors when determining mime type. #3658

Closed
chrisvanrun opened this issue Oct 25, 2024 · 3 comments · Fixed by #3661
Closed

Small(?) multipart upload has errors when determining mime type. #3658

chrisvanrun opened this issue Oct 25, 2024 · 3 comments · Fixed by #3661

Comments

@chrisvanrun
Copy link
Contributor

chrisvanrun commented Oct 25, 2024

Sentry issue: An error occurred (InvalidRange) when calling the GetObject operation: The requested range is not satisfiable.

The mimetype_from_file() uses the boto client to get the initial range, however, it seems that the upload may be less than 2048 bytes long. Not sure why it is a multi-part... I'd expect the multipart part to have a cutoff at a certain size. Need to investigate.

A quick fix would be wrapping it in a try-except as follows:

try:
    header = self._client.get_object(
        Bucket=self.bucket,
        Key=self.key,
        Range="bytes=0-2047",
    )["Body"].read()
except self._client.exceptions.InvalidRange as e:
    # Fallback if range is out of bounds
    header = self._client.get_object(Bucket=self.bucket, Key=self.key)["Body"].read()
@jmsmkn
Copy link
Member

jmsmkn commented Oct 28, 2024

The suggested error handling there is not correct, you would need to catch botocore.exceptions.ClientError. It would be something like:

        except botocore.exceptions.ClientError as error:
            if error.response["Error"]["Code"] == "InvalidRange":
                header = self._client.get_object(Bucket=self.bucket, Key=self.key)["Body"].read()
            else:
                raise error

However, I wonder if that could lead to a DOS in case in some weird situation with partial uploads the first bytes are missing or something. No idea here, but maybe? We can anyway be a bit more defensive:

object_head = s3_client.head_object(Bucket=self.bucket, Key=self.key)
object_size = int(response['ContentLength'])
max_bytes = min(2047, object_size)
header = self._client.get_object(
        Bucket=self.bucket,
        Key=self.key,
        Range=f"bytes=0-{max_bytes}",
    )["Body"].read()

@jmsmkn jmsmkn assigned jmsmkn and unassigned jmsmkn Oct 28, 2024
@jmsmkn
Copy link
Member

jmsmkn commented Oct 28, 2024

Actually, I don't think this explanation makes sense. We have a test with a small file introduced in #3350.

@jmsmkn
Copy link
Member

jmsmkn commented Oct 28, 2024

It is a problem with a zero bytes file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants