Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support putifabsent #8428

Open
wants to merge 36 commits into
base: master
Choose a base branch
from
Open

support putifabsent #8428

wants to merge 36 commits into from

Conversation

ItamarYuran
Copy link
Contributor

Closes #8380

Change Description

Added support for if-none-match header.
The header can be used for putObject or complete-multipart-upload operations.

changes were tested manually. In case of a collision an error with http status 304 will return, as in aws cli.

@ItamarYuran ItamarYuran added the include-changelog PR description should be included in next release changelog label Dec 17, 2024
@ItamarYuran ItamarYuran requested a review from a team December 17, 2024 18:03
@ItamarYuran ItamarYuran linked an issue Dec 17, 2024 that may be closed by this pull request
Copy link

E2E Test Results - DynamoDB Local - Local Block Adapter

13 passed

Copy link

github-actions bot commented Dec 17, 2024

E2E Test Results - Quickstart

11 passed

Copy link
Contributor

@Isan-Rivkin Isan-Rivkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, added 2 important comments.

ErrObjectExists: {
Code: "ErrObjectExists",
Description: "Object path already exists in DB",
HTTPStatusCode: http.StatusNotModified,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The correct error is PreconditionFailed with status code 412 (StatusPreconditionFailed).
As for the description, the exact AWS message is:

At least one of the pre-conditions you specified did not hold

let's stick to what AWS returns.

@@ -298,6 +298,11 @@ func handlePut(w http.ResponseWriter, req *http.Request, o *PathOperation) {
o.Incr("put_object", o.Principal, o.Repository.Name, o.Reference)
storageClass := StorageClassFromHeader(req.Header)
opts := block.PutOpts{StorageClass: storageClass}
err := o.checkIfAbsent(req)
Copy link
Contributor

@Isan-Rivkin Isan-Rivkin Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This intention is right but not the approach.
In the current version the function checkIfAbsent implements checkIfAbsent by doing Get to the object and then Put. This is not gonna work because the Get->Put operation is not atomic.
The idea of the feature is that AWS implements this and provide atomicity out of the box.

Instead, our S3 Gateway should only forward the request to AWS and let them handle the precondition check.
Example implementation:

  1. add additional field to opts := block.PutOpts{StorageClass: storageClass, IfNoneMatch} (one line above) that check if the header exist and sets it.
  2. let the ops propagate down to the blockstore adapter and then use it when calling PutObject as an option. We have a challenge here see below 👇 .

Challenge with PutObject call:

Our AWS SDK is outdated, so it does not contain the built-in option in the new version IfNoneMatch.

We have 2 options:

a. Upgrade the AWS SDK (preferable) after verifying there no breaking changes (separate PR)
b. Use the request middleware options that the current AWS SDK version provides to propagate the header.

NOTE: Same should apply for Multipart Uploads

@ItamarYuran ItamarYuran mentioned this pull request Dec 19, 2024
@@ -325,3 +338,20 @@ func handlePut(w http.ResponseWriter, req *http.Request, o *PathOperation) {
o.SetHeader(w, "ETag", httputil.ETag(blob.Checksum))
w.WriteHeader(http.StatusOK)
}

func (o *PathOperation) checkIfAbsent(req *http.Request) (bool, error) {
Header := req.Header.Get(IfNoneMatchHeader)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Use lowercase for variable names
  • Be explicit in variable name
Suggested change
Header := req.Header.Get(IfNoneMatchHeader)
headerValue := req.Header.Get(IfNoneMatchHeader)

@@ -298,6 +298,15 @@ func handlePut(w http.ResponseWriter, req *http.Request, o *PathOperation) {
o.Incr("put_object", o.Principal, o.Repository.Name, o.Reference)
storageClass := StorageClassFromHeader(req.Header)
opts := block.PutOpts{StorageClass: storageClass}
allowOverWrite, err := o.checkIfAbsent(req)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future generations, please add comment explaining this, something or exactly like this otherwise it's confusing, since this does not actually the verify core part, it's more optimization (since not atomicity here)

@@ -325,3 +338,20 @@ func handlePut(w http.ResponseWriter, req *http.Request, o *PathOperation) {
o.SetHeader(w, "ETag", httputil.ETag(blob.Checksum))
w.WriteHeader(http.StatusOK)
}

func (o *PathOperation) checkIfAbsent(req *http.Request) (bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this function confusing because:
It does 3 different things:

  • It checks and validates the the header value
  • Get's the value from catalog.
  • Returns allowOverride indicator for the real code somewhere else that checks if absent.

Now that is confusing because it's a "lie" - it only's only partially checking ifAbsent as optimization, since the real check performed later and also does input validation.

I would prefer something much more explicit like a function that extracts the header and validates it, then inline check if object exist:

// checkIfAbsent sets allowOverwrite and validates the header value if set 
allowOverwrite, err := o.checkIfAbsent(req)
if err != nil { 
    // ...
}
if !allowOverwrite { 
    // first check if object exist as optimization to save resources 
    _, err := o.Catalog.GetEntry(req.Context(), o.Repository.Name, o.Reference, o.Path, catalog.GetEntryParams{}) 
    // hadle if err != nil ...
}

@@ -298,6 +298,15 @@ func handlePut(w http.ResponseWriter, req *http.Request, o *PathOperation) {
o.Incr("put_object", o.Principal, o.Repository.Name, o.Reference)
storageClass := StorageClassFromHeader(req.Header)
opts := block.PutOpts{StorageClass: storageClass}
allowOverWrite, err := o.checkIfAbsent(req)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
allowOverWrite, err := o.checkIfAbsent(req)
allowOverwrite, err := o.checkIfAbsent(req)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
include-changelog PR description should be included in next release changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support putIfAbsent
2 participants