-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(sync): allow remote storage #2007
Conversation
@elee1766 can you also outline how one would setup a test harness for this? |
hmm, i will play around with it. @rchincha are there any rules/guidelines around your guys testing process i should be aware of ? i think a proper test harness should importantly
I think the easiest would be to use three instances of zot, one setup to be the destination, one setup to be the origin, and one setup to be the syncer. then everything can be tested from there? I saw you guys use localstack for s3, so i can use that for the storage. |
@elee1766 we have a similar harness already. Pls take a look at that also. |
code looks good to me, add the bats tests and it should be enough. Also sign your commit with gpg: https://docs.github.com/en/authentication/managing-commit-signature-verification/signing-commits commit name should look like PR name for the commit style check to pass. run |
ce8cd09
to
88839b0
Compare
I am very interested in this feature. Did you already build a publicly available container image I can test in dev ? |
eaecae6
to
d700667
Compare
you need to update makeDownstreamServer() in sync_test.go to use sync config's TmpDir add this line: destConfig.Extensions.Sync.TmpDir = destDir, so that tests, where we use chmod to trigger errors, passes. |
Just curious about this requirement. The most common use case for sync is to do a local mirror so that you are decoupled from the cloud wrt costs and connectivity. So could you enlighten us on this use case more where you want S3 as the backend. Why not just use the registries offered by the cloud providers directly? |
yeah sure, let me explain. so we use the gitlab ci + the container registry to produce tagged images there, ala v0.1.2, to create releases. we then use harbor to synchronize from the self hosted gitlab -> the registries offered by the cloud providers directly. we don't see the gitlab container registry as viable for production deployments. zot can replace both harbor and the registry offered by the cloud provider for this use case. at digitalocean, it's the same price. (a sub 20 dollar app engine app is enough to run it, and they use the same storage backend anyways) |
I see, and you will deploy apps on DO via images hosted on your private zot (synced with images from gitlab CI) |
This is a pretty edge case scenario: |
@peusebiu Pls add comments for next steps. @elee1766 do you want to make this PR ready for review (currently in draft state)? |
not yet, I still need to work on tests. have been travelling recently
…On Wed, Nov 15, 2023, 12:03 PM Ramkumar Chinchani ***@***.***> wrote:
@peusebiu <https://github.com/peusebiu> Pls add comments for next steps.
@elee1766 <https://github.com/elee1766> do you want to make this PR ready
for review (currently in draft state)?
—
Reply to this email directly, view it on GitHub
<#2007 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AARH66INA6ZDA2VVXYQQHWLYET7V7AVCNFSM6AAAAAA65DG7ASVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMJTGAYTAMBRGU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
sch.SubmitGenerator(gen, registryConfig.PollInterval, scheduler.MediumPriority) | ||
} | ||
tmpDir := config.Extensions.Sync.TmpDir | ||
if tmpDir == "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say, that by default, it should still sync temporarily under each repo, like today(main). if a TmpDir is given in config, then use that, this way all the tests should pass.
right @rchincha ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we do not want to step outside of our "storageRootDir" path. That's all the space we will own and manage, and nothing outside of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@elee1766 My suggestion is, if there is a cloud backend defined in storage, then instead, do a second step of writing to that backend after the local sync is done
reiterating this :D, that's why sync_test.go fails. but I would say that by default sync should temp copy under each "repo/.sync", and not /tmp, and this way tests will pass. |
ok, changed the sync_test.go as u said, and it seems to be happier now. I able to make the bats test pass locally, although it seems it failing on ci. the joys of integration testing. ill probably have to do some witchcraft later and force push to fix the commit requirements, for now i'll mark it as not a draft though, i am feeling better about it. |
@elee1766 fyi, https://github.com/project-zot/zot/releases/tag/v2.0.0-rc7 Would definitely like to put this PR in the next/final release. |
go.mod
Outdated
@@ -1,6 +1,8 @@ | |||
module zotregistry.io/zot | |||
|
|||
go 1.20 | |||
go 1.21 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think i need it. i'll push with 1.20
f3cded9
to
069fd22
Compare
Signed-off-by: a <[email protected]>
069fd22
to
f91c8af
Compare
Signed-off-by: a <[email protected]>
ac499c8
to
e30edbd
Compare
Signed-off-by: a <[email protected]>
seems some ci tests are failing, but i can't seem to replicate this locally. what is the correct way to replicate the ci test failures locally without waiting for actual github ci ? |
@@ -15,6 +15,7 @@ type Credentials struct { | |||
type Config struct { | |||
Enable *bool | |||
CredentialsFile string | |||
TmpDir string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would remove this.
sch.SubmitGenerator(gen, registryConfig.PollInterval, scheduler.MediumPriority) | ||
} | ||
tmpDir := config.Extensions.Sync.TmpDir | ||
if tmpDir == "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@elee1766 My suggestion is, if there is a cloud backend defined in storage, then instead, do a second step of writing to that backend after the local sync is done
You can locally run I would say the default for this feature would be to save under zot's rootDir like today, and if an imageStore is remote and we have sync config TmpDir then we will temp sync there, then move everything on remote(s3). Otherwise zot will just throw an error, that storage is remote and for sync to work it needs a TmpDir. I would say to not start zot with remote storage+sync without TmpDir set by the user. @elee1766 @Brice187 Would this ^ meet your needs? I will take this PR locally and will try to figure out what is missing for this. |
yeah it's more than enough. |
Closing this PR. @elee1766 thanks for getting this started. |
hi, i am trying my hand at this issue. I am not sure if this is a valid solution, but it seems to be working for me. some confirmation from someone maybe more familiar with the logic would be appreciated.
What type of PR is this?
feature
Which issue does this PR fix:
#1297
What does this PR do / Why do we need it:
it uses a configurable temporary local directory to store the local oci blobs always for sync. I also rename "local" to "destination", since that is more semantically accurate now.
so apparently,
s3 + sync is currently not supported because we are syncing images in a temporary local oci layout before committing them in the storage using storage APIs.
but i think the true reason that it not supported is because they are syncing images to a temporary local oci in a path based off off the storage drivers' root directory.
I think downloading to the local filesystem always is fine, so I just initialize a new local image store at the tmpdir configured, or os.TempDir() if not. the rest of the logic all remains the same.
since
skipImage, err := service.destination.CanSkipImage(destinationRepo, tag, manifestDigest)
actually works properly with remote s3 storageso i think this should work, without the need to
implement our own copy.Image()
Testing done on this change:
so far i've just tested this manually with one image, and it seems to be working with simple docker pulls. more needs to be done.
the config i used for testing
i probably need to write some tests?
Automation added to e2e:
not any yet
Will this break upgrades or downgrades?
it shouldn't
Does this PR introduce any user-facing change?:
yes, to be written
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.