Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(startup): update logic for metadb update on startup, skip unmodi… #2024

Merged
merged 1 commit into from
Nov 16, 2023

Conversation

laurentiuNiculae
Copy link
Contributor

…fied repos

What type of PR is this?

Which issue does this PR fix:

What does this PR do / Why do we need it:

If an issue # is not available please add repro steps and logs showing the issue:

Testing done on this change:

Automation added to e2e:

Will this break upgrades or downgrades?

Does this PR introduce any user-facing change?:


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@laurentiuNiculae laurentiuNiculae force-pushed the metadb-startup-update branch 7 times, most recently from ef00ada to 6244d31 Compare November 10, 2023 15:16
Copy link

codecov bot commented Nov 10, 2023

Codecov Report

Attention: 59 lines in your changes are missing coverage. Please review.

Comparison is base (60eaf7b) 92.03% compared to head (9f32c47) 91.81%.

Files Patch % Lines
pkg/meta/boltdb/boltdb.go 68.86% 22 Missing and 11 partials ⚠️
pkg/meta/dynamodb/dynamodb.go 84.02% 15 Missing and 8 partials ⚠️
pkg/storage/imagestore/imagestore.go 78.57% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2024      +/-   ##
==========================================
- Coverage   92.03%   91.81%   -0.22%     
==========================================
  Files         164      164              
  Lines       27717    27992     +275     
==========================================
+ Hits        25509    25702     +193     
- Misses       1635     1691      +56     
- Partials      573      599      +26     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@laurentiuNiculae laurentiuNiculae force-pushed the metadb-startup-update branch 5 times, most recently from 5253414 to 7c9ff33 Compare November 13, 2023 16:11
@andaaron andaaron linked an issue Nov 14, 2023 that may be closed by this pull request
@laurentiuNiculae laurentiuNiculae marked this pull request as ready for review November 14, 2023 08:21
@laurentiuNiculae laurentiuNiculae force-pushed the metadb-startup-update branch 2 times, most recently from b6a7c74 to 186d6c7 Compare November 14, 2023 10:36
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Contributor

@andaaron andaaron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall the approach looks good, but what happens if the contents of the pre-existing buckets does not match the current data?

Let's say for example you do a zot upgrade to a binary built from this PR and the repos in the DB do not have the last updated data/buckets?

pkg/test/oci-utils/store.go Outdated Show resolved Hide resolved
pkg/storage/imagestore/imagestore.go Show resolved Hide resolved
pkg/meta/parse.go Outdated Show resolved Hide resolved
pkg/meta/boltdb/boltdb_test.go Show resolved Hide resolved
@rchincha
Copy link
Contributor

@laurentiuNiculae can you summarize your changes/thoughts in this PR ... ideally in the commit msg itself.

@laurentiuNiculae
Copy link
Contributor Author

@laurentiuNiculae can you summarize your changes/thoughts in this PR ... ideally in the commit msg itself.

Now we parse just repos who were updated after the server is closed.

To do this we:

  • Store the timestamp when the repo was last modified in MetaDB, I'll note it DBLastUpdated
  • We compare that timestamp with the current "LastModified" time given by the storage (1). I'll note it StorageLastUpdated
  • If StorageLastUpdated is after DBLastUpdated we must parse this repo and update MetaDB, otherwise we can just skip it.

(1) LastModified refers to the last modified value of the index.json file for the repo

@rchincha
Copy link
Contributor

rchincha commented Nov 15, 2023

Does boltdb do internal checksums to detect data corruptions?
If corrupted, repair.

https://github.com/boltdb/bolt/blob/master/errors.go#L23C47-L25C1

…fied repos

- MetaDB stores the time of the last update of a repo
- During startup we check if the layout has been updated after the last recorded change in the db
- If this is the case, the repo is parsed and updated in the DB otherwise it's skipped

Signed-off-by: Laurentiu Niculae <[email protected]>
Copy link
Contributor

@rchincha rchincha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@rchincha rchincha merged commit 4fb1e75 into project-zot:main Nov 16, 2023
31 of 33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feat]: metadb reconciling takes a long time at startup
4 participants