-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify the cache #817
Open
darkowlzz
wants to merge
3
commits into
main
Choose a base branch
from
refactor-cache
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Simplify the cache #817
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
darkowlzz
force-pushed
the
refactor-cache
branch
3 times, most recently
from
November 7, 2024 16:54
543974d
to
02d44f7
Compare
- Remove dependency on kubernetes. - Simplify the Store interface to only have Get, Set and Delete. - Add new RecordCacheEvent() method in the cache implementations for recording the cache events, hit or miss, along with labels of the associated flux object being reconciled. The labels name, namespace and kind for this metric are not required to be configured when creating the cache. They are predefined in the created metrics recorder, similar to the runtime metrics recorder for the rest of flux. RecordCacheEvent() has to be called by the caller of cache operations explicitly along with the label information of the associated object being reconciled. The cache no longer has the knowledge of the object being reconciled, decoupled. - With the decoupled cache and reconciling object for metrics, the storeOptions no longer need to have the generic type defined. Simplifying the usage. - Add new DeleteCacheEvent() method in the cache implementations for deleting the cache events, hit or miss, which are associated with the object being reconciled. When the reconciling object is deleted, these metrics associated with the object must also be deleted. - Updated all the tests to use simple string cache. - Get() now returns a pointer instead of a separate exists boolean. If the pointer is nil, the item is not found in the cache. - Get() takes a key and returns a cached item for it. GetByKey() is removed as Get() does the same. Signed-off-by: Sunny <[email protected]>
Custom prefix allows using multiple instances of the cache in an application with unique metrics. Signed-off-by: Sunny <[email protected]>
OCI auth token caching will be done in the OCI client. Remove the cache from auth package and its use in the integration tests. Since the cache related changes introduced the auth token expiry time to be returned from the auth provider logins, a new function LoginWithExpiry() is introduced which returns the expiry time. The original Login() remains as it is for backwards compatibility. A higher level client can use LoginWithExpiry() to obtain the TTL of the auth token to use with the cache. Signed-off-by: Sunny <[email protected]>
darkowlzz
force-pushed
the
refactor-cache
branch
from
November 8, 2024 14:06
02d44f7
to
65b1039
Compare
stefanprodan
approved these changes
Nov 9, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks @darkowlzz 🥇
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Based on the cache implementation improvement/simplification plan discussed
before, this change refactors it along the same lines with additional
improvements. See
https://gist.github.com/darkowlzz/34bc8b37d016b8aeb1de6660173652eb for the
initial improvement idea. It also updates oci/auth package which used the cache,
to remove the auth token caching. Auth token caching will be done at a higher
level by an OCI client.
At the time of writing the initial improvement plan, I was focused on the cache
interface and the exposed API. But while implementing the changes, I realized
how tightly the cache and their metrics were coupled, which may have contributed
to the verbose API. I'll explain some of it below for the record along with some
other explanations related to the changes.
dependency on kubernetes
The initial implementation was based on client-go's cache.Store interface and
was implemented around kubernetes style objects with object metadata. Due to
this, the implementation used kubernetes related go packages. Any user of the
cache, for example the git package for caching the git auth tokens, when
importing the cache, also imports all the kubernetes dependencies. This is
undesirable to keep simple packages independent of kubernetes dependencies. The
simplification changes the design around objects and makes it based on generic
value.
verbose interface
The initial Store interface, which the cache implementations implement, had too
many behaviors defined. It seemed to leak the implementation detail on the
interface. A cache interface needs ways to perform Get, Set and Delete. All the
other operations like resize, listing the keys, etc can be done per
implementation or build on top of the simple base interface if needed. It's
possible that we may have use case for some of the extra behaviors in the
future, but at present, we don't seem to have them. To simplify the interface,
the Store interface now has Get(), Set() and Delete() which work on string key
and generic type value.
complexity due to StoreObject
Because the initial implementation was based on kubernetes like objects,
any/most operations require building a StoreObject which is passed to the
cache for setting Get/Set/Delete. This made the cache usage code complex for
simple use. Most of the cache use cases in flux will store string or byte slice.
There's no known use case for caching kubernetes style object. Which makes the
need to create StoreObject for every use unnecessary. The new implementation
explicitly requires a string key and generic type value.
cache metrics
The initial cache was implemented to be self-instrumented. For all the
operations, it records metrics on its own. But cache event metrics which
recorded cache hit or miss were defined to be associated with object being
reconciled, similar to the metrics in the runtime package. It is desirable to
provide metrics for cache usage by flux object reconciliation. Since the initial
implementation required a StoreObject for every operation, it extracted the
associated reconciling object details from the StoreObject being passed. This
creates a strong coupling between the cache and the metrics, which is
reflected in the API. It's a nice property but at the cost of complexity on the
API. In addition, the implementation wasn't consistent in regards to the
associated object information on the metric. Some operations like GetByKey()
didn't consider the need for the reconciled object.
The new implementation decouple the cache and the metrics, making the metrics
that require the associated reconciling object to be explicitly called by the
caller who may have the information about the reconciling object.
The new implementation also introduced methods to delete the metrics when the
associated objects are deleted, like we do for any metric that's associated with
flux objects.
cache metrics prefix
The cache
Options
now support setting a metrics prefix to be added to all themetrics registered by a cache. This is to prevent metrics name conflict when
using multiple instances of the cache in a program with the same metrics
registerer. Refer fluxcd/source-controller#1644 for more details.
remove cache from OCI auth
Initially, the cache was used in OCI auth package to cache the obtained auth
token. But later on, it was decided to not cache at the login manager level and
let the caller of such login functions to perform the cache as they have more
context about the scenario. #776 introduced
the cache in oci/auth package along with expiry time from the auth providers.
The cache is now removed and to return the expiry time to the caller, a new
function
LoginWithExpiry()
is introduced. A auth token caching client can callLoginWithExpiry()
to obtain the TTL of the obtained token and use it in thecache.
Following is a list of overall changes in brief:
exists
boolean. If the pointer is nil, the item is not found in the cache.OCI integration test run with the above changes: https://github.com/fluxcd/pkg/actions/runs/11743637711