Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactoring loading cache and metrics #127

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mjburghoffer
Copy link

@mjburghoffer mjburghoffer commented Jun 21, 2023

I am introducing some refactorings which accomplish the following:

Loading Cache

  • Changed all caches to have a get(key, loader) method on them, with loader being required.
  • Loading caches allow loader to be optional, as they have a default loader on the cache itself.
  • Loaders can return null or undefined in order to not cache their result.

Rationale

Firstly, this allows for caches to more effectively implement the interface of get, such in the case of a bounded cache, where it can avoid a lot of work by interfacing with the CacheNode directly, rather than wastefully calling getIfPresent, which does a lot of extra work.

Secondly, this makes sense from the perspective that there was nothing special about builder.loading(), since it didn't actually add any additional state or context onto the cache.

This also has the benefit of flattening the delegation hierarchy, removing the DefaultLoadingCache class altogether.

The change to get not caching null results means that developers don't need to throw an Error in order to abort loading the cached value.

Metrics

  • Metrics no longer controlled by delegate cache

Rationale

Reducing complexity of the delegation hierarchy, which should make things easier from a coding perspective, but also from a performance one. The more delegate method hops, the slower the cache.

@mjburghoffer
Copy link
Author

@aholstenson do you maintain this project anymore?

@aholstenson
Copy link
Owner

Thanks for the PR! Can you give me more information on what your use case is here? The PR being a large commit makes this hard to review quickly, but I'll give my initial thoughts.

  1. At first glance it looks like the same amount of work is being done when get(key, loader) is called as getNodeIfPresent does that work instead of getIfPresent
  2. Composition is important to me, I think every cache having a get(key, loader?) seems reasonable, but I'd prefer the automatic loading to be in a LoadingCache class with how 2.x is currently put together, mostly so it's possible to do: new LoadingCache({ parent: parentCache, loader: ... }
  3. The complexity of ExpirationCache seems to have increased a tiny bit here as well, which worries me a bit from a maintenance point of view.
  4. Changing metrics may make sense, but for how the current 2.x works MetricsCache has to still exist.

Looking at it I do have some thoughts about how the library is constructed, and that it might be time to think about a better design for a 3.x version. One of the things I regret in the current version is the builder, it's convenient but if you are targeting a browser and want smaller bundles you quickly end up having to construct caches yourself and the experience there isn't great.

One of my thoughts here would be to target an API closer to this for creating unbounded vs bounded caches:

// Unbounded cache
const cache = new Cache({
	storage: new UnboundedStorage(),
});

// A cache using LFU eviction
const cache = new Cache({
	storage: new LFUStorage({ 
		maxSize: 100,
	}),
});

As loading caches need that get with an optional loader they'd be a separate class:

const cache = new LoadingCache({
	storage: new LFUStorage({
		maxSize: 100,
	}),
	loader: ...
});

// So it's possible to use get without a loader
await cache.get('key');

This would make expiration and metrics options instead, something similar to this for both Cache and LoadingCache:

new Cache({
	storage: new LFUStorage({ 
		maxSize: 100,
	}),
	expiration: expireAfterWrite(1000),
	metrics: new StandardMetrics(),
})

I'm very interested to hear about your thoughts and use case here and if such a design would work for you.

@mjburghoffer
Copy link
Author

mjburghoffer commented Jul 7, 2023

Thanks for the PR! Can you give me more information on what your use case is here? The PR being a large commit makes this hard to review quickly, but I'll give my initial thoughts.

1. At first glance it looks like the same amount of work is being done when `get(key, loader)` is called as `getNodeIfPresent` does that work instead of `getIfPresent`

Yeah this ended up not being as fruitful as I had hoped. I still do think it is a valuable abstraction though - especially if you look to Caffeine cache for inspiration. Having access to the node could be useful for the following:

  • for set, once we have the node, we can mutate it rather than allocating a new one, removing the current one, and overwriting it with the new one. That would be a considerable amount less of work.
  • if we add methods like setIfPresent, computeIfPresent, compute, etc, it will have similar benefits to above.
  • for get, we could create the node and leave it empty, mutating it to set the value after the loader finishes its work. This could remove the need to have a separate map that manages all of the active loader promises.

basically, it is particularly useful for mutations

2. Composition is important to me, I think every cache having a `get(key, loader?)` seems reasonable, but I'd prefer the automatic loading to be in a `LoadingCache` class with how 2.x is currently put together, mostly so it's possible to do: `new LoadingCache({ parent: parentCache, loader: ... }`

Correction here: every cache has a get(key, loader). Only the LoadingCache has get(key, loader?), as it has a default loader already.

As it stands now, I have a separate interface for LoadingCache and Cache. It just so happens that the implementation of both of them is the same, as the logic was so similar. The public API, though, forces the develop to think they are working with either LoadingCache or Cache.

3. The complexity of `ExpirationCache` seems to have increased a tiny bit here as well, which worries me a bit from a maintenance point of view.

I didn't actually increase complexity here, I moved lines of code around from within the same file, and also from the old cacheBuilder. So the complexity of this file is up slightly, but cacheBuilder has decreased significantly. Overall this means that the project as a whole is less complex.

4. Changing metrics may make sense, but for how the current 2.x works `MetricsCache` has to still exist.

Seems like a lot of delegation and complexity for not a lot of value. Difference in opinion, I think.

Looking at it I do have some thoughts about how the library is constructed, and that it might be time to think about a better design for a 3.x version. One of the things I regret in the current version is the builder, it's convenient but if you are targeting a browser and want smaller bundles you quickly end up having to construct caches yourself and the experience there isn't great.

I think the builder should actually just be a nicely typed interface, so you can pass in a CacheConfig object or something rather than using a fluent API.

Speaking of wishlist for 3.x, I have made some extensions I use for my company's code base that do the following, and may be useful (if you want me to contribute them to the project):

  • transform any value to a KeyType on write, so a cache can have a key of any type. Still keeps the original keys so you can iterate them etc.
  • serialize/deserialize a value when being written/read from cache
  • support for bigint as a KeyType
  • EmptyCache, which implements Cache, but doesn't actually store any values

One of my thoughts here would be to target an API closer to this for creating unbounded vs bounded caches:

// Unbounded cache
const cache = new Cache({
	storage: new UnboundedStorage(),
});

// A cache using LFU eviction
const cache = new Cache({
	storage: new LFUStorage({ 
		maxSize: 100,
	}),
});

As loading caches need that get with an optional loader they'd be a separate class:

const cache = new LoadingCache({
	storage: new LFUStorage({
		maxSize: 100,
	}),
	loader: ...
});

// So it's possible to use get without a loader
await cache.get('key');

This would make expiration and metrics options instead, something similar to this for both Cache and LoadingCache:

new Cache({
	storage: new LFUStorage({ 
		maxSize: 100,
	}),
	expiration: expireAfterWrite(1000),
	metrics: new StandardMetrics(),
})

I'm very interested to hear about your thoughts and use case here and if such a design would work for you.

I think LoadingCache and Cache are pretty similar. In practice a LoadingCache is just a Cache with a default loader (which is incidentally how I implemented both of them with the same class). Not sure why the distinction really needs to be made, since the publicly exposed API interfaces will prevent the user from calling get with no loader in the case of them using a Cache. It's even more obvious they are the same when you consider the builder previously had a loading() method that did nothing, but caused a different class to be constructed.

Personally, I don't use LoadingCache, I just always pass in the loader when I call get.

I think that there are probably some optimization tweaks you could make if you make it so LoadingCache.get doesn't accept an optional loader - in which case having different classes makes a ton of sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants