-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[distributed store] use a single map Vec<u8> -> Counters map #340
Conversation
By keying using Vec<u8> we reduce how often we need to encode/decode the the counter keys. Signed-off-by: Hiram Chirino <[email protected]>
4ecdf0d
to
3895c8a
Compare
if let Some(limited) = process_counter(counter, value.read(), delta) { | ||
if !load_counters { | ||
return Ok(limited); | ||
// we need to take the slow path since we need to mutate the limits map. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So there is an interesting property of non qualified counters (i.e. one where there is no variable qualifying the counter), where for all limits we know, there is a fine set of counters. Given that, we chose to always have a counter for these limits present in our in memory storage to avoid ever having to mutate the datastructure that contains these counters for these (i.e. it's "cheaper" to keep an expired counter around than. updating the datastructure to remove it, to only add it back later).
I understand this isn't this here, but I'm just sharing and thinking what it would mean for the distributed case, i.e. if any instance within a cluster only cares about counters it knows about from its limits.yaml
... 🤔 Should that be shared with other instances? So they can filter on what they share with these other instances? Should we make it the concern of the receiver to drop things it doesn't care about (trading off bandwidth to other instances...)?
Anyways, a bit of a random thought, but one I thought I'd share...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like an optimization thing that would be hard to get the balance right.
@@ -356,29 +268,47 @@ impl CrInMemoryStorage { | |||
struct CounterKey { | |||
namespace: Namespace, | |||
seconds: u64, | |||
max_value: u64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a somewhat radical change... I think you "only" do this to be able to implement From<CounterKey> for Counter
that requires building the corresponding Limit
, right? Now, it also means that someone changing the max_value
of one of their Limit
would reset all their counters for that Limit
, which we learned is not what users expect.
Now, back to my point above about who gets to decide what counter values an instance gets to see or stores locally? What would it mean if cluster A has a max_value of 10, while cluster B has one of 100? Should we consider the counter to be the same, but reach the threshold at different values in each cluster? Should they now be different counters? But how would someone alter a limit's threshold, locally or across all sharing instances?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we push the max_value into the value portion of the map entry? Maybe we have to make it a (max_value, changed_at) tuple type thing so that the last distributed update of it wins?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe a changed_at is not needed, as that's something that's a limits configuration value, so that change should be rolled to all peers at the same time.
self.broker.publish(CounterUpdate { | ||
key, | ||
key: store_key, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
neat pick: this seems we use the same "name" for two things... or I guess in this case two representations of the same thing, is that right? I'd just either call it key
(and value
), or if there are places where two "things" name clash, first try to get rid of it, or just call these things differently, but consistently.
This might be a one off and a real minor thing, but I like keeping an eye out for these when I write such code patterns and avoid it starts leaking all over
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry, I had done a refactor of renaming CrCounterValue to StoreValue to make it more clear that is the value type that the distributed store works with, and then I reverted it to avoid the PR diff from being too large. I forgot to revert the variable names too.
Signed-off-by: Hiram Chirino <[email protected]>
By keying using Vec we reduce how often we need to encode/decode the the counter keys.