-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(metrics): Add ability for indexer cache to write new schema #58414
Conversation
5839070
to
6ac3b93
Compare
🚨 Warning: This pull request contains Frontend and Backend changes! It's discouraged to make changes to Sentry's Frontend and Backend in a single pull request. The Frontend and Backend are not atomically deployed. If the changes are interdependent of each other, they must be separated into two pull requests and be made forward or backwards compatible, such that the Backend or Frontend can be safely deployed independently. Have questions? Please ask in the |
🏷 The following changes will be made to the repository labels
|
d73c1b9
to
5e76277
Compare
6ac3b93
to
79b4f7b
Compare
79b4f7b
to
fb9d162
Compare
5e76277
to
57e4ecf
Compare
fb9d162
to
05d76b1
Compare
Codecov Report
@@ Coverage Diff @@
## master #58414 +/- ##
==========================================
- Coverage 80.60% 80.59% -0.01%
==========================================
Files 5151 5151
Lines 225476 225419 -57
Branches 37986 37987 +1
==========================================
- Hits 181736 181670 -66
- Misses 38187 38189 +2
- Partials 5553 5560 +7
|
57e4ecf
to
90c62ae
Compare
05d76b1
to
987ef58
Compare
987ef58
to
9d0dc10
Compare
90c62ae
to
3caa6a1
Compare
# This PR - Add a `namespace` parameters to the `make_cache_key` function, and depending on the feature flag (set to false right now) will create cache keys from the new namespaces that are separate for the read path and write path. - Functionally does not change anything unless the flag is set to True **Next change in this stack PR** #58414
22c9eac
to
e34b2da
Compare
@@ -150,14 +160,28 @@ def get_many(self, namespace: str, keys: Iterable[str]) -> MutableMapping[str, O | |||
def set_many(self, namespace: str, key_values: Mapping[str, int]) -> None: | |||
cache_key_values = {self._make_cache_key(k): v for k, v in key_values.items()} | |||
self.cache.set_many(cache_key_values, timeout=self.randomized_ttl, version=self.version) | |||
if options.get("sentry-metrics.indexer.write-new-cache-namespace"): | |||
timestamp = int(datetime.utcnow().timestamp()) | |||
name_sapced_cache_key_values = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name_sapced_cache_key_values = { | |
namespaced_cache_key_values = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thanks!
cache_keys = [self._make_cache_key(key) for key in keys] | ||
self.cache.delete_many(cache_keys, version=self.version) | ||
self.cache.delete_many([self._make_cache_key(key) for key in keys], version=self.version) | ||
if options.get("sentry-metrics.indexer.write-new-cache-namespace"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This string option name is being used at a lot of places in the code. Same for read new cache namespace. Lets define string constants for these to avoid any possibility of copy/paste mishaps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, fixed, thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comments are not blockers. Feel free to merge after making the change
e34b2da
to
2733f96
Compare
This PR
sentry-metrics.indexer.write-new-cache-namespace
(set to False default), and if it's True, write to the new cache schema along with the old cache schema.Previous change in this stack PR
#58178
Motivation
Click to expand
The indexer is making the following assumptions with Memcached:
bulk_record
(the indexer’s “write path”)It’s very easy to break 1 or 2 inadvertently. For example, we are inserting key/value pairs back into the cache during a
resolve
cache miss. Doing so is justified as some strings could be more read heavy than write heavy, such as users with automated scripts that are scanning for a specific key to show up.Background
Click to expand
How accessing the cache works today in the Indexer
- Emit
CACHE_READ
in header- Emit DB_READ in header
- Write {string} → {id} to cache
- Write {string} → {id} to cache
where
bulk_record
= the code path for writing indexed stringsand
resolve
= the code path for querying stringsGoal
Click to expand
We store a timestamp of when the cache is stored in the value, and also partition the key between the write and read path.(writer + string) → (integer + timestamp)
How this works
- Emit CACHE_READ in header
- Emit DB_READ in header
- Write br:{string} → {id}:{timestamp} to cache
- Increment metric
- Emit DB_READ in header
- Write br:{string} → {id}:{timestamp} to cache
- Write res:{string} → {id}:{timestamp} to cache
- Increment metric
- Write res:{string} → {id}:{timestamp} to cache
Deployment Plan
Click to expand
(string) → (integer)
and(writer + string) → (integer + timestamp)
via a sentry optionbulk_record
changes to write to both types of key/value pairs ((string) → (integer)
and(writer + string) → (integer + timestamp)
) on eachbulk_record
lookup(writer + string) → (integer + timestamp)
resolve
path cache replenishmentLocal Testing
Click to expand
Started sentry with memcache and kcat on
snuba-generic-metrics
Send first batch of 3 messages with the send_metrics script
Confirm that all 3 values have fetch type of FIRST_SEEN ("f")
Send the same batch of 3 messages with the send_metrics script
Confirm that the new messages all have fetch type of CACHE_HIT ("c")