Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proper handling of incompatible zedtokens #1723

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

josephschorr
Copy link
Member

@josephschorr josephschorr commented Jan 29, 2024

NOTE: ZedTokens are a bit longer now as a result of this change, but should still be well within the 1024 limit previously defined

Fixes #1541

@github-actions github-actions bot added area/api v1 Affects the v1 API area/CLI Affects the command line area/datastore Affects the storage system area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools) labels Jan 29, 2024
@josephschorr josephschorr force-pushed the incompatible-zedtokens branch 3 times, most recently from d3b259f to 64dda15 Compare January 29, 2024 22:30
@josephschorr josephschorr marked this pull request as ready for review January 29, 2024 22:38
@josephschorr josephschorr requested a review from a team January 29, 2024 22:38
@josephschorr
Copy link
Member Author

Rebased

@josephschorr josephschorr force-pushed the incompatible-zedtokens branch 2 times, most recently from 52895fe to 9bb170d Compare March 13, 2024 21:43
@josephschorr
Copy link
Member Author

Rebased

@josephschorr josephschorr force-pushed the incompatible-zedtokens branch 2 times, most recently from ba65181 to 160e514 Compare April 22, 2024 20:20
@josephschorr
Copy link
Member Author

Updated

Copy link
Contributor

@vroldanbet vroldanbet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some early feedback, I'll continue tomorrow. Please describe in the PR body what problems are you trying to solve and design choices you took to come up with this solution. It helps folks reviewing the PR with the right context 🙏🏻

internal/datastore/crdb/stats.go Show resolved Hide resolved
Comment on lines +546 to +545
// UniqueID returns a unique identifier for the datastore. This identifier
// must be stable across restarts of the datastore if the datastore is
// persistent.
UniqueID(context.Context) (string, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider alternatives using the UniqueID as a stable identifier for zedtokens?

I assume (not present in the PR body) that the goal is making sure zedtokens from one datastore type are not used in another datastore type. But what happens if you evolve the zedtoken implementation from one version to another for the same datastore, and want to force them to be dropped? the uniqueID wouldn't help, would it?
Shouldn't we store a zedtoken versioning parameter like datastoreType+zedtokenVersion?

Copy link
Member Author

@josephschorr josephschorr Jul 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but then it doesn't meet the goal which is to prevent zedtokens not just across types but instances of the datastore as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what happens if you evolve the zedtoken implementation from one version to another for the same datastore, and want to force them to be dropped? the uniqueID wouldn't help, would it?
Shouldn't we store a zedtoken versioning parameter like datastoreType+zedtokenVersion?

Can you address this ☝️?

@josephschorr
Copy link
Member Author

Some early feedback, I'll continue tomorrow. Please describe in the PR body what problems are you trying to solve and design choices you took to come up with this solution. It helps folks reviewing the PR with the right context 🙏🏻

I added the fixes; it was on the commit but not the PR

Copy link
Contributor

@vroldanbet vroldanbet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are transferring UUIDs in every API call for no reason other than to prevent an eventual migration within the same datastore type. That's extremely wasteful.

It's still within the zedtoken spec limit, but it is a very high price to pay for an incredible rare event (a legit one, don't get me wrong). We are storing UUIDs in the datastore as strings, which is the least compact way of storing them and transferring them over the wire. And we are forcing everybody to see the data transferred increase significantly (and latency!) to prevent a scenario that the system will not be subjected to in, I'll dare to say, practically 100% of its lifespan.

We need to rethink this. I get the zedtoken is a stateless token and that it needs to self-contain this information, but I think we can get 99% there by defining the datastore type as a small attribute in the token. I understand migrating from different instances of the same datastore can be problematic, but no one asked for this, and we could find alternative ways for it (e.g. a flag that forces SpiceDB to ignore requested consistency and fallback to full_consistency all the time, or minimize_latency if the customer can take the hit of ignoring the new enemy problem during the migration). I'd argue that by adding such a flag we could completely avoid having to add the datastore type too.

@@ -55,27 +56,47 @@ func RevisionFromContext(ctx context.Context) (datastore.Revision, *v1.ZedToken,
handle := c.(*revisionHandle)
rev := handle.revision
if rev != nil {
return rev, zedtoken.MustNewFromRevision(rev), nil
ds := datastoremw.FromContext(ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This middleware already had a dependency on the datastore middleware, but we never declared it via the middleware framework. Please update the middleware chain to state this middleware has such dependency.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you asking me to add a comment stating that or add something somewhere? (sorry for forgetting)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. The middleware framework has API to define dependencies between middleware instances, which makes it fail in dev if the order is not respected.

internal/middleware/consistency/consistency.go Outdated Show resolved Hide resolved
internal/middleware/consistency/consistency.go Outdated Show resolved Hide resolved
internal/middleware/consistency/consistency.go Outdated Show resolved Hide resolved
internal/middleware/consistency/consistency.go Outdated Show resolved Hide resolved
internal/middleware/consistency/consistency.go Outdated Show resolved Hide resolved
internal/services/v1/relationships.go Show resolved Hide resolved
internal/services/v1/schema.go Show resolved Hide resolved
internal/services/v1/watch.go Show resolved Hide resolved
@josephschorr
Copy link
Member Author

We are transferring UUIDs in every API call for no reason other than to prevent an eventual migration within the same datastore type. That's extremely wasteful.

Is it really? We could reduce the size of the ID to just a small prefix if we wish to save some bytes.

It's still within the zedtoken spec limit, but it is a very high price to pay for an incredible rare event (a legit one, don't get me wrong). We are storing UUIDs in the datastore as strings, which is the least compact way of storing them and transferring them over the wire. And we are forcing everybody to see the data transferred increase significantly (and latency!) to prevent a scenario that the system will not be subjected to in, I'll dare to say, practically 100% of its lifespan.

Except users have already been subject to it - this was spurred by two, different reports: one where a user moved between datastore types, but another where they moved between datastores of the same type (but accidentally)

We need to rethink this. I get the zedtoken is a stateless token and that it needs to self-contain this information, but I think we can get 99% there by defining the datastore type as a small attribute in the token. I understand migrating from different instances of the same datastore can be problematic, but no one asked for this,

It has happened before, sadly.

and we could find alternative ways for it (e.g. a flag that forces SpiceDB to ignore requested consistency and fallback to full_consistency all the time, or minimize_latency if the customer can take the hit of ignoring the new enemy problem during the migration). I'd argue that by adding such a flag we could completely avoid having to add the datastore type too.

This PR does add that flag, but we need a means of tracking to ensure that zedtokens are not used across instances of a datastore. The fact that we allow it now is, technically speaking, a small but real risk.

@josephschorr
Copy link
Member Author

Updated to only store 8 bytes of prefix of the datastore ID. Since datastores are rarely updated, this should be fine for detecting changes.

@josephschorr
Copy link
Member Author

Rebased

an older datastore is used

All ZedTokens are now minted with the datastore's unique ID included
in the ZedToken and that ID is checked when the ZedToken is decoded.

In scenarios where the datastore ID does not match, either an error is
raised (watch, at_exact_snapshot) or configurable behavior is used
(at_least_as_fresh)

Fixes authzed#1541
…token

Results in smaller tokens but given that datastore IDs are generated, should still minimize the chances of a conflict
@josephschorr
Copy link
Member Author

Rebased

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api v1 Affects the v1 API area/CLI Affects the command line area/datastore Affects the storage system area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improvement: Prevent ZedToken's from being used cross-datastore
2 participants