Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[nexus-db-queries] remove uses of filter_target #5052

Merged

Conversation

sunshowers
Copy link
Contributor

@sunshowers sunshowers commented Feb 13, 2024

filter_target is required for partial indexes but is easy to misuse for full indexes. To work around that, define an as_partial_index method, switch to it, and ban general uses of filter_target via clippy.

as_partial_index adds a WHERE false clause. See the comments included in the PR for why that works.

The rest of the PR:

  • Makes sure clippy's disallowed_ lints are active. -A clippy::style disables them.
  • Adds some missing tests (thanks @jmpesp)
  • Fixes a RunnableQuery that often did not return anything, replacing it with RunnableQueryNoReturn.

Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
@sunshowers
Copy link
Contributor Author

Thanks, looking through the comments now.

clippy.toml Outdated
Comment on lines 5 to 8
# Instead, import diesel::query_dsl::methods::FilterDsl, then use
# `do_update().set(...).filter(...)`. (This is not part of the
# documentation as of Diesel 2.1.4, but it works.)
"diesel::upsert::DecoratableTarget::filter_target",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's possible to write a simple test which would break if, when updating to a new Diesel version, FilterDsl changed to no longer generate SQL that works correctly with crdb? Is that worth the effort?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think existing tests would catch this, right?

nexus/db-queries/src/db/datastore/snapshot.rs Outdated Show resolved Hide resolved
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
@sunshowers
Copy link
Contributor Author

As discussed today, we do actually need filter_target here because this is a partial index. Replace our uses of it with this thing that's not not a hack, but I think is significantly harder to misuse.

I've verified via CockroachDB EXPLAIN that this compiles down to the same query.

I'm happy to write extensive comments explaining this, but would love to get a thumbs up on the approach.

@sunshowers sunshowers changed the title [nexus-db-queries] remove uses of filter_target [WIP] [nexus-db-queries] remove uses of filter_target Feb 14, 2024
@sunshowers sunshowers requested a review from smklein February 14, 2024 04:05
@sunshowers
Copy link
Contributor Author

sunshowers commented Feb 14, 2024

I spent the evening thinking about this.

For reference, the approaches we've discussed so far are:

  1. Require that users of filter_target use allow(clippy::disallowed_methods).
  2. Largely abandon this PR, and trust that (now that we've flagged it in such memorable fashion) we'll catch it during review.
  3. Use ON CONFLICT ON CONSTRAINT (this does not work for partial indexes)
  4. Use this WHERE false hack.

1, 2 and 3 were discussed in the meeting today. 4 came later.

3 would be amazing if it worked, but it sadly does not. Out of the valid approaches discussed in the meeting today (1 and 2), there seemed to be general consensus about 1.


The best argument I can come up with for this is that it is the closest we can get to achieving 3. While we can't name the index itself, we can name the columns constrained by it. Then WHERE false basically tells the database to figure out the index constraints.

One argument against it is that basically that we're relying on a 2000-year old bug in classical logic, as Eliza put it. I verified that crdb 23.2.0 has the same behavior, but it is possible that Cockroach errors on this in the future. Though in that case:

  • It's not a hugely difficult revert.
  • There will hopefully be a better way to express this as part of that effort. (Why can't ON CONFLICT ON CONSTRAINT just work...)

nexus/tests/integration_tests/snapshots.rs Show resolved Hide resolved
nexus/db-queries/src/db/on_conflict_ext.rs Outdated Show resolved Hide resolved
fn as_partial_index(self) -> Self::AsPartialIndexOutput {
#[allow(clippy::disallowed_methods)]
{
self.filter_target(sql::<Bool>("false"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I do understand (somewhat) based on the discussion yesterday that "contradiction implies YOLO", so false implies all partial indexes.

Is there ever a valid reason to select a subset of partial indices? Would this be desirable where someone wants conflict on one index, but not another?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I do understand (somewhat) based on the discussion yesterday that "contradiction implies YOLO", so false implies all partial indexes.

Specifically, false implies all the constraints on the partial indexes for the given columns. For example, in this case we have a partial index:

CREATE UNIQUE INDEX IF NOT EXISTS lookup_silo_user_by_silo ON omicron.public.silo_user (
    silo_id,
    external_id
) WHERE
    time_deleted IS NULL;

If you specify silo_id and external_id, and pass in WHERE false, the database will be able to infer that you mean the constraint time_deleted IS NULL. If you specify a set of columns that doesn't have any kind of unique constraint on it, the query will error out.

Is there ever a valid reason to select a subset of partial indices? Would this be desirable where someone wants conflict on one index, but not another?

From my testing against crdb, the set of columns has to match one or more UNIQUE INDEXes exactly. If there are multiple UNIQUE INDEX instances on a set of columns, then WHERE false will match all of them.

I think there are theoretical cases where it's possible that being able to specify some narrower constraint is useful, but I can't imagine we'll be needing it in omicron any time soon.

Created using spr 1.3.6-beta.1
@sunshowers sunshowers changed the title [WIP] [nexus-db-queries] remove uses of filter_target [nexus-db-queries] remove uses of filter_target Feb 15, 2024
@sunshowers
Copy link
Contributor Author

sunshowers commented Feb 15, 2024

Rendered documentation: click to expand

Screenshot 2024-02-14 at 19-10-10 IncompleteOnConflictExt in nexus_db_queries db on_conflict_ext - Rust

@sunshowers sunshowers enabled auto-merge (squash) February 15, 2024 03:17
@sunshowers sunshowers disabled auto-merge February 15, 2024 03:36
@sunshowers sunshowers merged commit 97dfaf0 into main Feb 15, 2024
20 checks passed
@sunshowers sunshowers deleted the sunshowers/spr/nexus-db-queries-remove-uses-of-filter_target branch February 15, 2024 06:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants