Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for further filtering of subquery #511

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

m5rk
Copy link

@m5rk m5rk commented Apr 2, 2023

Source:

dwillett@c6418b5

From: #292

The problem seems to be that the notion of additional_attributes is not really elaborated enough. I suspect in most cases, it's to facilitate segregating tenants, like here:

https://www.donnfelker.com/multitenant-pgsearch-how-to/

But if they wonder why they have terrible performance and look at the explain plan, they'll see why: The subquery ignores the opportunity to restrict the search to the tenant.

@Zinggi
Copy link

Zinggi commented Apr 6, 2023

Wow, using you're fork + using the same filtering condition that is used outside the inner query, I get a speedup of ca. 10x for a troublesome search!

Amazing ✨

@nertzy nertzy added the needs spec Adds functionality but isn't yet tested in the specs label Apr 7, 2023
@nertzy
Copy link
Collaborator

nertzy commented Apr 7, 2023

This is interesting! We need to get some spec coverage on this to make sure it works as intended. I'm curious to dig in more.

@m5rk
Copy link
Author

m5rk commented Apr 8, 2023

This is interesting! We need to get some spec coverage on this to make sure it works as intended. I'm curious to dig in more.

I'm happy to dig in as well. The only thing I can think of is something along these lines:

context 'when a block is given' do
  it 'has access to the subquery' do
    # Examine the content of the explain plan or .to_sql
  end
end

Update: @nertzy See my latest commit, which adds coverage that illustrates how the block is used.

@m5rk
Copy link
Author

m5rk commented Apr 8, 2023

Assuming the code and specs are good, it seems like we may also want to update the docs to mention this capability. But I didn't want to spend time doing that until I get some confirmation that this approach (all credit to @dwillett) is approved by the maintainers.

@LuukvH
Copy link

LuukvH commented Jun 28, 2023

@m5rk, do you have any updates on this? We also use the pg_search gem in a multi-tenant environment. However, with the growing number of records across multiple tenants, we are experiencing performance issues due to the inability to scope the subquery.

@m5rk
Copy link
Author

m5rk commented Jun 28, 2023

@m5rk, do you have any updates on this? We also use the pg_search gem in a multi-tenant environment. However, with the growing number of records across multiple tenants, we are experiencing performance issues due to the inability to scope the subquery.

No update on my end, sorry. I'm just waiting for the maintainers to give feedback or guidance on what's missing or to merge and then publish a new gem version. @nertzy, thoughts?

@LuukvH
Copy link

LuukvH commented Jun 28, 2023

@nertzy Another option could be to remove the unscoped on the subquery which was added in version 1.1 it seems to remove unnecessary joins. Which in my opinion is a concern to be handled by the developer and not this gem. This would also allow this gem to work with tenant gems that uses default scopes out of the box.

@Zinggi
Copy link

Zinggi commented Oct 2, 2023

Sorry for another "bumb" here, but is there something that's missing here for this to be merged?
Maybe I could help out somehow?

The PR looks pretty good to me, and we've been using this branch in production since about half a year without issue.
Without the ability to filter on the sub query, we couldn't use this gem at all for performance reasons...

@m5rk
Copy link
Author

m5rk commented Oct 2, 2023

@nertzy Any thoughts? I added coverage as you requested. As I noted earlier, maybe the only remaining item is to update the documentation?

@jsuchal
Copy link

jsuchal commented Dec 21, 2023

If anyone wants a drop-in fix. This worked for us.

Drop this monkey patch somewhere (e.g. initializers in rails)

It basically works by scoping the inner query by the scope that goes in before you call pg_search_all.

# frozen_string_literal: true

# Monkey patch that adds scope pushing down conditions to subselect
module PgSearch
  class ScopeOptions
    def apply(scope)
      scope = include_table_aliasing_for_rank(scope)
      rank_table_alias = scope.pg_search_rank_table_alias(include_counter: true)

      scope
        .joins(rank_join(rank_table_alias, scope))
        .order(Arel.sql("#{rank_table_alias}.rank DESC, #{order_within_rank}"))
        .extend(WithPgSearchRank)
        .extend(WithPgSearchHighlight[feature_for(:tsearch)])
    end

    def rank_join(rank_table_alias, scope)
      "INNER JOIN (#{subquery(scope).to_sql}) AS #{rank_table_alias} ON #{primary_key} = #{rank_table_alias}.pg_search_id"
    end

    def subquery(scope)
      scope
        .select("#{primary_key} AS pg_search_id")
        .select("#{rank} AS rank")
        .joins(subquery_join)
        .where(conditions)
        .limit(nil)
        .offset(nil)
    end
  end
end

@m5rk
Copy link
Author

m5rk commented Apr 13, 2024

@nertzy Anything else I can do to help get this reviewed and merged?

@Joe-Degler
Copy link

Joe-Degler commented May 28, 2024

First up - thanks for this PR! I was able to slightly modify it and implement a monkey patch into our project.

I'm questioning the usefulness of having the block on the invocation, as opposed to the definition.

In our multi-tenancy system, the active tenant is a globally accessible variable. So what we want is actually the ability to pass a block to the initial definition, not to the method that ends up calling filter_bar. That's actually how I expected this PR to work until I took a more detailed look. Having to pass it on every filter_bar call (unless you specifically wrap that method) seems like an easy way to forget it.

In this case, this is how I imagined it'd look:

  pg_search_scope :filter_bar,
                  against: :bar
                  } do |foos|
    foos.partitioned
  end

Maybe there's value in being able to pass a block in either the definition, as well as the actual caller of the filter_bar method?

@m5rk
Copy link
Author

m5rk commented Aug 21, 2024

@nertzy or other maintainers, can you please clarify what else is needed in order to merge this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs spec Adds functionality but isn't yet tested in the specs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants