Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

associated_against with tsvector_column #359

Open
cireficc opened this issue May 20, 2017 · 8 comments
Open

associated_against with tsvector_column #359

cireficc opened this issue May 20, 2017 · 8 comments

Comments

@cireficc
Copy link

cireficc commented May 20, 2017

I have tried every which way and cannot get this pg_search scope to work. The following does work:

Translation

class Translation < ActiveRecord::Base
  belongs_to  :source_phrase,
              class_name: "Phrase",
              foreign_key: "source_id"
  belongs_to  :destination_phrase,
              class_name: "Phrase",
              foreign_key: "destination_id"

  include PgSearch
  pg_search_scope :search_full_text, :associated_against =>
      {:source_phrase => :text}
end

Phrase

class Phrase < ActiveRecord::Base
  
  self.table_name = "new_phrases"

  has_many    :translations,
              class_name: "Translation",
              foreign_key: "source_id"
  has_many    :source_phrases,
              through: :translations
  has_many    :inverse_translations,
              class_name: "Translation",
              foreign_key: "destination_id"
  has_many    :destination_phrases,
              through: :inverse_translations
end

In the above I am not trying to use a ts_vector column on the associated table - I am just using the text field (which is obviously incredibly slow). If I try the following (in translation.rb):

pg_search_scope :search_full_text,
                  :associated_against => {
                      :source_phrase => :text,
                      :using => {
                          :tsearch => {
                              :tsvector_column => "tsv_text"
                           }
                       }
                  }

I get the error: NoMethodError: undefined method 'table_name' for nil:NilClass. Well that makes sense because using isn't part of Association. If I change it up slightly:

pg_search_scope :search_full_text,
                  :associated_against => {
                      :source_phrase => :text
                  },
                  :using => {
                      :tsearch => {
                          :tsvector_column => "tsv_text"
                      }
                  }

I get the error column translations.tsv_text does not exist, which makes sense, because the column I want is phrases.tsv_text. I can't specify the table name in the tsearch option, so I don't know what to do.

How do I use the :tsvector_column option for an associated table? I.e. I want to query phrases.tsv_text. Is this even possible with the current state of the library?

@nertzy
Copy link
Collaborator

nertzy commented Aug 1, 2017

This is currently a missing feature. The associated_against option unfortunately doesn't yet do anything special for a tsvector column. Presumably it would be possible to detect that the joined column is a tsvector and join the multiple rows together in a different way from how the current string aggregation works.

@arcreative
Copy link

@nertzy Any more thoughts about this? I would like to improve performance when searching against other models (also adding tsv_extended column so I can give users the option of whether they want simple or extended search), as currently it's taking around 800-1000ms to complete a search against related models. A SQL-only solution would be fine with me, but tsvectors are definitely not my strong suit.

@arcreative
Copy link

And to clarify, by SQL-only, I mean we can drop the against option and just have the SQL trigger do the heavy lifting.

@arcreative
Copy link

arcreative commented Apr 22, 2019

I did end up going this route and just using against: [] in the search scope. Works really well in my case. In case others don't want to bang their head against a wall trying to figure out how to craft the trigger, this is what ended up working for me:

CREATE OR REPLACE FUNCTION update_users_tsv() RETURNS trigger AS $$
BEGIN
  NEW.tsv := (
    SELECT 
      setweight(to_tsvector('simple', CONCAT(NEW.first_name, ' ', NEW.last_name)), 'A') || 
      setweight(to_tsvector('english', array_to_string(array_agg(DISTINCT skills.name), ' ')), 'B') ||
      setweight(to_tsvector('simple',  array_to_string(array_agg(DISTINCT cv_entries.client_name), ' ')), 'B') ||
      setweight(to_tsvector('english', array_to_string(array_agg(DISTINCT CONCAT(cv_entries.short_description, ' ', cv_entries.description)), ' ')), 'C')
    FROM users u
    LEFT JOIN skill_assignments sa ON sa.skillable_id = u.id AND sa.skillable_type = 'User'
    LEFT JOIN skills ON skills.id = sa.skill_id
    LEFT JOIN cv_entries ON cv_entries.user_id = u.id
    WHERE u.id = NEW.id
    GROUP BY u.id
  );
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER tsvectorupdate BEFORE INSERT OR UPDATE
ON users FOR EACH ROW EXECUTE PROCEDURE update_users_tsv();

Then the old UPDATE users SET id = id to update them all the first time.

@vpiau
Copy link

vpiau commented Aug 3, 2020

Did you tried to use merge query ?

# translation.rb

scope :search_full_text, -> (text) {
	joins(:source_phrase).merge(Phrase.search_full_text(text))
}
# phrase.rb

pg_search_scope :search_full_text,
   against: [:text],
   using: {
      tsearch: {
	tsvector_column: 'tsv_text'
      }
   }

@jmarsh24
Copy link

Would it be possible to update the readme with instructions on how to create a trigger to index associated records?

@mhenrixon
Copy link

mhenrixon commented Aug 25, 2023

You can use my branch #504; we are running that branch in production while I wait for feedback on the PR.

@arcreative
Copy link

@mhenrixon Thanks, I ended up writing my own gem: https://rubygems.org/gems/pg_fulltext. It supports native websearch_to_tsquery, so it actually does a lot of things that this gem doesn't, and I've opted to manage associations via the record's tsv column for efficiency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants