Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better searching - Elasticsearch support #834

Draft
wants to merge 8 commits into
base: develop
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,10 @@ gem 'stripe', '~> 5.28'
# EeeMAILS!
gem 'premailer-rails', '~> 1.11'

# Better searching
gem 'elasticsearch-model', '~> 7.2'
gem 'elasticsearch-rails', '~> 7.2'

group :test do
gem 'minitest', '~> 5.10.3'
gem 'minitest-ci', '~> 3.4.0'
Expand Down
41 changes: 41 additions & 0 deletions Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -105,8 +105,44 @@ GEM
diffy (3.4.0)
docile (1.4.0)
e2mmap (0.1.0)
elasticsearch (7.17.1)
elasticsearch-api (= 7.17.1)
elasticsearch-transport (= 7.17.1)
elasticsearch-api (7.17.1)
multi_json
elasticsearch-model (7.2.1)
activesupport (> 3)
elasticsearch (~> 7)
hashie
elasticsearch-rails (7.2.1)
elasticsearch-transport (7.17.1)
faraday (~> 1)
multi_json
erubi (1.10.0)
execjs (2.8.1)
faraday (1.10.1)
faraday-em_http (~> 1.0)
faraday-em_synchrony (~> 1.0)
faraday-excon (~> 1.1)
faraday-httpclient (~> 1.0)
faraday-multipart (~> 1.0)
faraday-net_http (~> 1.0)
faraday-net_http_persistent (~> 1.0)
faraday-patron (~> 1.0)
faraday-rack (~> 1.0)
faraday-retry (~> 1.0)
ruby2_keywords (>= 0.0.4)
faraday-em_http (1.0.0)
faraday-em_synchrony (1.0.0)
faraday-excon (1.1.0)
faraday-httpclient (1.0.1)
faraday-multipart (1.0.4)
multipart-post (~> 2)
faraday-net_http (1.0.1)
faraday-net_http_persistent (1.2.0)
faraday-patron (1.0.0)
faraday-rack (1.0.0)
faraday-retry (1.0.3)
fastimage (2.2.4)
ffi (1.15.5)
flamegraph (0.9.5)
Expand Down Expand Up @@ -153,6 +189,8 @@ GEM
minitest (5.10.3)
minitest-ci (3.4.0)
minitest (>= 5.0.6)
multi_json (1.15.0)
multipart-post (2.2.3)
mysql2 (0.5.3)
nio4r (2.5.8)
nokogiri (1.13.3-x86_64-linux)
Expand Down Expand Up @@ -245,6 +283,7 @@ GEM
ruby-progressbar (1.11.0)
ruby-vips (2.1.4)
ffi (~> 1.12)
ruby2_keywords (0.0.5)
sass (3.7.4)
sass-listen (~> 4.0.0)
sass-listen (4.0.0)
Expand Down Expand Up @@ -322,6 +361,8 @@ DEPENDENCIES
devise (~> 4.7)
diffy (~> 3.3)
e2mmap (~> 0.1)
elasticsearch-model (~> 7.2)
elasticsearch-rails (~> 7.2)
fastimage (~> 2.1)
flamegraph (~> 0.9)
groupdate (~> 4.3)
Expand Down
15 changes: 9 additions & 6 deletions app/controllers/search_controller.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,17 +3,20 @@ def search
@posts = if params[:search].present?
search_data = helpers.parse_search(params[:search])
posts = (current_user&.is_moderator || current_user&.is_admin ? Post : Post.undeleted)
.qa_only.list_includes
.qa_only
posts = helpers.qualifiers_to_sql(search_data[:qualifiers], posts)
posts = posts.paginate(page: params[:page], per_page: 25)

if search_data[:search].present?
posts.search(search_data[:search]).user_sort({ term: params[:sort], default: :search_score },
relevance: :search_score, score: :score, age: :created_at)
search_score_key = SiteSetting['ElasticsearchEnabled'] ? :es_search_score : :search_score
posts = posts.search(search_data[:search])
.user_sort({ term: params[:sort], default: search_score_key },
relevance: search_score_key, score: :score, age: :created_at)
else
posts.user_sort({ term: params[:sort], default: :score },
score: :score, age: :created_at)
posts = posts.user_sort({ term: params[:sort], default: :score },
score: :score, age: :created_at)
end

posts.list_includes.paginate(page: params[:page], per_page: 25)
end
@count = begin
@posts&.count
Expand Down
14 changes: 9 additions & 5 deletions app/models/application_record.rb
Original file line number Diff line number Diff line change
Expand Up @@ -97,12 +97,16 @@ def user_sort(term_opts, **field_mappings)
requested = term_opts[:term]
direction = term_opts[:direction] || :desc
if requested.nil? || field_mappings.exclude?(requested.to_sym)
$active_search_param = default
default.is_a?(Symbol) ? order(default => direction) : order(default)
sort_key = default
else
requested_val = field_mappings[requested.to_sym]
$active_search_param = requested_val
requested_val.is_a?(Symbol) ? order(requested_val => direction) : order(requested_val)
sort_key = field_mappings[requested.to_sym]
end

$active_search_param = sort_key
if sort_key == :es_search_score
self
else
sort_key.is_a?(Symbol) ? order(sort_key => direction) : order(sort_key)
end
end
end
Expand Down
45 changes: 45 additions & 0 deletions app/models/concerns/elasticsearchable.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Adds elastic search support to the given model
#
# We use a mocking approach to allow elasticsearch to be enabled and disabled without server restart.
module Elasticsearchable
extend ActiveSupport::Concern

# Mock for elasticsearch when it is not enabled.
class ElasticsearchMock
def client
self
end

def method_missing(_name); end
end

included do
include Elasticsearch::Model
include Elasticsearch::Model::Callbacks

# Use the Rails env in the index name to prevent issues of test indices overriding development/production indices
index_name "#{Rails.env}_#{model_name.collection.gsub(%r{/}, '-')}"

# Override elasticsearch class method such that we can mock it in case elasticsearch is disabled
def self.__elasticsearch__(&block)
if SiteSetting['ElasticsearchEnabled']
@__elasticsearch__ ||= Elasticsearch::Model::Proxy::ClassMethodsProxy.new(self)
@__elasticsearch__.instance_eval(&block) if block_given?
@__elasticsearch__
else
ElasticsearchMock.new
end
end

# Override elasticsearch instance method such that we can mock it in case elasticsearch is disabled
def __elasticsearch__(&block)
if SiteSetting['ElasticsearchEnabled']
@__elasticsearch__ ||= Elasticsearch::Model::Proxy::InstanceMethodsProxy.new(self)
@__elasticsearch__.instance_eval(&block) if block_given?
@__elasticsearch__
else
ElasticsearchMock.new
end
end
end
end
32 changes: 30 additions & 2 deletions app/models/post.rb
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
class Post < ApplicationRecord
include CommunityRelated
include Elasticsearchable

belongs_to :user
belongs_to :post_type
Expand Down Expand Up @@ -58,8 +59,12 @@ class Post < ApplicationRecord
after_save :update_category_activity, if: -> { post_type.has_category }
after_save :recalc_score

def self.search(term)
match_search term, posts: :body_markdown
scope :search, ->(term) do
if SiteSetting['ElasticsearchEnabled']
__elasticsearch__.search(create_elasticsearch_query(term)).records.merge(self)
else
match_search term, posts: :body_markdown
end
end

# Double-define: initial definitions are less efficient, so if we have a record of the post type we'll
Expand Down Expand Up @@ -172,6 +177,15 @@ def reaction_list
.map { |_k, v| [v.first.reaction_type, v] }.to_h
end

# Defines how Elasticsearch should index the data.
settings do
mappings dynamic: false do
indexes :id, type: :integer
indexes :title, type: :text, analyzer: :english
indexes :body_markdown, type: :text, analyzer: :english
end
end

private

def update_tag_associations
Expand Down Expand Up @@ -338,4 +352,18 @@ def update_category_activity
category.update_activity(last_activity)
end
end

def self.create_elasticsearch_query(term)
{
query: {
multi_match: {
query: term,
fields: %W[
title^#{SiteSetting['ElasticsearchTitleWeight'] || 2}
body_markdown^#{SiteSetting['ElasticsearchBodyWeight'] || 1}
]
}
}
}
end
end
11 changes: 11 additions & 0 deletions config/initializers/elasticsearch.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# When not using default elasticsearch settings, add them here.
#
# Elasticsearch::Model.client = Elasticsearch::Client.new hosts: [
# {
# host: 'localhost',
# port: '9200',
# user: 'elastic',
# password: 'password',
# scheme: 'https'
# }
# ]
26 changes: 26 additions & 0 deletions db/seeds/site_settings.yml
Original file line number Diff line number Diff line change
Expand Up @@ -416,3 +416,29 @@
category: Display
description: >
Automatically expand vote summary entries for the last X days, X being the value of this setting. Set to 0 to expand all entries.

- name: ElasticsearchEnabled
value: false
value_type: boolean
community_id: ~
category: Search
description: >
Enable better searching with Elasticsearch. Requires Elasticsearch to be installed and configured.
WARNING: When enabling this, you must also synchronize elasticsearch with the database.
If elasticsearch is out of sync with the database, server errors will occur.

- name: ElasticsearchTitleWeight
value: 2
value_type: integer
community_id: ~
category: Search
description: >
The relative weight that matches in the title get (Elasticsearch only).

- name: ElasticsearchBodyWeight
value: 1
value_type: integer
community_id: ~
category: Search
description: >
The relative weight that matches in the body get (Elasticsearch only).