Skip to content

Commit

Permalink
Move string similarity algorithm into constant
Browse files Browse the repository at this point in the history
  • Loading branch information
gregorbg authored Jul 18, 2024
1 parent b97d416 commit be65ffb
Showing 1 changed file with 5 additions and 7 deletions.
12 changes: 5 additions & 7 deletions lib/finish_unfinished_persons.rb
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ def self.search_persons(competition_ids = nil)
unfinished_persons = []
available_id_spots = {} # to make sure that all of the newcomer IDs that we're creating in one batch are unique among each other

persons_cache = Person.select(:id, :wca_id, :name, :dob, :countryId).to_a
persons_cache = Person.select(:id, :wca_id, :name, :dob, :countryId)

unfinished_person_results.each do |res|
next if unfinished_persons.length >= MAX_PER_BATCH
Expand Down Expand Up @@ -99,14 +99,12 @@ def self.compute_similar_persons(result, persons_cache, n = 5)
.take(n)
end

def self.string_similarity_algorithm
# Original PHP implementation uses PHP stdlib `string_similarity` function, which is custom built
# and "kinda like" Jaro-Winkler. I felt that the rewrite warrants a standardised matching algorithm.
FuzzyStringMatch::JaroWinkler.create(:native)
end
# Original PHP implementation uses PHP stdlib `string_similarity` function, which is custom built
# and "kinda like" Jaro-Winkler. I felt that the rewrite warrants a standardised matching algorithm.
JARO_WINKLER_ALGO = FuzzyStringMatch::JaroWinkler.create(:native)

def self.string_similarity(a, b)
self.string_similarity_algorithm.getDistance(a, b)
JARO_WINKLER_ALGO.getDistance(a, b)
end

def self.compute_semi_id(competition_year, person_name, available_per_semi = {})
Expand Down

0 comments on commit be65ffb

Please sign in to comment.