Comparison of string pairs #19

rieck · 2015-05-04T08:46:58Z

There exist analysis tasks where the similarity between pairs of strings needs to be computed. In this setting, computing a similarity matrix over all strings is clearly an overkill and it would be great if Harry could support this setting, e.g. using a special command-line option.

gsever · 2015-11-15T19:45:55Z

Hello, it would be also good to output similarity score based on a threshold rather than all results.

rieck · 2015-11-16T10:47:31Z

That's a very good idea. However, we would need to introduce a new representation and output format. Currently, Harry stores computed similarity values in a matrix. The benefit of a threshold would be that many of the matrix entries could be omitted and we would end with a sparse representation. I'll put this on my TODO list.

rieck added the enhancement label May 4, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comparison of string pairs #19

Comparison of string pairs #19

rieck commented May 4, 2015

gsever commented Nov 15, 2015

rieck commented Nov 16, 2015

Comparison of string pairs #19

Comparison of string pairs #19

Comments

rieck commented May 4, 2015

gsever commented Nov 15, 2015

rieck commented Nov 16, 2015