You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If a hit e.g. begins in one sentence and ends in another, BLS' /docs/DOCID/contents will highlight it in two parts, and the frontend will think those are two separate hits.
For example, if we highlight a hit spanning from pancakes to They're (inclusive), BLS will return:
<s>I like <hl>pancakes.</hl></s>
<s><hl>They're</hl> delicious.</s>
So the individual hits cannot always be found from BLS' highlighted document contents. But the frontend's pagination uses the <hl> tags as if they represent individual hits, which breaks in this case, showing two parts of a hit as if there's two separate hits.
Another problem is overlapping hits.
Both BLS and frontend need to be changed to address these problems.
BLS should add a hit index attribute to any <hl> tag, so the frontend know the two tags belong together, e.g.:
<s>I like <hln="1">pancakes.</hl></s>
<s><hln="1">They're</hl> delicious.</s>
For two overlapping hits "The fox jumps" and "jumps over the dog":
<s><hln="1">The fox <hln="2">jumps</hl></hl><hln="2"> over the dog.</hl></s>
The frontend would then use these indexes to identify and highlight whole hits at a time.
The text was updated successfully, but these errors were encountered:
Also adding the hit start as an attribute would be useful. There are instances where we need to jump to a specific hit without necessarily knowing its index. For example when opening the document through the expanded snippet.
If a hit e.g. begins in one sentence and ends in another, BLS'
/docs/DOCID/contents
will highlight it in two parts, and the frontend will think those are two separate hits.For example, if we highlight a hit spanning from pancakes to They're (inclusive), BLS will return:
So the individual hits cannot always be found from BLS' highlighted document contents. But the frontend's pagination uses the
<hl>
tags as if they represent individual hits, which breaks in this case, showing two parts of a hit as if there's two separate hits.Another problem is overlapping hits.
Both BLS and frontend need to be changed to address these problems.
BLS should add a hit index attribute to any
<hl>
tag, so the frontend know the two tags belong together, e.g.:For two overlapping hits "The fox jumps" and "jumps over the dog":
The frontend would then use these indexes to identify and highlight whole hits at a time.
The text was updated successfully, but these errors were encountered: