-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
capture all terminators and quotes in the sentence #360
Conversation
View Playwright Report (note: open the "playwright-report" artifact) |
Thanks! Could you add an extra comment or two, and/or make the variables a little more descriptive? (This isn't quite your fault, but I think the code is a bit hard to understand here...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be great to get some test coverage for this case, but also as I was looking into this, I realized that document-util tests are broken. I'll fix those.
#363 fixes this; once that's merged, rebase off the latest master and tests can be added to test\data\html\test-document1.html. Here is a new example test which I think will be affected by this update: <div
class="test"
data-test-type="scan"
data-element-from-point-selector="span"
data-caret-range-from-point-selector="span"
data-start-node-selector="span"
data-start-offset="4"
data-end-node-selector="span"
data-end-offset="4"
data-result-type="TextSourceRange"
data-sentence-scan-extent="100"
data-sentence="ありがとございます。"
>
<span>ありがとございます。!?ありがとございます。!?</span>
</div> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making the fixes! LGTM, but let's also wait for @toasted-nutbread's review.
This PR resolves #116 . It should capture all termination characters or quotes if it appears multiple times at the beginning or end of a sentence. When tested on sentences, it behaves the same as the original but grabs all termination characters.