-
-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ICU-21757 Bring in CollectionUtilities, depend only on icu4j #810
ICU-21757 Bring in CollectionUtilities, depend only on icu4j #810
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mihai and I are working on the TestCodeInvariants test failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: Consider a follow-up PR to remove things from here which we don't need in the Unicode Tools.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: Consider a follow-up PR to remove things from here which we don't need in the Unicode Tools.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK.
There are 90+ instances of CollectionUtilities.join
, I didn't want to clutter this CL with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I suspect that this test code also uses an ICU normalizer object, and needs adjustments for going from Unicode 15 to Unicode 16. |
The missing regexp is this (escaped): [\u{897}\u{10D69}-\u{10D6D}\u{1E5EE}] 0897 => ARABIC PEPET All added in UTF-16. |
TestUnicodeSet.TestNF
|
Wow, we are checking normalization properties that are typically from the previous version (via ICU) in TestCodeInvariants, of all places? This is terrifying. |
This is not really a test of properties per se. It's a test that checks that you can use a text regex in a JSP “Unicode Utilities: UnicodeSet” pattern with a partial-value-string match. Concretely, it checks that It creates each of these UnicodeSets via a simple pattern string, which means that it uses the Unicode ccc data built into ICU. What I find baffling is that it is failing now. It seems like it should have failed when the tools depended on ICU 72 and its Unicode 15.0, but the tools themselves had moved on to draft 16.0. In principle, this test should get the ccc=integer set from the Unicode Tools, not from ICU, but that would also use draft 16.0 data, so roughly the same as what draft ICU 76 has. @macchiati What's going on? |
I am not @macchiati, but I think I know what is going on.
Now it should, yes.
No, because by default the JSPs use the If you change line 290 to |
🤦 And we didn't notice this version skew when we moved to Unicode 15.1 (while ICU remained stuck on 72/15.0) because there happened to not be any new characters with interesting normalization properties.
Yeah, but that would change the intent of the test. For now, I propose that @mihnita disable this test case, as well as TestUnicodeSet.TestNF (which also fails for Unicode 16 characters), via a test method annotation like |
It might also have been ICU on both sides at the time; I moved some things to the tools by default recentlyish.
Not meaningfully; we are not testing the version match (which cannot work), we are testing that the JSPs support regexes in Unicode sets. |
This test method should compare a set pattern processed like the JSPs do with more basic properties fetching for the same version of Unicode. I would leave the version out of the pattern and (in a future fix) fetch the properties for Settings.LAST_VERSION. I am also fine with changing this test method to use However, I don't see such an easy, quick fix for TestUnicodeSet.TestNF -- I didn't try to figure out which part of the code calls ICU vs. which part uses tools properties. In order not to block this PR here, I suggest we do disable that one for now. |
Yes. Dependencies on ICU data in here are madness. But I am OK with fixing this one hackily now and cleanly later.
Oh this one is a mess, because it will involve the much nastier question « which tool properties ? ». |
Thank you both! Implemented the Mihai |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah!
@mihnita do you want me to squash-and-merge for you? |
Yes please. I don't have the right. |
No description provided.