Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLDR-17844 Modify the date report #3920

Merged

Conversation

macchiati
Copy link
Member

@macchiati macchiati commented Jul 31, 2024

CLDR-17844

Modify the date report to help figure out what is happening with the date formats, and present the user with more and clearer information.

  • Added BidiUtils with utilities for display of examples/reports

  • Small changes to CodePointEscaper, plus support for printing a table of escape IDs in a report.

  • DateTypeFormats

    • General cleanup
    • Add a paragraph of text underneath Patterns to explain the format. Clarify that this is for the default number system, and explain more about the format
    • In an RTL language there may be two examples, the top one is the dir=, and second is dir=auto with a distinct background. This allows vetters to see the contrast, and correct if important.
    • Add a line showing the escaped text
    • Add a warning when the reordering causes numbers or words to collide (all items currently pass)
    • If there are any escaped characters in any cell, print a Key at the end for those characters.
    • In the options for running manually, if there is a filter, then it is used; otherwise the locales are filtered to just the availableLanguages (for the charts).
  • This PR completes the ticket.

ALLOW_MANY_COMMITS=true

@macchiati macchiati requested review from pedberg-icu and srl295 July 31, 2024 03:59
@macchiati macchiati requested a review from AEApple July 31, 2024 04:00
@macchiati macchiati marked this pull request as draft July 31, 2024 04:22
@macchiati
Copy link
Member Author

@macchiati
Copy link
Member Author

I changed to rtl vs auto, which resolved a bunch of problems. Also added for comparison:

https://unicode.org/cldr/charts/46/verify/dates/fa.html
https://unicode.org/cldr/charts/46/verify/dates/de.html

Will update with screenshots.

@macchiati
Copy link
Member Author

macchiati commented Jul 31, 2024

Arabic
Screenshot 2024-07-31 at 08 25 09
Screenshot 2024-07-31 at 08 25 00

The one locale (out of those I checked) that shows an alert is Farsi. But I have to check that over because it doesn't seem like the alert test is quite right yet.

Farsi
Screenshot 2024-07-31 at 08 23 57

pedberg-icu
pedberg-icu previously approved these changes Jul 31, 2024
Copy link
Contributor

@pedberg-icu pedberg-icu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

AEApple
AEApple previously approved these changes Jul 31, 2024
Copy link
Contributor

@AEApple AEApple left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

srl295
srl295 previously approved these changes Jul 31, 2024
@macchiati macchiati dismissed stale reviews from srl295, AEApple, and pedberg-icu via 595dde3 August 1, 2024 13:43
@macchiati macchiati marked this pull request as ready for review August 1, 2024 13:45
@macchiati
Copy link
Member Author

Final cleanup for this PR. Am getting now no "alerts" indicating letter or number collisions. Also regenerated the charts.

@macchiati
Copy link
Member Author

Just updated the description, so would appreciate review.

Copy link
Member

@btangmu btangmu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great except minor nitpicking comments

}

/**
* Return a list of the , where each span is a sequence of:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

word missing before the comma? Actually this whole comment looks unattached to anything, obsoleted by the following comment Gets the fields

* @return a set of fields, in the same order as found in the text but duplicates removed (ike
* LinkedHashSeet).
*/
public static Set<String> getFields(String reordred, Set<String> result) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reordred should be reordered and the comment above should have @param reordered and @param result

return "";
}
// there can still be differences within a field of OTHERS, that we ignore.
// EG ⚠️ 20,28,2B; 2B,28,20 " (+" vs " (+"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2B,28,20 should be " ( "?

.collect(Collectors.joining(LRM + ", " + LRM, LRM, LRM));
}

public static String alphagram(String string) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alphagram seems unused?

UnicodeSet escapesToShow, String tableOptions, String cellOptions) {
if (!escapesToShow.strings().isEmpty()) {
throw new IllegalArgumentException("No strings allowed in the unicode set.");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe also confirm tableOptions and cellOptions start with spaces if not empty -- either throw IllegalArgumentException or insert the spaces if missing

final String id = cpe.name();
final String shortName = cpe.getShortName();
final String description = cpe.getDescription();
addREsult(result, tdPlus, id, shortName, description);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is capital E in addREsult a typo?

: "However, two examples are provided if the locale is right-to-left, like Arabic or Hebrew, "
+ "<i>and</i> the paragraph direction can cause a different display. "
+ "The first has a <b>RTL</b> paragraph direction, "
+ "while the second has a <b>auto</b> paragraph direction (LTR unless the first 'strong' character is RTL) "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a --> an (before auto)

+ spanEnd
+ " shows those characters with short IDs ❰…❱: see the <b>Key</b> below the table. "
+ "So that the ordering of the characters in memory is clear, they are presented left-to-right one at a time. "
+ "so that the placement is clear. "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"so that the placement is clear. " looks like leftover variant, delete?

@macchiati macchiati merged commit e302d06 into unicode-org:main Aug 2, 2024
12 checks passed
@macchiati macchiati deleted the CLDR-17844-Modify-the-date-report branch August 2, 2024 00:42
@macchiati
Copy link
Member Author

Thanks!
I went ahead and merged this, since I'm going to do a followup PR anyway. (Maybe more than one.) So I'll add this to the TODOs in the ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants