-
Notifications
You must be signed in to change notification settings - Fork 384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLDR-17736 Update approximate widths #3808
CLDR-17736 Update approximate widths #3808
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code changes LGTM
- the data is very verbose, is it really necessary to have the codepoint name? If it was just runs of widths, I'd think filling in entire blocks might be smaller even. I see:
xxxx5-xxxxB ; 0 ; # LONG NAME OF SOMETHING
xxxxD-xxxxF ; 0 ; # LONG NAME OF SOMETHING
but probably xxxxC is also 0
console check is run in -z BUILD currently |
It makes it easier to debug. The name is not needed in production, but it only affects startup, and I don't think it is a material performance issue. If we really think it is an issue, we could measure, and if important could generate two files (that is what is done with GenerateBirths), one for debugging with names, and the other without (or even binary).
I don't think so. The items are sorted with width, so a gap typically means that something is in another group. I can check this if you give me an example. |
@srl295
|
P.S. This is not a blocker for Vetting Phase, though I'd like to get it in very soon, since I think it solves the Fulah errors. |
ok . i will work on it while waiting for results of the sanity check. |
@macchiati I pulled down this PR and was able to repro the message $ cldr-check -z BUILD -f mr
[-e, -z, BUILD, -f, mr]
#-f file_filter ≔ mr
#-t test_filter ≝ .*
#-o organization ≝ .*
#-p path_filter ≝ .*
#-e errors_only ≔ null
#-s source_directory ≝ /Users/srl295/src/cldr2/common/main/,/Users/srl295/src/cldr2/common/annotations/,/Users/srl295/src/cldr2/seed/main/
#-z phase ≔ BUILD
#-g generate_html ≝ /Users/srl295/src/cldr-staging/docs/charts/46//errors/
#-y subtype_filter ≝ .*
#-S source_all ≝ common,seed,exemplars
Source directories:
/Users/srl295/src/cldr2/common/main (/Users/srl295/src/cldr2/common/main)
/Users/srl295/src/cldr2/common/annotations (/Users/srl295/src/cldr2/common/annotations)
/Users/srl295/src/cldr2/seed/main (/Users/srl295/src/cldr2/seed/main)
filtered tests: [CheckAnnotations, CheckChildren, CheckCoverage, CheckDates, CheckForCopy, CheckDisplayCollisions, CheckExemplars, CheckForExemplars, CheckForInheritanceMarkers, CheckNames, CheckNumbers, CheckMetazones, CheckLogicalGroupings, CheckAlt, CheckAltOnly, CheckCurrencies, CheckCasing, CheckConsistentCasing, CheckQuotes, CheckUnits, CheckWidths, CheckPlaceHolders, CheckPersonNames, CheckNew]
Locale Status ▸PPath◂ 〈Eng.Value〉 【Eng.Ex.】 〈Loc.Value〉 «fill-in» 【Loc.Ex】 ⁅error/warning type⁆ ❮Error/Warning Msg❯ Full Path AliasedSource/Path?
/Users/srl295/src/cldr2/common/main/mr.xml:2043:54: error
mr [Marathi] error ▸Date_&_Time|Gregorian|Eras_-_abbreviated|0-variant◂ 〈BCE〉 【】 〈ई. स. पू. युग〉 «=» 【】 ⁅abbreviated date field too wide⁆ ❮Error: Abbreviated value "ई. स. पू. युग" can't be longer than the corresponding wide value "ईसापूर्व युग"❯ https://st.unicode.org/cldr-apps/v#/mr//efe32222ec06f22
# mr [Marathi] Summary modern Items: 10969 Raw Missing: 144 Raw Provisional: 0
# mr [Marathi] Summary modern Total missing from general exemplars: 20 [\u200B ॲ a b c d e f g i l m n o p r s u x z]
# mr [Marathi] Summary modern Subtotal error: 1
# mr Elapsed time: : 5.732 s
# Total error: 1
# Total elapsed time: : 9.784 s
<< FAILURE - Error count is 1 . >> |
@macchiati digging in, the error is coming from this code which seems to not have anything to do with the -z BUILD switch, but unconditionally is |
Excellent, I was stuck but can now fix it
…On Mon, Jun 17, 2024, 08:22 Steven R. Loomis ***@***.***> wrote:
@macchiati <https://github.com/macchiati> digging in, the error is coming
from this code
-
https://github.com/macchiati/cldr/blob/d53cd8d713b86ef939a8faab47dbd82e2f403bd8/tools/cldr-code/src/main/java/org/unicode/cldr/test/CheckDates.java#L491-L504
which seems to not have anything to do with the -z BUILD switch, but
unconditionally is errorType
—
Reply to this email directly, view it on GitHub
<#3808 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACJLEMBSUYBSCP5RRW7MOS3ZH35KBAVCNFSM6AAAAABJMZXI7KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNZTG4YDCMJSG4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may eventually want to make this a common function
I was thinking that too. |
CLDR-17736
May fix some of the fulah items
FYI, this should be a yearly BRS item
NOTE
The check-console gets two errors, because the abbreviated forms are longer than the wide values. Looking at a variety of fonts, these are in fact errors that should show up in the survey tool.
However, if the console-check is run with BUILD mode (as it should be), then these two errors should not block merging.
"ई. स. पू. युग"
"ईसापूर्व युग"
"جماد ۱"
"جماعه"
mr [Marathi] error ▸Date_&Time|Gregorian|Eras-_abbreviated|0-variant◂ 〈BCE〉 【】 〈ई. स. पू. युग〉 «=» 【】 ⁅abbreviated date field too wide⁆ ❮Error: Abbreviated value "ई. स. पू. युग" can't be longer than the corresponding wide value "ईसापूर्व युग"❯ https://st.unicode.org/cldr-apps/v#/mr//efe32222ec06f22
ps [Pashto] error ▸Date_&Time|Islamic|Months-abbreviated-_Formatting|May◂ 〈Jum. I〉 【】 〈جماد ۱〉 «=» 【】 ⁅abbreviated date field too wide⁆ ❮Error: Abbreviated value "جماد ۱" can't be longer than the corresponding wide value "جماعه"❯ https://st.unicode.org/cldr-apps/v#/ps//6f92de7116b2180f
ALLOW_MANY_COMMITS=true