Skip to content

Commit

Permalink
CLDR-17566 text diffs
Browse files Browse the repository at this point in the history
  • Loading branch information
chpy04 committed Sep 2, 2024
1 parent 40f9f1a commit 62dabc1
Show file tree
Hide file tree
Showing 3 changed files with 36 additions and 31 deletions.
16 changes: 8 additions & 8 deletions docs/site/TEMP-TEXT-FILES/creating-the-archive.txt
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Some other tools (typically when given a version argument on the command line)
FindPluralDifferences
...
Here's how to do that.
Create an archive directory cldr-archive. The Simplest is if it on the same level as your local CLDR repository. In other words, if your CLDR_DIR is .../workspace/cldr, then create the directory  …/workspace/cldr-archive
Create an archive directory cldr-archive. The Simplest is if it on the same level as your local CLDR repository. In other words, if your CLDR_DIR is .../workspace/cldr, then create the directory …/workspace/cldr-archive
(Note: The Java property ARCHIVE can be used to overide the path to cldr-archive).
Open up ToolConstants.java and look at ToolConstants.CLDR_VERSIONS. You'll see something like:
public static final List<String> CLDR_VERSIONS = ImmutableList.of(
Expand All @@ -29,18 +29,18 @@ public static final List<String> CLDR_VERSIONS = ImmutableList.of(
// add to this once the release is final!
);
NOTE: this should also match CldrVersion.java (those two need to be merged together)
Add the just-released version, such as "42.0" to the list  above
Add the just-released version, such as "42.0" to the list above
Also update DEV_VERSION to "43" (the next development version)
Finally, update CldrVersion.java and make similar changes.
Now, run the tool org.unicode.cldr.tool.CheckoutArchive
Or from the command line:
mvn -DCLDR_DIR=path_to/cldr --file=tools/pom.xml -pl cldr-code compile -DskipTests=true exec:java -Dexec.mainClass=org.unicode.cldr.tool.CheckoutArchive  -Dexec.args=""
mvn -DCLDR_DIR= path_to/cldr --file=tools/pom.xml -pl cldr-code compile -DskipTests=true exec:java -Dexec.mainClass=org.unicode.cldr.tool.CheckoutArchive -Dexec.args=""
Note other options for this tool:
--help will give help
--prune will run a 'git workspace prune' before proceeding
--echo will just show the commands that would be run, without running anything
(For example,  -Dexec.args="--prune" in the above command line)
--help will give help
--prune will run a 'git workspace prune' before proceeding
--echo will just show the commands that would be run, without running anything
(For example, -Dexec.args="--prune" in the above command line)
The end result (where you need all of the releases) looks something like the following:
Advanced Configuration
You can set the property  -DCLDR_ARCHIVE to point to a different parent directory for the archive
You can set the property -DCLDR_ARCHIVE to point to a different parent directory for the archive
You can set -DCLDR_HAS_ARCHIVE=false to tell unit tests and tools not to look for the archive
18 changes: 9 additions & 9 deletions docs/site/TEMP-TEXT-FILES/documenting-cldr-tools.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,19 +4,19 @@ You can add the @CLDRTool annotation to any class in cldr-code that has a main()
See CLDR Tools for general information about obtaining and using CLDR tools.
Coding it
An example from ConsoleCheckCLDR.java will start us out here
@CLDRTool(alias = "check",
description = "Run CheckCLDR against CLDR data")
public class ConsoleCheckCLDR { …
Then, calling java -jar cldr-tools.jar -l produces:
check - Run CheckCLDR against CLDR data
<http://cldr.unicode.org/tools/check>
= org.unicode.cldr.test.ConsoleCheckCLDR
  @CLDRTool(alias = "check",
  description = "Run CheckCLDR against CLDR data")
  public class ConsoleCheckCLDR { …
Then, calling java -jar cldr-tools.jar -l produces:
  check - Run CheckCLDR against CLDR data
  <http://cldr.unicode.org/tools/check>
  = org.unicode.cldr.test.ConsoleCheckCLDR
And then java -jar cldr-tools.jar check can be used to run this tool. All additional arguments after "check" are passed to ConsoleCheckCLDR.main() as arguments.
Note these annotation parameters. Only "alias" is required.
alias - used from the command line instead of the full class name. Also forms part of the default URL for documentation.
description - a short description of the tool.
Additional parameters:
url - you can specify a custom URL for the tool. This is displayed with the listing.
hidden - if non-empty, this specifies a reason to not show the tool when running "java -jar" without "-l". For example, the main() function may be a less-useful internal tool, or a test.
hidden - if non-empty, this specifies a reason to not show the tool when running "java -jar" without "-l". For example, the main() function may be a less-useful internal tool, or a test.
Documenting it
Assuming your tools’s alias is myalias, create a new subpage with the URL http://cldr.unicode.org/tools/myalias (a subpage of CLDR Tools). Fill this page out with information about how to use your tool.
Assuming your tools’s alias is myalias, create a new subpage with the URL http://cldr.unicode.org/tools/myalias (a subpage of CLDR Tools). Fill this page out with information about how to use your tool.
33 changes: 19 additions & 14 deletions docs/site/TEMP-TEXT-FILES/updating-englishroot.txt
Original file line number Diff line number Diff line change
Expand Up @@ -17,26 +17,19 @@ outdatedEnglish.data
Replacing the previous versions in /cldr/tools/java/org/unicode/cldr/util/data/births/. These files are used to support OutdatedPaths.java, which is used in CheckNew.
Readable data is found in https://github.com/unicode-org/cldr-staging/tree/master/births/* That should also be checked in, for comparison over time. Easiest to read if you paste into a spreadsheet!
Binary File Format
outdatedEnglish.data
outdated.data
int:size
long:pathId str:oldValue
long:pathId str:oldValue
outdatedEnglish.data outdated.data
int:size str:locale
long:pathId str:oldValue int:size
long:pathId str:oldValue long:pathId
... long:pathId
...
str:locale
int:size
long:pathId
long:pathId
...
str:locale
int:size
long:pathId
long:pathId
$END$
~50KB
$END$
~100KB
$END$ $END$
~50KB ~100KB
In a limited release, the file SubmissionLocales.java is set up to allow just certain locales and paths in those locales.
Testing
Make sure TestOutdatedPaths.java passes. It may take some modifications, since it depends on the exact data.
Expand All @@ -50,4 +43,16 @@ Their format is the following (TSV = tab-delimited-values) — to view, it is pr
English doesn't have the E... values, but is a complete record.
Other languages only have lines where the English value is more recently changed (younger) than the native’s.
So what the first line below says is that French has "bengali" dating back to version 1.1.1, while English has "Bangla" dating back to version 30.
Loc Version Value PrevValue EVersion EValue EPrevValue Path
fr 1.1.1 bengali � 30 Bangla Bengali //ldml/localeDisplayNames/languages/language[@type="bn"]
fr 1.1.1 galicien � 1.4.1 Galician Gallegan //ldml/localeDisplayNames/languages/language[@type="gl"]
fr 1.1.1 kirghize � 24 Kyrgyz Kirghiz //ldml/localeDisplayNames/languages/language[@type="ky"]
fr 1.1.1 ndébélé du Nord � 1.3 North Ndebele Ndebele, North //ldml/localeDisplayNames/languages/language[@type="nd"]
fr 1.1.1 ndébélé du Sud � 1.3 South Ndebele Ndebele, South //ldml/localeDisplayNames/languages/language[@type="nr"]
...
fr 34 exclamation | point d’exclamation blanc | ponctuation exclamation | point d’exclamation blanc trunk ! | exclamation | mark | outlined | punctuation | white exclamation mark exclamation | mark | outlined | punctuation | white exclamation mark //ldml/annotations/annotation[@cp="❕"]
fr 34 exclamation | point d’exclamation | ponctuation exclamation | point d’exclamation trunk ! | exclamation | mark | punctuation exclamation | mark | punctuation //ldml/annotations/annotation[@cp="❗"]
fr 34 cœur | cœur point d’exclamation | exclamation | ponctuation cœur | cœur point d’exclamation trunk exclamation | heart exclamation | mark | punctuation exclamation | heavy heart exclamation | mark | punctuation //ldml/annotations/annotation[@cp="❣"]
fr 34 couple | deux hommes se tenant la main | hommes | jumeaux couple | deux hommes se tenant la main | jumeaux trunk couple | Gemini | man | twins | men | holding hands | zodiac couple | Gemini | man | twins | two men holding hands | zodiac //ldml/annotations/annotation[@cp="👬"]
fr 34 couple | deux femmes se tenant la main | femmes | jumelles couple | deux femmes se tenant la main | jumelles trunk couple | hand | holding hands | women couple | hand | two women holding hands | woman //ldml/annotations/annotation[@cp="👭"]
A value of � indicates that there is no value for that version.

0 comments on commit 62dabc1

Please sign in to comment.