[CRRZG5311W / CRRZG5340I] Improve UTF-8 encoding support #463

FALLAI-Denis · 2024-12-11T17:32:31Z

Description of the enhancement requested

Hi,

We work with IBM Z Open Editor in a Git repository context.
As standard, Git repositories use UTF-8 encoding for source management.
The ZOE extension provides line overflow management, and triggers a warning if a line exceeds the line length limit, set by default to 80 characters (modifiable).
But in the presence of multi-byte characters, as is the case in UTF-8, the length in characters is incorrectly calculated: it is the length in number of bytes that is used, and which causes the wrong emission of a warning message.

ZOE reports that the warning message can be wrongly emitted, but it would be better if it did not emit it and did a correct calculation of length in characters and not in bytes.

Depending on whether the check is done by the Language Server, probably coded in Java, or by the VS Code extension's own code, probably coded in TypeScript, there must be routines to calculate a string length based on the encoding used for the file (in our case UTF-8).

The Euro character is encoded on 3 bytes in UTF-8: 0xE2, 0x82 and 0xAC

COBOL can calculate a length in characters on a UTF-8 string... COBOL better than Java or TypeScript? 💪 🤣

FYI, the following setting does not solve the problem:

"zopeneditor.encodings.filePatterns": {
  "**/*.rexx": "IBM-1147",
  "**/*.cbl": "IBM-1147",
  "**/*.cpy": "IBM-1147"
}

Thanks.

The text was updated successfully, but these errors were encountered:

phaumer · 2024-12-12T23:32:26Z

Agreed. We should calculate the correct length the line would have in EBCDIC.

phaumer added the bug Something isn't working label Dec 12, 2024

phaumer mentioned this issue Dec 28, 2024

Truncation settings for more types than COBOL, HLASM, JCL, REXX and PL1 #464

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CRRZG5311W / CRRZG5340I] Improve UTF-8 encoding support #463

[CRRZG5311W / CRRZG5340I] Improve UTF-8 encoding support #463

FALLAI-Denis commented Dec 11, 2024 •

edited

Loading

phaumer commented Dec 12, 2024

[CRRZG5311W / CRRZG5340I] Improve UTF-8 encoding support #463

[CRRZG5311W / CRRZG5340I] Improve UTF-8 encoding support #463

Comments

FALLAI-Denis commented Dec 11, 2024 • edited Loading

Description of the enhancement requested

phaumer commented Dec 12, 2024

FALLAI-Denis commented Dec 11, 2024 •

edited

Loading