Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling of accentuated characters #681

Open
minirop opened this issue Oct 23, 2024 · 4 comments
Open

Handling of accentuated characters #681

minirop opened this issue Oct 23, 2024 · 4 comments

Comments

@minirop
Copy link
Contributor

minirop commented Oct 23, 2024

I'm working on the European release of FE8 and wanted to know what is the best course of action to handle äll thôsè characters (to eventually be able to build both ROMs from the same source code).

1/ do like é and have [AccentedE] for each letter.

2/ Since FE8 uses Windows-1252 and it is almost compatible with Unicode, use the unicode characters (directly or via their \x00 code point), and only have a [specialCase] for œ Œ.

@MokhaLeee
Copy link
Contributor

If you are talking about hardcoded const data, you can try to dig out makefile.

In FE6/FE7J, the hardcoded string are shift-jis encoded, thus you need to convert utf8 code (always used in morden linux) to shift-jis via iconv -f UTF-8 -t CP932

https://github.com/FireEmblemUniverse/fireemblem6j/blob/main/Makefile#L265

@minirop
Copy link
Contributor Author

minirop commented Oct 24, 2024

@MokhaLeee
Copy link
Contributor

After huffmam decompresson, each string may have a u16 array. As for how this u16 array corresponds to a string, we need to study its encoding method. JP version directly use 16bit shift-jis characters with some CTRL char, and US version use ASCII.

@minirop
Copy link
Contributor Author

minirop commented Oct 24, 2024

The EU version uses the same as the US version, it just has more accentuated characters, so either I add more tags:

elif u16_data == 0xC8:
    output = "[UppercaseGraveAccentE]"
elif u16_data == 0xC9:
    output = "[UppercaseAcuteAccentE]"
elif u16_data == 0xE8:
    output = "[GraveAccentE]"
elif u16_data == 0xF4:
    output = "[CircumflexO]"
# ...

or simply:

elif u16_data >= 0xA1 and u16_data <= 0xFF:
    output = chr(u16_data)

(for my tests I'm using the latter and it seems to work as expected)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants