Skip to content

Commit

Permalink
Fix Unicode output in cabal-testsuite
Browse files Browse the repository at this point in the history
`System.Process.createPipe` calls (through many intermediaries)
`GHC.IO.Handle.FD.fdToHandle`, whose documentation says:

> Makes a binary Handle. This is for historical reasons; it should
> probably be a text Handle with the default encoding and newline
> translation instead.

The documentation for `System.IO.hSetBinaryMode` says:

> This has the same effect as calling `hSetEncoding` with `char8`, together
> with `hSetNewlineMode` with `noNewlineTranslation`.

But this is a lie, and Unicode written to or read from binary handles is
always encoded or decoded as Latin-1, which is always the wrong choice.

Therefore, we explicitly set the output to UTF-8 to keep it consistent
between platforms and correct on all modern computers.

See: https://gitlab.haskell.org/ghc/ghc/-/issues/25307
  • Loading branch information
9999years committed Oct 3, 2024
1 parent 504c7bc commit 5e4fa58
Showing 1 changed file with 23 additions and 0 deletions.
23 changes: 23 additions & 0 deletions cabal-testsuite/src/Test/Cabal/Run.hs
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,29 @@ runAction _verbosity mb_cwd env_overrides path0 args input action = do
mb_env <- getEffectiveEnvironment env_overrides
putStrLn $ "+ " ++ showCommandForUser path args
(readh, writeh) <- createPipe

-- `System.Process.createPipe` calls (through many intermediaries)
-- `GHC.IO.Handle.FD.fdToHandle`, whose documentation says:
--
-- > Makes a binary Handle. This is for historical reasons; it should
-- > probably be a text Handle with the default encoding and newline
-- > translation instead.
--
-- The documentation for `System.IO.hSetBinaryMode` says:
--
-- > This has the same effect as calling `hSetEncoding` with `char8`, together
-- > with `hSetNewlineMode` with `noNewlineTranslation`.
--
-- But this is a lie, and Unicode written to or read from binary handles is
-- always encoded or decoded as Latin-1, which is always the wrong choice.
--
-- Therefore, we explicitly set the output to UTF-8 to keep it consistent
-- between platforms and correct on all modern computers.
--
-- See: https://gitlab.haskell.org/ghc/ghc/-/issues/25307
hSetEncoding readh utf8
hSetEncoding writeh utf8

hSetBuffering readh LineBuffering
hSetBuffering writeh LineBuffering
let drain = do
Expand Down

0 comments on commit 5e4fa58

Please sign in to comment.