Skip to content

Commit

Permalink
Fix wiki-links with non-ascii chars being broken sometimes
Browse files Browse the repository at this point in the history
Normalize unicode filenames to NFC

Ref:
- #611
- #419

Resolves #611
  • Loading branch information
srid committed May 6, 2021
1 parent 4a327f5 commit 39597ae
Show file tree
Hide file tree
Showing 4 changed files with 15 additions and 4 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@
- Fix a bug where folgezettel relationship is not established if a note also has non-folgezettel links to the same target
- Clean HTML output when zettels are deleted (#141)
- Added '§' character in whitelist (#595)
- Normalize unicode filenames to NFC, fixing broken wiki links.

## Unreleased (v1 + v2)

Expand Down
4 changes: 3 additions & 1 deletion doc/Guide/Zettel ID.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,9 @@
slug: id
---

A Zettel ID is a [[Zettel Markdown]] file's filename without the extension. Zettel IDs must be unique across the Zettelkasten.
A Zettel ID is a [[Zettel Markdown]] file's filename[^unicode] without the extension. Zettel IDs must be unique across the Zettelkasten.

[^unicode]: Neuron will [NFC normalize](https://www.unicode.org/faq/normalization.html) the Zettel ID derived from filename or link so that they work reliably when using non-ascii characters in filename or links (see [[Linking]]).

By default, `neuron new`[^new] will use random alphanumeric IDs of length 8, called a "random ID". But you may use arbitrary text as ID as well, called a "title ID".

Expand Down
3 changes: 2 additions & 1 deletion neuron.cabal
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
cabal-version: 2.4
name: neuron
version: 1.9.27.3
version: 1.9.28.0
license: AGPL-3.0-only
copyright: 2020 Sridhar Ratnakumar
maintainer: [email protected]
Expand Down Expand Up @@ -84,6 +84,7 @@ common library-common
text,
time,
timeit,
unicode-transforms,
unix,
uri-encode,
uuid,
Expand Down
11 changes: 9 additions & 2 deletions src/Neuron/Zettelkasten/ID.hs
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE ViewPatterns #-}
{-# LANGUAGE NoImplicitPrelude #-}

module Neuron.Zettelkasten.ID
Expand All @@ -29,6 +30,7 @@ import Data.Aeson
ToJSONKey (toJSONKey),
)
import Data.Aeson.Types (toJSONKeyText)
import qualified Data.Text.Normalize as UT
import Relude hiding (traceShowId)
import System.FilePath (splitExtension, takeFileName)
import qualified Text.Megaparsec as M
Expand Down Expand Up @@ -110,6 +112,11 @@ idParser' cs = do
-- | Parse the ZettelID if the given filepath is a Markdown zettel.
getZettelID :: FilePath -> Maybe ZettelID
getZettelID fp = do
let (fileName, ext) = splitExtension $ takeFileName fp
let ( -- Apply unicode normalization per https://github.com/srid/neuron/issues/611
UT.normalize UT.NFC . toText ->
fileName,
ext
) =
splitExtension $ takeFileName fp
guard $ ".md" == toText ext
rightToMaybe $ parseZettelID (toText fileName)
rightToMaybe $ parseZettelID fileName

0 comments on commit 39597ae

Please sign in to comment.