Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removes id when used with remarkStringify #11

Open
dmca-glasgow opened this issue Jun 5, 2024 · 4 comments
Open

Removes id when used with remarkStringify #11

dmca-glasgow opened this issue Jun 5, 2024 · 4 comments

Comments

@dmca-glasgow
Copy link

Hello,

I've been trying your package in my project, but I've found when I stringify my mdast tree back into markdown, the heading id is now lost.

I've tried to hack together a solution which works for basic text titles, but I think it's an over-simplification of the problem and probably breaks a lot of stuff:

import { unified, Processor } from 'unified';
import { Heading, Root, Text } from 'mdast';
import remarkParse from 'remark-parse';
import remarkStringify from 'remark-stringify';
import { visit } from 'unist-util-visit';

function remarkHeadingId() {
  // @ts-expect-error
  const self = this as Processor;
  const data = self.data();
  const toMarkdownExtensions = data.toMarkdownExtensions || [];

  toMarkdownExtensions.push({
    handlers: {
      heading(node: Heading) {
        const text = node.children[0] as Text;
        const idValue = String(node.data?.hProperties?.id || '');
        const id = idValue === '' ? '' : `{#${idValue}}`;
        return `${'#'.repeat(node.depth)} ${text.value} ${id}`;
      },
    },
  });

  return (tree: Root) => {
    visit(tree, 'heading', (node) => {
      const text = node.children[0] as Text;
      const match = text.value.match(/ {#([^]+?)}$/);

      if (match !== null) {
        node.data = {
          hProperties: {
            id: match[1],
          },
        };
        text.value = text.value.slice(0, match.index);
      }
    });
  };
}

const processor = unified()
  .use(remarkParse)
  .use(remarkHeadingId)
  .use(remarkStringify)

const markdown = `### Hello {#hi}`;
const mdast = processor.parse(markdown);
const transformed = await processor.run(mdast);

console.dir(transformed, { depth: null });

console.log(processor.stringify(transformed as Root))

Transformed (position data removed):

{
  type: 'root',
  children: [
    {
      type: 'heading',
      depth: 3,
      children: [
        {
          type: 'text',
          value: 'Hello'
        }
      ],
      data: { hProperties: { id: 'hi' } }
    }
  ]
}

Stringify Result:

### Hello {#hi}

With your plugin, the result is:

### Hello

I had a look at the way Micromark parses ATX headings to see if I could replicate the logic but it looks pretty complex!

Thanks.

@dmca-glasgow
Copy link
Author

Oh, d'oh.. just reading back through my question and I came up with the (much more straightforward!) solution:

function remarkHeadingId() {
  return (tree: Root) => {
    visit(tree, 'heading', (node) => {
      const text = node.children[0] as Text;
      const match = text.value.match(/ {#([^]+?)}$/);

      if (match !== null) {
        node.data = {
          hProperties: {
            id: match[1],
          },
          hChildren: [
            {
              type: 'text',
              value: text.value.slice(0, match.index),
            },
          ],
        };
      }
    });
  };
}

Instead of mutating the value in the mdast node.. just create a new value in the remarkRehype hChildren property.

If this works for you I can create a PR?

@imcuttle
Copy link
Owner

I can't understand the reason of the issue.
your code looks consistent with the source logic.

const setNodeId = (node, id) => {
if (!node.data) node.data = {}
if (!node.data.hProperties) node.data.hProperties = {}
node.data.id = node.data.hProperties.id = id
}

if (matched) {
let id = matched[1]
if (!!id.length) {
setNodeId(node, id)
string = string.substring(0, matched.index)
lastChild.value = string
return
}
}

@dmca-glasgow
Copy link
Author

Hi there,

The difference is lastChild is an mdast node:

visit(node, 'heading', node => {
let lastChild = node.children[node.children.length - 1]

Whereas I propose using hChildren:

hChildren: [
  {
    type: 'text',
    value: text.value.slice(0, match.index),
  },
]

If you are converting Markdown to HTML, you can transform the AST in 2 ways: you can directly change the mdast node, as you have done, and it will be converted to a hast node by remarkRehype.

Or the other way, you can use the special data properties: hName, hPropertiies and hChildren where you can add hast nodes. The advantage here is that you don't change the mdast node, which means your plugin can be more flexible because it allows the processor to do other things. For example, I am using remarkStringify, to turn the AST back into Markdown.

If you experiment with this example you should see that it currently removes the id attribute:

import { unified, Processor } from 'unified';
import { Heading, Root, Text } from 'mdast';
import remarkParse from 'remark-parse';
import remarkStringify from 'remark-stringify';

const processor = unified()
  .use(remarkParse)
  .use(remarkHeadingId)
  .use(remarkStringify)

const markdown = `### Hello {#hi}`;

console.log(String(await processor.process(markdown)))

More info here.

Thanks.

@imcuttle
Copy link
Owner

@dmca-glasgow PR Welcomes! And please add some test cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants