Removes id when used with remarkStringify #11

dmca-glasgow · 2024-06-05T14:17:51Z

Hello,

I've been trying your package in my project, but I've found when I stringify my mdast tree back into markdown, the heading id is now lost.

I've tried to hack together a solution which works for basic text titles, but I think it's an over-simplification of the problem and probably breaks a lot of stuff:

import { unified, Processor } from 'unified';
import { Heading, Root, Text } from 'mdast';
import remarkParse from 'remark-parse';
import remarkStringify from 'remark-stringify';
import { visit } from 'unist-util-visit';

function remarkHeadingId() {
  // @ts-expect-error
  const self = this as Processor;
  const data = self.data();
  const toMarkdownExtensions = data.toMarkdownExtensions || [];

  toMarkdownExtensions.push({
    handlers: {
      heading(node: Heading) {
        const text = node.children[0] as Text;
        const idValue = String(node.data?.hProperties?.id || '');
        const id = idValue === '' ? '' : `{#${idValue}}`;
        return `${'#'.repeat(node.depth)} ${text.value} ${id}`;
      },
    },
  });

  return (tree: Root) => {
    visit(tree, 'heading', (node) => {
      const text = node.children[0] as Text;
      const match = text.value.match(/ {#([^]+?)}$/);

      if (match !== null) {
        node.data = {
          hProperties: {
            id: match[1],
          },
        };
        text.value = text.value.slice(0, match.index);
      }
    });
  };
}

const processor = unified()
  .use(remarkParse)
  .use(remarkHeadingId)
  .use(remarkStringify)

const markdown = `### Hello {#hi}`;
const mdast = processor.parse(markdown);
const transformed = await processor.run(mdast);

console.dir(transformed, { depth: null });

console.log(processor.stringify(transformed as Root))

Transformed (position data removed):

{
  type: 'root',
  children: [
    {
      type: 'heading',
      depth: 3,
      children: [
        {
          type: 'text',
          value: 'Hello'
        }
      ],
      data: { hProperties: { id: 'hi' } }
    }
  ]
}

Stringify Result:

### Hello {#hi}

With your plugin, the result is:

### Hello

I had a look at the way Micromark parses ATX headings to see if I could replicate the logic but it looks pretty complex!

Thanks.

The text was updated successfully, but these errors were encountered:

dmca-glasgow · 2024-06-05T14:24:29Z

Oh, d'oh.. just reading back through my question and I came up with the (much more straightforward!) solution:

function remarkHeadingId() {
  return (tree: Root) => {
    visit(tree, 'heading', (node) => {
      const text = node.children[0] as Text;
      const match = text.value.match(/ {#([^]+?)}$/);

      if (match !== null) {
        node.data = {
          hProperties: {
            id: match[1],
          },
          hChildren: [
            {
              type: 'text',
              value: text.value.slice(0, match.index),
            },
          ],
        };
      }
    });
  };
}

Instead of mutating the value in the mdast node.. just create a new value in the remarkRehype hChildren property.

If this works for you I can create a PR?

imcuttle · 2024-06-17T02:39:29Z

I can't understand the reason of the issue.
your code looks consistent with the source logic.

remark-heading-id/util.js

Lines 25 to 29 in 5f6272e

    
           const setNodeId = (node, id) => { 
        
             if (!node.data) node.data = {} 
        
             if (!node.data.hProperties) node.data.hProperties = {} 
        
             node.data.id = node.data.hProperties.id = id 
        
           }

remark-heading-id/index.js

Lines 20 to 29 in 5f6272e

    
           if (matched) { 
        
             let id = matched[1] 
        
             if (!!id.length) { 
        
               setNodeId(node, id) 
        
               string = string.substring(0, matched.index) 
        
               lastChild.value = string 
        
               return 
        
             } 
        
           }

dmca-glasgow · 2024-06-17T15:53:39Z

Hi there,

The difference is lastChild is an mdast node:

remark-heading-id/index.js

Lines 14 to 15 in 5f6272e

    
           visit(node, 'heading', node => { 
        
             let lastChild = node.children[node.children.length - 1]

Whereas I propose using hChildren:

hChildren: [
  {
    type: 'text',
    value: text.value.slice(0, match.index),
  },
]

If you are converting Markdown to HTML, you can transform the AST in 2 ways: you can directly change the mdast node, as you have done, and it will be converted to a hast node by remarkRehype.

Or the other way, you can use the special data properties: hName, hPropertiies and hChildren where you can add hast nodes. The advantage here is that you don't change the mdast node, which means your plugin can be more flexible because it allows the processor to do other things. For example, I am using remarkStringify, to turn the AST back into Markdown.

If you experiment with this example you should see that it currently removes the id attribute:

import { unified, Processor } from 'unified';
import { Heading, Root, Text } from 'mdast';
import remarkParse from 'remark-parse';
import remarkStringify from 'remark-stringify';

const processor = unified()
  .use(remarkParse)
  .use(remarkHeadingId)
  .use(remarkStringify)

const markdown = `### Hello {#hi}`;

console.log(String(await processor.process(markdown)))

More info here.

Thanks.

imcuttle · 2024-06-24T05:55:26Z

@dmca-glasgow PR Welcomes! And please add some test cases.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Removes id when used with remarkStringify #11

Removes id when used with remarkStringify #11

dmca-glasgow commented Jun 5, 2024

dmca-glasgow commented Jun 5, 2024

imcuttle commented Jun 17, 2024

dmca-glasgow commented Jun 17, 2024

imcuttle commented Jun 24, 2024

Removes id when used with remarkStringify #11

Removes id when used with remarkStringify #11

Comments

dmca-glasgow commented Jun 5, 2024

dmca-glasgow commented Jun 5, 2024

imcuttle commented Jun 17, 2024

dmca-glasgow commented Jun 17, 2024

imcuttle commented Jun 24, 2024