Skip to content

Roundtrip loss: trailing break node in paragraph serializes but doesn't parse back #1477

@gkubisa

Description

@gkubisa

Subject

remark-stringify serializes a trailing break node in a paragraph as \ + newline, but remark-parse does not parse it back as a break — it produces a literal \ in the text instead. This means MDAST trees with trailing breaks in paragraphs don't survive a serialize → parse roundtrip.

Steps to reproduce

import remarkParse from 'remark-parse';
import remarkStringify from 'remark-stringify';
import { unified } from 'unified';

// An MDAST with a trailing break in a paragraph
const tree = {
  type: 'root',
  children: [{
    type: 'paragraph',
    children: [
      { type: 'text', value: 'test' },
      { type: 'break' },
    ],
  }],
};

// Serialize: produces "test\\\n" (backslash + newline)
const md = unified().use(remarkStringify).stringify(tree);
console.log('Serialized:', JSON.stringify(md)); // "test\\\n"

// Parse back: the trailing \ becomes a literal backslash, not a break
const parsed = unified().use(remarkParse).parse(md);
const children = parsed.children[0].children;
console.log('Parsed children:', JSON.stringify(children.map(c => ({ type: c.type, value: c.value }))));
// [{ type: "text", value: "test\\" }] — literal backslash, no break node

Expected behavior

A break node at the end of a paragraph should survive a remark-stringifyremark-parse roundtrip. Either:

  1. remark-stringify should not emit \ + newline for a trailing break (since it won't be parsed back), or
  2. remark-parse should recognize \ at end-of-paragraph as a hard break

Actual behavior

remark-stringify emits \ + newline, but remark-parse treats \ at end of a block as a literal backslash (per CommonMark spec). The break node is lost and replaced with a literal \ character in the text.

Context

This issue surfaces when converting HTML → Markdown → HTML. A trailing <br> inside <p> (e.g. from a WYSIWYG editor's Shift+Enter) gets converted to a break node in MDAST, serialized as \ + newline, and then parsed back as a literal \ character — showing a visible backslash to the user instead of preserving the line break.

Metadata

Metadata

Assignees

No one assigned

    Labels

    🤞 phase/openPost is being triaged manually

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions