List items wrapped in tags due to trailing space #114

scottf3 · 2022-07-19T17:20:51Z

Initial checklist

I read the support docs
I read the contributing guide
I agree to follow the code of conduct
I searched issues and couldn’t find anything (or linked relevant results below)

Affected packages and versions

3

Link to runnable example

No response

Steps to reproduce

In chrome's console run:

const mm = await import('https://esm.sh/micromark@3?bundle');
console.log(mm.micromark('List1\n* item1\n* item2\n\n\n\n'));
console.log('------');
console.log(mm.micromark('List1\n* item1\n* item2\n\n\n \n'));

Note the only difference between the two examples is a single space some blank lines away from the list. Those two examples return different html, the latter has the list elements wrapped in 

<p>List1</p>
<ul>
<li>item1</li>
<li>item2</li>
</ul>
------
<p>List1</p>
<ul>
<li>
<p>item1</p>
</li>
<li>
<p>item2</p>
</li>
</ul>

Expected behavior

I'm not clear enough on the markdown spec to say which case is actually correct. Certainly other markdown parsers I've tried (though that is not a long list) render it like the first example.

Regardless I'd expect it to be the same between the two. In most markdown editors the trailing space is impossible to see and it can take a long time to track down why some list elements render with increased padding.

Actual behavior

See repro steps. Two examples output visually different HTML whereas I feel they should render the same.

Runtime

No response

Package manager

No response

OS

No response

Build and bundle tools

No response

The text was updated successfully, but these errors were encountered:

wooorm · 2022-07-19T18:46:18Z

Thanks for raising this issue.
I am working on a port of micromark in another language, which improves a couple of architectural things, likely also including this problem.
It's going to take a bit but I am planning to port those back.
In the meantime, perhaps you can find out how to fix this here now.

github-actions · 2022-07-19T18:46:47Z

Hi! This was marked as ready to be worked on! Note that while this is ready to be worked on, nothing is said about priority: it may take a while for this to be solved.

Is this something you can and want to work on?

Team: please use the area/* (to describe the scope of the change), platform/* (if this is related to a specific one), and semver/* and type/* labels to annotate this. If this is first-timers friendly, add good first issue and if this could use help, add help wanted.

sisp · 2022-11-24T11:53:06Z

I think there are two possible places where this problem could be fixed:

micromark/dev/lib/compile.js::prepareList: When compiling to basic HTML, the events related to a list are preprocessed such that the items of a list without blank lines between items are compiled as <li>...</li> and items of a list with blank lines in at least one item are compiled as <li>...</li> (tight vs. loose lists). This preprocessing logic could be extended to ignore trailing blank lines which are part of the tokenized list.
micromark-core-commonmark/dev/lib/list.js::tokenizeListContinuation: The list continuation tokenizer could be modified to not only check whether the current line is a blank line but rather keep checking subsequent lines until a non-blank line is found that terminates the list and abort then, so that the blank lines (including non-empty blank lines, which are treated as list item prefixes + line endings) are excluded in the last list item.

I tend to prefer option 2, but I'm not sure whether that's easily done with a document parser as lines are parsed independently and content is parsed implicitly. Any comments and/or suggestions?

github-actions bot added 👋 phase/new Post is being triaged automatically 🤞 phase/open Post is being triaged manually and removed 👋 phase/new Post is being triaged automatically labels Jul 19, 2022

wooorm added the 🙆 yes/confirmed This is confirmed and ready to be worked on label Jul 19, 2022

github-actions bot added 👍 phase/yes Post is accepted and can be worked on and removed 🤞 phase/open Post is being triaged manually labels Jul 19, 2022

zhyd1997 mentioned this issue Aug 16, 2022

MDX2: newline character is added visually, and extra  tag is getting inserted in the DOM storybookjs/storybook#18921

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

List items wrapped in <p> tags due to trailing space #114

List items wrapped in <p> tags due to trailing space #114

scottf3 commented Jul 19, 2022

wooorm commented Jul 19, 2022

github-actions bot commented Jul 19, 2022

sisp commented Nov 24, 2022

List items wrapped in <p> tags due to trailing space #114

List items wrapped in <p> tags due to trailing space #114

Comments

scottf3 commented Jul 19, 2022

Initial checklist

Affected packages and versions

Link to runnable example

Steps to reproduce

Expected behavior

Actual behavior

Runtime

Package manager

OS

Build and bundle tools

wooorm commented Jul 19, 2022

github-actions bot commented Jul 19, 2022

sisp commented Nov 24, 2022