Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving performance by reducing useless parsing #24

Open
wooorm opened this issue Sep 29, 2020 · 6 comments
Open

Improving performance by reducing useless parsing #24

wooorm opened this issue Sep 29, 2020 · 6 comments
Labels
🏁 area/perf This affects performance 👍 phase/yes Post is accepted and can be worked on 💬 type/discussion This is a request for comments 🦋 type/enhancement This is great to have 🙆 yes/confirmed This is confirmed and ready to be worked on

Comments

@wooorm
Copy link
Member

wooorm commented Sep 29, 2020

Subject of the feature

There are two main places where parsing is done that is (potentially) useless.

  • content: At the end of a line in content, we parse ahead, to figure out if the paragraph should be closed
    This is double, because when the paragraph is closed, we will actually do the parsing.
    This was done in remark too. And similar to there, it is a bit optimized.
    This point in parsing markdown is rather complex, because of interplay with definitions, setext headings, paragraphs, but also lazy lines.
    Removing lookaheadConstruct improves performance by 13%. The alternative should be possible and hopefully is not too big.
  • document: to figure out whether containers continue, close flow, start new flow, or have lazy lines, another throwaway inspection is done.
    Removing document completely improves performance by 28% (although lists are complex so it some time spent there is unavoidable)
    (SOLVED IN 939e90d)
@wooorm wooorm added 🦋 type/enhancement This is great to have 🏁 area/perf This affects performance 🙉 open/needs-info This needs some more info 💬 type/discussion This is a request for comments labels Sep 29, 2020
@fazouane-marouane
Copy link
Member

It could be helpful to have rough idea of bottlenecks. I tried chrome tools for nodejs but it doesn't provide useful reports for micromark. Maybe there's a flag to include or maybe it works better for browser code.

@wooorm
Copy link
Member Author

wooorm commented Oct 5, 2020

I’ve essentially run micromark on 1mb of markdown, and then toggled a bunch of code on and off to figure out where potential bottle necks are. Hence the 13% and 28% mentioned above.
I did see that V8 in Chrome performs fast, and in Node it’s a bit slower. I’ve looked a bit at Chrome’s profiler, but I’m not too great at it 😅

@fazouane-marouane
Copy link
Member

Oh sorry I overlooked the performance speedups. 13% and 28% look good indeed !

@wooorm
Copy link
Member Author

wooorm commented Oct 5, 2020

this repo might be useful! https://github.com/wooorm/a-dump-of-markdown.

And for these two numbers: well, block quotes and lists do need to get parsed. So 28% improvement won’t be reached. But, 1/3rd of the time is spent with ’em.

@wooorm wooorm added 👍 phase/yes Post is accepted and can be worked on 🙆 yes/confirmed This is confirmed and ready to be worked on and removed 🙉 open/needs-info This needs some more info labels Apr 12, 2021
wooorm added a commit that referenced this issue Jun 16, 2021
*   Remove unneeded double parse, previously needed to investigate
    whether flow was allowed to be lazy
*   Remove `lazy` field on constructs: whether a line in a construct
    is allowed to be lazy must now be defined by construct’s
    tokenizers.
*   Remove `.lazy()` method on effects
*   Fix delayed lazyness, as used by GFM tables

Related to GH-24.
Closes remarkjs/remark-gfm#1.
Closes remarkjs/remark-gfm#3.
@github-actions

This comment has been minimized.

@wooorm
Copy link
Member Author

wooorm commented Jun 16, 2021

The document part is now fixed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏁 area/perf This affects performance 👍 phase/yes Post is accepted and can be worked on 💬 type/discussion This is a request for comments 🦋 type/enhancement This is great to have 🙆 yes/confirmed This is confirmed and ready to be worked on
Development

No branches or pull requests

2 participants