Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a mechanism to allow a process plugin to declare it as single-threaded so dprint can scale it #761

Open
bradzacher opened this issue Oct 27, 2023 · 3 comments
Labels
enhancement New feature or request

Comments

@bradzacher
Copy link

Working with nodejs in a process plugin I've found that there's a lot of overhead introduced by the IPC. In a nutshell a format request requires 4 cloned IPC messages to complete - 2 of which contain a file's contents:

  1. dprint asks the plugin to format a file with text
  2. the plugin asks its worker thread to format the file with text
  3. the worker thread responds with either the formatted text or null
  4. the plugin responds with either the formatted text or null

This cloned IPC has a lot of overhead, sadly.
I've been running some tests on Canva's repo and I'm currently averaging ~10ms per file for an unchanged format - which isn't bad, but adds up across ~60k files.

In contrast, I've written a pure-nodejs script for the formatter which spawns 4 child_processes, sends each 1/4 of the entire list of files, and lets each process do its own read-format-write cycle and that runs at ~1ms per file.

I think that if I could remove the nodejs-based worker_thread parallelisation layer and instead have dprint just spawn n instances of the process plugin then things would be much faster and closer to that "reference" implementation perf-wise.

What do you think? Do you think that this feature might make sense - giving a plugin the ability to say "hey I'm single threaded - spin up as many of me as you please" and letting dprint do its own thing as it sees fit based on resource utilisation and availability.

@dsherret dsherret added the enhancement New feature or request label Oct 27, 2023
@dsherret
Copy link
Member

@bradzacher yes, it should be faster for sure.

That said, what do you need a JavaScript-based formatter for? I've found it's way faster to use deno's Rust crates for anything JavaScript based similar to what dprint-plugin-prettier does (https://github.com/dprint/dprint-plugin-prettier). I'm planning to overhaul that plugin to be better soon (going to release a dprint roadmap sometime in the next few days that will include the major changes to that and also will try to prioritize some of this work).

Btw, sorry for not responding to your email. I was in Japan the past two weeks and wasn't really checking it (back now). I think most questions in it have been answered now. Also, thanks for the details on worker_threads! That was intresting.

@bradzacher
Copy link
Author

bradzacher commented Oct 27, 2023

We want to avoid introducing deno to the repo if we can. Mostly because if we introduce it then we have to maintain it and our team is already stretched kinda thin in regards to ownership and maintenance!

A process plugin seemed like a good middle-ground - a smallish piece of JS code that allows us to use the standard NodeJS version for our environment to improve the perf compared to dprint-plugin-exec. It certainly did improve it a lot - from about ~300ms per file down to ~10ms per file! But yeah still shy of what I'd really want at our (growing) scale.

I actually did want to use dprint-plugin-prettier! However we need to use our forked version of prettier (@canva/prettier) instead of the open source version - but the plugin currently has a hard-coded version bundled (AFAICT?).
The deno usage also put us off just a little as mentioned above - though that's something we could swallow and ignore as a black-box implementation detail.


Btw, sorry for not responding to your email

No problem at all 😃 You gave me more than enough to work with! With your examples I was able to build a fully-working JS process plugin! I can probably share it if you're interested.
The protocol side of things was simple to do - the hardest part has been stably reading the input stream. It took a lot of trial and error to get that working fully - turns out that reading binary data from stdin is a bitch to get right in NodeJS.

@dsherret
Copy link
Member

However we need to use our forked version of prettier (@canva/prettier) instead of the open source version - but the plugin currently has a hard-coded version bundled (AFAICT?).

Ah, ok makes sense! That said, there is a dprint-plugin-deno-base crate that makes it slighly eaiser to use for a scenario like that: https://github.com/dprint/dprint-plugin-prettier/blob/main/base/Cargo.toml (I extracted this out of dprint-plugin-prettier a few months ago... it is very undocumented though)

It uses the Deno crates, but it's not really Deno. It uses Deno's helper crate wrappers around v8 and then some of the other crates to get some APIs that are used in Prettier. I had to polyfill a bunch of stuff still though, which was a bit of a pain.

Yeah it requires a bundled version atm (I'm planning to support prettier plugins that are configurable soon).

With your examples I was able to build a fully-working JS process plugin! I can probably share it if you're interested.
The protocol side of things was simple to do - the hardest part has been stably reading the input stream. It took a lot of trial and error to get that working fully - turns out that reading binary data from stdin is a bitch to get right in NodeJS.

dprint-plugin-prettier used to work this way and use JS for the communication (and when I exprimented with https://github.com/dprint/node-plugin-base/). It was indeed surprisingly extremely annoying to read from stdin.

@dsherret dsherret mentioned this issue Oct 28, 2023
24 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants