Use all unsolved pieces on the offending line as expand pieces. #1494

munificent · 2024-05-18T05:23:10Z

The solver works by incrementally building a up a solution by binding pieces to states one at a time. To avoid wasting time exploring solutions that are pointless, it only looks at unbound pieces in play when the first line containing overflow characters or an invalid newline as written.

Before this PR, it only looked at the first unbound piece on that line. Often, the first piece on the line is not the one that actually needs to split. For example, in:

//                    |
variable = function(argument);

Here, the first piece on the overflowing line is the AssignPiece for the =, but the piece we actually want to split at is the ListPiece for the argument list.

To handle that, the solver currently tries binding the first piece to all values, including State.unsplit, even though that's effectively the value the current solution used, since unbound pieces behave like they have State.unsplit. The only reason it makes a new solution and binds the piece to State.unsplit is that if that piece turns out to not be the one that needs to split, we can now find the next piece on that same line. Now that the first piece is bound to State.unsplit, the second piece will be the first unbound one.

But the end result is that we end up generating a lot of more or less redundant solutions that just bind a bunch of pieces to State.unsplit and then produce the exact same formatted result.

Instead, this PR collects all of the unbound pieces on the first overflowing line. When it expands, it expands them all, but doesn't bind any of them to State.unsplit. The old formatter works the same way, but it wasn't clear to me that doing so was important for perf. It is!

Benchmark (tall)                fastest   median  slowest  average  baseline
-----------------------------  --------  -------  -------  -------  --------
block                             0.065    0.067    0.131    0.070     96.3%
chain                             1.515    1.540    1.629    1.547    172.2%
collection                        0.169    0.173    0.189    0.175     98.6%
collection_large                  0.896    0.926    0.959    0.925     96.5%
conditional                       0.088    0.089    0.104    0.091    179.9%
curry                             1.651    1.667    1.689    1.668    147.9%
flutter_popup_menu_test           0.409    0.422    0.448    0.423    116.8%
flutter_scrollbar_test            0.154    0.159    0.176    0.159     94.2%
function_call                     1.428    1.457    1.625    1.463     98.1%
infix_large                       0.680    0.702    0.776    0.708    144.4%
infix_small                       0.163    0.167    0.193    0.169     97.5%
interpolation                     0.090    0.093    0.120    0.094    107.0%
large                             4.552    4.631    4.987    4.650     90.6%
top_level                         0.180    0.183    0.214    0.185    135.0%

My goal is to get the new formatter as fast as the old one on a real-world corpus. Here's the results for formatting the Flutter repo:

Current formatter    15.890 ========================================
This PR              14.309 ====================================
Old formatter         7.131 =================
The current formatter is 55.12% slower than the old formatter.
This PR is 11.05% faster than the current formatter.
This PR is 50.16% slower than the old formatter.
This PR gets the formatter 18.05% of the way to the old one.

So not a huge improvement, but a big step in the right direction.

The solver works by incrementally building a up a solution by binding pieces to states one at a time. To avoid wasting time exploring solutions that are pointless, it only looks at unbound pieces in play when the first line containing overflow characters or an invalid newline as written. Before this PR, it only looked at the *first* unbound piece on that line. Often, the first piece on the line is not the one that actually needs to split. For example, in: ```dart // | variable = function(argument); ``` Here, the first piece on the overflowing line is the AssignPiece for the `=`, but the piece we actually want to split at is the ListPiece for the argument list. To handle that, the solver currently tries binding the first piece to all values, including State.unsplit, even though that's effectively the value the current solution used, since unbound pieces behave like they have State.unsplit. The only reason it makes a new solution and binds the piece to State.unsplit is that if that piece turns out to *not* be the one that needs to split, we can now find the *next* piece on that same line. Now that the first piece is *bound* to State.unsplit, the second piece will be the first *unbound* one. But the end result is that we end up generating a lot of more or less redundant solutions that just bind a bunch of pieces to State.unsplit and then produce the exact same formatted result. Instead, this PR collects *all* of the unbound pieces on the first overflowing line. When it expands, it expands them all, but *doesn't* bind any of them to State.unsplit. The old formatter works the same way, but it wasn't clear to me that doing so was important for perf. It is! ``` Benchmark (tall) fastest median slowest average baseline ----------------------------- -------- ------- ------- ------- -------- block 0.065 0.067 0.131 0.070 96.3% chain 1.515 1.540 1.629 1.547 172.2% collection 0.169 0.173 0.189 0.175 98.6% collection_large 0.896 0.926 0.959 0.925 96.5% conditional 0.088 0.089 0.104 0.091 179.9% curry 1.651 1.667 1.689 1.668 147.9% flutter_popup_menu_test 0.409 0.422 0.448 0.423 116.8% flutter_scrollbar_test 0.154 0.159 0.176 0.159 94.2% function_call 1.428 1.457 1.625 1.463 98.1% infix_large 0.680 0.702 0.776 0.708 144.4% infix_small 0.163 0.167 0.193 0.169 97.5% interpolation 0.090 0.093 0.120 0.094 107.0% large 4.552 4.631 4.987 4.650 90.6% top_level 0.180 0.183 0.214 0.185 135.0% ``` My goal is to get the new formatter as fast as the old one on a real-world corpus. Here's the results for formatting the Flutter repo: ``` Current formatter 15.890 ======================================== Optimized 14.309 ==================================== Old formatter 7.131 ================= The current formatter is 55.12% slower than the old formatter. The optimized is 11.05% faster than the current formatter. The optimized is 50.16% slower than the old formatter. The optimization gets the formatter 18.05% of the way to the old one. ``` So not a huge improvement, but a big step in the right direction.

munificent requested review from natebosch and kallentu May 18, 2024 05:23

kallentu approved these changes May 20, 2024

View reviewed changes

natebosch approved these changes May 20, 2024

View reviewed changes

munificent merged commit 0a78529 into main May 20, 2024
7 checks passed

munificent deleted the multiple-expand-pieces branch May 20, 2024 22:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use all unsolved pieces on the offending line as expand pieces. #1494

Use all unsolved pieces on the offending line as expand pieces. #1494

munificent commented May 18, 2024 •

edited

Use all unsolved pieces on the offending line as expand pieces. #1494

Use all unsolved pieces on the offending line as expand pieces. #1494

Conversation

munificent commented May 18, 2024 • edited

munificent commented May 18, 2024 •

edited