Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use all unsolved pieces on the offending line as expand pieces. #1494

Merged
merged 1 commit into from
May 20, 2024

Conversation

munificent
Copy link
Member

@munificent munificent commented May 18, 2024

The solver works by incrementally building a up a solution by binding pieces to states one at a time. To avoid wasting time exploring solutions that are pointless, it only looks at unbound pieces in play when the first line containing overflow characters or an invalid newline as written.

Before this PR, it only looked at the first unbound piece on that line. Often, the first piece on the line is not the one that actually needs to split. For example, in:

//                    |
variable = function(argument);

Here, the first piece on the overflowing line is the AssignPiece for the =, but the piece we actually want to split at is the ListPiece for the argument list.

To handle that, the solver currently tries binding the first piece to all values, including State.unsplit, even though that's effectively the value the current solution used, since unbound pieces behave like they have State.unsplit. The only reason it makes a new solution and binds the piece to State.unsplit is that if that piece turns out to not be the one that needs to split, we can now find the next piece on that same line. Now that the first piece is bound to State.unsplit, the second piece will be the first unbound one.

But the end result is that we end up generating a lot of more or less redundant solutions that just bind a bunch of pieces to State.unsplit and then produce the exact same formatted result.

Instead, this PR collects all of the unbound pieces on the first overflowing line. When it expands, it expands them all, but doesn't bind any of them to State.unsplit. The old formatter works the same way, but it wasn't clear to me that doing so was important for perf. It is!

Benchmark (tall)                fastest   median  slowest  average  baseline
-----------------------------  --------  -------  -------  -------  --------
block                             0.065    0.067    0.131    0.070     96.3%
chain                             1.515    1.540    1.629    1.547    172.2%
collection                        0.169    0.173    0.189    0.175     98.6%
collection_large                  0.896    0.926    0.959    0.925     96.5%
conditional                       0.088    0.089    0.104    0.091    179.9%
curry                             1.651    1.667    1.689    1.668    147.9%
flutter_popup_menu_test           0.409    0.422    0.448    0.423    116.8%
flutter_scrollbar_test            0.154    0.159    0.176    0.159     94.2%
function_call                     1.428    1.457    1.625    1.463     98.1%
infix_large                       0.680    0.702    0.776    0.708    144.4%
infix_small                       0.163    0.167    0.193    0.169     97.5%
interpolation                     0.090    0.093    0.120    0.094    107.0%
large                             4.552    4.631    4.987    4.650     90.6%
top_level                         0.180    0.183    0.214    0.185    135.0%

My goal is to get the new formatter as fast as the old one on a real-world corpus. Here's the results for formatting the Flutter repo:

Current formatter    15.890 ========================================
This PR              14.309 ====================================
Old formatter         7.131 =================
The current formatter is 55.12% slower than the old formatter.
This PR is 11.05% faster than the current formatter.
This PR is 50.16% slower than the old formatter.
This PR gets the formatter 18.05% of the way to the old one.

So not a huge improvement, but a big step in the right direction.

The solver works by incrementally building a up a solution by binding pieces to states one at a time. To avoid wasting time exploring solutions that are pointless, it only looks at unbound pieces in play when the first line containing overflow characters or an invalid newline as written.

Before this PR, it only looked at the *first* unbound piece on that line. Often, the first piece on the line is not the one that actually needs to split. For example, in:

```dart
//                    |
variable = function(argument);
```

Here, the first piece on the overflowing line is the AssignPiece for the `=`, but the piece we actually want to split at is the ListPiece for the argument list.

To handle that, the solver currently tries binding the first piece to all values, including State.unsplit, even though that's effectively the value the current solution used, since unbound pieces behave like they have State.unsplit. The only reason it makes a new solution and binds the piece to State.unsplit is that if that piece turns out to *not* be the one that needs to split, we can now find the *next* piece on that same line. Now that the first piece is *bound* to State.unsplit, the second piece will be the first *unbound* one.

But the end result is that we end up generating a lot of more or less redundant solutions that just bind a bunch of pieces to State.unsplit and then produce the exact same formatted result.

Instead, this PR collects *all* of the unbound pieces on the first overflowing line. When it expands, it expands them all, but *doesn't* bind any of them to State.unsplit. The old formatter works the same way, but it wasn't clear to me that doing so was important for perf. It is!

```

Benchmark (tall)                fastest   median  slowest  average  baseline
-----------------------------  --------  -------  -------  -------  --------
block                             0.065    0.067    0.131    0.070     96.3%
chain                             1.515    1.540    1.629    1.547    172.2%
collection                        0.169    0.173    0.189    0.175     98.6%
collection_large                  0.896    0.926    0.959    0.925     96.5%
conditional                       0.088    0.089    0.104    0.091    179.9%
curry                             1.651    1.667    1.689    1.668    147.9%
flutter_popup_menu_test           0.409    0.422    0.448    0.423    116.8%
flutter_scrollbar_test            0.154    0.159    0.176    0.159     94.2%
function_call                     1.428    1.457    1.625    1.463     98.1%
infix_large                       0.680    0.702    0.776    0.708    144.4%
infix_small                       0.163    0.167    0.193    0.169     97.5%
interpolation                     0.090    0.093    0.120    0.094    107.0%
large                             4.552    4.631    4.987    4.650     90.6%
top_level                         0.180    0.183    0.214    0.185    135.0%
```

My goal is to get the new formatter as fast as the old one on a real-world corpus. Here's the results for formatting the Flutter repo:

```
Current formatter    15.890 ========================================
Optimized            14.309 ====================================
Old formatter         7.131 =================
The current formatter is 55.12% slower than the old formatter.
The optimized is 11.05% faster than the current formatter.
The optimized is 50.16% slower than the old formatter.
The optimization gets the formatter 18.05% of the way to the old one.
```

So not a huge improvement, but a big step in the right direction.
@munificent munificent merged commit 0a78529 into main May 20, 2024
7 checks passed
@munificent munificent deleted the multiple-expand-pieces branch May 20, 2024 22:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants