Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[grammar] comma-related Inconsistency between the spec and implementation. #54538

Open
modulovalue opened this issue Jan 8, 2024 · 5 comments
Assignees
Labels
area-language Dart language related items (some items might be better tracked at github.com/dart-lang/language). language-spec-formal language-spec-parser

Comments

@modulovalue
Copy link
Contributor

Consider:

void main() {
  for (;; print("foo"), ) {
    break;
  }
}

Notice the trailing comma, which, according to the spec, doesn't appear to be valid Dart syntax. However, that program is accepted as a valid dart program by the implementation.

The analyzer and parsers based on Dart.g produce the following parse trees:

 === Analyzer ===
Scan errors: 0
Parse errors: 0
<CompilationUnitImpl> [0-59]
┗━ <FunctionDeclarationImpl> [0-58]
  ┣━ <NamedTypeImpl> [0-4]
  ┃  ┗━ 'void' [0-4]
  ┣━ 'main' [5-9]
  ┗━ <FunctionExpressionImpl> [9-58]
    ┣━ <FormalParameterListImpl> [9-11]
    ┃  ┣━ '(' [9-10]
    ┃  ┗━ ')' [10-11]
    ┗━ <BlockFunctionBodyImpl> [12-58]
      ┗━ <BlockImpl> [12-58]
        ┣━ '{' [12-13]
        ┣━ <ForStatementImpl> [16-56]
        ┃  ┣━ 'for' [16-19]
        ┃  ┣━ '(' [20-21]
        ┃  ┣━ <ForPartsWithExpressionImpl> [21-36]
        ┃  ┃  ┣━ ';' [21-22]
        ┃  ┃  ┣━ ';' [22-23]
        ┃  ┃  ┗━ <MethodInvocationImpl> [24-36]
        ┃  ┃    ┣━ <SimpleIdentifierImpl> [24-29]
        ┃  ┃    ┃  ┗━ 'print' [24-29]
        ┃  ┃    ┗━ <ArgumentListImpl> [29-36]
        ┃  ┃      ┣━ '(' [29-30]
        ┃  ┃      ┣━ <SimpleStringLiteralImpl> [30-35]
        ┃  ┃      ┃  ┗━ '"foo"' [30-35]
        ┃  ┃      ┗━ ')' [35-36]
        ┃  ┣━ ')' [38-39]
        ┃  ┗━ <BlockImpl> [40-56]
        ┃    ┣━ '{' [40-41]
        ┃    ┣━ <BreakStatementImpl> [46-52]
        ┃    ┃  ┣━ 'break' [46-51]
        ┃    ┃  ┗━ ';' [51-52]
        ┃    ┗━ '}' [55-56]
        ┗━ '}' [57-58]
--------------------------------------------------------------------------------
 === ANTLR ===
line 2:24 extraneous input ')' expecting {'(', '[', '~', '<', '-', '!', '++', '--', '#', 'const', 'false', 'new', 'null', 'super', 'switch', 'this', 'throw', 'true', 'abstract', 'as', 'covariant', 'deferred', 'dynamic', 'export', 'extension', 'external', 'factory', 'Function', 'get', 'implements', 'import', 'interface', 'late', 'library', 'operator', 'mixin', 'part', 'required', 'set', 'static', 'typedef', 'await', 'yield', 'async', 'base', 'hide', 'of', 'on', 'sealed', 'show', 'sync', 'when', NUMBER, HEX_NUMBER, RAW_SINGLE_LINE_STRING, RAW_MULTI_LINE_STRING, SINGLE_LINE_STRING_SQ_BEGIN_END, SINGLE_LINE_STRING_SQ_BEGIN_MID, SINGLE_LINE_STRING_DQ_BEGIN_END, SINGLE_LINE_STRING_DQ_BEGIN_MID, MULTI_LINE_STRING_SQ_BEGIN_END, MULTI_LINE_STRING_SQ_BEGIN_MID, MULTI_LINE_STRING_DQ_BEGIN_END, MULTI_LINE_STRING_DQ_BEGIN_MID, '{', IDENTIFIER}
line 2:26 missing ')' at '{'
Errors of type 1: [[@16,46:50='break',<53>,3:4] Bad state: ]
Errors of type 2: [), <missing ')'>]
line 2:24 extraneous input ')' expecting {'(', '[', '~', '<', '-', '!', '++', '--', '#', 'const', 'false', 'new', 'null', 'super', 'switch', 'this', 'throw', 'true', 'abstract', 'as', 'covariant', 'deferred', 'dynamic', 'export', 'extension', 'external', 'factory', 'Function', 'get', 'implements', 'import', 'interface', 'late', 'library', 'operator', 'mixin', 'part', 'required', 'set', 'static', 'typedef', 'await', 'yield', 'async', 'base', 'hide', 'of', 'on', 'sealed', 'show', 'sync', 'when', NUMBER, HEX_NUMBER, RAW_SINGLE_LINE_STRING, RAW_MULTI_LINE_STRING, SINGLE_LINE_STRING_SQ_BEGIN_END, SINGLE_LINE_STRING_SQ_BEGIN_MID, SINGLE_LINE_STRING_DQ_BEGIN_END, SINGLE_LINE_STRING_DQ_BEGIN_MID, MULTI_LINE_STRING_SQ_BEGIN_END, MULTI_LINE_STRING_SQ_BEGIN_MID, MULTI_LINE_STRING_DQ_BEGIN_END, MULTI_LINE_STRING_DQ_BEGIN_MID, '{', IDENTIFIER}
line 2:26 missing ')' at '{'
<startSymbol>
┗━ <libraryDefinition>
  ┣━ <metadata>
  ┣━ <topLevelDefinition>
  ┃  ┣━ <functionSignature>
  ┃  ┃  ┣━ <type>
  ┃  ┃  ┃  ┗━ <typeNotFunction>
  ┃  ┃  ┃    ┗━ 'void'
  ┃  ┃  ┣━ <identifier>
  ┃  ┃  ┃  ┗━ 'main'
  ┃  ┃  ┗━ <formalParameterPart>
  ┃  ┃    ┗━ <formalParameterList>
  ┃  ┃      ┣━ '('
  ┃  ┃      ┗━ ')'
  ┃  ┗━ <functionBody>
  ┃    ┗━ <block>
  ┃      ┣━ '{'
  ┃      ┣━ <statements>
  ┃      ┃  ┗━ <statement>
  ┃      ┃    ┗━ <nonLabelledStatement>
  ┃      ┃      ┗━ <forStatement>
  ┃      ┃        ┣━ 'for'
  ┃      ┃        ┣━ '('
  ┃      ┃        ┣━ <forLoopParts>
  ┃      ┃        ┃  ┣━ <forInitializerStatement>
  ┃      ┃        ┃  ┃  ┗━ ';'
  ┃      ┃        ┃  ┣━ ';'
  ┃      ┃        ┃  ┗━ <expressionList>
  ┃      ┃        ┃    ┣━ <expression>
  ┃      ┃        ┃    ┃  ┗━ <conditionalExpression>
  ┃      ┃        ┃    ┃    ┗━ <ifNullExpression>
  ┃      ┃        ┃    ┃      ┗━ <logicalOrExpression>
  ┃      ┃        ┃    ┃        ┗━ <logicalAndExpression>
  ┃      ┃        ┃    ┃          ┗━ <equalityExpression>
  ┃      ┃        ┃    ┃            ┗━ <relationalExpression>
  ┃      ┃        ┃    ┃              ┗━ <bitwiseOrExpression>
  ┃      ┃        ┃    ┃                ┗━ <bitwiseXorExpression>
  ┃      ┃        ┃    ┃                  ┗━ <bitwiseAndExpression>
  ┃      ┃        ┃    ┃                    ┗━ <shiftExpression>
  ┃      ┃        ┃    ┃                      ┗━ <additiveExpression>
  ┃      ┃        ┃    ┃                        ┗━ <multiplicativeExpression>
  ┃      ┃        ┃    ┃                          ┗━ <unaryExpression>
  ┃      ┃        ┃    ┃                            ┗━ <postfixExpression>
  ┃      ┃        ┃    ┃                              ┣━ <primary>
  ┃      ┃        ┃    ┃                              ┃  ┗━ <identifier>
  ┃      ┃        ┃    ┃                              ┃    ┗━ 'print'
  ┃      ┃        ┃    ┃                              ┗━ <selector>
  ┃      ┃        ┃    ┃                                ┗━ <argumentPart>
  ┃      ┃        ┃    ┃                                  ┗━ <arguments>
  ┃      ┃        ┃    ┃                                    ┣━ '('
  ┃      ┃        ┃    ┃                                    ┣━ <argumentList>
  ┃      ┃        ┃    ┃                                    ┃  ┗━ <argument>
  ┃      ┃        ┃    ┃                                    ┃    ┗━ <expression>
  ┃      ┃        ┃    ┃                                    ┃      ┗━ <conditionalExpression>
  ┃      ┃        ┃    ┃                                    ┃        ┗━ <ifNullExpression>
  ┃      ┃        ┃    ┃                                    ┃          ┗━ <logicalOrExpression>
  ┃      ┃        ┃    ┃                                    ┃            ┗━ <logicalAndExpression>
  ┃      ┃        ┃    ┃                                    ┃              ┗━ <equalityExpression>
  ┃      ┃        ┃    ┃                                    ┃                ┗━ <relationalExpression>
  ┃      ┃        ┃    ┃                                    ┃                  ┗━ <bitwiseOrExpression>
  ┃      ┃        ┃    ┃                                    ┃                    ┗━ <bitwiseXorExpression>
  ┃      ┃        ┃    ┃                                    ┃                      ┗━ <bitwiseAndExpression>
  ┃      ┃        ┃    ┃                                    ┃                        ┗━ <shiftExpression>
  ┃      ┃        ┃    ┃                                    ┃                          ┗━ <additiveExpression>
  ┃      ┃        ┃    ┃                                    ┃                            ┗━ <multiplicativeExpression>
  ┃      ┃        ┃    ┃                                    ┃                              ┗━ <unaryExpression>
  ┃      ┃        ┃    ┃                                    ┃                                ┗━ <postfixExpression>
  ┃      ┃        ┃    ┃                                    ┃                                  ┗━ <primary>
  ┃      ┃        ┃    ┃                                    ┃                                    ┗━ <literal>
  ┃      ┃        ┃    ┃                                    ┃                                      ┗━ <stringLiteral>
  ┃      ┃        ┃    ┃                                    ┃                                        ┗━ <singleLineString>
  ┃      ┃        ┃    ┃                                    ┃                                          ┗━ '"foo"'
  ┃      ┃        ┃    ┃                                    ┗━ ')'
  ┃      ┃        ┃    ┣━ ','
  ┃      ┃        ┃    ┗━ <expression>
  ┃      ┃        ┃      ┗━ ')'
  ┃      ┃        ┣━ '<missing ')'>'
  ┃      ┃        ┗━ <statement>
  ┃      ┃          ┗━ <nonLabelledStatement>
  ┃      ┃            ┗━ <block>
  ┃      ┃              ┣━ '{'
  ┃      ┃              ┣━ <statements>
  ┃      ┃              ┃  ┗━ <statement>
  ┃      ┃              ┃    ┗━ <nonLabelledStatement>
  ┃      ┃              ┃      ┗━ <breakStatement>
  ┃      ┃              ┃        ┣━ 'break'
  ┃      ┃              ┃        ┗━ ';'
  ┃      ┃              ┗━ '}'
  ┃      ┗━ '}'
  ┗━ '<EOF>'

Notice how the ANTLR based parser rejects that program, but the analyzer does not.

@lrhn
Copy link
Member

lrhn commented Jan 8, 2024

Seems like the easiest fix is to change the spec.
We generally allow trailing commas on places like this.

@lrhn lrhn added area-language Dart language related items (some items might be better tracked at github.com/dart-lang/language). language-spec-parser language-spec-formal labels Jan 8, 2024
@eernstg
Copy link
Member

eernstg commented Jan 8, 2024

Right, I don't think it's reasonable to introduce a breaking change in order to remove support for trailing commas in in particular situation where it is actually already implemented.

One thing, though, does dart format know about these trailing commas? I would have expected the following to have more line breaks, but it is actually formatted as shown here:

void main() {
  for (int i; i < 10; print("foo"), ++i, print("bar"),) {
    break;
  }
}

@munificent, what do you think about trailing commas in <forLoopParts>?

@munificent
Copy link
Member

Wow, I'm a little surprised the formatter doesn't end up inadvertently dropping the trailing comma on the floor.

Filed this to track how we want to handle it: dart-lang/dart_style#1354

As far as the language is concerned, it's hard to say. The language doesn't seem to have any consistent principle for where trailing commas are allowed:

  • Has closing delimiter and trailing comma is allowed: collection literal, argument list, parameter list, assert statement arguments.
  • Has closing delimiter but disallows trailing comma: type parameter list, type argument list.
  • Has no closing delimiter and disallows trailing comma: show clause, hide clause, implements clause, with clause.
  • Not sure how to categorize whether there is closing delimiter or not but allows trailing comma: enum values.

I guess for loop updaters are most similar to enum values where there is a closing delimiter, but the stuff inside the delimiter is more than just the comma-separated list of things? In an enum, there may be ; followed by members. In a for loop, there are the preceding for parts clauses.

@lrhn
Copy link
Member

lrhn commented Jan 9, 2024

For enums, I'd say there is a trailing delimiter, which is either } or ;.

I agree that the leading and trailing delimiters are not necessarily matched, and for loop increments are like that too, started by ; and ended by ).

Inside that, for-loop increments are more like function arguments, they are comma separated completely general expressions.

Currently implements also has a recognizable end, terminated by either { or ; (mixin application classes). I'd avoid exploiting that, since it's really just an accident that nothing else can come after.

@modulovalue
Copy link
Contributor Author

I hope this issue isn't viewed as some evidence that there's "people who depend on this feature". I've discovered this one by accident (it seems that @lrhn discovered it first: dart-lang/language#2430 (comment)!) and I don't plan on ever using it. I wouldn't mind a breaking change that removed this feature to prevent any potential formatter issues.


@munificent wrote

As far as the language is concerned, it's hard to say. The language doesn't seem to have any consistent principle for where trailing commas are allowed:

  • Has closing delimiter and trailing comma is allowed: collection literal, argument list, parameter list, assert statement arguments.
  • Has closing delimiter but disallows trailing comma: type parameter list, type argument list.

dart-lang/language#2430

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-language Dart language related items (some items might be better tracked at github.com/dart-lang/language). language-spec-formal language-spec-parser
Projects
None yet
Development

No branches or pull requests

4 participants