New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lexical scoping for augmentations #3738
Comments
We have talked about "re-binding errors" for a while (and I just added a description of this kind of error to the sketch of a proposal in #3741). I think this is exactly what we need in order to address the concerns that you're expressing. In the example you're describing we would incur a re-binding error because the reference to I think it's a fair claim that re-binding errors will filter out the programs whose bindings are confusing in this sense. It would be very interesting to scrutinize the concept of proto bindings and see if there are any compelling examples where it does not work in this manner. One thing to consider is, of course, whether the re-binding errors are too constraining, e.g., in the sense that they make it impossibly hard to write certain macros that ought to be within reach. |
I worry a lot about the phrase "re-binding errors" because there should never be any re-binding. A program is a static entity. It means what it means, each identifier referst to what it resolves to at the point where it is resolved. If we need the phrase "re-binding" to describe our semantics, we have a problem. And possibly not a well-defined semantics. What can be confusing is if a name doesn't refer to what it looks like it refers to. That's a readability problem (Readability == Code does what it looks like its doing). If we want to make a program transformation as part of compilation, it's up to us to ensure that it's a semantics preserving transformation. (Any desugaring which moves code into a new context has to be extremely clear about when it happens in the compilation pipe-line. If it happens before type inference, it cannot use static types. If it happens after type inference, we must specify the static type of every expression (usually by making every type explicit, and not use any syntax that would normally trigger type inference). There not being any changes is not true for macros, but that is fundamentally different, because the pre-macro input is not a Dart program. Macros start with something which is (possibly, going on likely) not a Dart program, because there are unresolved identifiers or incomplete class declarations. We then provide some macro-specific introspection on that partial program, and allows it to add further syntactic declarations to the program in specific ways, and afterwards the result should be a Dart program using completely normal Dart semantics. We can talk about proto-bindings there, but those are different from the semantics of a Dart program, they are just our "best guess" about what the binding will be when the program is complete. And if we guess incorrectly, and the macro generated code invalidates those guesses, then ... well, we get to decide what to do. The macro tool does whatever the macro tool chooses to do, and the only requirement is that the resulting output is valid Dart. Everything else is a choice we can make, because we never promised correct Dart semantics for a non-Dart-program. |
The notion of "re-binding errors" that I describe in #3741 does not involve two equally valid results from name resolution, it is built on the notion of proto bindings (which is a new concept that applies to incomplete Dart source code artifacts) and normal Dart name binding. So we're not "deciding what a name means" and then later "changing our minds" (I agree that we should never do that). It is the normal Dart name binding step that provides the binding of a name that contributes so greatly to the determination of the dynamic semantics of the program. This step is performed on the final, merged library. The proto bindings are a well-defined property of a special kind of incomplete software artifact, namely a library that has parts that have augmenting declarations, along with its tree of parts, or a part that has augmenting declarations, along with its parents and its children. The proto bindings serve as a sanity check on the merged library, and they are only intended to improve on the code comprehensibility for a human reader of the code, they are never taken into account when it comes to the final meaning of the program. The point is that a human reader who is looking at a library with augmentations or a part with augmentations can use all the normal habits of reading the code and understanding what it does, disregarding the declarations that may be added to any scope during merging, and conclude that a certain identifier refers to a certain declaration (and, hence, that this identifier has a specific meaning which is associated with that declaration). In the case where a name appears to be undefined (because it is introduced into some scope during merging), the human reader can simply skip the parts that rely on the meaning of this name, still concluding a large number of things about the code, ignoring augmentations. The re-binding errors occur in the case where this approach is inconsistent with the result of merging. That is, when looking at a part (as if augmentations do not exist) we conclude one thing about a name, but the actual meaning of the name after merging is different. I believe the re-binding errors are helpful for a human reader because they justify the kind of proto-binding based human reasoning which is described above. A counterpoint could be that re-binding errors are too constraining, because they turn too many useful programs into compile-time errors. This is something that we'll need to explore in practice, and we would also need to think about useful idioms or programming patterns that will work when we encounter that kind of failure. I think the alternative is impractical: There is no way we can modify the static analysis of Dart such that each library with parts with augmentations and each part whose library has augmentations can have their names resolved without taking any augmentations into account. |
That sounds more complicated than necessary, and more error prone. And it still has the underlying assumption that something changes relative to the source code that the user provided. If the proto-binding is more like a guess at "what a reasonable reader would assume this variable means from reading just this file", and we make it a problem if that doesn't match the actual binding, then I can see a value. But I'd also suggest that we just make the code mean what the reasonable reader would assume, instead of shooting them in the foot, and then tell them why we shot them in the foot afterwards.
Introducing a "proto semantics" which are intended to match user expectations, then doing a transformation that isn't proto-semantics preserving, and making it an error if the transformation didn't preserve proto-semantic for this concrete program, ... prompts the question of why did we not just preserve the user's expectations instead?
I don't want to consider such a program as incomplete. It's the complete source that the user has provided. The totalitly of the files define the contents of the library. It's all there. The word "incomplete" is inaccurate. If we have a problem specifying a "merging" which desugares augmentations into non-augmentations, and parts with imports into parts without imports, maybe the solution is not to add more complexity on top, but to not try to do that merging to begin with. Because it isn't necessary, and it seems to just be adding complications. |
The augmentation specification contains the following paragraph:
I think the "implicit ones" is a mistake. It changes a lexical scope to an implicit context scope.
Currently an unqualified identifier is resolved in the lexical scope, which are the textually surrounding declarations, then bottoming out at the library's declaration scope and import scope.
If an identifier
id
is not declared in those scopes, it becomes an implicitthis.id
, which is then an error if that would not be valid.As currently specified in the augmentation spec, the name of a static or instance member can come from a different augmentation, which is not in the textual scope, but is still introduced into the scope that identifiers are resolved in.
It's not that it's technically a problem to define it that way, and it has the "advantage" that textually merging augmentations in into the base declaration preserves resolution. At least if names introduced by later augmentations are also in scope, which they are.
What it loses is readability. Consider:
and
These two classes,
C
andD
, look identical. They both refer to_isOdd
.To the casual reader, it looks like they are both referencing the global
_isOdd
.However, with the currently specified behavior, the
C
_isOdd
reference actually refers to the instance method added inaugment class C
, because that declaration is added to the lexical scope of everyclass C
declaration. It becomes lexically in scope, even if it's not textually declared in the surrounding code.The
D
declaration keeps referncing the global_isOdd
, because while theaugment class D
adds an implementation of_isOdd
, it doesn't add a declaration in the same namespace.Having two textually identical declarations that mean different things in seemingly the same textual scope, is bad for readability.
I recommend that we do not combine the lexical scopes of declarations and augments.
That would mean that:
_isOdd
references inC
andD
are resolved the same way, to the global_isOdd
, which is the only one in the textual scope.this._isOdd(n-1)
, they have to write it out. Just as if they want to reference a static member. (We should still allow static and instance members with the same name, even in the same textual scope, but especially with augmentations where they can then be in different scopes.)this.id
orType.id
, or possibly something else if they refer to a further-out declaration than is now shadowed.The text was updated successfully, but these errors were encountered: