Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update feature-specification.md #3472

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -1326,7 +1326,7 @@ fresh, non-late, final variable `v` is created. An initializing formal
argument passed to this formal. An initializer list element of the
form `id = e` or `this.id = e` is evaluated by evaluating `e` to an
object `o` and binding `v` to `o`. During the execution of the
constructor body, `this` and `id` are bound to the value of `v`. The
constructor body, `this` is bound to the value of `v`. The
Copy link
Member Author

@lrhn lrhn Nov 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The id is referenced normally using the scope or implicit this.id.
That makes a difference for something like:

extension type E._(Object? id) {
  E(Object? base, Object? id) : id = base {
    assert(identical(this.id, id));
  };
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but it is still not working as intended if id evaluates to an arbitrary object, I'd expect identical(this, id) to be true.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you assume identical(this.id, id) to be true in the example above, even if the two arguments are not the same object?
(I'm using assert, maybe that's confusing, I'm not claiming it must be true, I'm just doing some operation that can be not true. Should have used !identical I guess.)

So more precisely:

extension type E._(Object? id) {
  /// The second argument must not be the same as the first.
  E(Object? base, Object? id) : id = base {
    assert(!identical(this.id, id));
  };
}

I expect this extension type to work such that E(2, 3) is successful, doesn't fail the assert because it compares 2 and 3, and ends up having 2 as representation value.

I expect that because I expect the same for a similar class:

class C {
  final Object? id;
  C._(this.id);
  
  /// The second argument must not be the same as the first.
  C(Object? base, Object? id) : id = base {
    assert(!identical(this.id, id))
  };
}

That works as I described above.
And I want the extension type to work the same.

Which means that the claim that "... ann id are bound to the value of v" during the execution of the constructor body is incorrect. It's not in this case, where it's bound to the value of the second argument, not v which is the value of the first argument.

(If the claim should be interpreted in any other way than that the identifier id in the lexical scope available to the constructor body is bound to the value v, then it should be rephrased, because that's the only reading I can see.)

Every instance method and generative constructor body (anything which can access this) can access the representation variable as if it was an instance getter on this. It's not added to the scope of those members, like this text says, it's added to the member scope of the extension type declaration itself.
Then it's looked up as normal for a this reference.

So the "and id are" shouldn't be here, because the only thing I can read it as saying, is not correct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I missed the point about multiple declarations with the same name.

I think the following change that I mentioned would suffice:

During the execution of the constructor body, this is bound to the value of v, and so is the representation variable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably.

But we'll also have to defined "representation variable" then, because (unless my browser's search function is busted), the phrase "representation variable" only occurs once in this document, where we say that it can be promoted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The representation variable is implied in many locations (and we should probably adjust the text to make it explicit in many of those situations, because that's needed in order to avoid the imprecision about shadowing).

For example, we have a couple of occurrences of text like this:

this and the name of the representation are bound as with the getter invocation

which is used to specify the dynamic semantics of an extension type member invocation/tear-off, and we shouldn't say that "the name" is bound to anything: It's the representation variable which is bound to said receiver object, and the name of the representation is an identifier which may or may not resolve to the representation variable in any given expression. We don't need any special rules for that, name resolution is determined by the syntactic structure (nested scopes) and by implicit insertion of this., and I think it would be a significant source of complexity if we introduce a different set of rules about extension types than the ones that we're using for classes/mixins/enums/etc.

Another example is commentary saying "the static analysis considers the representation as a final instance variable".

Another example is "The static type of the representation name is the representation type" which should again specify the static type of the representation variable because the name may resolve to other declarations.

By the way, I noticed at least one occurrence of 'representation\nvariable'.

I agree, we do have to make the representation variable more explicit in a number of locations in the feature specification.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would it be an improvement to eliminate the rule that the representation variable is bound to the representation object? As far as I can see this means that a conforming implementation could bind the representation variable to an arbitrary object (OK, an arbitrary object whose run-time type is a subtype of the actual value of the representation type).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It says id is bound to the value v. And id is not the representation variable, that's the name of the representation variable.
Binding the representation variable to v is fine, if that's what it wants to say.
Binding an identifier to a value during execution of code means something else.

Since theree can be other things in scope with the same name, and I read this as saying that the identifier is bound to the value v while executing the constructor body. (I really couldn't read it any other way.)
Which it isn't.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It says id is bound to the value v

No, it says that id is bound to the value of v. As is common in the language specification we introduce a fresh local variable (here: v) whose allocation is unspecified, but it occurs in the context of the semantics of the evaluation of an expression e, so it's presumably always possible to allocate it as an additional local variable in the scope which is the current scope for e.

This is slightly magic in cases like var x = e; because there is no syntactic support for allocating local variables in the initializing expression of a variable. On the other hand, the language specification does use let expressions in order to specify the semantics of certain constructs.

So let's say that we can in fact create a fresh variable like that. (Otherwise it's certainly a broader discussion if we insist that the language specification should be rewritten to avoid using this kind of local variables to specify the semantics of some expressions, and we should handle that separately.)

It is also common for our specification documents (including the language specification) to refer to a variable (of any kind) using its name. In some cases we say "the variable 'someName' is bound to ...", but in other cases we just say "'someName' is bound to ...".

We do not have the concept that an identifier is bound to anything.

I would not have a problem with "this and the variable id are bound to ...", but it does seem somewhat verbose, considering that we don't always do so.

Binding the representation variable to v is fine

We never bind a variable to a variable. In this case I really don't think we have a habit of abbreviating "the value of v" to "v", and I wouldn't want to start doing that. (Well, why would that be worse? I guess it's because there is no way we could misunderstand "bind v to ..." to mean that we're binding the identifier to an object, but both the variable and the value of the variable are run-time entities, so that's more likely to be an actual source of confusion).

Binding an identifier to a value during execution of code means something else.

We do have the notion of a run-time namespace (as well as a compile-time namespace). A run-time namespace could actually (if we squint) be considered to bind an identifier to a storage location, that is, it binds the identifier to a variable (and that storage location would in turn hold a pointer to an object, except that we don't specify the run-time semantics with that much detail, and it doesn't have to be a pointer, e.g., it could be a SmallInteger).

Nevertheless, the language specification and feature specifications don't usually include these details. I'd prefer to say that we (slightly magically) obtain a fresh variable, and that variable is bound to an object.

That's how we talk about objects that don't occur as the value of any variable when we specify the semantics of a construct.

Since theree can be other things in scope with the same name ...

That's a good point! We need to mention the scopes where the representation name resolves to the representation variable, because there could be other variables (parameters or locals) shadowing it. The specification of the semantics of an initializing formal is still fine, and so is the specification of the semantics of initializer list elements of the form id = e and this.id = e (they will both initialize the representation variable, even in the case where there is a formal parameter named id).

But the part about the binding of the representation variable should not refer to its name.

So maybe:

During the execution of the constructor body, this is bound to the value of v, and so is the representation variable.

value of the instance creation expression that gave rise to this
constructor execution is the value of `this`.

Expand All @@ -1337,8 +1337,8 @@ constructor, then consider the corresponding non-primary constructor _k_.
The execution of the representation declaration as a constructor has the
same semantics as an execution of _k_.

At run time, for a given instance `o` typed as an extension type `V`, there
is _no_ reification of `V` associated with `o`.
At run time, for a given instance `o` that was statically typed as an
extension type `V`, there is _no_ reification of `V` associated with `o`.
eernstg marked this conversation as resolved.
Show resolved Hide resolved

*This means that, at run time, an object never "knows" that it is being
viewed as having an extension type. By soundness, the run-time type of `o`
Expand Down