-
Notifications
You must be signed in to change notification settings - Fork 217
Rework expression inference spec draft based on review comments. #3850
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The following changes (mostly based on suggestions by Leaf and Erik) are included: - The text is now under a heading called "expression inference" rather than "body inference" (which was confusing). - Instead of trying to describe the output of type inference using a vague notion of "expression artifacts" whose runtime behavior is described in prose, I'm specifying it using a Dart-like syntax called "elaborated expressions". Elaborated expressions look generally like Dart expressions, except they have some limitations (types are fully specified, and complex constructs like null shorting are desugared, and type checks are explicit) and support some extra forms (such as "let" expressions). - I've simplified the terminology I use to describe the three things that can happen when an expression is evaluated (it can diverge, throw an exception, or evaluate to a value). My previous terminology was confusing ("fail to complete at all", "complete normally with a value", or "complete with an exception"). - I've re-worked the treatment coercions so that I'm no longer making subsumption explicit in the language of elaborated expressions. (I never meant to make subsumption explicit; it happened as a mistake due to unclear thinking on my part.) - I broke the "numbers" subsection into separate "integer literals" and "double literals" sections. (Previosuly I only specified how integer literals are handled.) - The sketches of soundness proofs have been simplified a bit. I'm no longer trying to crisply distinguish "expression soundness", "return value soundness", "tear-off soundness", and "future soundness"; rather, I'm speaking about soundness more informally, as befits an informal proof sketch. - On the other hand, I've clarified precisely what I mean by "is an instance of", since the soundness proof sketches don't make any sense if your definition of this concept doesn't agree with what's in my head. I've also made sure this concept correctly accounts for extension types. - And I've added text explaining why the interpretation of `await` expressions is indeed sound. (Previously I had a "TODO" because I thought maybe they weren't sound. But they're sound; it just took a little extra work to prove it to myself.)
resources/type-system/inference.md
Outdated
An invariant of expression inference, known as _soundness_, is that when the | ||
elaborated form of any expression in the program is executed, it is guaranteed | ||
either to diverge, throw an exception, or evaluate to a value that is an | ||
_instance_ of its static type. _Instance_ is defined as follows: a value `v` is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"instance...of ...type" - of usually read that as an object whose runtime type is exactly (the extension type erasure of) that type, not just a subtype of it.
I can see why this definition is practical, could we call it singing slightly less direct? "Instance satisfying type" perhaps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. @leafpetersen had a similar reaction. I pointed out to him that my interpretation was based on wanting to be consistent with the existing spec, which is what led to the non-normative comment below.
I would be totally open to switching to different terminology that is more consistent with common usage, but I would want to make the change both in this doc and in dartLangSpec.tex
😃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked the specification. Most uses of "instance of" are about the types bool
, int
, double
, String
and Null
. Those technically can be subclasses, but we usually pretend they're not.
Other uses are:
A function object is always an instance of some class $C$ that implements the class \FUNCTION{} {\ref{functionType}),
If the transitive meaning was used, it could just say "an instance that implements the class \FUNCTION{}".
Or:
If $v$ is an instance of class \code{Error} or a subclass thereof,
which is about whether throwing sets stackTrace
on the object.
Here it the meaning is clearly not transitive, because then it would be wrong.
And:
but $e$ evaluates to an instance of a type which is not a subtype of \code{Iterable<\DYNAMIC>}.%
Again, this could just be "to an object which is not an instance of \code{Iterable<\DYNAMIC>}."
There are definitely also other places where "instance of" is used transitively, but I wouldn't say that the specification is consistently using it that way.
Using it transitively gets confusing, because then I don't have a good way to say that an object is an instance of a precise type. (I guess "the runtime type of the object is TheType" could work.)
I'd rather use "x is an instance of T" to mean that the runtime type of x
is the class of T
(and keep pretending there are no subtypes of int
and String
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked the specification. Most uses of "instance of" are about the types
bool
,int
,double
,String
andNull
. Those technically can be subclasses, but we usually pretend they're not.
Ok, I think I've come around to your way of thinking. (FWIW, I was initially skeptical because the SDK files define int
, double
, and String
as abstract classes, but after playing around a bunch this morning I've convinced myself that this is an implementation detail that's invisible to the user if they don't go digging into the SDK source files).
I've re-worked this PR to consistently use "instance satisfying type" instead of "instance of" in the cases where it's important to include subtypes, and I've changed this text to define "instance satisfying type" appropriately.
I might follow up with a PR to change the few places in the spec where I think we're accidentally using the "subtype" meaning of "instance of".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After digging through the dartLangSpec.tex
, I discovered that the terminology we use fairly consistently in the spec is "instance implementing type T
" rather than "instance satisfying type T
", so I've made a PR to switch this file to use the same convention: #3918
resources/type-system/inference.md
Outdated
_Some literature uses a different definition of "instance", saying that a value | ||
`v` is an instance of a type `T` only if the runtime type of `v` is | ||
__precisely__ `T`. We adopt the "subtype" definition for consistency with other | ||
parts of the language spec._ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... Well, shucks. 😭
resources/type-system/inference.md
Outdated
- `m_2` completes with the value `o_2`. _Expression soundness follows from | ||
the fact that `o_2` is an instance of `U` and `U <: T_2`._ | ||
- Otherwise, if `T_1` is an interface type that contains a method called `call` | ||
with type `U`, and `U <: T`, then let `m_2` be `m_1.call`. _Since `m_1.call` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I'd introduce a @CALL_TEAROFF(m_1)
too, so that any elaborated expression is either an expression of the source program, or it's a new form.
That avoids having to answer whether m_1.call
is elaborated or not.
The phrase "Since m_1.call
has static type ..." is undefined, since the term has neither been type inferred (out was not in the source program so it was never recursed on, and it's not an unelaborated expression to begin with) nor been assigned a static type explicitly, like we've done for the other @FOO
forms.
We'd have to explicitly say that the term has that static type.
Then I'd prefer to introduce a new form for it too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. I’m not sure how I feel about this.
On the one hand, using ordinary Dart syntax for elaborated expressions is nice because it allows readers familiar with Dart (who, let’s be honest, constitute my entire target audience) to understand what’s going on more quickly, and allows me to focus my specification efforts on the details of type inference rather than trying to rigorously define the semantics of everything (which I think was one of the drawbacks of my previous approach).
On the other hand, using new syntax for elaborated expressions has the advantage that it’s much more obvious to the reader when we are talking about an elaborated expression vs. an unelaborated expression.
As to your comment that "Since m_1.call
has static type ..." is undefined, my mental model in this PR is that every elaborated expression has a well-defined static type; when the elaborated expression uses standard Dart syntax, the static type of the expression is defined using the static typing rules in dartLangSpec.tex
; when it uses one of the new @
constructs I’ve introduced, the static type of the expression is defined in the section that defines all of those constructs ("New operations allowed in elaborated expressions"). To put it another way, since elaborated expressions are based on standard Dart, with the addition of some new constructs and the prohibition of some old ones, they inherit the static typing rules defined in dartLangSpec.tex
. Does that seem like a reasonable approach to you? If so, is there something I could do that would help clarify that’s the approach I’m taking?
Note that this is a change from my previous mental model (prior to this CL), which was to consider the type inference process for an expression to produce two outputs: (1) an “expression artifact”, which is defined only by a prose description of its runtime semantics and therefore has no static type per se, and (2) a static type. This is actually a far better match for what actually happens in the CFE (expression artifacts are kernel Expressions, and although there is an Expression.getStaticType
method in the kernel, Johnni has told me that it’s not always correct, and the CFE mostly disregards it in favor of its own independently-calculated static type). But it made the spec a lot more verbose, and I think it contributed to confusion about whether I was trying to make subsumption explicit in the language of elaborated expressions.
I’m going to leave this as is for now, but I’m working on a follow-up commit in which I add specifications for things like method invocations; I’ll experiment with your proposed invariant (any elaborated expression is either an expression of the source program, or it's a new form) in that PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm OK with keeping m_1.call
as the elaborated expression produced by type inference. I'd like to highlight the .call
as being synthetic in some way, but if we can do that typographically it's probably fine.
However, "every elaborated expression has a well-defined static type" is the end result of the type inference algorithm, which we are in the middle of.
It's (I think) a recursive/inductive algorithm on expressions, where type is inferred on subexpressions first, resulting in elaborated expressions with static types (for the elaborated expression and every elaborated subexpression), then those elaborated expressions and static types are used to infer the elaborated expression and static type of the parent expression, maybe filling in missing type arguments and wrapping it in coercions. Maybe even some desugaring using @...
, although I'd prefer to avoid it.
So "when the elaborated expression uses standard Dart syntax" does not apply to m_1.call
, because m_1
is an elaborated expression which may not use standard Dart syntax.
The m_1.call
is not the result of a recursive call to type inference, so it hasn't been assigned a static type by that, it's a completely new elaborated expression which we have willed into existence from nothing, and it has no inherent static type.
We definitely don't want to do type inference on m_1.call
, doing inference on m_1
recursively (now in receiver position, if it's even a valid unelaborated expression any more).
That is: a recursive invocation of type inference returns an elaborated expression where every subexpression has been assigned a static type. The current invocation needs to do the do the same, so when it returns an alaborated expression, that, and all subexpressions, must also have been assigned static types. And someone needs to do that assigning - a static type is not an inherent property of an expression, it's the explicit result of running inference on the expression, and here we are implementing the inference which assigns the static type.
The right thing to do is to say that:
- We introduce an new elaborated expression
m_1.call
. This follows the grammatical rules for elaborated expressions (which is the grammar of expressions where<primary>
has been extended with the new syntactic forms). - We immediately assign this
m_1.call
the static typeU
(the function type of thecall
method ofT_1
, the static type ofm_1
, which inductively has a static type since it's an existing elaborated expression). - Then
m_2
ism_1
.
That is, when we introduce a new elaborated expression, which is not the result of running type inference on an unelaborated expression, then we also have to assign its static type. (Then we can still argue that that static type is sound.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm OK with keeping
m_1.call
as the elaborated expression produced by type inference. I'd like to highlight the.call
as being synthetic in some way, but if we can do that typographically it's probably fine.
I'd like to push back on this. What would we be trying to communicate by highlighting that .call
is synthetic? The fact that .call
doesn't appear in the user's source code in this case is obvious. Are we trying to highlight that coercion has inserted an operation? Again, I think that's obvious.
However, "every elaborated expression has a well-defined static type" is the end result of the type inference algorithm, which we are in the middle of. It's (I think) a recursive/inductive algorithm on expressions, where type is inferred on subexpressions first, resulting in elaborated expressions with static types (for the elaborated expression and every elaborated subexpression), then those elaborated expressions and static types are used to infer the elaborated expression and static type of the parent expression, maybe filling in missing type arguments and wrapping it in coercions. Maybe even some desugaring using
@...
, although I'd prefer to avoid it.
Ok, you have a good point. I think I was trying to avoid scope creep in this document by letting the rules for static typing be largely defined elsewhere, but it's really not panning out that way, is it? Not only do I have to define static type for the new constructs I'm introducing, but I'm also breaking my own principle and introducing pointless differences between the spec and the implementations (because the implementations do compute static types as a by-product of the type inference process).
I've re-worked the PR so that every rule for expression inference explicitly specifies the static type of the resulting elaborated expression, and I've clarified that the rules in this document for determining the static type of an expression are meant to supersede the rules currently in dartLangSpec.tex
.
is there a coercion available, so it's a type error._ | ||
- Otherwise, there is a compile-time error. _We have an expression of type `T_1` | ||
in a situation that requires `T`, which isn't a supertype, nor is there a | ||
coercion available, so it's a type error._ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this work for assigning to a promoted variable?
That is: the error being part of the coercion, not something that is sometimes applied after coercion that didn't promote (or didn't promote enough).
Say we promote a variable from Function
to int Function(int)
.
Then we assign to it:
- an
String Function(String)
statically typed asdynamic
. Here I think we will try to cast toint Function(int)
, and fail at runtime, and that's OK. - A callable object with a
call
method with typeString Function(String)
. Here I think we currently specify that we do.call
insertion for any function context type, whether it's a supertype or not, then we should assign that function to the variable as a demoting assignment.
In the second case, it's not an error for the coercion to fail to reach the target type. The coercion happens anyway, then it may or may not be an error if the resulting static type can be used in the context.
(That's also a way to combine instantiation and call
-tearoff, by doing the tearoff first, and then coercing the result of that with instantiation.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this should work ok. I haven’t written the spec text for it yet, but what we do in the case of an assignment to a promoted local variable is:
- Infer the RHS of the assignment using the promoted type of the local variable as the context.
- Coerce the resulting elaborated expression to the (unpromoted) static type of the local variable.
- If the static type of the coerced elaborated expression is not a subtype of the variable’s promoted type, the variable is demoted.
So in your second example, where we promote a local variable from Function
to int Function(int)
, and then assign a callable object with a call
method with type String Function(String)
, the target type for the coercion is Function
, so the call tearoff happens, and the coerced expression has static type String Function(String)
, and the variable gets demoted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super. That also means that my idea of just putting a "try to coerce" step in between any two inference steps won't work, because the context type for inference isn't always the same as the context type for coercion.
(But I guess we could pass both to the coercion step, which then recurses with one on the inference, and then coerces the result to the other.)
expression soundness, `o_1` will be an instance of the type `bool`._ | ||
- If `e` is of the form `e_1 && e_2`, let `m` be `m_1 && m_2`. _It is valid to | ||
form this elaborated expression because the static type of `m_1` and `m_2` are | ||
guaranteed to be a subtype of `bool`._ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I'd say to coerce that to the context type, but we don't allow implicit tearoff of extension call
methods, right?)
(Complete side track: should we let the static type of the expression be UP of the static types of the m_1
and m_2
elaborated expressions? If both are the same extension type implementing bool
, we could let the result have that type too!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You’re right, we don’t do implicit tearoff of extension call methods, so there’s no coercion that would change bool
to any other type.
Also, since my goal here is to be descriptive of what the implementations currently do, I don’t want to specify that we coerce to the context "type" because the implementations don’t try to coerce logical boolean expressions to the context. (Indeed, they never really try to coerce to the context per se; it just looks like they do because coercions always happen at assignability check points, and when there's an assignability check, the context for the subexpression is usually the same as the type used in the assignability check).
As to your side-track, that’s a really interesting thought. I’m torn between a desire to implement an interesting language and a desire to write fewer crazy tests 😃
resources/type-system/inference.md
Outdated
- Let `T_2` be `flatten(T_1)`. | ||
|
||
- Let `m_2` be `@LET(T_1 v = m_1 in v is Future<T_2> ? v : | ||
Future<T_2>.value(@PROMOTED_TYPE<T_2>(v)))`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here too I'd introduce a @FUTURE_VALUE<T_2>(v)
, not just for the usual reasons, but also because I don't want the specification to say which exact constructor must be used, as long as it creates a future of the correct type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it’s also important that it creates a future with the correct semantics. Specifying this as Future<T_2>.value(@PROMOTED_TYPE<T_2>(v))
captures both the static type we want and the runtime semantics we want, so I think it would be a high bar to come up with something better. As to whether we want to specify the exact constructor that must be used, I think it goes without saying that the implementation is always welcome to replace one elaborated expression with another one that has equivalent semantics (that’s what optimization is all about, after all). I don't think we need to invent a new piece of syntax here to grant extra permission for optimization in this specific case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vaguely remember someone filing an issue about some backend not using Future.value
to create the future (maybe they use _Future.immediateValue
instead), because there was a timing difference between the two.
I'd be much happier if we specified this abstractly as introducing an object implementing Future<T_2>
which completes with the value v
at some later point. Saying that it has to call a specific constructor, one which has extra overhead because it can also accept a Future<T2>
or null
as argument, and has to check for those, even though we know we'll only call with a T2
instance.
Can we at least say that it calls Future<T_2>._value(v)
instead, which is a (hypothetical) constructor with signature Future<T2> Function(T2 value)
that creates a future completed with v
?
(When nobody can find that constructor, we'll just say it's platform specific.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, looking at the implementation again, the CFE doesn't actually insert an explicit call to Future.value
. It simply sets a flag which is documented as telling the back-end to insert an explicit call to Future.value
.
Silly me for trusting the documentation comment, though, because digging further into the implementations, they actually don't all do precisely the same thing:
- In dart2js, a call is made to a private method called
_wrapAwaitedExpression
(insdk/lib/_internal/js_runtime/lib/async_patch.dart
), which calls_Future.value
. - In dev_compiler, a call is made to a private method called
_awaitWithTypeCheck
(insdk/lib/_internal/js_dev_runtime/patch/async_patch.dart
), which also calls calls_Future.value
. - In wasm, a call is made to a private method called
_awaitHelperWithTypeCheck
(insdk/lib/_internal/wasm/lib/async_patch.dart
), which is implemented in terms of lower level primitives likescheduleMicrotask
. - In the vm, the necessary logic is inlined directly into the method using the lower level primitive
AwaitWithTypeCheck
.
I've reworked this section. Rather than refer to Future<T_2>.value
or create a new primitive @FUTURE_VALUE
, I'm introducing a new primitive called @AWAIT_WITH_TYPE_CHECK
, which encapsulates the behavior of _wrapAwaitedExpression
, _awaitWithTypeCheck
, etc.
resources/type-system/inference.md
Outdated
establish that `v is Future<T_2>` always evaluates to `true`, in which case | ||
`m_2` can be optimized to `@PROMOTED_TYPE<Future<T_2>>(m_1)`._ | ||
|
||
- _For soundness, we must prove that whenever `@PROMOTED_TYPE<T_2>(v)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should prove, once and for all, that T <: FutureOr<flatten(T)>
for all types T
.
Then here, on the is! T_2
branch, with T_2 = flatten(T)
, you'd know that T <: T_2
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I think I understand what you’re hinting at here. Fleshing out some details, you’re saying that if we know that T <: FutureOr<flatten(T)>
for all T
, we can substitute T_1
for T
and get that T_1 <: FutureOr<flatten(T_1)>
, or equivalently, T_1 <: FutureOr<T_2>
. In both branches, we know that v
is an instance of T_1
, therefore it must be an instance of FutureOr<T_2>
. Clearly, then, there are two cases: v
must either be an instance of Future<T_2>
or an instance of T_2
. The is Future<T_2>
handles the first case, so obviously in the second case, v
must be an instance of T_2
.
I like it. It makes the motivation for this elaboration much clearer to me.
Since this suggestion requires separately proving that T <: FutureOr<flatten(T)>
for all T
, I’m going to postpone it to a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current definition of flatten is ... not easy to work with (it's defined in terms of "derives a future type" which is a separate function, which means that the goals of flatten aren't obvious in "derives a future type", so it takes more proof to get from A to B. I've done it before, and I can probably do it again.
I've also written more useful, but equivalent, specifications. I prefer my recursive functions to do their own recursing, it's the only way I can convince myself that I'm covering all the cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be addressed by #3920.
I really like that an elaborated expression is a district concept from the unelaborated expression, and that every elaborated expression is either the same structure as an unelaborated source expression, a new fresh variable, or a new syntactic form. (So I want to make that completely true, and not introduce something that looks like source code, not which isn't. Maybe that's too much, but I think it would be nice. We could possibly just allow |
I've not needed to explicitly say that something diverges. It's usually been enough to say what happens if it completes (with a value) or it throws (an error object and a stack trace) - which is the current terminology. |
Yeah, like I said in response to one of your other comments, I’m going to play with this more in a follow-up PR. I’m happy to talk more about this as I continue fleshing out the spec. |
Haha, the terminology I'm using in this PR was suggested by @leafpetersen in PR 3803. I don’t have a strong preference; maybe the two of you should bikeshed about it 😃 |
The currently used terminology for expression evaluation in the spec is:
In practice, I think we always use "evaluates to an object" or "evalautes to a value", never "produces". The word "complete" is used for execution of statements which can complete in the following ways:
Using the same words gives you some benefits. It's written just once that while defining the evaluation Them's the official words :) |
We now consistently use "`v` is an instance of `T`" for the situation where the runtime type of `v` is _exactly_ `T`, and "`v` is an instance satisfying `T`" for the situation where the runtime type of `v` is a subtype of `T`. See discussion at #3850 (comment).
Previously I was leaning on the definition of static types in `dartLangSpec.tex`, but this was (a) leading to confusion, and (b) not in alignment with the way the implementations actually compute static types.
We now use a primitive `@AWAIT_WITH_TYPE_CHECK`, which captures the logic in dart2js's `_wrapAwaitedExpression`, dev_compiler's `_awaitWithTypeCheck`, wasm's `_awaitHelperWithTypeCheck`, and the VM's `AwaitWithTypeCheck`.
Ok, cool. Then I think the only place where I diverge (no pun intended) from this terminolgy is in my mentioning of the possibility that evaluation of an expression might diverge. I get what you're saying that it's indistinguishable from an expression that simply hasn't finished evaluating yet, but I would feel weird removing it, because if I remove it, it sounds like I'm saying that every elaborated expression will either evaluate to a value or throw an exception, and that's simply not true; expressions containing infinite loops definitely won't ever do either of those two things. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really great! I really like this presentation.
resources/type-system/inference.md
Outdated
|
||
Expression inference is a recursive process of elaborating an expression in Dart | ||
source code, transforming it into a form in which all types and type coercions | ||
are explicit. An expression that has not yet undergone type inference is known |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"known as" -> "referred to as" maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
resources/type-system/inference.md
Outdated
source code, transforming it into a form in which all types and type coercions | ||
are explicit. An expression that has not yet undergone type inference is known | ||
as an _unelaborated expression_, and an expression that has completed type | ||
inference is known as an _elaborated expression_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ibid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
resources/type-system/inference.md
Outdated
_Often, an expression's context can be understood as the static type the | ||
expression must have (or be coercible to) in order to avoid a compile-time | ||
error. For example, in the statement `num n = f();`, the result of type | ||
inferring `f()` needs to be either a subtype of `num` (or a type that's |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"either X (or Y)" reads oddly to me, perhaps choose "either X or Y" or else "X (or Y)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, thanks. I went with "a subtype of num
(or a type that's coercible to num
)" to be parallel with "a static type the expression must have (or be coercible to)".
types in this document should be understood to supersede the specification of | ||
static types in the existing Dart language specification._ | ||
|
||
_Informally, we may sometimes speak of the static type of an unelaborated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This paragraph left me a little uneasy/confused, since it's saying "sometimes we informally talk about the static type of an unelaborated expressione
, and when we do, we mean the static type of the elaborated...." but also saying "but actually the static type of an expression e
can't be talked about independently of context".
I think maybe what you are saying here is something like the following: "For some expressions e
, elaboration does not depend on the context K
: that is, for any K
, elaborating e
with respect to K
will produce the same m
with the same static type T
. For such expressions, we sometimes informally refer to the static type of e
as a shorthand for the T
that results from elaborating e
to some m
in some context K
".
Is that right? Or do you mean to have a K
in the first part as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I get why this is confusing. Let me try to explain what I'm trying to achieve here, and then maybe you can help me decide how to achieve it (or whether it's even worth trying).
What's happening in my mind is that I've recently introduced a new concept to the reader: the idea of an elaborated expression. To distinguish this new concept from the old thing we used to mean by "expression" in the spec, I'm going to refer to the old thing as an "unelaborated" expression. So everywhere the existing Dart language spec talks about expressions, that should be understood to mean "unelaborated expression". Elaborated expressions are the new thing.
And now I've just told the reader that static types (which the existing Dart language spec talks about a lot) are a property of elaborated expressions. They're not a property of unelaborated expressions. I think that's a really useful thing to do, in this PR, because it keeps me honest: I never want to try to make a claim about the static type of an expression before the type inference process has finished with it, and since elaborated expressions are the outputs of the type inference process, restricting myself to talking about the static types of elaborated expressions achieves that.
So what should the reader make of all the other spec documents that pre-dated this PR? Those documents just talk about the static types of "expressions". I'm implicitly telling the reader that when those older documents say "expression" they mean "unelaborated expression". But I'm also explicitly telling them that "static type of an unelaborated expression" isn't a meaningful thing.
So, when I said:
Informally, we may sometimes speak of the static type of an unelaborated expression
e
; when we do, what is meant by this is the static type of the elaborated expressionm
that results from performing expression inference one
.
What I was trying to say was: hey reader, chill out. If you see an old spec document that says "static type of expression e
", just pretend it says "static type of whatever e
elaborates to", and things will work out.
But I don't really want to just leave it at that, because I also want to encourage us to try to start being more careful about our language with this sort of thing in the future.
So when I followed that with:
Note in particular that a given expression might have a different static type at different points in the code, since the behavior of expression inference depends on the context
K
, as well as the flow analysis state.
What I was trying to say was: but in the future let's try to be careful not to get too sloppy with this. Let's try not to say "the static type of the expression 5
is int
", because sometimes the context applied to 5
causes it to elaborate to @DOUBLE(5)
, which of course has a static type of double
. Similarly, if we say "the static type of v
is Object
", that can be misleading, because maybe the declared type of v
is indeed Object
, but a particular expression v
might get elaborated to @PROMOTED_TYPE<int>(v)
(because in the flow analysis state at the point where v
appears, it's promoted).
Does that make any more sense? Do you think any of this is worth mentioning? Do you have any suggestions about how to condense this to something spec-worthy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, this makes sense. I think I might just tack on something like the following "...that results from performing expression inference on e
in some context K
, where K
depends on where e
appears in the broader program.". And then continuing with your "Note in particular.." which expands on the bit added to the previous sentence. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it, thanks! So updated!
resources/type-system/inference.md
Outdated
type at different points in the code, since the behavior of expression inference | ||
depends on the context `K`, as well as the flow analysis state._ | ||
|
||
An invariant of expression inference, known as _soundness_, is that when an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
invariant -> property maybe?
Peeking ahead, I don't think the cases you've covered so far are broad enough for me to be sure which direction you're taking this, so let me just throw out a couple of thoughts for consideration as you build this out.
In general, you have some choices to make about how to structure this (I realize you're being informal, so in some sense it doesn't really matter, but at least I'll throw this out as food for thought).
- You could choose to define elaboration such that if
e -> m : T
, thenm
is guaranteed to be a thing which can be run and will produce a value which is a subtype ofT
.- If you make this choice, then elaboration must be a partial function or else extend it to explicitly return an "error case" (e.g. you have to say what happens when you elaborate
String x = 3
). - You also will in principle have to do a lot of error checking that isn't really a part of inference (e.g. "the subclass graph is not cyclical")
- If you make this choice, then elaboration must be a partial function or else extend it to explicitly return an "error case" (e.g. you have to say what happens when you elaborate
- You could choose to define elaboration such that if
e -> m : T
, anderrorFree(e, T)
, thenm
is etc- Given this choice, you may or may not be able to make elaboration a total function (maybe
String x = 3
elaborates fine, it just gets rejected later) - If you're taking this approach, then you probably want to say something fuzzy here about "and some other conditions hold"
- Given this choice, you may or may not be able to make elaboration a total function (maybe
To be completely formal, there's a lot of detail about what order various invariants need to be established. For example, inference relies on subtype matching, but it's not clear that subtype matching is well-defined if the types in question are not known to be well-formed (i.e. don't have undefined type variables, or cyclic class hierarchies, etc). Obviously, we're not trying to be completely formal here, but worth maybe nodding in this direction?
Given all this, I might be inclined to broaden what you're saying slightly to note that 1) in addition to filling in missing types, elaboration is intended to reject programs which cannot be guaranteed to be sound, and 2) there may be static checks required to ensure soundness that are not specified here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
invariant -> property maybe?
Good point. Done.
Peeking ahead, I don't think the cases you've covered so far are broad enough for me to be sure which direction you're taking this, so let me just throw out a couple of thoughts for consideration as you build this out.
In general, you have some choices to make about how to structure this (I realize you're being informal, so in some sense it doesn't really matter, but at least I'll throw this out as food for thought).
You could choose to define elaboration such that if
e -> m : T
, thenm
is guaranteed to be a thing which can be run and will produce a value which is a subtype ofT
.
- If you make this choice, then elaboration must be a partial function or else extend it to explicitly return an "error case" (e.g. you have to say what happens when you elaborate
String x = 3
).- You also will in principle have to do a lot of error checking that isn't really a part of inference (e.g. "the subclass graph is not cyclical")
You could choose to define elaboration such that if
e -> m : T
, anderrorFree(e, T)
, thenm
is etc
- Given this choice, you may or may not be able to make elaboration a total function (maybe
String x = 3
elaborates fine, it just gets rejected later)- If you're taking this approach, then you probably want to say something fuzzy here about "and some other conditions hold"
To be completely formal, there's a lot of detail about what order various invariants need to be established. For example, inference relies on subtype matching, but it's not clear that subtype matching is well-defined if the types in question are not known to be well-formed (i.e. don't have undefined type variables, or cyclic class hierarchies, etc). Obviously, we're not trying to be completely formal here, but worth maybe nodding in this direction?
Given all this, I might be inclined to broaden what you're saying slightly to note that 1) in addition to filling in missing types, elaboration is intended to reject programs which cannot be guaranteed to be sound, and 2) there may be static checks required to ensure soundness that are not specified here.
I like this idea. I need to spend some time thinking about where I want to add it, so I'm going to postpone it to a follow-up PR.
|
||
- Let `m_1` be the result of performing expression type inference on `e_1`, in | ||
- Let `m_1` be the result of performing expression inference on `e_1`, in | ||
context `_`, and then coercing the result to type `Object`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know whether it was something that we discussed, but it is definitely from my perspective the correct decision. In general the context is intended to capture "what this thing should be", and what this thing should be is a subtype of Object
, not a subtype of Object?
.
`Future<T_2>`, in which case `m_2` can be optimized to | ||
`@PROMOTED_TYPE<Future<T_2>>(m_1)`._ | ||
|
||
- _For soundness, we must prove that whenever |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for clarity, I would be slightly more explicit here. Something like:
For soundness, we must prove that @AWAIT_WITH_TYPE_CHECK<T_2>(m_1)
always evaluates to a value which is an instance satisfying Future<T_2
. By the definition of @AWAIT_WITH_TYPE_CHECK
, there are two cases to consider.
First, if m_1
evaluates to v_1
and v_1
is an instance of Future<T_2>
then we are done.
Second, if ....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds reasonable. As I mentioned earlier, this is going to get easier once we have proven that T <: FutureOr<flatten(T)>
for all types T
, which I intend to do in a follow-up PR. I'll address this comment in that follow-up PR.
For now I’ve added a short TODO as a reminder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#3920 deletes this text.
soundness by assuming that `v` is an instance satisfying type `T_1` and not | ||
an instance satisfying type `Future<T_2>`, and then considering two cases:_ | ||
|
||
- _If the runtime value of `v` is `null`, then by soundness, `T_1` must be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically no - it could be an extension type which erase to one of those things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I’ll address this in the follow-up PR as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#3920 deletes this text.
type `T_1`, but not an instance satisfying type `Future<T_2>`, then `v` is | ||
an instance satisfying type `T_2`._ | ||
|
||
- _Substituting in the definition of `T_2`, we need to show that if `v` is a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fishythefish this sort of thing is the kind of coherence property that I can imagine pulling out as a lemma to be validated. "Lemma: If v
has static type T
and runtime type S
where S
is not Null
then either S <: flatten(T)
or S <: Future<flatten<T>>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the "not Null
" restriction important? If S
is Null
(i.e. v
is null
), then T
must be nullable. Therefore, flatten(T)
is nullable (this lemma can be proven separately if needed), so S <: flatten(T)
.
- _If `T_1` is `S?`, then `flatten(T_1)` is `flatten(S)?`. We need to show | ||
that if `v` is a non-null instance satisfying type `S?`, but not an | ||
instance satisfying type `Future<flatten(S)?>`, then `v` is an instance | ||
satisfying type `flatten(S)?`. Assuming `v` is a non-null instance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fishythefish similarly here: if v
has runtime type T
and T <: S?
then either T
is Null
or T <: S
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The union type rules are that T <: S?
implies T <: S
and/or T <: Null
.
It happens that today T <: Null
implies T = Null
or T <: Never
, and T <: Never
is subsumed by T <: S
.
Or actuall, that T = Null
up to Norm. T
can also be Never?
or Null?
. The flatten function is defined on non-normalized type terms. (We haven't said otherwise, although I'm not sure it behaves differently for any different norm-equivalent types. I'm also not completely sure it doesn't.)
It's better to just not make the assumption unless it's needed. If T <: Null
is enough, then just use that.
In the code review discussion dart-lang#3850 (comment), we agreed to use the terminology "instance satisfying type T" to mean "instance whose runtime type is T or a subtype thereof", and I made the necessary changes to `resources/type-system/inference.md`. However, after digging through `dartLangSpec.tex`, it appears that the terminology we were already using in the spec for this concept is "instance implementing type T". This commit modifies `resources/type-system/inference.md` to match the spec terminology. It also fixes two places in the spec where we were incorrectly saying "instance of type T", but we meant "instance implementing type T". Furthermore, the spec consistently uses the term "dynamic type of `v`" for the concept that we informally refer to as the "runtime type of `v`", so this commit updates `resources/type-system/inference.md` to follow the spec convention.
The following changes (mostly based on suggestions by Leaf and Erik) are included: - The text is now under a heading called "expression inference" rather than "body inference" (which was confusing). - Instead of trying to describe the output of type inference using a vague notion of "expression artifacts" whose runtime behavior is described in prose, I'm specifying it using a Dart-like syntax called "elaborated expressions". Elaborated expressions look generally like Dart expressions, except they have some limitations (types are fully specified, and complex constructs like null shorting are desugared, and type checks are explicit) and support some extra forms (such as "let" expressions). - I've simplified the terminology I use to describe the three things that can happen when an expression is evaluated (it can diverge, throw an exception, or evaluate to a value). My previous terminology was confusing ("fail to complete at all", "complete normally with a value", or "complete with an exception"). - I've re-worked the treatment coercions so that I'm no longer making subsumption explicit in the language of elaborated expressions. (I never meant to make subsumption explicit; it happened as a mistake due to unclear thinking on my part.) - I broke the "numbers" subsection into separate "integer literals" and "double literals" sections. (Previosuly I only specified how integer literals are handled.) - The sketches of soundness proofs have been simplified a bit. I'm no longer trying to crisply distinguish "expression soundness", "return value soundness", "tear-off soundness", and "future soundness"; rather, I'm speaking about soundness more informally, as befits an informal proof sketch. - On the other hand, I've clarified precisely what I mean by "is an instance of", since the soundness proof sketches don't make any sense if your definition of this concept doesn't agree with what's in my head. I've also made sure this concept correctly accounts for extension types. - And I've added text explaining why the interpretation of `await` expressions is indeed sound. (Previously I had a "TODO" because I thought maybe they weren't sound. But they're sound; it just took a little extra work to prove it to myself.)
The following changes (mostly based on suggestions by Leaf and Erik) are included:
The text is now under a heading called "expression inference" rather than "body inference" (which was confusing).
Instead of trying to describe the output of type inference using a vague notion of "expression artifacts" whose runtime behavior is described in prose, I'm specifying it using a Dart-like syntax called "elaborated expressions". Elaborated expressions look generally like Dart expressions, except they have some limitations (types are fully specified, and complex constructs like null shorting are desugared, and type checks are explicit) and support some extra forms (such as "let" expressions).
I've simplified the terminology I use to describe the three things that can happen when an expression is evaluated (it can diverge, throw an exception, or evaluate to a value). My previous terminology was confusing ("fail to complete at all", "complete normally with a value", or "complete with an exception").
I've re-worked the treatment coercions so that I'm no longer making subsumption explicit in the language of elaborated expressions. (I never meant to make subsumption explicit; it happened as a mistake due to unclear thinking on my part.)
I broke the "numbers" subsection into separate "integer literals" and "double literals" sections. (Previosuly I only specified how integer literals are handled.)
The sketches of soundness proofs have been simplified a bit. I'm no longer trying to crisply distinguish "expression soundness", "return value soundness", "tear-off soundness", and "future soundness"; rather, I'm speaking about soundness more informally, as befits an informal proof sketch.
On the other hand, I've clarified precisely what I mean by "is an instance of", since the soundness proof sketches don't make any sense if your definition of this concept doesn't agree with what's in my head. I've also made sure this concept correctly accounts for extension types.
And I've added text explaining why the interpretation of
await
expressions is indeed sound. (Previously I had a "TODO" because I thought maybe they weren't sound. But they're sound; it just took a little extra work to prove it to myself.)Thanks for your contribution! Please replace this text with a description of what this PR is changing or adding and why, list any relevant issues, and review the contribution guidelines below.
Contribution guidelines:
dart format
.Note that many Dart repos have a weekly cadence for reviewing PRs - please allow for some latency before initial review feedback.