-
-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rhombus expansion and enforestation on shrubberies #162
base: master
Are you sure you want to change the base?
Conversation
Big comments:The four syntactic categories are not really defended. You basically say, "Racket has three, Rhombus adds one more". If someone has never heard of Racket, how would we explain these things? What even is a "syntactic category"? I think something like, "Based on the surrounding context, this shrubbery blob is put into one of these categories and it is the choice of the context NOT of the blob". In a naive LISP, there's just two categories because If that answer is correct, then I think this document should say something about how we know what syntactic category a particular blob is in. I think your API is like this, because it says that I think that explanation should also justify why it is this particular set of four things.... maybe:
These feel quite natural and general. However, patterns feel very specific:
That feels very particular to one "language". In other words, all of these categories are specifying an "interface" --- what they receive and what they return --- where what they receive is syntax with a promise about where it occurs and what they return is the "influence" or "effect" they can have on their context. The categories are roughly defined by the effect they can have: declarations do module-effects, like imports and submodules; definitions do binding-effects; expressions have no effects. Your "patterns" have a constraint-effect (the matcher function) and a binding-effect, where the first effect is "outward" in that it communicates to the pattern match "Don't select me" and the second effect is "inward" in that it influences a "sibling" based on the particular syntax of the matcher. Perhaps these outward/inward effects could be expressed more generally:
A "match transformer" might be
This, of course, "knows" that it is patch of I am particularly concerned about how this idea of binding patterns could, for instance, be used for non-value work, like in type declarations; consider this Haskell:
Perhaps we could write
to use an bounded quantification style with a binding pattern. In this case the
Small comments:I believe that this sentence ---
Big shed --- I feel like the
I think that |
@jeapostrophe - thanks for the comments. "Binding" is a better word than "pattern", so I've switched to using that word. Where "binding" was previously used for the You're right that the category of a shrubbery for expansion is determined by its context, and I've updated the description to say that. I've also updated to clarify that the four categories are just the ones directly supported by the expander, while a language built on the expander can have even more categories. The rationale now starts with a paragraph justifying the four categories (which is simple: experience with Racket). I'm not sure I understand your type-declaration example. I would expect a typed language to have an additional syntactic category for types, and the rationale now notes that possibility. I would hope that the new category is supported through a new kind of compile-time value, and not a compile-time function that an expander calls to determine there category where it's being used. I take your "If someone has never heard of Racket, how would we explain these things?" comment as being primarily about how to justify the four syntactic categories. The comment could also suggest that the proposal is gibberish to someone who has never heard of Racket, and I would agree. If and when a Rhombus language built on this concepts exists, then it will be possible to explain everything in those terms. Meanwhile, this proposal bootstraps by using Racket for general concepts and to make the API concrete. |
enforestation/0000-enforestation.md
Outdated
with the usual precedence. Unlike the other operators, the `.` | ||
operator's right-hand side is not an expression; it must always be an | ||
identifier. | ||
meant to know about `::` specifically; the `::` meant to be a binding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is
enforestation/0000-enforestation.md
Outdated
remains between shrubbery notation and Racket's macro expander, | ||
because shrubbery notation is intended to be used with less grouping | ||
than is normally present in S-expressions. | ||
syntax-object form, so it can include scopes to determine a mappin for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mapping
enforestation/0000-enforestation.md
Outdated
Rhombus expander will dispatch on operator binding only during the | ||
The relevant syntactic category for a shrubbery is determined by its | ||
surrounding forms, and not inherent to the shrubbery. For example, | ||
`cons(x, y)` might mean one thing as an expression and aother as a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another
enforestation/0000-enforestation.md
Outdated
surrounding forms, and not inherent to the shrubbery. For example, | ||
`cons(x, y)` might mean one thing as an expression and aother as a | ||
binding. Exactly where the contexts reside in a module depends on a | ||
specific Rhambus language that is built on the Rhombus expander, so |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rhombus
enforestation/0000-enforestation.md
Outdated
binding. Exactly where the contexts reside in a module depends on a | ||
specific Rhambus language that is built on the Rhombus expander, so | ||
it's difficult to say more here. Meanwhile, a full Rhambus language | ||
may have more syntactic categories than the oes directly supported by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ones
I've made some copy-editing comments inline. I like your additions. "The four categories for the Rhombus expander are merely the ones that are directly supported by the expander and its API." --- You say this elsewhere, and below, but I think it is worth talking about, in some way, the idea that Rhombus will be "syntactic category heavy" while Racket is "syntactic category light". What I mean by that is that the mores of Racket macros are not to make new categories, in part because the language & standard library doesn't, and what we're trying to do in Rhombus is (a) demonstrate how that is useful and (b) make it is easy to do. I think that your list of example new categories might be fruitfully expanded based on your imagination and common examples: database query contexts, Web route contexts, import specification, export specifications, and so on.
I think that I mean that the proposal doesn't try to explain what a "declaration" vs "definition" vs "expression" is. I think that a casual observer of the LISP world would say that there are only two categories---expression and binding---and then if you pressed them, they'd probably admit that definitions are a thing, but my guess is that no one would come up with "declaration". I don't think that I could give a convincing explanation of what a "declaration" in this context or in the Racket context is. I think that writing that explanation would lead to something like "Of course it's obvious all languages have these four things: the top of a compilation unit, a definition context, an expression, and a binding position. That's why those four are built-in to this discussion of Rhombus expansion, because they'll always be there. We're not describing the ceiling of syntactic categories... we're describing the floor, and we're demonstrating how a Rhombus-based language designer should think about their job... just like a Racket-based language designer thinks in a different way, such as by using conventions (like a leading
My point with that example is that it binds |
Include a more substantial prototype language, which helps for writing examples. But at the same time, the expansion engine that's the subject of the proposal is smaller and cleanly separated (in the "enforest" directory/collection) from its us in the prototype.
New draft pushed. Experimenting with a Rhombus prototype helped clarify which pieces belong in this proposal and which details are "a language built with the Rhombus expander". The resulting proposal is more abstract, in the sense that it makes fewer assumptions. But it's more concrete in that the implementation part that belongs to this proposal is cleanly separated out, and it's explained with a lot more examples from the I'll write up more about the prototype soon, maybe as a new PR. |
"The invocation of a transformer for an implicit operator does not include a token for the implicit operator, unlike other transformer invocations." --- Am I correct that this is different than Racket? Why the change? I feel like it can be nice for I notice that you seem to be going towards Typo: "repersenting", "shribbery" |
@jeapostrophe You're right that Racket passes along a synthesized identifier to implicit implementations, and that has worked fine, so it's probably better to keep that behavior here. Changed. |
Also, sync the implementation with the current prototype.
One feature of several other languages that has never worked that well in Racket is a sort of combining of multiple top-level forms together. Haskell is a good example of this: not :: Boolean -> Boolean
not True = False
not False = True Here we have three conceptually-separate forms but they all get grouped into the same definition. Another example, featuring a somewhat more regular syntax, is decorators in Python or annotations in Rust: @foo
def f(x): return x #[derive(Serialize, Deserialize, Debug)]
struct Point {
x: i32,
y: i32,
} Typed Racket implements something like this for type annotations: (: f (-> Integer Integer))
(define (f x) x) but that works via mutation and coordination using Is it possible to use Rhombus macros to combine forms this way? It feels like the following should definitely be possible:
Where the In the current implementation, at least as described in this document, that seems like it would involve passing all the remaining forms to definition and declaration macros in the |
You're right that this sort of grouping is not specifically handled and would need some cooperation from an enclosing form (such as the module top-level and block forms). There's a precedent for cooperation from enclosing definition forms in #163's support for A possibility for some things, which fits more directly into the expansion framework here, is to use different binding spaces. For example, in with something like
then The biding-space approach doesn't help with the things that look like annotations, though, where the way encouraged by shrubbery notation is to have a new form with the definition in a block:
If avoiding this kind of nesting is an important enough goal, then it might suggest a different surface-syntax approach instead of the shrubbery approach. |
Would it be possible to eliminate the apparent cooperation from an enclosing form by using interposition point form and/or reader macro? (this is what #85 is proposing). |
I think giving compound structure to declarations at the top level of a file has a lot in common with parsing. Instead of a sequence of tokens, it's a sequence of declarations. So, for a language with an enforestation pass to process custom infix operations, I wouldn't be at all surprised to see a similar kind of enforestation pass for declarations. I think there are some different design pressures for it, though. Function declarations, which would probably be the single most numerous kind of declaration, are capable of mutual recursion and can be freely rearranged with respect to each other. This makes it almost odd when some declarations have a more ordered relationship.
I think these can basically break down into phases: Delimiters and annotations cooperate with a delimiter-macroexpander to produce a structured document of sections made up of annotated section headings and annotated declarations. (Along the way, the delimiter-macroexpander can discover definitions of macros that extend its behavior.) Now that the delimiter-macroexpansion is out of the way, we section-macroexpand according to policies present on the annotated section headings. A section-macro in turn typically expands a section body by first parsing out contiguous "definitions by parts" groupings, then running several topic-specific macroexpanders that take turns processing the "ordered whole-program composition" declarations they're interested in. After that, each section has been broken down into an orderless collection of ordered collections of declarations (some individual declarations; some definitions by parts; some ordered whole-program composition systems that are independent of each other). I guess I'll call them subfeeds. Each one can then be subfeed-macroexpanded in its own way Breaking things down into hierarchical sections and independent subfeeds of declarations makes this syntax more concurrent; in fact, some of the passes can logically begin processing before the others have finished. I find this valuable because it should help with reporting multiple errors in independent parts of the file and should help with caching of compilation results. This describes just the macroexpansion of the top level of a file, but I think most nested blocks could be macroexpanded in roughly the same way (with variations that have to do with specific applications, like some blocks being Scribble documentation and such). Having the outside edge of a file be macroexpanded before the inner parts would be great for letting IDE tooling help out even when the inner parts have errors in them. That's just one idea, anyway. :) I've been sitting on this combination of outside-in parsing and Markdown-like section headers for a while, and I thought about making it a proposal at some point, but I'm not sure I have the time. |
@sorawee If I understand what you mean, @rocketnia That's an interesting line of thought. While I'd be wary of building in too much complexity, it does seem like there could be improved cooperation from the default |
One question I had while reading the description of the enforestation API/implementation was whether the building blocks have Rhombus-level API, or just Racket-level API. That is, the API described here is "three Racket structs and one macro". Is the intent that implementation on top of Racket is an exposed feature of the macro system (the same way that Typed Racket's design presumes that Racket is a part of what you know about) or is the eventual goal to have a pure-Rhombus explanation of the macro system (the way that Chez Scheme is basically just an implementation detail of Racket itself)? |
The intent is to have a Rhombus-level API, and everything would be presented and explained in those terms (so, like Chez Scheme relative to Racket). |
I've updated the "shrubbery-rhombus-0" package with a proof-of-concept Here's a dumb example, which is a
|
Looks to me like that's exactly what's needed for I think a lot of the rest of what I described above could be implemented as a library in terms of that. Section heading syntaxes could operate by being sequence macros that took more control over processing the rest of the file. They'd scan for other delimiter-macro definitions, delimiters, and annotations, then proceed to expand the hierarchical result according to a section-macroexpander implementation (and so on) defined within the library. They might leave an "unused portion" after some kind of section-ending delimiter, but I think once a syntax has its beginning and ending both marked, it might as well use brackets (or other block structure). Speaking of which, I think
But it serves its purpose as a toy example, at least. :) A more realistic simple example of leaving behind an unused portion might be an annotation that only consumes a single declaration, perhaps with some other arguments before that, like a documentation block. |
On the other hand, an annotation probably wouldn't make a great example either. I think there's quite a bit of subtlety to take into account when trying to consume a specific number of declarations. The very fact that a declaration could be annotated means that it could be made up of more one part at the s-expression/syntax object level, and a well-designed annotation macro should check for that in its input as well. This is why I have the parsing of delimiters and annotations intermixed with each other in the first part of the macroexpansion strategy I described (...and now I wonder if definitions by parts should join them). I think if a sequence macro doesn't go a particular effort to parse out annotations and delimiters and such from the declarations that follow it, it should probably treat those declarations as a single, indivisible block. The Most of the CPS assistance only makes sense in a local block, where treating the rest of the block as a run time continuation can make sense. At the module level, in most cases, the indivisible block would be spliced into the module to facilitate mutual recursion, using To gather my thoughts, I think the most compelling examples of sequence macros break down as follows:
...The more I think about 1, the less I like it. I think this would be better handled at the The Number 4 is probably the foremost reason to write a |
I haven't read every discussion point, so perhaps already answered: Have you considered an alternative design in which operator declarations specify the contexts of the subexpressions? It seems like it would be nice if the more complex |
@michaelballantyne I think I don't understand the question. While I can see how specifying contexts on arguments work work — like using |
The `define-enforest` macro is now parameterized over the implicit names that it uses.
Based on a suggestion from @willghatch, |
Rendered
This proposal builds on #122, defining a macro-expansion layer suitable for shrubberies.
In other words, it still doesn't define a language like
#lang rhombus
, but it defines an expansion and enforestation layer that toward that goal. It's analogous to Racket's core expander, but also defined in terms of Racket's expander.The implementation in the proposal is currently the same as the https://github.com/mflatt/shrubbery-rhombus-0 package that makes
#lang shubbery
run in Racket and DrRacket (with just a few operators and a definition form).