-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to New Parser #55
base: main
Are you sure you want to change the base?
Conversation
@@ -1234,7 +1234,7 @@ module Compilation = | |||
|
|||
let ast, diagnostics = | |||
let nodes, diagnostics = | |||
List.map Parse.parseFile sources | |||
List.map LegacyParse.parseFile sources | |||
|> List.fold (fun (nodes, diags) (n, d) -> (List.append nodes [ n ], List.append d diags)) ([], []) | |||
|
|||
{ Location = TextLocation.Missing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably want to come up with a better way of modelling this. The previous parser was run on each input, and then we pasted them together into a single big SEQ. Instead I guess we should have a wrapper Compilation
type, which contains a set of BoundCompilationUnits
. We can bind them in sequence, and propagate some state (lib definitions etc.). The key is we don't want to have public items from one unit just become 'magically' available in another. This will probably want to be different for script compilations however. In that case we'd want multiple passes at compilation to share definitions from the previous.
We may want to handle this by just having the last compilation unit in the compilation be treated specially and allow definitions from that to leak into the root binder scope. That way inherited compilations from scripting would have access to the previous definitions.
|> Seq.choose (NodeOrToken.asToken) | ||
|> Seq.tryFind (tokenOfKind AstKind.OPEN_PAREN) | ||
|
||
member public _.Body = red.Children() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to do something about form bodies
|
||
member _.Text = | ||
red | ||
|> NodeOrToken.consolidate (fun n -> n.Range) (fun t -> t.Range) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs to actually get the text, not range.
|
||
static member TryCast(red: SyntaxNode) = | ||
if red.Kind = (AstKind.CONSTANT |> astToGreen) then | ||
new Constant(red) |> Some | ||
else | ||
None | ||
|
||
type Expression = | ||
| Form of Form |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inheritance might be better than this, along with active patterns.
ReadLine.Read("[]> ") | ||
|> Parse.readProgram "repl.scm" | ||
|
||
let private print (result: ParseResult<Program>) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some kind of ast explorer which allows you to cd and ls around the tree might be nice.
test/Feersum.Tests/LexTests.fs
Outdated
@@ -16,16 +22,9 @@ let private getValue token = | |||
|
|||
[<Fact>] | |||
let ``Empty input text always returns end of file`` () = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test name needs updating to match expectations
@@ -6,6 +6,14 @@ open Feersum.CompilerServices.Diagnostics | |||
open Feersum.CompilerServices.Syntax | |||
open Feersum.CompilerServices.Utils | |||
|
|||
module private BinderDiagnostics = | |||
|
|||
// TODO: remove this and replace with better binder errors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Distinct binder diagnostics.
ctx.Diagnostics.Emit | ||
BinderDiagnostics.bindError | ||
(List.last formals).Location | ||
"Saw dot but no ID in formals" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dots should be handled by the parse, not the bind
match node.Kind with | ||
| AstNodeKind.Ident i -> BoundDatum.Ident i | ||
| AstNodeKind.Constant c -> BoundDatum.SelfEval(BoundLiteral.FromConstant c) | ||
| AstNodeKind.Dot -> BoundDatum.Ident "." // FIXME: This is definitely not right |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix me. Datum dots.
inherit Expression(red) | ||
|
||
member public _.OpeningParen = | ||
red.ChildrenWithTokens() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should / could lazies work here?
if s.Line = e.Line then | ||
sprintf | ||
"%s(%d,%d-%d): %s: %s" | ||
(s.Source |> normaliseName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This breaks click through and problem matching in VS Code. Maybe Code needs a PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2ccc457
to
309f86c
Compare
f49cb8a
to
de9feb9
Compare
de9feb9
to
7c274b4
Compare
This new REPL mode allows debugging how the parser produces syntax trees.
Recognise simple forms. No support yet for dotted forms.
Start moving the new parser infra into primary place.
Introduce a collection of types to model out the node kinds in our tree. Fixup our parser to return a typed result instead of plain `SyntaxNode` from Firethorn.
Properly handle warning diagnostics in parse results.
Introduce base type for syntax items.
Update diagnostics to store a kind. This can then contain information shared amongst all instances of a given diagnotci. For now we use it to store our error code.
Ensure that each stage of the compiler is grouped in the `.fsproj`. This should also help ensure that each stage of the compilation pipeline only dpends on those before it.
Simplify the API Layout of the Lex module. No need for a nested module to store the token sets any longer.
Move the parser from an object to a collection of functions. Each parser function takes a `state` as the final parameter, and returns a new state to allow for modifications. This is the same approach taken with `Firethorn` recently in the Predciated` project.
Update the parser to allow bumpt to _optionally_ return a token. If no token is retuend then we will just emit an empty token as the error rather than throwing. This fixes the test case which fails to provide a closing `)`, causing `expect` to be called when no tokens remain.
We don't need this any more.
We don't actually need a stopwatch for this, instead just use `Stopwatch.GetTimestamp`.
It seems that with the current debugger, and our current output format, we no longer need to be quite so agressive with our debug settings. Roll back to the defaults in some places.
Update the VS Code configuration's indentation a little.
Add better names to most option values. This improves the help text a little. It would be _super nice_ to be able to use proper sub-commands to properly parse the disjoint operations supported by the compiler. It doens't seem like that is quite possible right now with the default behaviour as we can't use `MainCommand`.
The old bound expressions used to emit AST nodes in quotes. Fix taht by introducing a new type for bound datums.
Tree becomes a namespace that directly contains the types used in the syntax tree. There are still types missing from this.
This is less F#, but more uniform in the end.
Update the error output to emit a more compact representation for the position if both the start and end lie on the same line. Add some docs as a placeholder for a full error index.
This layer should allow switching parts of the compiler over to the new syntax tree in a picewise manner. The idea is we can start accepting the new tree types, and filter them through the LegacySyntax shim when the old syntax is needed.
Rplace a timeout in the tests with a bail count. This should help prevent the tests being suceptible to execution speed, as well as simplifying the test overall.
Remove the generic binder erorr and replace with a set of specific diagnostics for each case.
When the location of a point is missing we can use a hidden sequence point rahter than omitting the sequence point entirely. This ensures that the debugger knows the sequence point exists but that the source location is hidden from it. Given that missing locations usually come from fabricated syntax as part of macro expansion this seems like the best solution.
Fixup some of the syntax tests by implemeting cooking for idnetifier string values. This involves walking through the characters in the identifier's token and replacing any escape sequences with the appropriate values.
Re-enable the remaining new parser tests and properly handle the differnt escape sequences.
Update the parser specs to use the new parser. Change the serialisation for the tests to make things _a little_ more compact.
Add support for parsing dotted forms. The dottet tail is a node in the tree under the form. Add support for this in our syntax patterns.
TCE means that we can run programs like this that recurse forever without having to worry about stack consumption. This program runs using all of a single core indefinitely without any memory use change.
Rather than fabricating a full location for each token instead we can just store the offset into the document. The `TextDocument` API allows this to be converted into a line-col as and when required.
Some work to improve the handling of text locations by the `TextDocument`. This makes sure that locations are allways at a consistent line colum offset. Previously the first line was 0-based and the remaining lines 1-based. Still not totally sure if we want the position objects to be 0-based like Firethorn or 1-based ready for output to the screen.
7c274b4
to
9d1b6f8
Compare
9d1b6f8
to
9df82a1
Compare
Bringing the new parser up to speed. Replacing all uses of the old parser with the new one, and
binning the old parser.
TODO: