New command: purs codegen #4092

colinwahl · 2021-05-19T04:13:14Z

Description of the change

Implements a new command: purs codegen

purs codegen takes globs to filepaths containing the JSON representation of a CoreFn Module (this can be generated by purs compile). It parses the core functional representation out of these files, and passes them in to the standard codegen function.

This command allows for CoreFn transformations to be written outside of the compiler (even in PureScript!) without having to worry about using PureScript as a library.

Example usage of this would be:

$ purs compile glob/to/files.purs -g corefn,js
$ <execute pass over generated corefn.json files>
$ purs codegen glob/to/all/corefn.json

This intends to close #3339

Checklist:

Added the change to the changelog's "Unreleased" section with a reference to this PR (e.g. "- Made a change (#0000)")
Added myself to CONTRIBUTORS.md (if this is my first contribution)
Linked any existing issues or proposals that this pull request should close
Updated or added relevant documentation
Added a test for the contribution (if applicable)

colinwahl · 2021-05-19T04:18:27Z

app/Command/Codegen.hs

+      <> Opts.showDefault
+      <> Opts.help "The output directory"
+
+globWarningOnMisses :: (String -> IO ()) -> [FilePath] -> IO [FilePath]


Command.Graph, Command.Compile, and Command.Codegen all contain the same definition here, but I wasn't sure the most appropriate place to pull it out to for sharing.

If there is an appropriate place to move this then I will update all 3 of those modules.

Sounds like a good idea. Create a Command.Common module?

colinwahl · 2021-05-19T04:20:07Z

src/Language/PureScript/Make/Actions.hs

@@ -97,7 +97,7 @@ data MakeActions m = MakeActions
  , readExterns :: ModuleName -> m (FilePath, Maybe ExternsFile)
  -- ^ Read the externs file for a module as a string and also return the actual
  -- path for the file.
-  , codegen :: CF.Module CF.Ann -> Docs.Module -> ExternsFile -> SupplyT m ()
+  , codegen :: CF.Module CF.Ann -> Docs.Module -> Maybe ExternsFile -> SupplyT m ()


When performing codegen via purs codegen - we can create a stub ExternsFile via the CoreFn.Module - but it isn't actually the ExternsFile we want to write. I modified this so that we don't write a bogus ExternsFile during purs codegen

Maybe it would be better to factor codegen the function into a function for each output (including the externs file), and only use the JS-outputting function for purs codegen.

Good idea - I can go ahead and do that now

CHANGELOG.md

rhendric · 2021-05-28T22:55:51Z

src/Language/PureScript/Make/Actions.hs

@@ -97,7 +97,7 @@ data MakeActions m = MakeActions
  , readExterns :: ModuleName -> m (FilePath, Maybe ExternsFile)
  -- ^ Read the externs file for a module as a string and also return the actual
  -- path for the file.
-  , codegen :: CF.Module CF.Ann -> Docs.Module -> ExternsFile -> SupplyT m ()
+  , codegen :: CF.Module CF.Ann -> Docs.Module -> Maybe ExternsFile -> SupplyT m ()


Maybe it would be better to factor codegen the function into a function for each output (including the externs file), and only use the JS-outputting function for purs codegen.

rhendric · 2021-05-28T22:56:13Z

app/Command/Codegen.hs

+      <> Opts.showDefault
+      <> Opts.help "The output directory"
+
+globWarningOnMisses :: (String -> IO ()) -> [FilePath] -> IO [FilePath]


Sounds like a good idea. Create a Command.Common module?

rhendric · 2021-05-28T23:36:57Z

app/Command/Codegen.hs

+
+  foreigns <- P.inferForeignModules filePathMap
+  (makeResult, makeWarnings) <-
+    liftIO


Is liftIO needed here?

rhendric · 2021-05-28T23:41:32Z

app/Command/Codegen.hs

+    return paths
+
+  concatMapM :: (a -> IO [b]) -> [a] -> IO [b]
+  concatMapM f = fmap concat . mapM f


While you're messing around with this, you can import concatMapM from Protolude. Don't know why we aren't already doing that.

rhendric · 2021-05-28T23:44:22Z

app/Command/Codegen.hs

+  concatMapM f = fmap concat . mapM f
+
+-- | Arguments: use JSON, warnings, errors
+printWarningsAndErrors :: Bool -> P.MultipleErrors -> Either P.MultipleErrors a -> IO ()


This is just Command.Compile.printWarningsAndErrors True, right? Looks like another candidate for Command.Common.

Ah, it is, great call

colinwahl · 2021-06-01T16:22:21Z

Sorry, things got busy around here, I'm going to pick this back up soon to address the feedback.

colinwahl · 2021-06-02T17:16:43Z

@rhendric I've addressed your initial feedback.

While doing the codegen refactoring, I noticed that purs codegen doesn't allow opting-in to generating source maps - I should probably add an option to the command to allow the user to opt-in to that.

rhendric

Looking good! I have one outstanding question about the runSupplyT 0 here, and I suggest rebasing on master to clean up some conflicts and HLint nits, but I think this is in great shape already.

rhendric · 2021-06-02T18:20:04Z

app/Command/Codegen.hs

+  foreigns <- P.inferForeignModules filePathMap
+  (makeResult, makeWarnings) <-
+    P.runMake purescriptOptions
+      $ runSupplyT 0


In the normal compilation path, the codegen supply monad is initialized with the next unused number from previous parts of the compilation. Using 0 here raises the question of whether this reuse is necessary. If so, using 0 here might cause problems. If not (I suspect not), we should probably be consistent so that the produced code isn't different when generated just by purs compile versus purs compile; purs codegen.

So assuming it's safe to do so, I think we should remove the SupplyT from the signatures in MakeActions and push that detail into their implementations. But now would be a really good time for someone else to share why that wouldn't be safe!

This is a great point. I tried reading through the usages of the supply monad in the codegen code and it looks like it's just for generating fresh variable names - it doesn't seem to me that it'd require we start off from where we left off in say typechecking - but I don't have enough experience to say for sure.

At work we've been using zephyr for quite a while, which also starts from 0 for codegen, so I'd be really surprised if it causes errors.

If it is the case that it doesn't matter, then I'll remove the SupplyT requirement and start it from zero within codegenJS.

I spent some time going through the codegenJS implementation and I don't think that it is dangerous to always initial that supply with 0. The fact that zephyr was doing that for so long also makes me pretty confident based on my personal experience.

I went ahead and made the change you suggested, and all tests are passing.

If anyone knows more than I do and thinks that we should undo the change, I can do that too!

rhendric · 2021-06-02T18:22:42Z

app/Command/Codegen.hs

+      M.fromList $ map ((\m -> (CoreFn.moduleName m, Right $ CoreFn.modulePath m)) . snd) $ rights mods
+
+  unless (null (lefts mods)) $ do
+    _ <- traverse (hPutStr stderr . formatParseError) $ lefts mods


hlint will yell at you for this when you rebase on master. Use traverse_.

rhendric · 2021-06-02T18:23:45Z

app/Command/Codegen.hs

+  runCodegen foreigns filePathMap m =
+    P.codegenJS (makeActions foreigns filePathMap) False m


hlint will yell at you for this too, but actually I think you should probably just inline this whole definition.

app/Command/Codegen.hs

colinwahl · 2021-06-04T04:04:58Z

I'll spend some time thinking about how we could add a meaningful test for this soon.

Other than that, there is the outstanding question of initializing the codegen supply with 0, which I've gone ahead and done. Then this should be ready for a final review!

JordanMartinez · 2021-06-23T04:09:38Z

app/Command/Codegen.hs

+      M.fromList $ map ((\m -> (CoreFn.moduleName m, Right $ CoreFn.modulePath m)) . snd) $ rights mods
+
+  unless (null (lefts mods)) $ do
+    traverse_ (hPutStr stderr . formatParseError) $ lefts mods


Since left mods is used twice, perhaps this should be turned into a let above?

let errList = lefts mods

Also, since filePathMap isn't used until after the unless block, perhaps it should go below this block but above the foreigns <- P.inferForeignMoudles filePathMap line?

JordanMartinez · 2021-06-23T04:12:16Z

app/Command/Codegen.hs

+  (makeResult, makeWarnings) <-
+    P.runMake purescriptOptions
+      $ traverse (P.codegenJS (makeActions foreigns filePathMap) codegenSourceMaps . snd)
+      $ rights mods


Here's a second rights mods. Perhaps that should also be moved to a let binding so more things can reuse it?

MaybeJustJames · 2022-09-24T14:55:49Z

@colinwahl do you have bandwidth to finish this off? Can I help?

MaybeJustJames · 2022-09-24T16:24:53Z

I've rebased on current master here.

colinwahl · 2022-09-26T17:06:57Z

@MaybeJustJames would you like to take this over from me? My bandwidth for compiler work is pretty low these days.

The big open question is that I'm still not sure if #4092 (comment) could lead to any problems.

rhendric · 2022-09-26T17:32:27Z

As the question is over a year old, I'm inclined to say let's ship it and find out.

MaybeJustJames · 2022-09-26T17:41:01Z

Happy to take over. How would you like to do it?

colinwahl · 2022-09-27T18:02:52Z

Happy to take over. How would you like to do it?

However you'd like to do it is fine with me - you could continue off this PR, or make a new branch and cherry-pick my changes, or just close it and start again from scratch. Let me know what you decide and if I should close this PR!

MaybeJustJames · 2022-09-28T13:18:17Z

However you'd like to do it is fine with me - you could continue off this PR

IMHO it would be a shame to lose the context here. Could I get write permission to your branch so I can pick up from here?

f-f · 2022-09-28T15:43:45Z

I was wondering - is this work necessary at all now that we have the backend optimizer?

colinwahl · 2022-09-28T15:55:44Z

I was wondering - is this work necessary at all now that we have the backend optimizer?

Supporting an optimizer was certainly my main goal for this - now that we've got purescript-backend-optimizer, I don't think I'd use the command (at least, I don't have anything in mind ATM). However, maybe someone's got other ideas :)

rhendric · 2022-09-28T16:06:45Z

It does strike me as a loss if the best general-purpose JavaScript backend for PureScript remains in a third-party project in the long term. I'm not sure how exactly this happened—I suspect the friction to contributing to PureScript is just too high for this level of innovation—but with enough time I would hope it can be mostly unforked. At that point, exposing the backend used by purs becomes a feature of interest again, unless the unforking includes some other mechanism for making the CoreFn-handling pipeline extensible.

f-f · 2022-09-28T18:45:44Z

@rhendric I agree with you - my point here is that the new backend has shuffled the landscape quite a bit: it shows not only that it's possible to aggressively optimise the CoreFn, but also that it's possible to emit more performant JS outside of the compiler, and all of this while the implementation is in PureScript.
Given this new perspective, I am suggesting that we should reconsider the premises for this command to exist at all - with the baseline being that everything that is exposed by the compiler is a public API that we can't deprecate easily, e.g. see how long it took to remove bundle - and if the new project offers a better way to achieve the goal of this work (that is: generate better JS).
I am sure that this work will be unforked in the long term, hopefully while lowering the barrier for contribution, for example by showing that we can implement chunks of the compiler (or even all of it) in PureScript itself.

rhendric · 2022-09-28T19:16:58Z

Okay yeah, I agree with looking at bundle as an example. We got rid of bundle when the ecosystem around ES modules matured enough and we did enough work on our codegen that we could recommend another no-regrets tool to replace it; waiting for those things to happen was what made deprecating bundle take so long, as far as I know.

Is purs-backend-es already that no-regrets tool for codegen? It's very impressive but also very young and possibly more aggressive than some of our users want. If it becomes that tool in the future, I don't see a significant barrier to ripping codegen back out, along with all the codegen internals. Just like with bundle, we'll paper over the switch in spago and basic users won't need to be aware of it.

In the meantime, as long as there's some value in having a built-in JS backend (regardless of the language in which the backend is written or the repo in which it lives), I think there's still a case for exposing it, so users can benefit from custom optimizations and rewrites without needing to also use a third-party backend.

natefaubion · 2022-09-29T00:17:49Z

Is purs-backend-es already that no-regrets tool for codegen?

purs-backend-es does not subsume compiler functionality.

It is not incremental, so right now it's largely targeted at production builds. If we want to separate out the backend from the core compiler, then the compiler must be able to inform backends of incremental status, otherwise all backends have to duplicate the work the compiler has already done to sort out what needs to be built, which is a complete waste of non-trivial work and resources. I would like to make it incremental in the interim by just depending on cache-db.json, whether or not the compiler considers it a stable target because it's the only realistic way to get that information.
It does not emit source maps. I personally have no intention of ever implementing this without near unanimous support from the community that it's something that people use since I consider the power-to-weight-ratio to be extremely poor.

So, I do not see any near term future where the current JS backend is rendered obsolete, though I would like a near term future where something like purs-backend-es can be used in a first-class way. That being said, if there are currently no pending users of this feature, I'm not sure what the point is. I think the fresh name issue seems like it can clearly cause a problem, however unlikely, and I'm not sure how you'd fix it. I don't know how I feel about merging a feature with a uncertain prospects and potentially buggy behavior.

I'm happy to talk about purs-backend-es background/motivation in general, but I don't think this is the place. If you have any thoughts or questions, I'd love to hear from you on discourse!

MaybeJustJames · 2022-09-29T07:29:36Z

I think there's still a case for exposing it, so users can benefit from custom optimizations and rewrites without needing to also use a third-party backend.

I agree. I wanted to get this through for zephyr specifically. There is potential for other optimization tools to make use of this interface even if an improved backend is eventually merged.

Also expose the new codegenJS action for use in purs codegen

Always initialize supply with 0 as codegenJS implementation detail in order to get deterministic variable naming while doing a normal purs compile vs purs compile; purs codegen

Co-authored-by: Ryan Hendrickson <ryan.hendrickson@alum.mit.edu>

JordanMartinez · 2022-10-30T14:51:44Z

I think this PR can be closed, right?

MaybeJustJames · 2022-10-31T08:25:56Z

I would still vote to merge for the zephyr | ${other_optimizer} use case.

MaybeJustJames · 2022-11-06T16:54:29Z

Is the vision for purescript to have multiple codegen backends? If non-javascript backends are always going to be separate projects then maybe to makes sense for JavaScript codegen to be separate too? In which case this PR should be closed. If the vision for the compiler is to include multiple backends then I think a codegen command will remain useful

colinwahl commented May 19, 2021

View reviewed changes

thomashoneyman reviewed May 19, 2021

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

rhendric reviewed May 28, 2021

View reviewed changes

TheMatten mentioned this pull request Jun 1, 2021

Implement Typeable #4097

Draft

5 tasks

rhendric reviewed Jun 2, 2021

View reviewed changes

rhendric reviewed Jun 3, 2021

View reviewed changes

app/Command/Codegen.hs Outdated Show resolved Hide resolved

JordanMartinez reviewed Jun 23, 2021

View reviewed changes

rhendric mentioned this pull request Sep 28, 2021

Uncurry optimization #479

Closed

rhendric mentioned this pull request Apr 12, 2022

Lazy initialization for recursive bindings #4283

Merged

5 tasks

MaybeJustJames mentioned this pull request Sep 24, 2022

Dead code elimination should happen before codegen #2970

Open

WIP - set up command

c93a9e0

colinwahl and others added 15 commits September 29, 2022 09:35

WIP - run codegen

47467b9

WIP - dont write externs.cbor on codegen pass

efe02b7

WIP - report errors and warnings

376bc16

WIP - Fail when an input file fails to parse as a CoreFn Module

79799dd

update changelog / contributors

072c73c

Woops, added to wrong section of contributors :)

2c47936

Add output path argument

3d4828d

hlint

1ab094a

Cleanup: Split out common utilities into Command.Common

9e42adc

Update CHANGELOG - add PR number and username

4f9b220

Cleanup: Split different parts of codegen action into functions

04b24ff

Also expose the new codegenJS action for use in purs codegen

hlint

d7468da

Add configuration option for generating source maps

01c7948

Remove SupplyT constraint on codegen & codegenJS

0ef5e39

Always initialize supply with 0 as codegenJS implementation detail in order to get deterministic variable naming while doing a normal purs compile vs purs compile; purs codegen

Update app/Command/Codegen.hs

231ed9e

Co-authored-by: Ryan Hendrickson <ryan.hendrickson@alum.mit.edu>

MaybeJustJames force-pushed the codegen branch from cb5920a to 231ed9e Compare September 29, 2022 08:09

MaybeJustJames added 3 commits September 29, 2022 14:12

Small fix for variable renaming and update tests

acec49f

Let bindings for lefts and rights mods

e5a6370

Make weeder happy

5d91486

		runCodegen foreigns filePathMap m =
		P.codegenJS (makeActions foreigns filePathMap) False m

New command: purs codegen #4092

Are you sure you want to change the base?

New command: purs codegen #4092

Conversation

colinwahl commented May 19, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

colinwahl commented Jun 1, 2021

colinwahl commented Jun 2, 2021

rhendric left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

colinwahl commented Jun 4, 2021

JordanMartinez Jun 23, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MaybeJustJames commented Sep 24, 2022

MaybeJustJames commented Sep 24, 2022

colinwahl commented Sep 26, 2022

rhendric commented Sep 26, 2022

MaybeJustJames commented Sep 26, 2022

colinwahl commented Sep 27, 2022

MaybeJustJames commented Sep 28, 2022

f-f commented Sep 28, 2022

colinwahl commented Sep 28, 2022

rhendric commented Sep 28, 2022

f-f commented Sep 28, 2022

rhendric commented Sep 28, 2022

natefaubion commented Sep 29, 2022 • edited

MaybeJustJames commented Sep 29, 2022

JordanMartinez commented Oct 30, 2022

MaybeJustJames commented Oct 31, 2022

MaybeJustJames commented Nov 6, 2022

JordanMartinez Jun 23, 2021 •

edited

natefaubion commented Sep 29, 2022 •

edited