Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Makes shake build and cache Shakefiles by using shake itself to rebui… #525

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

RowDaBoat
Copy link

…ld when nessesary, instead of just executing runhaskell on them

@ndmitchell
Copy link
Owner

From what I can see, this only works if the Shakefile.hs doesn't import any local modules - which might lead to it not rebuilding when it should. Does that match your understanding?

@ndmitchell
Copy link
Owner

Maybe the solution is to always run ghc --make unconditionally and let it control when rebuilding happens? It's not as fast as ghc-make, but it's usually not terrible.

@RowDaBoat
Copy link
Author

True. I didn't consider local modules, it looks like calling ghc --make should just do the work, although calling ghc-make if it's installed should be even better. I'll give it a try tomorrow and see what I can do.

@RowDaBoat
Copy link
Author

Ok, I see the actual difficulty of this now, let's see if I got this straight:

-Just calling "ghc --make" implies linking every single time, that makes this approach quite slow and not very much viable.

-On the other hand, getting the dependencies of Shakefile.hs beforehand and then use shake itself to decide when compiling or linking are needed can get quite complex, since it includes parsing the .hs files.

-Or actually isn't it, since those dependencies are exactly what ghc-make is getting when parsing the Makefile generated by ghc -M

@ndmitchell
Copy link
Owner

ghc --make doesn't link each time - so just unconditionally calling ghc --make each time shouldn't be a huge overhead. It will spawn a separate process, and then spawn a further process for each file making use of CPP, but it's not usually terrible. ghc-make can shortcircuit that by ghc -M, but that's quite a complex approach, and not something I'd like to merge into the core of Shake.

@RowDaBoat
Copy link
Author

Alright, that makes perfect sense to me. I'll be commiting soon.

@RowDaBoat
Copy link
Author

RowDaBoat commented Apr 8, 2017

After some investigation and coding I found out that, at least on version 7.10.3, ghc --make actually performs linking every time it's called, regardless of dependencies changed or not, consuming significant time.

The best solution I could come up with is using ghc -M to generate a Makefile, then parse it to obtain the list of .hs that Shakefile.hs depends on, and then keep using Shake to track when calling ghc --make is needed.

The parsing of the Makefile is a bit dependant on how ghc formats it, but I tried to make it as resilient as possible by dropping comments and consuming white spaces.

@ndmitchell
Copy link
Owner

Sorry for taking a long time to respond...

I'm unconvinced GHC relinks every time with ghc --make (this was a bug with GHC 6.4 if you omitted the .exe extension on Windows - but that was a long time ago). I think we should seek to understand that first. If the linking goes away then a lot of the complex does too, as we can rely on GHC "as-is" for the most part.

@ndmitchell
Copy link
Owner

It seems like there are a bunch of options:

  1. We can write our own ghc -M thing, but that's a lot of fragile code to deal with TemplateHaskell, head files etc. It's basically all of https://github.com/ndmitchell/ghc-make. I think it's too complex to be feasible.
  2. We can do what we do now, and just runhaskell it every time. Using the demo program (on the basis users with huge build scripts are going to do their own thing anyway) it takes 0.6s.
  3. We can runhaskell, but also ask it to compile to object code and store that code somewhere. That segfaults in GHC 8.6.3 on Windows due to https://ghc.haskell.org/trac/ghc/ticket/16057, but with 8.4 it takes 0.75 to build and 0.58 to run.
  4. We can run ghc --make takes 3.52s to compile GHC, 0.154 to do nothing, followed by 0.03s to execute it subsequently.

So only 2 and 4 make sense, and we're trading 0.6s vs 3.5s first and 0.15s subsequently. The other disadvantage of 4 is it writes to the file system. If those numbers are typical, I'd suggest we stick with 2. If those numbers change to be quite large quite rapidly for even moderately sized build systems, I'd go with 4. Do you have any measurements for the build system you wrote?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants