Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Goto dependency definition #3749

Open
wants to merge 37 commits into
base: master
Choose a base branch
from

Conversation

nlander
Copy link
Contributor

@nlander nlander commented Aug 8, 2023

For users using cabal HEAD, enables indexing of dependency HIE files so that the source can be written out on calls to gotoDefinition.

Opening this instead of #3704 because rebasing over the ghcide test refactor ended up being more trouble than it was worth.

There are a few things still missing, but I think this is ready for more eyes.

What I know is still missing - feedback on all of these items is welcome:

  • What should we do if gotoDefinition is called while the module with the definition has not been indexed yet, because it is still in the indexQueue? @wz1000 has suggested that I look into what other language servers (rust, java, etc) do for this, so I plan to look into that over the next few days.
  • Say that for some reason a package's HIE files get indexed and then for some reason the cabal store gets corrupted or deleted. In this case, loadHieFile in lookupMod would fail. What should we do in this case? Should we send a message to the client? We should probably at least unindex the package.
  • There is one test of gotoDefinition for a dependency. I could add another that goes to a dependency definition and then from there goes to a transitive dependency definition. Do we think this would be a good test to add?
  • Documentation: I think instructions on how to enable -fwrite-ide-info would be helpful, along with some troubleshooting tips. For example, it is possible to have the same hash in the cabal store as what we would have with the extra-compilation-artifacts directory (where the HIE files should be found), but missing extra-compilation-artifacts. This can happen if the project is build with the -fwrite-ide-info flag, but with an older version of cabal that does not produce the HIE files.
  • I need to update CI to install cabal HEAD, or my dependency gotoDefinition test will fail.

@July541
Copy link
Collaborator

July541 commented Aug 8, 2023

Thanks for your work!

I'm glad to try this, If I compile cabal from master and compile HLS with this thread, everything should work then without any extra configruation?

@nlander
Copy link
Contributor Author

nlander commented Aug 8, 2023

@July541 You're welcome! You do need to add ghc-options: -fwrite-ide-info to some cabal config. It can be your global cabal config, the cabal.project for your project, or a .cabal file.

@michaelpj
Copy link
Collaborator

What should we do if gotoDefinition is called while the module with the definition has not been indexed yet, because it is still in the indexQueue?

I would say that the answer should be the same as "what do we do if gotoDefinition is called while a module has not yet been typechecked and it is in the shake build queue (or we're building dependencies, or whatever)?".

In this case, loadHieFile in lookupMod would fail. What should we do in this case?

What happens if the cabal store gets corrupted and the interface files for a dependency get deleted? Presumably we will die horribly in some way? If so, I think it's fairly reasonable to die horribly here also.

Do we think this would be a good test to add?

I definitely think we should have some more tests. The one you described sounds good. It would be good to think of some weird situations and try to have tests for them. Quick brainstorming:

  • In a direct dependency
    • Definition of a term
    • Definition of a type
    • Are there any entities that we know are not currently recorded in HIE files but are available for locally compiled modules? Might be worth adding some failing tests for those in order to show we still have stuff to fix
  • In a transitive dependency
  • In a boot library dependency (presumably will be broken until we get GHC shipped with HIE files, but worth recording)
  • Open two files in different components that both depend on X, but different versions of X. Try a goto definition request in both, check that each goes to the right version of X
  • In a direct dependency, but defined in a generated file like the Paths_ module

Any other weird cases people can think of?

I think instructions on how to enable -fwrite-ide-info would be helpful, along with some troubleshooting tips

So, if someone asks to go to a definition in a dependency and we don't have the IDE files, then we should definitely log something (or maybe even show a message?), but whatever we log and show could say that we're missing the IDE files and you can get them by doing X. Maybe that is a good enough UX?

@nlander nlander force-pushed the goto-dependency-definition-2 branch 3 times, most recently from e05e50d to 5cfb425 Compare August 11, 2023 03:02
@nlander
Copy link
Contributor Author

nlander commented Aug 11, 2023

@michaelpj I am taking a stab at this test idea:

Open two files in different components that both depend on X, but different versions of X. Try a goto definition request in both, check that each goes to the right version of X

and as far as I can tell, different components within the same project are not allowed to depend on different versions of the same package. Am I missing something, or misunderstanding what you meant?
You can see my attempt at the test here: nlander@da0ac64

@michaelpj
Copy link
Collaborator

and as far as I can tell, different components within the same project are not allowed to depend on different versions of the same package. Am I missing something, or misunderstanding what you meant?

They have to be independent, e.g. an executable component and a library component, something like:

library
  ...
  build-depends: foo == 1.0

executable
   build-depdnds: foo == 2.0 -- note, no dependency on the main library!

@michaelpj
Copy link
Collaborator

(executables get their own build plan that doesn't necessarily have to be the same as the main build plan for the libraries, hence how this can happen!)

@nlander
Copy link
Contributor Author

nlander commented Aug 14, 2023

@michaelpj Thank you for that clarification!

@nlander
Copy link
Contributor Author

nlander commented Aug 15, 2023

@michaelpj I have tried making one component an executable and the other a library and cabal still seems to think that they depend on each other :( even when I don't specify either one as a dependency of the other. I have tried this for one cabal file and two cabal files. Both attempts can be found here: nlander@3c458b2. Is there something else that I need to add to allow the executable to have its own build plan?

@michaelpj
Copy link
Collaborator

Okay, turns out I'm just wrong. I asked and the only things that get independent build plans are custom setups and build tools. And in practice it sounds like the only place this could happen is Setup.hs, and we don't support that anyway and it's not that important.

@nlander
Copy link
Contributor Author

nlander commented Aug 15, 2023

@michaelpj Thank you for double checking!

@@ -31,7 +31,7 @@ runs:
sudo chown -R $USER /usr/local/.ghcup
shell: bash

- uses: haskell/actions/setup@v2.4.4
- uses: nlander/actions/setup@install-cabal-head
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's try and get this upstreamed before we merge

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm waiting for a CI run to be approved for haskell/actions#290. If that passes I think it should be merged soon.

@@ -7,7 +7,7 @@ inputs:
cabal:
description: "Cabal version"
required: false
default: "3.8.1.0"
default: "head"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to leave this here? Will we want to turn it back to a released version at some point? Write it down!

cabal.project Outdated
source-repository-package
type:git
location: https://github.com/nlander/HieDb.git
tag: 4eebfcf8fab54f24808e6301227d77ae64d2509c
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the status of the hiedb changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wz1000 what time frame were you thinking for merging my PRs?

-- project. Right now, this is just a stub.
-- | Generates URIs for files in dependencies, but not in the
-- project. Dependency files are produced from an HIE file and
-- placed in the .hls/dependencies directory.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this in the project root? I guess maybe it needs to be so the LSP server thinks it's part of the project. It's a bit awkward to introduce a new directory just for this. Can we not put it in one of the existing places, like somewhere in dist-newstyle?

Copy link
Contributor Author

@nlander nlander Aug 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please elaborate a little about what you find awkward about generating a new directory? I don't think putting things in dist-newstyle would be ideal because it's used by another tool entirely (cabal). As it is, the code handles .hls/dependency being wiped out just fine, but would have trouble if just part of it were corrupted.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just One More Thing. At the moment users of HLS need to worry about:

  • dist-newstyle
  • ~/.cache/ghcide
  • and now .hls

Plus, it's in the project directory, so everyone will have to add a new thing to their .gitignores. It's not disastrous, but it is a real cost.

Does it need to be in the project root?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it needs to be in the project root to prevent vscode from spawning another HLS instance.

-- name and hash of the package the dependency module is
-- found in. The name and hash are both parts of the UnitId.
writeOutDir :: FilePath
writeOutDir = projectRoot </> ".hls" </> "dependencies" </> show uid
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should make the directory configurable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think making this directory configurable would be fine functionality to add to the project, but I don't think it should be in the same PR, as it's not strictly necessary for the functionality I'm trying to add here. I do think it's a good idea to get rid of the magic strings though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, in some ways making it configurable is partly a way to force the definition of the directory in one place! We could do that and then just not expose it to the user?

setFileModified (cmapWithPrio LogFileStore recorder) (VFSModified vfs) ide False file
let foiStatus = case getSourceFileOrigin file of
FromProject -> Modified{firstOpen=True}
FromDependency -> ReadOnly
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be impossible, right? A change notification for a read only file? Something has gone wrong here, we should probably at least log

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that editors (VSCode included) have various mechanisms for getting around readonly file permissions. So it is entirely possible for the user to accidentally or unwittingly edit one of these dependency files.

setFileModified (cmapWithPrio LogFileStore recorder) (VFSModified vfs) ide False file
addFileOfInterest ide file foiStatus
unless (foiStatus == ReadOnly)
$ setFileModified (cmapWithPrio LogFileStore recorder) (VFSModified vfs) ide False file
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does something go wrong if we do this even for ReadOnly things? comment!

@@ -38,7 +38,9 @@ moduleOutline
moduleOutline ideState _ DocumentSymbolParams{ _textDocument = TextDocumentIdentifier uri }
= liftIO $ case uriToFilePath uri of
Just (toNormalizedFilePath' -> fp) -> do
mb_decls <- fmap fst <$> runAction "Outline" ideState (useWithStale GetParsedModule fp)
mb_decls <- case getSourceFileOrigin fp of
FromDependency -> pure Nothing
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, why can't we just let the handler fail because the rule fails?

-- ^ File extension of the files the plugin is responsible for.
-- The plugin is only allowed to handle files with these extensions.
-- When writing handlers, etc. for this plugin it can be assumed that all handled files are of this type.
-- The file extension must have a leading '.'.
}

-- A description of the types of files that the plugin
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
-- A description of the types of files that the plugin
-- | A description of the types of files that the plugin

I realise I'm just repeating myself, but again I don't see why we can't let plugins just fail on dependency files if they need stuff that can't be provided.

-- project file.
getSourceFileOrigin :: NormalizedFilePath -> SourceFileOrigin
getSourceFileOrigin f =
case [".hls", "dependencies"] `isInfixOf` (splitDirectories file) of
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another use of the magic strings!

@July541
Copy link
Collaborator

July541 commented Aug 16, 2023

User feedback: I've using this for about three days, going to third-party package is really really fantastic!

I still found some packages like base and time(for my personal experience) don't work for the goto definition, I dont' know why and the log dont' gives a reason. And hls seems trying to work on .hls folder, so there are some diagnostic errors.

@michaelpj
Copy link
Collaborator

My big comments:

  • It would be great to have a note explaining how it all goes together. The flow of logic is pretty unclear to me, I have no idea what order things happen in or why.
  • It seems like there's a bunch of work to try and avoid running certain rules and plugins on dependency files. I'm not clear why we can't just let them fail?

@nlander
Copy link
Contributor Author

nlander commented Aug 16, 2023

It seems like there's a bunch of work to try and avoid running certain rules and plugins on dependency files. I'm not clear why we can't just let them fail?

I took this design choice at the request of @wz1000. It was his idea to create a whitelist of Rules that we think are appropriate for dependency files. It was also his idea to create a notion of which file types plugins are defined for, defaulting to not supporting dependency files.

@michaelpj I wonder what justification you would give for running code that we know will fail? I was under the impression that plugin failures were not something that were handled gracefully, but if they are I suppose it doesn't matter which way we go on this. It just seems wasteful to me and likely not very performant to run extra code.

@nlander
Copy link
Contributor Author

nlander commented Aug 17, 2023

It would be great to have a note explaining how it all goes together. The flow of logic is pretty unclear to me, I have no idea what order things happen in or why.

There are two main components of the functionality:

  • the changes to the lookupMod function in ghcide/src/Development/IDE/Core/Actions.hs, which are triggered on calls to gotoDefinition.
  • the code that indexes dependencies in the hiedb, which can be found in the new file ghcide/src/Development/IDE/Core/Dependencies.hs. This gets run asynchronously, triggering every time newHscEnvEqWithImportPaths gets called.

The gotoDefinition code was written in such a way that it was expecting that we would eventually be able to go to dependency definitions. Before my changes, lookupMod was a no-op stub intended to be where functionality would eventually go for dependencies. You can see the code that eventually ends up calling lookupMod in the function nameToLocation in ghcide/src/Development/IDE/Spans/AtPoint.hs. To summarize, gotoDefinition will look for a file in the project, and look in the hiedb if it can't find it. In this sense, the name lookupMod might be a little misleading, because by the time it gets called, the HIE file has already been looked up in the database and we have the FilePath of its location. A more appropriate name might be something like loadModule, since what it does is load the module source code from an HIE file and write it out to .hls/dependencies. The way nameToLocation works (which my code hasn't changed), if we have already opened a dependency file once, lookupMod won't get called. In addition to loading the dependency source and writing it out, lookupMod handles indexing the source file that we wrote out, which can't happen in the initial indexing since the source file doesn't exist at that point. To summarize, for gotoDefinition to work for a dependency we need to have already indexed the HIE file for that dependency module.

The indexing process gets the packages and modules for dependencies from the HscEnv. It filters them for packages we know are direct or transitive dependencies, using the function calculateTransitiveDependencies. indexDependencyHieFiles attempts to load an HIE file for each module, checking for it in the extra-compilation-artifacts directory, found in the package lib directory. This fails for the packages that ship with GHC, because it doesn't yet generate HIE files. @July541 that is why it fails for base and time. If it is able to load the HIE file, it indexes it in hiedb using indexHieFile, which is the same function used to index project HIE files.

Everything else is plumbing to make the two above functionalities work, code that disables functionality that won't work on dependency files (which may or may not be necessary as discussed above), and tests.

@michaelpj does this answer your questions about logic flow? Do you think this should be documented somewhere other than this comment, and if so where?

@nlander
Copy link
Contributor Author

nlander commented Aug 17, 2023

And hls seems trying to work on .hls folder, so there are some diagnostic errors.

@July541 I think this is a consequence of how hls handles a plugin being enabled or disabled. This may be the first instance of many files in a project having multiple plugins disabled (which may not be the right approach anyway as @michaelpj pointed out above). I think maybe there are some changes that should be made that aren't directly related to this functionality.

@michaelpj
Copy link
Collaborator

Do you think this should be documented somewhere other than this comment, and if so where?

Great comment! Yes, please do write a Note! I would probably put it close to where the indexing is happening. Something like Note [Going to definitions in dependencies].

A more appropriate name might be something like loadModule

👍

In addition to loading the dependency source and writing it out, lookupMod handles indexing the source file that we wrote out, which can't happen in the initial indexing since the source file doesn't exist at that point

I think I asked this already, but how bad would it be to eagerly write out the source files and index them during the initial process? It would make startup slower, since we'd be doing work for files that we might not need, but it seems like it would simplify the logic quite a bit.

Also, this process has the flavour of "do X, but only when it's requested". Would it make sense to handle this with a Rule? So we could have a Rule for GetHieSourceFile, which would either return the created source file, or create and index it.

I wonder what justification you would give for running code that we know will fail? I was under the impression that plugin failures were not something that were handled gracefully, but if they are I suppose it doesn't matter which way we go on this. It just seems wasteful to me and likely not very performant to run extra code.

Plugin failures and rule failures have to be handled somewhat gracefully, since failure is normal state. If the user has a Haskell file that has a parse error, then many things will fail! Rules that depend on parsing will fail, plugin handlers that depend on those rules will fail (some will work because they depend on stale information, but still). (I'm not too worried about the cost - the rule failures should cascade pretty quickly, I'd think)

Maybe this will all be too much if we just can't process the file at all, but in some sense it seems like it should be fine if, say, the Typecheck rule just always says "nope" when you run it on a dependency file. I could imagine that we might want to tweak it to make sure it doesn't emit diagnostics if it fails in such a way that we shouldn't expect it to succeed, but it would be nice if the cascading failures were handled gracefully.

Perhaps this is harder than I think - I know we had similar issues with the cabal plugin, which is what led to us having the filetype associations for plugins. So I guess it's not a hard and fast rule. But the changes here seem more invasive, and I wonder if we could just go for the "let it fail" approach.

@nlander nlander force-pushed the goto-dependency-definition-2 branch from cc84de6 to 3d098ba Compare August 17, 2023 09:17
@nlander nlander force-pushed the goto-dependency-definition-2 branch from 7e00314 to e9ea310 Compare October 10, 2023 15:27
@fendor fendor added the status: needs review This PR is ready for review label Nov 15, 2023
@soulomoon
Copy link
Collaborator

It is really great feature, do you still have time pushing it @nlander ?

@nlander
Copy link
Contributor Author

nlander commented Feb 10, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: needs review This PR is ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants