Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid printing/parsing numbered args #4962

Merged
merged 13 commits into from
May 29, 2024

Conversation

sellout
Copy link
Contributor

@sellout sellout commented May 16, 2024

Overview

The Haskell data structures for various types are preserved through the “numbered args” processing, meaning that numbered args don’t need to be re-parsed after being expanded into the command. This avoids work, makes handling more consistent between commands, and allows for better error messages (say, when you refer to a branch when you need a definition).

Test coverage

This is a refactoring that should be largely covered by existing transcript tests that exercise the use of numbered args. All transcripts pass without modification.

As the type is changing to be more structured, we can’t use `[String]` in its
place.
Copy link
Contributor Author

@sellout sellout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a bunch of cleanup to do in here (e.g., removing formatting done by wrong Ormolu version (but I think I have that sorted going forward)), but it’s feeling pretty solid with a few questions in the comments here.

I think there is some follow-up work in a separate PR to standardize some of the types expected by various commands, which would also reduce the handlers.

map (Text.unpack . Reference.toText . snd) types
<> map (Text.unpack . Reference.toText . Referent.toReference . snd) terms
map (SA.Type . snd) types
<> map (SA.Term . Referent.toReference . snd) terms
Copy link
Contributor Author

@sellout sellout May 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These cases (and other SA.Ref cases) had been displaying the Reference (hash), but we also have a HashQualified Name which seems like a much nicer thing to show to the user? Is there a reason that the HQ Name isn’t preferable here?

Copy link
Contributor

@aryairani aryairani May 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside: I looked in the diff for these lines (e.g. old line 1553), but it doesn't match what's shown here, and I don't understand why it wouldn't? I see it on old line 1302. I see Github's marked the snippet as "Outdated", but I don't think that would matter.

I think the answer to your question is that until #4907, the numbered args weren't shown to the user at all unless they ran debug.numberedArgs, making the question moot(?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside: I looked in the diff for these lines (e.g. old line 1553), but it doesn't match what's shown here, and I don't understand why it wouldn't? I see it on old line 1302. I see Github's marked the snippet as "Outdated", but I don't think that would matter.

Ah, I think it’s because I squashed another commit into this one when it was in draft, but after I wrote this comment.

I had originally had Ref, Term, and Type as three separate StructuredArgument cases, but that just led to three copies of identical logic, so I removed the Term and Type cases until we can reasonably preserve that information.

I think the answer to your question is that until #4907, the numbered args weren't shown to the user at all unless they ran debug.numberedArgs, making the question moot(?)

Ah, right. I conflated a few things here and got confused.

But the question wasn’t really moot – if References are put into numbered args state, a command expecting a Name couldn’t then use a numbered arg. Whereas, if HashQualified Names were put into the state, the same command could then use it.

I don’t think it’s a change for this PR, but it seems like using the HQ instead of the Reference would help fix some of the existing numbered args problems.

Also, I think the pretty-printed numbered results lists (e.g., the output of find) could eventually lean more on the numbered args handling, rather that being constructed completely independently, which is another reason using HQ instead of a Reference would be a good change.

But I don’t know what the counter-arguments might be.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, makes sense.

helpFor p = I.help p
helpFor = I.help

handleProjectArg :: I.Argument -> Either Text ProjectName
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These handlers aren’t great, IMO. There is so much code here. I feel like we can probably shrink the size of the StructuredArgument sum type more, which would help a bit here and there are some other possibilities, too, but we probably need to discuss.

( \case
SA.Project project -> pure project
-- __FIXME__: Do we want to treat a project branch as a project?
SA.ProjectBranch (ProjectAndBranch (Just project) _) -> pure project
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt this was supported before, but when referencing numbered args, it seems like this kind of extraction is somewhat useful? E.g., you get back a list of all the ProjectAndBranches in your codebase, and then you run a command to rename a project, giving it some random ProjectAndBranch for that project – it makes sense that it would just ignore the branch name and rename the project … right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe? Sounds kind of weird to me, like if we said:

rename.project foo/main bar/branch

should that rename foo to bar?

I'd be more inclined to tell the user something like "hey I'm not sure we're on the same page here"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, @mitchellwrosen said the same. I think that’s right.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 104 to 105
-- also: ShortHash.toText . Reference.toShortHash
Reference.toText termReference
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the formatting has been localized to this one function now, but some types had previously been formatted multiple ways, in which case I added a comment like this one.

All of the existing transcripts are working unmodified (although they definitely broke along the way), so at least as much as is covered is producing the same strings, but I’m not sure I always chose the best option.

--
-- __FIXME__: Don’t hardcode this
schLength :: Int
schLength = 10
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seemed to be the value used for short hashes in various places, but not exposed in a good way, AFAICT. A few of the handlers need this (for commands that expect short hashes where we have a full hash). But this feels like the wrong way.

Also, this might make sense for the formatter, but perhaps the commands that take ShortCausalHash or whatever should have it use the full length? (Even though the type has to be Short to allow users to type less.)

Copy link
Contributor

@aryairani aryairani May 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were initially planning to derive this number from the individual codebase, like git does; but we never did.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a commit at the end now that handles this a bit differently. It avoids hardcoding, and preserves the full hash as an SCH in a number of cases. When the length is provided to the formatter1 it takes advantage of it.

Footnotes

  1. DumpNumberedArgs gets the hash length “from the DB”, and hopefully the command history could take advantage of it soon.

This is the first step toward avoiding printing/parsing the values provided via
`NumberedArgs`. It simply adds a new sum type to hold all of the types that can
be in numbered args and stores it alongside the `Text` representation.

It currently gets discarded when we actually expand the arguments.
This forces each `InputPattern.parse` function to serialize any
`StructuredArgument` in its arguments. It’s a stop-gap that allows us to
incrementally handle the structured arguments command-by-command.
This converts the commands to accept structured numbered arguments, rather than
turning them all into strings.
Previously, the `Text` format had been preserved from the original code. This
extracts all to a separate function that is called as needed.

All transcripts still pass.
@sellout sellout marked this pull request as ready for review May 23, 2024 02:03
@sellout
Copy link
Contributor Author

sellout commented May 23, 2024

This still isn’t running any checks for me …

@aryairani
Copy link
Contributor

aryairani commented May 23, 2024

This still isn’t running any checks for me …

At here I see

Workflows will not run on pull_request activity if the pull request has a merge conflict. The merge conflict must be resolved first.

I was unsure as to why this would matter, but reading a bit more I suppose it's because we run CI on pull_request (which it looks like uses the merge node, which can't exist with a conflict, to run the workflow), rather than on push, which just uses exactly whatever you committed. I didn't realize this! Or I forgot.

I could see a case for running both TBH, but we don't currently, probably due to running time.

It's too bad it doesn't even mention that it would have run if not for the merge conflict.

helpFor p = I.help p
helpFor = I.help

handleProjectArg :: I.Argument -> Either Text ProjectName
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: use Either Pretty

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

This was referenced May 25, 2024
Copy link
Contributor

@aryairani aryairani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

unison-cli/src/Unison/CommandLine/InputPatterns.hs Outdated Show resolved Hide resolved
( \case
SA.Project project -> pure project
-- __FIXME__: Do we want to treat a project branch as a project?
SA.ProjectBranch (ProjectAndBranch (Just project) _) -> pure project
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe? Sounds kind of weird to me, like if we said:

rename.project foo/main bar/branch

should that rename foo to bar?

I'd be more inclined to tell the user something like "hey I'm not sure we're on the same page here"

Comment on lines 1301 to 1311
$ \case
p : args ->
Input.FindI False . mkfscope
<$> first P.text (handlePathArg p)
-- __FIXME__: This changes things a bit. Previously, `find` and
-- friends would just expand the numbered args and search
-- for them like any other string, but now it recognizes
-- that you’re trying to look up something you already
-- have, and refuses to. Is that the right thing to do? We
-- _could_ still serialize in this case.
<*> traverse (unsupportedStructuredArgument "text") args
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we see an example in a transcript for context?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's already a transcript that demonstrates find it could go there

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had already removed this change in other find cases, but overlooked this one. So I’ve changed this (and a couple other missed cases) to align with the others, which use unifyArgument to provide the string form of the structured argument.

# Conflicts:
#	unison-cli/src/Unison/Codebase/Editor/HandleInput.hs
#	unison-cli/src/Unison/CommandLine/InputPatterns.hs
#	unison-cli/src/Unison/CommandLine/OutputMessages.hs
@aryairani
Copy link
Contributor

I want to understand the question about find, but apart from that it seems ready to merge — or did I miss addressing any other question?

This also generally improves formatting:
- follows the longer line convention in Unison and
- removes unnecessary `( … )` and `$` before `LambdaCase` args.
When `StructuredArgument`s are used as an input, preserve the entire hash. When
printed, take the length as an optional argument (and show the full hash when
unavailable).
@sellout sellout added the ready-to-merge Apply this to a PR and it will get merged automatically once CI passes and 1 reviewer has approved label May 28, 2024
@aryairani
Copy link
Contributor

@sellout Go ahead and merge this when you're ready. (I was about to pull the trigger, but I had created a new merge conflict.)

@sellout sellout merged commit 6805135 into unisonweb:trunk May 29, 2024
19 checks passed
@sellout sellout deleted the structured-numbered-args branch May 29, 2024 23:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready-to-merge Apply this to a PR and it will get merged automatically once CI passes and 1 reviewer has approved
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants