Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding pre-commit information to Code quality/Code Style and Formatting #3650

Closed
2 of 8 tasks
AoifeHughes opened this issue May 13, 2024 · 15 comments
Closed
2 of 8 tasks
Assignees
Labels
enhancement New feature or request tools Tools we use, can be combined with question, help etc.

Comments

@AoifeHughes
Copy link
Collaborator

AoifeHughes commented May 13, 2024

Summary

The "Code Style and Formatting" chapter provides an overview of coding styles, automatic formatting tools, and online services for code quality checks. However, it does not mention the use of pre-commit hooks, which are an important tool for maintaining code quality and consistency. I propose adding a new section to this chapter, titled "Pre-commit hooks," which will introduce the concept of pre-commit, its key features, and how to set it up in a project.

What needs to be done?

PR by me is in coming. For now see branch: https://github.com/the-turing-way/the-turing-way/tree/code-style-additions

See: #3651

  • Discuss if it's worth adding a pre-commit to the main repo itself. Though I fear this will mean a large amount of reformatting

Who can help?

  • @JimMadge Had some thoughts on this a few years ago
  • Anyone else in the community who's interested!

Updates

  • First set of edits to chapter/subchapter
  • Proofread
  • Request reviews
  • Address reviews
  • Merge to main branch
@AoifeHughes AoifeHughes added enhancement New feature or request tools Tools we use, can be combined with question, help etc. labels May 13, 2024
@AoifeHughes AoifeHughes self-assigned this May 13, 2024
@JimMadge
Copy link
Member

Sounds great to me 👍.

I've come around to pre-commit recently and strict tools like prettier and black. Avoids arguments about style and is usually quite friendly for newcomers as it automates linting and makes PRs smoother.

For adding a .pre-commit, I think you are right it would probably cause a big PR which touches a lot of files. It could be worth it if we could also make sure that CI runs the same linting tools.

I think in general the philosophy of the @the-turing-way/infrastructure-working-group is that we should remove as much burden from the contributors as we can and try to enable them to contribute how they are most comfortable.
On one hand, pre-commit would mean they would be unlikely to be asked to fix formatting by hand (just run pre-commit -a). However, it would also mean we would need to ask contributors to install more tools and make contributing through the GitHub web interface more difficult.

@AoifeHughes
Copy link
Collaborator Author

Completely agree, adding it in as some CI would be ideal. There’s a nice bot which runs it automatically on PRs

@sgibson91
Copy link
Member

I think we tried prettier before but disabled it because it couldn't enforce the one style guide point we wanted to (one full sentence per line, newline after a full stop) and can make tables really unwieldy in the raw Markdown, especially for complex tables with long strings in them. We can try it again and see how we go, or maybe disable it for Markdown files in the website folder. I also imagine we'll get a pretty huge PR diff once we turn it on too.

But generally, yeah pre-commit is good and running it on pre-commit.ci means no one has to install it locally 👍🏻

@AoifeHughes
Copy link
Collaborator Author

AoifeHughes commented May 13, 2024

Turned it on in c3c93db

Just a casual 461 changed files with 5,588 additions and 5,713 deletions

Used this config:

repos:
-   repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.0.1
    hooks:
    -   id: check-yaml
    -   id: end-of-file-fixer
    -   id: trailing-whitespace

So it mostly just cleared out the whitespace.

@JimMadge
Copy link
Member

JimMadge commented May 14, 2024

Just a casual 461 changed files with 5,588 additions and 5,713 deletions

🤯
We have a serious trailing whitespace problem 😄.

@JimMadge
Copy link
Member

I've never found a way to enforce semantic line breaks but would really like one. Reviewers would still need to look out for that, but at least prettier/pre-commit would handle (almost) everything else.

can make tables really unwieldy in the raw Markdown, especially for complex tables with long strings in them.

Markdown tables always feel a bit awkward with one row per line.
I know there is an html table here. There are spans so I don't think you could do it in Markdown anyway.

I think it is probably best to discourage raw html.

@da5nsy
Copy link
Collaborator

da5nsy commented May 14, 2024

I've never found a way to enforce semantic line breaks but would really like one.

I think there's no simple rule that could do that - since fullstops are used in decimals and web addresses (and probably other things I'm not thinking of). This might genuinely be the sort of thing that a LLM might be useful for (insert rant against LLMs - ethical/environmental etc)

@AoifeHughes
Copy link
Collaborator Author

AoifeHughes commented May 14, 2024

Just gave permissions for pre-commit.ci to work on repo. Which I don't think will get picked up until the pre-commit config is merged to main? Ah it runs fine https://results.pre-commit.ci/run/github/155712190/1715680636.vNG1yoPgRXOT4btyKXwFxg

@JimMadge
Copy link
Member

I think there's no simple rule that could do that - since fullstops are used in decimals and web addresses (and probably other things I'm not thinking of). This might genuinely be the sort of thing that a LLM might be useful for (insert rant against LLMs - ethical/environmental etc)

Definitely splitting around full stops isn't the solution. I haven't thought too much about if there is a good heuristic. There is a TeX linter which is very good at telling the difference between full stops as the end of a sentence and as part of a truncation of abbreviation.

@AoifeHughes
Copy link
Collaborator Author

I'm going to move some of this over to the PR, but just to keep some of this conversation going; @JimMadge do you think it's something that's also worth opening another issue on and keep thinking about the line splitting stuff? Or would it be way out of scope?

@JimMadge
Copy link
Member

I think having an issue about enforcing semantic line breaks would be good to have on the Infrastructure WG backlog. I don't know if there is a solution, but if one becomes available we should look into it.

No need to solve that problem here or in your PR.

@jezcope
Copy link
Collaborator

jezcope commented May 17, 2024 via email

@JimMadge
Copy link
Member

@jezcope The sembr specification actually suggests breaks should occur at commas and could occur at conjunctions 🚀.

@jezcope
Copy link
Collaborator

jezcope commented May 17, 2024 via email

@AoifeHughes
Copy link
Collaborator Author

From reading the earlier discussions I feel like there's an agreement in some of the basic suggestions for pre-commit and space removing suggested. But there's a lot of interesting discussions happening around the line breaks and other grammar issues. So Iv'e started an issue over here and going to close this issue.

New issue: #3654

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request tools Tools we use, can be combined with question, help etc.
Projects
None yet
Development

No branches or pull requests

5 participants