Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for pip install dask[jobqueue] #11112

Closed
Andrew-S-Rosen opened this issue May 9, 2024 · 4 comments
Closed

Add support for pip install dask[jobqueue] #11112

Andrew-S-Rosen opened this issue May 9, 2024 · 4 comments
Labels
discussion Discussing a topic with no specific actions yet tests Unit tests and/or continuous integration

Comments

@Andrew-S-Rosen
Copy link
Contributor

It might be worthwhile to add support for doing pip install dask[jobqueue] by adding the dependency to [project.optional-dependencies] in pyproject.toml. Additionally, it could be nice to have this tested in some integration CI run.

@github-actions github-actions bot added the needs triage Needs a response from a contributor label May 9, 2024
@fjetter
Copy link
Member

fjetter commented May 21, 2024

There are many different dask extensions that could be added this way and so far we chose not to do this. So far, the optional dependencies are only covering things that are related to dask core functionality and doesn't include any deployment specific optionals.
I'm inclined to not go down that route of adding all possible projects as there are too many.


Regarding testing, I see the benefits of having fully integrated CI runs but I'm concerned about the cost (mostly time and developer productivity).
Most of the test suite is actually not setup to run with a distributed cluster but are only using the trivial schedulers. To enable an integration test we'd have to migrate the entire test suite to a setup that is using a LocalCluster variant (which could then be replaced by dask-jobqueue et al with their respective clusters). This is a lot of work and would likely slow down CI considerably even without accounting for the additional test runs against jobqueue, kubernetes, gateway, mpi, ...

@fjetter fjetter added discussion Discussing a topic with no specific actions yet tests Unit tests and/or continuous integration and removed needs triage Needs a response from a contributor labels May 21, 2024
@fjetter
Copy link
Member

fjetter commented May 21, 2024

cc @jacobtomlinson who is heavily involved in development and maintenance of the various OSS deployment tools

@Andrew-S-Rosen
Copy link
Contributor Author

I think that's fair enough! Feel free to close if you wish.

@jacobtomlinson
Copy link
Member

I think the amount of complexity in the CI of the deployment projects is enough to put me off doing something like this. Historically we've kept projects separate in Dask in order to give agency to the maintainers of each subproject.

I'm not especially +1/-1 on adding some more extras to the dask package. But it doesn't feel like it gives that much value over using package names directly. It would also mean we end up with a circular dependency as most subprojects depend on dask which I could imagine would cause a maintenance headache from time to time.

I'm always keen to discuss ideas like this. Anything we can do to make the user experience of Dask better is worth talking about. But I dont see any compelling reasoning in this particular case to motivate making a change like this. Given @fjetter is also not positive on the idea I'm going to close this out. But thanks for starting the discussion @Andrew-S-Rosen!

@jacobtomlinson jacobtomlinson closed this as not planned Won't fix, can't repro, duplicate, stale May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Discussing a topic with no specific actions yet tests Unit tests and/or continuous integration
Projects
None yet
Development

No branches or pull requests

3 participants