You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm curious what folks would think about adding use case-specific pages to the Dask docs. Specifically, I was thinking about pages for machine learning and workflow orchestration where there is an especially broad ecosystem of libraries that you can use with Dask, but it'd be hard to find these by looking through the Dask docs. Maybe there are other good use cases too.
I'm not quite sure how these would fit into the current docs.dask.org table of contents. Some ideas:
add a new "use cases" category in the left TOC
create a use cases drop-down under "how to use" that can link out to common use cases
...
machine learning
We have https://ml.dask.org/, which is nice because it covers many ml-specific libraries that integrate well with Dask. A lot of this information is out of date, though, and I think it would be nice to have a single, concise page in docs.dask.org that links out to examples, relevant libraries (xgboost, lightgbm, rapids, scikit-learn, dask-ml functions that are still used/maintained), etc.
workflows/etl
I think the closest thing we have to this right now is the Prefect example in examples.dask.org. I'm imagining this page could link to using Dask with Prefect, but also other workflow orchestration tools like airflow and dagster. Maybe it also mentions things like dask-sql, dask-bigquery, delta-rs + dask.
I'm very +1 on adding use case examples. As you say we already have ml.dask.org and examples.dask.org. Are you suggesting putting some effort into getting those up to date? Or merging those into the main Dask docs?
I don't think that ml.dask.org is particularly good. I think that we should have a Machine Learning doc inside docs.dask.org that points people in different directions. My guess is that it points people to ...
HPO systems
Optuna
Futures
Gradient boosted trees
xgboost
lightgbm
Batch inference
Futures
Dask dataframe map_partitions
Pytorch training on large models (the saturn thing maybe?)
I think that I would find this more valuable than ml.dask.org, which is today focused on the Dask ML package, which is, as far as I'm aware, largely unused.
I'm curious what folks would think about adding use case-specific pages to the Dask docs. Specifically, I was thinking about pages for machine learning and workflow orchestration where there is an especially broad ecosystem of libraries that you can use with Dask, but it'd be hard to find these by looking through the Dask docs. Maybe there are other good use cases too.
I'm not quite sure how these would fit into the current docs.dask.org table of contents. Some ideas:
machine learning
We have https://ml.dask.org/, which is nice because it covers many ml-specific libraries that integrate well with Dask. A lot of this information is out of date, though, and I think it would be nice to have a single, concise page in docs.dask.org that links out to examples, relevant libraries (xgboost, lightgbm, rapids, scikit-learn, dask-ml functions that are still used/maintained), etc.
workflows/etl
I think the closest thing we have to this right now is the Prefect example in examples.dask.org. I'm imagining this page could link to using Dask with Prefect, but also other workflow orchestration tools like airflow and dagster. Maybe it also mentions things like dask-sql, dask-bigquery, delta-rs + dask.
cc @jrbourbeau @fjetter @mrocklin @jacobtomlinson
The text was updated successfully, but these errors were encountered: