Add ability to limit get (and thus install) --recursive installation of subdatasets #7567

yarikoptic · 2024-03-06T19:03:38Z

I think this case came up a number of times. I do not think we arrived at any facility to address those use-cases. ATM we only have -R|--recursion-limit to limit via hardcoded recursion limit integer. Here I would like to collect use cases which could drive us a solution to tackle

"YODA'ed datasets from above": it is typical to recommend installing some version from the flattened level above, e.g. smth like datalad create -d . derivatives/deriv1 && cd derivatives/deriv1 && datalad clone -d . ../../rawdata sourcedata && ... && cd - && datalad save -m "Finalized deriv1 derivative" -d . derivatives/deriv1 && datalad uninstall derivatives/deriv1/sourcedata. So that later on someone could install the hierarchy of the dataset but without installing those sourcedata/ installations.
- note that URL in .gitmodules for sourcedata/ might be replaced with a public URL, not just local one.
- good examples:
  - https://wittkuhn.mpib.berlin/highspeed/ study by @lnnrtwttkhn
- here might be a rule "do not install subdataset with UUID of a dataset or immediate subdataset somewhere in superdatasets"
YODA'ed datasets from somewhere else: example is https://github.com/OpenNeuroDerivatives/OpenNeuroDerivatives which has superdataset which doesn't include in top superdataset https://github.com/ReproNim/containers/ and https://github.com/poldracklab/tacc-openneuro/ but those are included in every subdataset. Currently could be "addressed" via
- -R 1 but might prevent not installing some "ad-hoc" subdataset
- expressing in the terms of use-case right before by including those subdatasets in top level superdataset (might be the best way to go)
Avoid private - install until hitting a Private (on github) repo. Use case:
- Skip embargoed datasets somehow? dandi/dandisets-healthstatus#73
  may be could be re-expressed via
- adding some labeling within .gitmodules records, and then explicitly allowing to say to not install submodules with specific label/tag.

WDYT @datalad/developers and @datalad/contributors -- did you have related use-cases?

The text was updated successfully, but these errors were encountered:

yarikoptic added the enhancement label Mar 6, 2024

yarikoptic mentioned this issue Mar 6, 2024

Skip embargoed datasets somehow? dandi/dandisets-healthstatus#73

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ability to limit get (and thus install) --recursive installation of subdatasets #7567

Add ability to limit get (and thus install) --recursive installation of subdatasets #7567

yarikoptic commented Mar 6, 2024

Add ability to limit get (and thus install) --recursive installation of subdatasets #7567

Add ability to limit get (and thus install) --recursive installation of subdatasets #7567

Comments

yarikoptic commented Mar 6, 2024