How to get files' ids? #32

janxkoci · 2020-12-14T10:53:29Z

Hi, sorry for stupid question, but I don't know how to get files' ids so I can download individual files from a dryad dataset.

I tried looking at our published dataset with:

> dryad_dataset("10.5061/dryad.7nt8f")
# truncated output
$`10.5061/dryad.7nt8f`$id
[1] 6817

However if I try to use that id to get files, it shows different doi for this id:

> dryad_files(6817)
# truncated output
$`6817`$`_links`$`stash:dataset`$href
[1] "/api/v2/datasets/doi%3A10.5061%2Fdryad.nf757"

i.e. the returned doi is rather 10.5061/dryad.nf757 instead of 10.5061/dryad.7nt8f.

So how do I get:

a proper ids for my dataset, to be used in functions like dryad_files?
a link to a particular file (e.g. Appendix S2.txt in the doi link above)?

Session Info

R version 3.6.1 (2019-07-05)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: elementary OS 5.1.7 Hera

Matrix products: default
BLAS/LAPACK: /home/jena/miniconda3/lib/libopenblasp-r0.3.12.so

locale:
 [1] LC_CTYPE=cs_CZ.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=cs_CZ.UTF-8        LC_COLLATE=cs_CZ.UTF-8    
 [5] LC_MONETARY=cs_CZ.UTF-8    LC_MESSAGES=cs_CZ.UTF-8   
 [7] LC_PAPER=cs_CZ.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=cs_CZ.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] rdryad_1.0.0

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.5      magrittr_2.0.1  rappdirs_0.3.1  uuid_0.1-4     
 [5] R6_2.5.0        rlang_0.4.8     hoardr_0.5.2    tools_3.6.1    
 [9] htmltools_0.5.0 ellipsis_0.3.1  digest_0.6.27   httpcode_0.3.0 
[13] tibble_3.0.4    lifecycle_0.2.0 crayon_1.3.4    zip_2.1.1      
[17] IRdisplay_0.7.0 repr_1.1.0      base64enc_0.1-3 vctrs_0.3.5    
[21] triebeard_0.3.0 IRkernel_1.1.1  curl_4.3        crul_1.0.0     
[25] evaluate_0.14   mime_0.9        pbdZMQ_0.3-3.1  compiler_3.6.1 
[29] pillar_1.4.7    urltools_1.7.3  jsonlite_1.7.1  pkgconfig_2.0.3

The text was updated successfully, but these errors were encountered:

janxkoci · 2020-12-14T11:02:20Z

Update

I noticed that I can use the number from a link to file on Dryad website as ids and it seem to work properly and get the right file. But how do I get that ids from rdryad?

For example the file Appendix S2.txt mentioned above is linked with the following url: https://datadryad.org/stash/downloads/file_stream/33893

Using 33893 as ids in functions returns the right doi, file description etc:

> dryad_files(33893)
# truncated output
$`33893`$`_links`$`stash:dataset`$href
[1] "/api/v2/datasets/doi%3A10.5061%2Fdryad.7nt8f"

sckott · 2020-12-15T18:54:23Z

Thanks for opening the issue. It's quite a mystery to me too how it works. i'll have a look though

sckott · 2020-12-15T20:16:29Z

Sorry for the confusion on this. I hate to point fingers, but Dryad has not explained their API well at all, especially how the different ids work, and why we have to deal with their internal IDs, and not just the DOI for the dataset itself. And they don't really respond to questions, so really is a joy!

sckott · 2020-12-15T20:21:26Z

Okay, so this should work, where you have to get version information first:

last <- function(x) x[length(x)]
z = dryad_dataset_versions("10.5061/dryad.7nt8f")
idpath <- z[[1]]$`_embedded`$`stash:versions`$`_links.self.href`
id <- as.numeric(last(strsplit(idpath, "/")[[1]]))
# gives you information about the files, including their individual IDs
dryad_versions_files(id)

Then you still have regex/etc. the IDs out of the strings for each file.

We really need to make this easier - any pull requests welcome - don't have a lot of time to devote to this

janxkoci · 2020-12-16T09:46:10Z

Thanks for your reply and tips.

Early next year I plan to work on one pipeline which starts by pulling data from Dryad, so I will work more closely with this package. I cannot promise anything, but I will see if I can help to make it work in some way.

sckott · 2020-12-16T16:07:01Z

Thanks, sounds good

sckott added this to the v1.1 milestone Dec 15, 2020

sckott added the Bug label Dec 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to get files' ids? #32

How to get files' ids? #32

janxkoci commented Dec 14, 2020

janxkoci commented Dec 14, 2020

sckott commented Dec 15, 2020

sckott commented Dec 15, 2020

sckott commented Dec 15, 2020

janxkoci commented Dec 16, 2020

sckott commented Dec 16, 2020

How to get files' ids? #32

How to get files' ids? #32

Comments

janxkoci commented Dec 14, 2020

janxkoci commented Dec 14, 2020

Update

sckott commented Dec 15, 2020

sckott commented Dec 15, 2020

sckott commented Dec 15, 2020

janxkoci commented Dec 16, 2020

sckott commented Dec 16, 2020