-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support ls_recursive
in Azure.
#4981
Conversation
8d7bfba
to
0a9d023
Compare
0002936
to
a618ed1
Compare
I was having trouble debugging this in my Windows machine, because of asserts caused by MSVC's checked iterators feature. Opened SC-47581 to track this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, the only change I would like to see before giving a 👍 is C++ / C API tests using filters that won't accept all entries. I left a separate comment that points to existing tests for S3 that could be reused. We should also list Azure as supported for external docs in C++ (please update comment to include local FS here too) and C APIs.
Otherwise just left some suggestions for comments. Nice work!
tiledb/sm/filesystem/azure.h
Outdated
* iterators returned by a previous request. To be able to detect this, we | ||
* can track the batch number and compare it to the batch number associated | ||
* with the iterator returned by the previous request. Batch number can be | ||
* tracked by the total number of times we submit a ListObjectsV2 request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* tracked by the total number of times we submit a ListObjectsV2 request | |
* tracked by the total number of times we submit a ListBlobs request |
tiledb/sm/filesystem/azure.h
Outdated
* exit, will hold the continuation token to pass to the next listing | ||
* operation, or nullopt if there are no more results. | ||
* | ||
* @return A vector with the blobs and directories found. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* @return A vector with the blobs and directories found. | |
* @return Vector of results with each entry being a pair of the string URI | |
* and object size. |
// Currently only S3 is supported for VFS::ls_recursive. | ||
using TestBackends = std::tuple<S3Test>; | ||
// Currently only S3 and Azure is supported for VFS::ls_recursive. | ||
using TestBackends = std::tuple<S3Test, AzureTest>; | ||
TEMPLATE_LIST_TEST_CASE( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For tests using custom filters we could update these tests
https://github.com/TileDB-Inc/TileDB/blob/dev/test/src/unit-cppapi-vfs.cc#L510-L649
It looks like only one of them is using TEMPLATE_LIST_TEST_CASE at the moment but if it's not too much trouble I think all three of them could be updated to do this for local, s3, and azure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar tests for C APIs here that could be updated
https://github.com/TileDB-Inc/TileDB/blob/dev/tiledb/api/c_api/vfs/test/unit_capi_ls_recursive.cc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The local filesystem seems to fail. Investigating…
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I figured out how to fix the local filesystem test, but will make the fix in a separate PR. For now I reverted enabling it.
tiledb/sm/filesystem/vfs.h
Outdated
@@ -581,6 +581,28 @@ class VFS : private VFSBase, protected S3_within_VFS { | |||
return results; | |||
} | |||
|
|||
/** | |||
* Lists objects and object information that start with `prefix`, invoking |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit to update comment, even though the name for the function documents itself.
* Lists objects and object information that start with `prefix`, invoking | |
* Recursively lists objects and object information that start with `prefix`, invoking |
Feedback addressed. |
1738356
to
9d2c8d1
Compare
CI is green @shaunrd0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for digging into the local FS failures
SC-44943
This PR implements
ls_recursive
in the Azure VFS. The design is based on the S3 implementation, but unlike S3, the Azure SDK calls are cleanly separated, which prevents the Azure SDK headers from leaking to the VFS header.TYPE: C_API
DESC: Support VFS
ls_recursive
API for Azure filesystem.