-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend Arrow support to cover nullable data #4049
base: dev
Are you sure you want to change the base?
Conversation
This pull request has been linked to Shortcut Story #27472: Extend arrow i/o operations to nullable columns. |
Looks like a few msvc errors, https://github.com/TileDB-Inc/TileDB/actions/runs/4758463351/jobs/8456556083?pr=4049#step:10:593
|
I also saw an Ubuntu build fail but figure that would be spurious. My changes so far are quite gingerly and limited. Let's see what happens once new tests get injected. |
e237b20
to
17c1492
Compare
@eddelbuettel if you see my last commit, we were basically never testing col sizes greater than zero.
or something similar. Do you have any idea? Spent some hours but couldn't find any solution. |
Howdy. Love that you are trying to resurrect the PR. Theses days it would probably be better to wrap some of the code in the nanoarrow (header-only) helper functions we already use in tiledb-r and tiledbsoma (mostly the R parts). See https://github.com/apache/arrow-nanoarrow and check out the recently expanded python side (to manipulate objects; nanoarrow is really three packages namely the header-only C/C++ helper, a somewhat mature R package to work on object s and a growing Python package). I think I wrote this PR after I had similar work in the R package where I since have changed things. (Just took a look.) Hm, that is on the other hand quite motivated by SOMA's libtiledbsoma and its ColumnBuffers etc. My pedestrian approach has usually been to setup a similar mock function standalone, or as an R extension, and ten abstract out the pure C++ side for Core. Back to your PR: I honestly do not know what the col size is mad about. Is there something driving this now and/or urgency? |
The
arrowio
header provides import export support from/to TileDB and Arrow with its interface of twovoid*
pointers. This PR extends the support to cover 'nullable' aka 'validity map' data.The PR is in need of some tests but the existing tests is a little involved between Python, pybind11 and C++ so @ihnorton has kindly 'volunteered' to add this.
The PR will remain a draft til we have tests.
TYPE: FEATURE
DESC: Nullable support for Arrow