-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C ABI records #2082
Comments
Having a Taking a step back, the real question is just how to make it more ergonomic to create arrays of tuples through the C API. It would not be particularly difficult to make a |
I think for arrays of tuples/records, reusing the same format of Separate arrays for each component is a nice format for any language that has constructs like The aim for Futhask is generate Haskell types that mirror the Futhark types and can be easily and efficiently converted to and from, while being easy to construct and manipulate in Haskell. While creating the abstractions necessary to maintain ergonomics with this type of representation might be difficult in some languages, I think it'd be a shame if every conversion involved this type of packing/unpacking transformation. |
This is an enhancement proposal, aiming at better ergonomics and/or speed.
Passing arrays or records or tuples into entry points can currently be done efficiently by marshalling them transposed. To send an argument like
the caller needs to call tuple_new() N times, potentially(?) allocating each time, and crossing the FFI barrier, making optimizations harder (if LTO can eliminate the calls; otherwise impossible).
To get back that speed, the arguments must be like this:
foo0: [N]u8
foo1: [N]u16
In this case, the array needs to be created only twice, but no extra allocations happen.
The issue with this approach is that this kind of output is rarely idiomatic in today's programming languages. To get this output, all data, if it exists in the form [N](u8, u16), must be processed again by the CPU (slow) in order to match this new shape.
Some languages without a stable internal data representation get around this problem by having another kind of record type. In Rust, a regular struct can be turned into one with a defined field order by specifying the C representation:
Such structs can be accessed on the C side as:
and also as array members.
Futhark already supports primitive types laid out consecutively in memory. The purpose of the C ABI record would be to have a representation that lets it get packed into memory by the caller. Then a value like
foo: [N](c_abi(u8, u16))
would be created from an array directly:As a result, the caller has to do nearly no extra work.
Details
This is faster than the old way only if element accesses on the futhark side are as fast as for normal records/tuples.
While the operations this gets rid of are all O(elements), this should have impact where bandwidth to the processing unit is low compared to the computations done by each entry call.
This scheme can be nested, but it makes sense to start with only primitive types at first.
The idea came to me when learning that small arrays are inefficient. I wanted an array of records, but records aren't easy to pass to Futhark. That pushed me to an array of small arrays, which turn out to be inefficient. Now I'm packing data into u32, which is both limited and annoying. It's still less annoying than splitting the data into multiple arrays.
The text was updated successfully, but these errors were encountered: