Pandas 2.0 and Arrow #12868
Replies: 3 comments 12 replies
-
I definitely need to look into this some more (and get a better understanding of Bokeh internals, for that matter), but Bokeh 3.1 should have no problem handling pandas pyarrow arrays since they sublcass pandas ExtensionArrays (see below). However, anytime a pandas array is encoded it is just converted to a numpy array. This means that some of the main benefits of arrow (i.e., minimizing array copies) are being lost. Working code example for
import pandas as pd
from bokeh.plotting import figure, show
from bokeh.models import ColumnDataSource, LabelSet
def main():
source = ColumnDataSource({
'x': pd.array([0, 1, 2], dtype='int64[pyarrow]'),
'y': pd.array([0, 1, 2], dtype='int64[pyarrow]'),
'text': pd.array(['hello', None, 'world'], dtype='string[pyarrow]')
})
p = figure()
p.circle(source=source)
p.add_layout(LabelSet(source=source))
show(p)
if __name__ == '__main__':
main() |
Beta Was this translation helpful? Give feedback.
-
For reference here's a Gist demonstrating how to serialize an arrow table across a Jupyter Comm: https://gist.github.com/manzt/5c5e3de6d2cea65d2bb68af03db88249 This should be pretty similar to how we would handle this. |
Beta Was this translation helpful? Give feedback.
-
Hopefully the team could bring issue #10464 into consideration. If bokeh support arrow format or use arrow format as internal data representation, I guess we could get better performance when user provides data to |
Beta Was this translation helpful? Give feedback.
-
Pandas 2.0 is coming out and is increasing support for Apache Arrow as a backend:
https://datapythonista.me/blog/pandas-20-and-the-arrow-revolution-part-i
I wanted to start this discussion just to get some preliminary idea of what this might mean for us: will everything work out of the box, do we need any new explicit codepaths, are there longer term opportunities (e.g. to leverage pyarrow and arrow.js?)
More immediately: if things don't work out of the box (or nearly) do we need a pandas upper bound for version 3.1?
cc @bokeh/dev @benrussell80
Beta Was this translation helpful? Give feedback.
All reactions