Skip to content

Individual Plans H1 2023

Pavithra Eswaramoorthy edited this page Mar 31, 2023 · 10 revisions

This page is a place for @bokeh/core to collect high-level thoughts and informal plans for things they expect or hope to be able to work on in the next ~6 months.

Bryan

Vectorized annotations

There is historical design cruft around the division between some glyphs and some annotations. I would like to achieve as a group, a consistent plan forward that will clear up the current situation (with any necessary deprecations), as well as guide future work when new drawing features are to be added. A specific outcome is to clearly enunciate questions like:

  • what are firm technical differences between glyphs and annotations
  • what features and APIs should all annotations support (are there sub-categories?)
  • what are consistent naming conventions we can adopt now and carry into future work

Keeping in mind that sometimes things may fall into both categories, e.g. we may want a Text glyph, and also a vectorized Labels (since the term "label" has widespread significance and meaning within the dataviz world). They should share as much implementation as possible, but also clearly demonstrate the conceptual intentions behind "glyph" and "annotation" individually.

This work is primarily planning and discussion, but eventually someone(s) will do work around the glyph and annotations modules of both Bokeh and BokehJS.

Documentation improvements

  • revamp example handling

    I think we will want to go back to having a custom parser for .py files. Now that examples have been consolidated under examples we can point the custom parser there, and have it pre-process all the files up front. This will allow us to have better control over caching, etc. and to avoid re-evaluating examples unnecessarily.

    We may also want to explore options like switching to JSON embeds rather than relying on old and complicated autoload_static so much.

  • speed up docs build

    Our docs currently take very long to build. The examples work described above may help improve things, but is probably not the entire story.

  • auto API linking

    Example code in our docs should auto-link

    There are some existing tools that almost work out of the box, at least for functions and classes, etc. But realistically we would like auto-links to our model properties as well. This may necessitate developing a custom Sphinx extension yet again.

Any work should be mostly confined to bokeh.sphinxext and docs/bokeh in the main repo. However, it is possible we may want to add hooks into the model system to facilitate API auto linking.

Demo site deployment automation

The demo site was recently re-built using terraform. A full infrastructure tear-down and setup using terraform requires highly elevated AWS credentials. However, the current overall infrastructure really should not need to change at this point. Instead, it should be possible to simply swap out new containers for new Bokeh versions into the existing infrastructure. For this smaller task, a well-scoped set of minimal credentials can be determined and automation set up so that anyone can kick off to update the site when needed.

Any work should be confined to https://github.com/bokeh/demo.bokeh.org and AWS and I do not anticipate any impact for any work ongoing in the main repo.

Testing improvements

  • Integration tests

    Our integration tests are currently disabled due to flakiness. Some level of full end-to-end cross-runtime testing capability needs to be restored.

  • Notebook tests

    We are ten years overdue for having any automated testing that actually exercises real notebooks. It's not clear what the best approach is, so some exploration and discussion will be necessary.

I would expect work to be mostly contained under bokeh/tests and .github/workflows

Ian

Contouring improvements

We have a working implementation of contour plots, but improvements are needed to make it really useful. These ideas were originally in the Contouring Roadmap discussion.

  • Automatic calculation of contour levels. User specifies the number of levels required and they are calculated based on the supplied data limits. There are possibilities for linearly and logarithmically spaced levels, and linearly symmetric about zero, maybe more later. User may not receive exactly the requested number of levels as the requirement that the levels are sensibly spaced is more important.

  • Ways of specifying vector visual properties without knowing the number of levels in advance. This applies to all fill and line visual properties, and there will be extra palette-specific possibilities for fill_color and line_color. I am expecting the validation and calculation of these properties to occur on the Python side as all other contour validation and calculation occurs there.

  • Extending colorbar above and below the level range for filled contours. Before this, there are always contour lines calculated and drawn at the lower and upper level limits. If you "extend above" there will be an additional set of filled polygons from the maximum level upwards. If you "extend below" there will be an additional set of filled polygons from the minimum level downwards. This needs a sensible API and a way of indicating it visually on a colorbar, e.g through the use of filled triangles at the upper and/or lower limits.

WebGL improvements

These ideas are mostly from the WebGL Roadmap discussion.

  • Improved single line rendering. I have an idea for a different approach in the line shaders that could be both simpler code and faster to run. It should also address some of the current dashed line limitations.

  • Dashed line support for markers, meaning fixed-shape glyphs. Initially I am only considering some of the more commonly used and easy to implement shapes such as squares and circles.

  • Multiline glyph. The WebGL part of this is fairly easy, but it also requires a restructuring of the BokehJS rendering loop so that a single render call can blit and clear the WebGL canvas multiple times.

  • Arbitrary area glyphs, including with holes in them. This is going to need JavaScript or WASM tesselation/triangulation functionality.

  • Mechanisms to minify the shader code, and allow insertion of code into shaders. For example, we only want a single copy of the code that draws the various hatch patterns, and this needs to be used in both marker and arbitrary area shaders.

Note that I am not aiming here for full WebGL support.

BokehJS documentation

I'd like documentation on BokehJS to be auto-generated, whether that is standalone or part of the sphinx doc build system. Primarily my interest is to provide contributors with better information on the API including the availability or not of handy utilities such as ndarray classes and functions. But it should also be useful for users who are interesting in accessing BokehJS functionality through extensions and/or callbacks.

Mateusz

Finalize support for canvas layouts

Add support for:

  • multiple plots per canvas and arbitrary plot positioning
  • layouts of legends, color bars and possibly other annotations
  • legends, color bars and plot independent layouts on separate canvases

Support multi-threading in bokehjs

Allow all data intensive computations (set_data, map_data, hit testing, etc.) and painting (using offscreen canvas where applicable) to be performed off the main thread in dedicated web workers. In the longer term, consider performing all data processing in web workers, including receiving data through web sockets. Focus on making bokehjs' UI more responsive when handling large amounts of data. This work needs to be coordinated with adding support for web assembly and bokehjs packaging reorganization.

Robustify data handling in bokehjs

Introduce web assembly to bokehjs, to allow using tooling and a programming language better suited for handling data. A language and a respective tool chain need to be chosen. Currently Rust is under evaluation in PR #12961. Similarly to multi-threading support, consider re-implementation of data processing logic. Consider utilization of SIMD where applicable. Add support for 64-bit integer arrays, handle complex dtypes and non-native number types (e.g. fixed point arithmetics). Generally improve support for handling ndarrays. Consider supporting other data/array formats (e.g. arrow).

Improve bokehjs packaging and development

Rethink how we build, package and bundle bokehjs. Specifically reduce or completely eliminate usage of tsc's compiler APIs in favour of a faster tool chain (e.g. swc). Split bokehjs into smaller self-contained packages. Split-off plotting/vis code into its own bundle. Finalize support for ESM bundles. Investigate alternative bundling schemes (e.g. use scope hoisting and eliminate Bokeh.require()).

Improve validation and error handling

Add support for validating of the shape and types of data whether it matches with capabilities of associated glyphs'. Currently it's impossible to verify whether the supplied data makes sense, at least not until bokehjs tries to process it. Allow to report many or all issues during validation, instead of quitting on the first one. This principle should be applied more generally across bokeh and bokehjs.

Separately, make bokehjs more error resilient by not allowing unhandled exceptions. Any such exceptions should only indicate bugs and not usage error. All usage errors should be either recoverable or presented in the UI with an explanation how to recover from them manually.

Create JSON schema for the protocol

Create a JSON schema for bokeh's protocol and in general for any JSON generated by bokeh (base schema). In the long term, create tool for generating schema for all models and their properties (detailed schema). Having a schema for the protocol should make it easier to create new tools targeting bokeh's protocol and bokehjs.

Timo

Documentation

  • Continuously improve readability and structure
  • Update to most recent version of Sphinx (currently pinned to 5.1) and theme (currently pinned to 0.9)
  • Help with automating documentation

Tutorial

Pavithra

Outreach

  • Coordinate participation in Outreachy's May 2023 round
  • Coordinate presentations and sprints at conferences this year, esp. SciPy US
  • Coordinate the call for JS-focused Bokeh core-dev
  • Work with Victoria on regular social and blog posts

Grants

  • Work on proposals for improving Bokeh's accessibility, we can apply to NASA's HPOSS grant and the next CZI EOSS round.

Administration

  • Update bokeh/pm repository
  • Create and publish a Privacy Policy
  • Look into setting up a different analytics tool (move away from GA)