Skip to content

Working Document: Document API and Bokeh Server

Bryan Van de Ven edited this page Mar 18, 2017 · 1 revision
BEP 5 New Document API and document-driven server
Authors Havoc Pennington and Bryan Van de Ven
Status Implemented (alpha)
Implementation https://github.com/bokeh/bokeh/pull/2794
Discussion https://github.com/bokeh/bokeh/pull/2794 for small topics or email https://groups.google.com/a/continuum.io/forum/#!forum/bokeh for big picture

Motivation

This BEP accomplishes several things at once.

  • Developing a Bokeh server app should use only the Bokeh API, rather than requiring knowledge of "web technology" (JavaScript, http, Flask, etc.). Dropping down to web tech is still possible for those who wish to do so.
  • A new concept of spellings allows experimentation with new syntaxes and file formats for specifying Bokeh models and apps.
  • Simplify the Bokeh server so that it only renders a set of Bokeh models. The generic web framework (Flask) has been removed, and the server no longer has multiuser concerns - it's only the Bokeh renderer.
  • Bokeh server should have reasonable scalability and production-readiness.
  • The Document API was complicated by the old bokeh-server architecture, so it has been simplified.
  • A new bokeh command allows scripts to be "output agnostic"; if a script just fills in curdoc() and does not call output_server, output_file, save, or push, then it can be run with bokeh serve or bokeh html to get a server app or a standalone html file, respectively.
  • More web framework agnosticism; new server only uses Tornado, and can also be run in its own process (embedding plots in another web app)

What did NOT change

This PR does not change the actual plot or widget APIs (bokeh.plotting, bokeh.models, etc.), it only changes the "container" outside of those APIs (Document, PlotContext, bokeh-server, bokeh.embed, bokeh.io).

A concrete example of a new bokeh serve script

In this commit, you can see how the sliders_app example ports from the old server to the new: https://github.com/bokeh/bokeh/commit/0fc0974cf545ae69b6663450dd234f9abb776e65

The following concepts and APIs are no longer in the new version:

  • bokeh.properties and Instance
  • subtyping of HBox
  • extra_generated_classes
  • setup_events
  • specifying on_change with a string name instead of a regular function
  • object_page
  • bokeh_app and bokeh_app.route

The new sliders_app.py adds models to curdoc() and sets up on_change callbacks triggered when those models change. That's it. No hoops to jump through.

Drawbacks

The new server is significantly incompatible with the old bokeh-server command, because server apps are no longer Flask apps, they are simply Bokeh scripts that build up curdoc().

There are some minor API changes to Document and PlotObject that (as far as we know) should be easy to adapt to, because most of the changed API was only there to support the old bokeh-server.

For those who wish to write a traditional web app using a framework such as Flask, it should be possible to embed the new server in another web app. However, now it would be equally easy to use Bokeh's embedded server with any framework, or at least any framework that can integrate with Tornado. It should also now be possible to run the Bokeh server in its own process, and embed plots from that process in your own app.

The new server no longer tries to handle document persistence and publication. Instead, this is a separate concern; there's a new SpellingHandler concept which is an abstraction of "where documents come from." In principle (though not in this PR) one could have a spelling handler that loaded documents from Redis or any other kind of storage. More on SpellingHandler below.

Tour of the new code

The PR is extremely large, which was unavoidable because Document, io.py, state.py, and the server are highly entangled. The best way to review the PR is probably to read the new files in their entirety. Here's a discussion of some of the design points and changes.

The regular GitHub diff view cannot display the large diff, but you can see a plain text diff here: https://github.com/bokeh/bokeh/compare/master...tornado.diff

Document

The new document.py: https://github.com/bokeh/bokeh/blob/tornado/bokeh/document.py

  • there's no longer a concept of "server document"
  • there's no longer a PlotContext; instead the Document has "roots" which are those objects you want in the Document even if no other object refers to them
  • autostore and autoadd are now part of State, rather than Document
  • Documents no longer have their own ID string
  • Document.load and Document.merge no longer exist; they were needed by old bokeh-server to manage partial updates. There are a number of new serialization methods that were simpler to implement, including from_json, replace_with_json, create_json_patch_string and apply_json_patch_string
  • A new on_change method lets you monitor change events for the entire document (all included models) at once
  • there's no longer a prune() method because the model graph is automatically tracked without manual pruning.

figure vs. Figure

The figure() function creates a Figure and also adds the Figure to the document. Before this patch, it was OK to then go on and add the figure to a layout container such as a box. After this patch, that will put the figure in two places (it will be a document root, and also in the box). You should get a warning about this. To fix the warning, change from lowercase figure() to uppercase Figure() so that your figure is only in the box and not also a document root. There is no reason to use lowercase figure() if you're going to use a layout box.

Model, the class formerly known as PlotObject

The new model.py: https://github.com/bokeh/bokeh/blob/master/bokeh/model.py

We renamed PlotObject to Model to reflect that not all models are necessarily related to plots. (They could be a slider or a layout or something.)

  • Each model is attached to a single Document.
  • Serialization now always happens in the context of an entire Document, models cannot be serialized standalone. As a result, the load_json, dump, finalize methods have been removed.
  • on_change now expects a plain function for callbacks, rather than objects plus a callback name.
  • setup_events should not be needed and has been removed.

Application and SpellingHandler

These are new concepts, see https://github.com/bokeh/bokeh/blob/tornado/bokeh/application/application.py and https://github.com/bokeh/bokeh/blob/tornado/bokeh/application/spellings/handler.py

Most apps won't use these APIs directly. Instead, the new bokeh command constructs an Application using the spelling handlers appropriate for the files or directories you provide to the command. For example, bokeh serve foo.py would construct an Application with one ScriptHandler to handle foo.py.

An Application is a Document factory. It has a method create_document().

A SpellingHandler is a Document modifier. It has a method modify_document().

Application keeps a list of spelling handlers. To create a document, it creates an empty document, and then allows each handler to modify_document().

So a Bokeh server application is a script that modifies a document, and that's it. The server calls create_document() once per browser session - each time someone opens up a browser tab, they get a new fresh document.

The created document can have callbacks (using Document.on_change or PlotObject.on_change). When changes are made to the document on the client side (in JavaScript), they are synced to the server, potentially triggering your callbacks. If your callbacks in turn modify the document, those changes are synced to the client side.

What does this mean? If you're writing foo.py for bokeh serve foo.py, all you do is put the stuff you want in curdoc() and set up any on_change callbacks you want to have.

The same script can be output as a standalone html file using bokeh html foo.py, though standalone HTML doesn't support on_change callbacks.

If you want to support a new syntax or convention for specifying (or modifying) a Bokeh document, you'd just add a new SpellingHandler subtype.

Python Client API

Code: https://github.com/bokeh/bokeh/blob/tornado/bokeh/client/session.py

The new client-server protocol is entirely websocket-based. One consequence is that you can open a session, and have session.document continuously synced with the server; your changes to the doc go to the server, and the server's changes go to your copy of the doc.

This somewhat obsoletes the old output_server concept. For compatibility, output_server still works similarly to the old way, but rather than making changes then doing a big push(), the new best practice would be to simply leave the websocket open and continuously syncing.

JavaScript API

Server

Bokeh command

There's a new bin/bokeh which is implemented here: https://github.com/bokeh/bokeh/blob/tornado/bokeh/command/__init__.py

You pass in to the command your files that spell a Bokeh Document, right now this will always be script that modify curdoc(). The command can use these to create an Application which powers a server (bokeh serve) or emits standalone HTML (bokeh html).

Currently unimplemented is bokeh serve --develop, which will be the "develop mode" demonstrated a couple months ago. In develop mode, for example, we might automatically reload the application as you edit it. Our intent right now is to merge the PR without develop mode and then add it in an incremental PR.

The bokeh-server command no longer exists.