Working Document: Document API and Bokeh Server
BEP 5 | New Document API and document-driven server |
---|---|
Authors | Havoc Pennington and Bryan Van de Ven |
Status | Implemented (alpha) |
Implementation | https://github.com/bokeh/bokeh/pull/2794 |
Discussion | https://github.com/bokeh/bokeh/pull/2794 for small topics or email https://groups.google.com/a/continuum.io/forum/#!forum/bokeh for big picture |
This BEP accomplishes several things at once.
- Developing a Bokeh server app should use only the Bokeh API, rather than requiring knowledge of "web technology" (JavaScript, http, Flask, etc.). Dropping down to web tech is still possible for those who wish to do so.
- A new concept of spellings allows experimentation with new syntaxes and file formats for specifying Bokeh models and apps.
- Simplify the Bokeh server so that it only renders a set of Bokeh models. The generic web framework (Flask) has been removed, and the server no longer has multiuser concerns - it's only the Bokeh renderer.
- Bokeh server should have reasonable scalability and production-readiness.
- The Document API was complicated by the old bokeh-server architecture, so it has been simplified.
- A new
bokeh
command allows scripts to be "output agnostic"; if a script just fills in curdoc() and does not calloutput_server
,output_file
,save
, orpush
, then it can be run withbokeh serve
orbokeh html
to get a server app or a standalone html file, respectively. - More web framework agnosticism; new server only uses Tornado, and can also be run in its own process (embedding plots in another web app)
This PR does not change the actual plot or widget APIs (bokeh.plotting, bokeh.models, etc.), it only changes the "container" outside of those APIs (Document, PlotContext, bokeh-server, bokeh.embed, bokeh.io).
In this commit, you can see how the sliders_app example ports from the old server to the new: https://github.com/bokeh/bokeh/commit/0fc0974cf545ae69b6663450dd234f9abb776e65
The following concepts and APIs are no longer in the new version:
- bokeh.properties and
Instance
- subtyping of HBox
extra_generated_classes
setup_events
- specifying
on_change
with a string name instead of a regular function object_page
-
bokeh_app
andbokeh_app.route
The new sliders_app.py
adds models to curdoc()
and sets up on_change
callbacks triggered when those models change. That's it. No hoops to jump through.
The new server is significantly incompatible with the old bokeh-server
command, because server apps are no longer Flask apps, they are simply Bokeh scripts that build up curdoc()
.
There are some minor API changes to Document
and PlotObject
that (as far as we know) should be easy to adapt to, because most of the changed API was only there to support the old bokeh-server
.
For those who wish to write a traditional web app using a framework such as Flask, it should be possible to embed the new server in another web app. However, now it would be equally easy to use Bokeh's embedded server with any framework, or at least any framework that can integrate with Tornado. It should also now be possible to run the Bokeh server in its own process, and embed plots from that process in your own app.
The new server no longer tries to handle document persistence and publication. Instead, this is a separate concern; there's a new SpellingHandler
concept which is an abstraction of "where documents come from." In principle (though not in this PR) one could have a spelling handler that loaded documents from Redis or any other kind of storage. More on SpellingHandler
below.
The PR is extremely large, which was unavoidable because Document
, io.py, state.py, and the server are highly entangled. The best way to review the PR is probably to read the new files in their entirety.
Here's a discussion of some of the design points and changes.
The regular GitHub diff view cannot display the large diff, but you can see a plain text diff here: https://github.com/bokeh/bokeh/compare/master...tornado.diff
The new document.py: https://github.com/bokeh/bokeh/blob/tornado/bokeh/document.py
- there's no longer a concept of "server document"
- there's no longer a PlotContext; instead the Document has "roots" which are those objects you want in the Document even if no other object refers to them
-
autostore
andautoadd
are now part of State, rather than Document - Documents no longer have their own ID string
-
Document.load
andDocument.merge
no longer exist; they were needed by oldbokeh-server
to manage partial updates. There are a number of new serialization methods that were simpler to implement, includingfrom_json
,replace_with_json
,create_json_patch_string
andapply_json_patch_string
- A new
on_change
method lets you monitor change events for the entire document (all included models) at once - there's no longer a
prune()
method because the model graph is automatically tracked without manual pruning.
The figure()
function creates a Figure
and also adds the Figure to the document.
Before this patch, it was OK to then go on and add the figure to a layout container such as a box.
After this patch, that will put the figure in two places (it will be a document root, and also in the box). You should get a warning about this. To fix the warning, change from lowercase figure()
to uppercase Figure()
so that your figure is only in the box and not also a document root. There is no reason to use lowercase figure()
if you're going to use a layout box.
The new model.py: https://github.com/bokeh/bokeh/blob/master/bokeh/model.py
We renamed PlotObject
to Model
to reflect that not all models are necessarily related to plots. (They could be a slider or a layout or something.)
- Each model is attached to a single
Document
. - Serialization now always happens in the context of an entire
Document
, models cannot be serialized standalone. As a result, theload_json
,dump
,finalize
methods have been removed. -
on_change
now expects a plain function for callbacks, rather than objects plus a callback name. -
setup_events
should not be needed and has been removed.
These are new concepts, see https://github.com/bokeh/bokeh/blob/tornado/bokeh/application/application.py and https://github.com/bokeh/bokeh/blob/tornado/bokeh/application/spellings/handler.py
Most apps won't use these APIs directly. Instead, the new bokeh
command constructs an Application
using the spelling handlers appropriate for the files or directories you provide to the command.
For example, bokeh serve foo.py
would construct an Application
with one ScriptHandler
to handle foo.py
.
An Application
is a Document factory. It has a method create_document()
.
A SpellingHandler
is a Document modifier. It has a method modify_document()
.
Application
keeps a list of spelling handlers. To create a document, it creates an empty document, and then allows each handler to modify_document()
.
So a Bokeh server application is a script that modifies a document, and that's it. The server calls create_document()
once per browser session - each time someone opens up a browser tab, they get a new fresh document.
The created document can have callbacks (using Document.on_change
or PlotObject.on_change
). When changes are made to the document on the client side (in JavaScript), they are synced to the server, potentially triggering your callbacks. If your callbacks in turn modify the document, those changes are synced to the client side.
What does this mean? If you're writing foo.py
for bokeh serve foo.py
, all you do is put the stuff you want in curdoc()
and set up any on_change
callbacks you want to have.
The same script can be output as a standalone html file using bokeh html foo.py
, though standalone HTML doesn't support on_change
callbacks.
If you want to support a new syntax or convention for specifying (or modifying) a Bokeh document, you'd just add a new SpellingHandler
subtype.
Code: https://github.com/bokeh/bokeh/blob/tornado/bokeh/client/session.py
The new client-server protocol is entirely websocket-based. One consequence is that you can open a session, and have session.document
continuously synced with the server; your changes to the doc go to the server, and the server's changes go to your copy of the doc.
This somewhat obsoletes the old output_server
concept. For compatibility, output_server
still works similarly to the old way, but rather than making changes then doing a big push()
, the new best practice would be to simply leave the websocket open and continuously syncing.
- https://github.com/bokeh/bokeh/blob/tornado/bokehjs/src/coffee/common/client.coffee The client API in JS is very similar to the Python one.
- https://github.com/bokeh/bokeh/blob/tornado/bokehjs/src/coffee/server/embed.coffee The embed API has been simplified and adapted to the new server.
- https://github.com/bokeh/bokeh/blob/tornado/bokehjs/src/coffee/common/document.coffee There's now a Document (not the DOM one, this is a Bokeh document) on the client side.
- A server hosts one or more
Application
(remember that anApplication
is a thing which cancreate_document
) https://github.com/bokeh/bokeh/blob/tornado/bokeh/server/server.py - For each application we have sessions https://github.com/bokeh/bokeh/blob/tornado/bokeh/server/application_context.py
- A ServerSession is a live instance of a document usually associated with a single browser tab https://github.com/bokeh/bokeh/blob/tornado/bokeh/server/session.py
- Sessions are kept in sync with a client-side document over a websocket https://github.com/bokeh/bokeh/blob/tornado/bokeh/server/views/ws.py
- There's an HTML page that displays a document https://github.com/bokeh/bokeh/blob/tornado/bokeh/server/views/doc_handler.py#L68 and this is the main link for your application
- The server also provides an autoload embed script, like the old server did, for embedding from another process https://github.com/bokeh/bokeh/blob/tornado/bokeh/server/views/autoload_js_handler.py
There's a new bin/bokeh which is implemented here: https://github.com/bokeh/bokeh/blob/tornado/bokeh/command/__init__.py
You pass in to the command your files that spell a Bokeh Document, right now this will always be script that modify curdoc()
. The command can use these to create an Application
which powers a server (bokeh serve
) or emits standalone HTML (bokeh html
).
Currently unimplemented is bokeh serve --develop
, which will be the "develop mode" demonstrated a couple months ago. In develop mode, for example, we might automatically reload the application as you edit it. Our intent right now is to merge the PR without develop mode and then add it in an incremental PR.
The bokeh-server
command no longer exists.