-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jupyter(Hub) conceptual intro #2726
base: main
Are you sure you want to change the base?
Conversation
e86f1a8
to
c129841
Compare
@rkdarst thank you for putting in this excellent work to give an cohesive overview how things relate! I appreciate how it is written in a readable non-reference like manner, I'm thinking about the clarifications in the early part of the text for example where you clarify the purpose of the text is to give overview etc. |
cece31d
to
78f424f
Compare
I did a big pass on this, and I think we should working on polishing. It's late and I'm sure everyone will be able to find little improvements - instead of making things perfect, I'll just let everyone read and make their suggestions. The CI failures seem unrelated, but at least the docs one should be fixed. |
Thanks for coming back to this! The docs build error is:
which I don't fully understand. There is one definition https://github.com/jupyterhub/jupyterhub/blob/master/docs/source/api/auth.rst (the "other instance"). But where is the first definition? |
The weird thing is that on |
On Wed, Apr 15, 2020 at 03:09:51PM -0700, Tim Head wrote:
There is one definition https://github.com/jupyterhub/jupyterhub/blob/master/docs/source/api/auth.rst (the "other instance"). But where is the first definition?
My current hypothesis is that it is about autodoc somehow: it's
parsing the docstring to turn it into docs, but then one of the
`.. autoconfigurable:: Authenticator` includes the same thing.
And actually. scratch that. Look at the current docs:
https://jupyterhub.readthedocs.io/en/stable/api/auth.html
jupyterhub.auth.Authenticator has *every* attribute, including
admin_users, duplicated! So, there *is* some actual, existing
duplication. I guess something is stricter on warnings?
|
https://github.com/jupyterhub/autodoc-traits/blob/master/autodoc_traits/autodoc_traits.py#L28 I added some "print" statements right before this line, and sure enough, |
- When run with the `:members:`, then traitlets traits are duplicated, because they are added to the autodoc list from both from `:members:` and this autodetection. (I think) - Discussed in at least jupyterhub/jupyterhub#2726. Currently causing JupyterHub docs to fail, because sphinx gives an error if there are duplicate autodoced traits and it is run with `-W`. - This seems to be started in a new version of sphinx, but we aren't completly sure. JH has been using the `-W` option since 2017. - I'm unsure if this is the right solution, but it works and gets me past these errors.
After this, I am able to get the build to run further - but then there are other new errors. I guess sphinx really has gotten a lot stricter in more ways than one. Easy fix is removing |
Thanks for diving into autodocs! We introduced the However we don't want to block progress with this. Maybe as a compromise we can limit ourselves to sphinx<3 for now? A next step would be to start a (hopefully) small coordinated effort to get the docs building with v3. Curiously the last build on |
Discussion of docs failure moved to #3021. |
78f424f
to
87704e3
Compare
Hurrah! All CI robots are happy again. Great work digging into sphinx and autotraits! |
Now that CI works, would someone like to take a look at this? It is ready now, I've done multiple passes. Some hints on what to check is at the top, but roughly a) is the placement within the docs good? Should it entirely be somewhere else? ... and b) fact-checking. I've done lots with Jupyter so I think there's not much wrong, but of course there's just so much that any other eyes will of course help. |
## JupyterHub | ||
|
||
**JupyterHub** is the central piece that provides multi-user | ||
login. Despite this, the end user only briefly interacts with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"... provides multi-user login $thing" The sentence somehow ends abruptly. Could it be "...multi-user login capabilities." or "...functionality."?
[reference](../reference/authenticators)) if the | ||
username/password is valid(&). The authenticator can also return user | ||
groups and admin status of users, so that JupyterHub can do some | ||
higher-level management. The authenticator returns a username(&), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move the "The authenticator can also return..." sentence to the end of the paragraph? Lets tell people about the main thing the authenticator does "return a username" and then afterwards tell them about "but wait there is more".
the user's notebook servers. It actually isn't directly between, | ||
because the JupyterHub **proxy** relays connections between the users | ||
and their single-user notebook servers. What this basically means is | ||
that the hub itself can shut down, and if the proxy can continue to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing words in the second half of the sentence?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the problem was an extra "if". Removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rkdarst as promised I took a read through of the docs here. I think this puts a lot of useful information in one place to try to dispel some folks' confusion about what's what. I made a few comments where some things may need to be clarified.
@@ -0,0 +1,465 @@ | |||
# What is Jupyter and JupyterHub? | |||
|
|||
JupyterHub is not what you think it is. Most things you think are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"is not" -> "may not be" ?
part of JupyterHub are actually handled by some other component, for | ||
example the spawner or notebook server itself, and it's not always | ||
obvious how the parts relate. The knowledge contained here hasn't | ||
been assembled in one place before, and is essential to understand |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and -> but
In this document, we occasionally leave things out or bend the truth | ||
where it helps in explanation, and give our explanations in terms of | ||
Python even though Jupyter itself is language-neutral. The "(&)" | ||
symbol highlights important points where there is more. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"where there is more."
Not sure what there is more of? Did the ending get chopped off or should it be "there is more to it" or something?
Before we get too far, let's remember what our end goal is. A | ||
**Jupyter Notebook** is really nothing more than a Python(&) process | ||
which is getting commands from a web browser and displaying the output | ||
via that browser. What the process actually sees can roughly like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"can roughly like" -> "can roughly be thought of like" ?
JupyterHub: when someone wants a notebook server, the spawner allocates | ||
resources and starts the server. The notebook server could run on the | ||
same machine as JupyterHub, on another machine, on some cloud service, | ||
or even more. They can limit resources (CPU, memory) or isolate users |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure who "They" refers to, is this the administrator who configured the hub?
opens in a separate tab. It is traditionally started by `jupyter | ||
notebook`. | ||
|
||
Does anything need to be said here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you don't have enough for 2 ###-level sections, maybe just smush them into the single user notebook server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree - I don't think we need much depth on the interfaces, maybe beyond mentioning that they'll live at different URL prefixes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the sections but not the text, someone else can do that later. Anyway, there is jupyter_server now, which will some sort of updates here, right?
|
||
## I want to... | ||
|
||
TODO: answers to common cross-layer questions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the foregoing text actually does a good job of laying things out. You might want to omit this section for now to move forward with getting these docs integrated and then add this section later if things come up that aren't handled better any other way.
there are still plenty of details, implementations, and exceptions. | ||
When setting up JupyterHub, the first step is to consider the above | ||
layers, decide the right option for each of them, then begin putting | ||
everything together. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe cite the JupyterCon talk?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that mean this one? https://www.youtube.com/watch?v=JxyKBNJnfVM
Since it's mine and perhaps old, I'll let someone else decide to add it.
Great to see this PR moving forward again. I will take a more detailed look. One thing that I'm thinking is that we come up with an improved title. "JupyterHub Concepts for New Users", "JupyterHub: A Conceptual Look" @choldgraf added you to the reviewers as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @willingc for pinging me. I took a pass through the document and made several suggestions and comments.
I think that this content will be really helpful for people trying to wrap their heads around JupyterHub (and the broader Jupyter ecosystem). In my opinion we should do a round or two to make sure the content is of "MVP quality" and then get it in the docs, and iterate on it over time. The PR is big enough that I worry it'll get bogged down for a long time if we try to make it perfect. Does that make sense to others?
@@ -0,0 +1,465 @@ | |||
# What is Jupyter and JupyterHub? | |||
|
|||
JupyterHub is not what you think it is. Most things you think are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we should avoid narrative flourishes like this. I think it's good writing and I enjoy it, but I don't think it's the most helpful for newcomers who are learning technical concepts. It may also make things harder to understand for non-native english speakers
e.g., rather than saying "Most things you think are part of JupyterHub are actually handled by some other component", we could say "JupyterHub is designed in a modular fashion, and much of its functionality is handled by pluggable components."
|
||
JupyterHub is not what you think it is. Most things you think are | ||
part of JupyterHub are actually handled by some other component, for | ||
example the spawner or notebook server itself, and it's not always |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we link to sections where we discuss these in the docs?
part of JupyterHub are actually handled by some other component, for | ||
example the spawner or notebook server itself, and it's not always | ||
obvious how the parts relate. The knowledge contained here hasn't | ||
been assembled in one place before, and is essential to understand |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove the "story of this document" stuff like "hasn't been assembled in one place before"
|
||
In this document, we occasionally leave things out or bend the truth | ||
where it helps in explanation, and give our explanations in terms of | ||
Python even though Jupyter itself is language-neutral. The "(&)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the &
symbol have a functional use on the page? (e.g., does it create a hyperlink or something like that?) if not, I think we should either:
- Turn these into footnotes or in-line links to other sections
- Remove the
&
symbol because I think many readers won't really understand what to do with it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got this idea form regex(7)
, which uses (!)
to indicate a certain thing that is commonly said but shouldn't be spelled out each time (possibly non-portable decisions). I thought I had two choices: a) say thing that are not entirely true or the whole story or b) very often say "but there is more/this is a simplification". Perhaps the definition of (&)
could be c) emphasized in an admonition, or d) it could be changed to "everything on this page may be simplified and expect some inaccuracies here". But I'd expect (d) could lead to lots of updates adding more details, making the page too long for it's main purpose...
also - somebody (@willingc ? @minrk ?) should enable ReadTheDocs builds for PRs so we can preview these changes! https://readthedocs.org/projects/jupyterhub/ (I don't have permissions) |
RTD build is active, this needs a rebase since |
All of a sudden I'm reminded that this exists. I'll try to make more improvements to it based on the suggestions, but I'm not very good at following up with things these days, so if anyone wants to push things forward, by all means go ahead! Does anyone know if rebasing (but not renaming the file) to resolve the conflicts above will mess up the per-line issues? My thought is "probably not" but just want to make sure... |
No ideas re: rebasing but I usually find it to work sensibly for this kind of thing. I am a big fan of merging this one quickly. It has a lot of useful information and I'd prefer merging something imperfect and then iterating, rather than having all of this knowledge locked up in a PR draft. |
- Single-user servers are same you get with `jupyter notebook`. - Kernels by default in single-user server environment but don't have to be.
- Apparently recommonmark does intelligently uses links like sphinx+rst, and you shouldn't use `.html` on the links.
Thanks to @betatim Co-authored-by: Tim Head <betatim@gmail.com>
- Thanks to @betatim for the suggestions.
Co-authored-by: Chris Holdgraf <choldgraf@gmail.com>
obvious how the parts relate. The knowledge contained here hasn't | ||
been assembled in one place before, and is essential to understand | ||
when setting up a sufficiently complex Jupyter(Hub) setup. | ||
|
||
This document was originally written to assist in debugging: very | ||
often, the actual problem is not where one thinks it is and thus | ||
people can't easily debug. In order to tell this story, we start at | ||
JupyterHub and go all the way down to the fundamental components of | ||
Jupyter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
obvious how the parts relate. The knowledge contained here hasn't | |
been assembled in one place before, and is essential to understand | |
when setting up a sufficiently complex Jupyter(Hub) setup. | |
This document was originally written to assist in debugging: very | |
often, the actual problem is not where one thinks it is and thus | |
people can't easily debug. In order to tell this story, we start at | |
JupyterHub and go all the way down to the fundamental components of | |
Jupyter. | |
obvious how the parts relate. |
Removing as per suggestion
company), or whitelist only the allowed users (e.g. your group's | ||
Github usernames). Some other popular authenticators include: | ||
|
||
- **OAuthenticator** uses the standard OAuth protocol to verify users. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, we should (not just here but many places scattered around here), but to get it out I'll save that for later...
what it does out of the box) and makes the hub not too dissimilar to | ||
an advanced ssh server. | ||
|
||
There are many more advanced spawners: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a link to /reference/spawners
, but didn't remove any from here since they somehow were chosen to represent the diversity of what's available... I'll let someone else pick what to remove.
|
||
The proxy always runs as a separate process to JupyterHub (even though | ||
JupyterHub can start it for you). JupyterHub has one set of | ||
configuration options for the proxy addresses (`bind_url`) and one for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I got this right now (in the push that is upcoming), might be worth someone checking that it matches modern standards...
opens in a separate tab. It is traditionally started by `jupyter | ||
notebook`. | ||
|
||
Does anything need to be said here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the sections but not the text, someone else can do that later. Anyway, there is jupyter_server now, which will some sort of updates here, right?
there are still plenty of details, implementations, and exceptions. | ||
When setting up JupyterHub, the first step is to consider the above | ||
layers, decide the right option for each of them, then begin putting | ||
everything together. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that mean this one? https://www.youtube.com/watch?v=JxyKBNJnfVM
Since it's mine and perhaps old, I'll let someone else decide to add it.
51f8129
to
e3d645d
Compare
for more information, see https://pre-commit.ci
In the sprit of "getting it out and update", I made rebased to current upstream and did the quick revisions from the reviews. There are still more extensive things to do, but I'll leave that for someone else for later. |
linkcheck failures seem unrelated |
At the JupyterHub/BinderHub workshop, one of our ideas was to make a conceptual intro to JupyterHub so that people could know what it does, and in particular what it doesn't do (what is handled by other components). We get many issues that end up misdirected or that have a root cause of not understanding what the components are.
This PR is my initial draft - comments welcome. I've written it informally and with a certain opinion - it's supposed to be like a teaching giving a first lesson, and not technical reference. It goes far beyond just JupyterHub, but the reason for having it here is that you really need to start knowing this stuff once you start administrating a JupyterHub, before that you can sort of get by without a perfect mental model.
Issues I know of: