Skip to content

The Datahub Workflow

Anthony Suen edited this page Jun 27, 2017 · 3 revisions

Workflow Steps

Below are the steps in the general workflow to start distributing content with students in the class. Datahub runs on Jupyterhub and leverages Jupyter Notebooks, Github, Interact Links to create, store, and distribute notebooks in data science courses.

Workflow

  1. Create a notebook
  2. Push the notebook to Github.
  3. Distribute the notebook to students.
  4. Students submit their completed notebooks to you.

Derived from Patty's workflow on her workflow.

1. How to Create a new Python notebook in DataHub

This document describes one possible workflow for this process. See the DS8 Connector-Instructors Workflow Wiki for another.

ASSUMPTIONS:

CREATE NEW ASSIGMENT

  1. I log in to my account on the Datahub.
  2. I create a new notebook for my students or I upload one that I've created on my computer.
  • Once I'm logged in I can click on the upload button to upload files or New > folder to create a new folder for my assignment.
  1. I upload any data files I need into the same jupyter folder (keeps things simple).
  2. I run my notebook in the jupyter hub to make sure it works.

THIS STEP IS ESSENTIAL as your local python environment may not be the same as that on the data8 jupyter hub.

  1. I clear all outputs from my notebook to reduce the file size (Cell > All outputs > Clear).

  2. I download my notebook if I have made any changes.

ADD NEW ASSIGNMENT TO GITHUB

You can download your notebook & data files and create a zipfile that you add to a bCourses assignment. If you do this, your students need to download it from bCourses and upload it to the DS8 jupyter hub. Using github makes the process of opening a new notebook in the Jupyter hub as easy as opening a url. But it creates more work for you.

The process below is one of the easiest ways to add notebooks to your connector github repo.

  1. To add my assignment to github, I open my connector Github repository site in my web browser, for me: https://github.com/data-8/geospatial-connector

  2. I create a new folder for my assignment, eg hw1/readme.md by clicking on the Create new file button and adding hw1/.

  • That forward slash makes a folder, but the folder cannot be empty so I add a readme.md file.
  1. I add a message to the readme.md file about the homework, e.g. This is homework 1.
  2. I scroll down to add a one line commit message like Created a new folder for HW1 and then click Commit new file.
  3. In github, I click on the new folder to change to that directory.
  4. Then, I upload my new assignment to this folder by clicking on "Upload files"
  • If the upload files or create new file button is not active then you do not have the correct permissions to your github repo.
  1. I scroll down to add a one line commit message like added homework data and notebook files and then click Commit new file.

OPEN GITHUB NOTEBOOK in THE JUPYTER HUB

  1. Once I create my notebook I add it to bCourses as an assignment that links to the notebook. To create an interact link that allows my students to open the notebook on the class jupyter hub, I update the url below to point to the folder with my assigment.
  • http://datahub.berkeley.edu/user-redirect/interact?repo=<repo_name>&path=<path_name>

For example:

  1. And then I run the whole notebook again to make sure it didn't mess up.

Tips