Storing Notebooks on GitHub
- Web Interface
- GitHub
- Outside Hosts
- Shared Copy on JupyterHub
- Direct Upload
- Large Datasets
- Command Line
- Additional Resources
You can perform many actions such as uploads and downloads directly through GitHub's web interface directly, without having to use the command line interface. Here are some directions on how to upload assignments to GitHub. If you did your development on JupyterHub, download the notebook onto your computer. Then, go to your connector's GitHub repository and click Upload Files
on the right side.
You can drag and drop your desired files onto the page. Then, write a short sentence describing the files you're adding. This short sentence is called a commit message.
You will then see an option to select the branch for your changes. The default for most repositories will be the master
branch. If you are a Git beginner, you can stick to the default and add your changes to the master
branch. If you are a more advanced Git user and want to use different branches, you may want to select the option to create a new branch. Please see the additional GitHub resources on this page to learn more about branching.
Once you've gone through the above steps, you can save your changes. A set of changes in Git is called a commit.
Datasets and the corresponding Jupyter Notebook can be stored in a folder on GitHub. You can then create an interact link for the entire folder. When students click this link, the entire folder will appear on their JupyterHub account.
You can store the data on an online host such as Box, Google Drive, or even GitHub. You can then include a cell with this download_dataset
function or have students read the data directly via URL. The read_table
function for the Table data structure supports URLs.
Contact us on Piazza if you want your data to be saved in shared folder on JupyterHub directly. Notebooks stored on JupyterHub will be able to access this data. This is the preferred method for large datasets.
Students can directly upload data files that you provide them onto their JupyterHub accounts. This method can get messy if notebooks expect the data to be stored at a certain filepath and students upload the files to a different location. Therefore, we recommend using the other methods listed on this page.
For datasets on the order of GB, we recommend that you contact us regarding hosting a shared copy on JupyterHub. You can also use use outside hosts and provide students with a URL to the data, which they can then read into a Table or other data structure.
GitHub can also be used via the command line. You can store your connector's Git repository locally and use a local terminal application to access the command line. You can also store the repository on datahub.berkeley.edu and use the terminal that is present on the JupyterHub site. The instructions below are tailored towards command line use over JupyterHub, but the commands listed can be run on a local terminal as well.
You can access the terminal on JupyterHub by clicking on the New
dropdown, and then clicking on Terminal
.
You will then see a terminal page in the browser.
In order to push to your connector's repository, you must have the repository downloaded (aka cloned). If you have not yet cloned the repository, type the below command into the terminal. The <repo_name>
is the name of the repository for your connector. The repository names are listed at https://github.com/data-8. Once you run the below command, you will see a folder for your repository in your home directory on JupyterHub. You do not have to repeat this step again.
git clone https://github.com/data-8/<repo_name>
For example, if your repository is called health-connector
, you'd type:
git clone https://github.com/data-8/health-connector
After this step, you should be able to see your connector's folder at https://datahub.berkeley.edu. Create, upload, or move content (Notebooks, datasets, etc.) into the folder. For more information on creating Notebooks, see this page. For more information on storing datasets, see this page. Once you have your content in the newly created connector repository folder, you can follow the steps below on the terminal to push to GitHub.
cd ~/<repo_name>
git status
You should see something that lists the files you've changed or added. If your files don't show up, ensure that they are in your repo's folder.
git add -A
git commit -m "Update"
git push origin master
If the push is successful, you should be able to go GitHub and see the newly uploaded file in the connector repo. If you run into something that looks the below error, contact us on Piazza and we will make sure you have the permissions needed.
ERROR: Permission to data-8/some-connector.git denied
Here are the above commands, consolidated. This workflow is intended for Git beginners. Git offers many additional features that are not demonstrated in these steps.
git clone https://github.com/data-8/<repo_name>
cd ~/<repo_name>
git status
git add -A
git commit -m "Update"
git push origin master
Web Interface
- Managing Files - contains information under the "Managing Files on GitHub" section on how to perform many basic file operations using the GitHub web interface.
- Hello World Exercise - a short exercise that walks you through additional GitHub features such as branches and pull requests.
Command Line
- Atlassian Tutorials - tutorials for different levels of Git Users.
Desktop GUI
- Desktop GUI site - information on using a GitHub desktop GUI.