Skip to content

Development workflow

Paul Nulty edited this page Mar 6, 2014 · 7 revisions

General contributor guidelines

Make small, iterative changes. Contributing a new .R file to the project is a seamless and safe way to contribute, as there is no possibility of conflicting with other edits we're doing or breaking other parts of the library. Any changes to core functions such as those in corpus.R, those in get-texts.R, or tokenize() or clean() should be discussed first (just mail Ken or Paul). Git is complicated. The longer the time between your commits and pushes to github, the more our git history begins to resemble the plot of Primer.

Make a small change, test the code, add and commit the changes locally as you go, and then push it to the dev branch on github before you go about working on another aspect of the code. Committing locally about every 30 minutes and pushing to github every 2-3 hours works well, but it depends on how insulated your changes are from the rest of the library.

Each function should encapsulate a single, generalizable behaviour. A function is single operation that a user might want to do often, or to more than one text or corpus. If a function is more than 30-50 lines of code, consider whether it can be broken into more than one operation.

Suggested Roxygen and Devtools workflow

  • make sure github is synced before any edits

  • setwd() to local quanteda directory, e.g. setwd('Dropbox//code//quanteda/')

  • clean the workspace, and call library(devtools)

  • Make a temporary, local, (informal) test script that calls the function that you want to create or edit and starts with load_all() - you might want to edit more than one .R file at once, so you want the latest local versions to be sourced each time you run your test script

  • Iteratively edit the quanteda file you are changing and run your test script to observe the results until you are finished.

  • All documentation should be in the source-code following the roxygen2 format: https://github.com/yihui/roxygen2#why-use-roxygen2

  • Document at minimum the description, return, param, and examples

  • use data() calls to load data for examples in roxygen

  • Make sure to add roxygen @export tag for publicly facing function (don't edit NAMESPACE manually)

  • Follow the style guide

  • When finished:

  • Call document() to generate roxygen and namespace

  • Call check() to see that everything works

  • if check() fails with an opaque error in the examples, try calling document(clean=TRUE) before check()

  • To see if a check() error is in the examples or not, try run_examples()

  • If check() fails don't commit to github.

  • After check() has passed, commit to github (dev branch), see the git guide

  • If you're really paranoid about breaking something, you can then try: install_github("quanteda", username="kbenoit", ref="dev") and run the amicus example from the github readme.