Skip to content

Google Summer of Code 2018

Levi John Wolf edited this page Jan 17, 2019 · 1 revision

Google Summer of Code 2018

PySAL is inviting students to join in PySAL's development by applying for Google Summer of Code 2018. This is the third year PySAL will be seeking to participate, and we hope to again work under the umbrella of the Python Software Foundation (PSF).

Introduction

PySAL is an open source library of spatial analysis functions written in Python intended to support the development of high level applications. See our documentation for more details. The developer guide describes in more details how to make contributions to PySAL and our work flow for contributing to the project. Our issues are also on github, which include bug reports and 'wishlist' items and enhancement plans and ideas.

If you are interested in participating in GSoC as a student, the best approach is to become an active and engaged contributor to the project right away. You should take a look at some of the existing issues on GitHub and see if there are any you think you might be able to take a crack at. Try submitting a pull request for something and start getting the hang of the process and interacting with the PySAL code base and development community.

Guidelines and Prerequisites

Students should start by reading the guidelines for participation. Google also provides guidelines to help with writing a proposal as part of their GSoC Student Guide. It is a good idea to start on your proposal early, post a draft to the pysal-dev mailing list and iterate based on the feedback you receive. This will not only improve the quality of your proposal, but also help you find a suitable mentor.

Please note that as a sub-organization of the PSF (and active members of the Python community), we ask that all mentors and students working with PySAL abide by the Python Community Code of Conduct.

Project Ideas

Below are a listing of possible projects that students might consider. We also encourage students to propose their own projects, though several of the following topics are relatively high on our priority list. Our priority list is flexible, and it is important that the topic matches the interest and background of the student.

When considering the following projects, don't be put off by the knowledge prerequisites -- you don't need to be an expert, and there is some scope for research and learning within the GSoC period. However, familiarity with and interest in the subject area and involved technologies will be helpful!

Geovisualization Module

PySAL was originally conceived as a library implementing advanced spatial statistics and econometric methods. Given that there were many different visualization toolkits in the Python ecosystem as well as GIS packages, visualization was not a focus of our library. However, over time users of PySAL wanted the ability to visualize the results of the computations that the analytical components provided. In response a contributed module viz was developed to explore alternative approaches towards providing light-weight visualization for PySAL.

The goal of the viz module is to provide a simple to use and lightweight interface that connects PySAL to different popular visualization toolkits. While much progress has been made, there is more that can be done on the viz project as the visualization space is one that is constantly evolving.

Specific activities for the viz project include:

  • Refinement and extension of the matplotlib interface (e.g. legends, views for analytics, regression object plots)
  • Development of interactive visualizations in jupyter
  • Exploration of potential interfaces for alternative packages (e.g., Bokeh, folium, D3)
  • Exploration of collaboration with geopandas

Difficulty level: intermediate

Mentors: Dani Arribas-Bel, Serge Rey, Joris Van den Bossche

Bayesian Spatial Models

Many of the models in pysal.spreg have long been able to be estimated using Bayesian methods. However, due to the lack of support for the simultaneous autoregressive specifications in common Bayesian spatial analysis packages, many statistical users end up writing custom Gibbs samplers for new model specifications.

To help the Bayesian computation community in Python and the spatial analysis community generally, a project demonstrating implementations of the common SAR specifications in pysal.spreg, in addition to spatial gaussian process models, would provide a set of common reference implementations for Bayesian Spatial Econometrics. These implementations could target either PyMC3 or Stan, but the goal would be to provide examples that allow HMC techniques to be used to estimate common spatial econometric models.

To make these estimation techniques efficient, we anticipate interested candidates possibly needing familiarity with sparse matrix techniques & libraries in python, namely theano.sparse and scipy.sparse. This module may be rolled together with with the new multilevel SAR-Error model estimators in spvcm. Together, this would include any custom classes, distributions, or utilities required to state & estimate models efficiently in either PyMC3 or Stan, as well as examples demonstrating how to do so.

Skills:

  • Familiarity with Theano, Numpy, Stan, and PyMC3
  • Background or familiarity with econometric methods and techniques
  • Basic understanding of Bayesian statistics, particularly Bayesian linear models or Gaussian process models

Related Readings:

  • Bannerjee, G. and B. Carlin and A. Gelfand. 2014. Hierarchical Modeling and Analysis for Spatial Data
  • LeSage, J. and R.K. Pace. 2010. Introduction to Spatial Econometrics

Difficulty Level: intermediate

Mentors: Levi John Wolf, Serge Rey

PySAL Refactoring

We have begun a major refactoring of the library as described here. This project would be perfect for a student who is interested in learning how a library can be redesigned for improved modularity, while maintaining backwards compatibility.

Difficulty level: intermediate

Mentors: Serge Rey, Dani Arribas-Bel, Wei Kang

Other

PySAL is an open source project and as such we invite contributions from any interested developer. If you have an idea for an enhancement for PySAL please contact one of the developers to discuss the possibilities for the project in GSOC18.

Some of the above guidelines were 'borrowed' from previously successful GSoC Mentoring Organizations, such as Julia and Statsmodels.

Timeline

  • January 4-19- sub-organization applications due
  • February 12 organizations announced
  • February 27-March 20 students discuss applications with mentoring organizations
  • March 12 - March 27 Student application period
  • April 23 Accepted student proposals announced
  • April 23 - May 13 community bonding
  • May 14 - Aug 14 coding
  • August 22 results announced

Source: https://summerofcode.withgoogle.com

Student Application Template

Python Software Foundation's student application template.