Skip to content

Google Summer of Code 2019

James Gaboardi edited this page Mar 1, 2019 · 10 revisions

Google Summer of Code 2019

PySAL is inviting students to join in PySAL's development by applying for Google Summer of Code 2019. This is the fourth year PySAL will be seeking to participate, and we hope to again work under the umbrella of the Python Software Foundation (PSF).

Introduction

PySAL is an open source library of spatial analysis functions written in Python intended to support the development of high level applications. See our documentation for more details. The developer guide describes in more details how to make contributions to PySAL and our work flow for contributing to the project. Our issues are also on github, which include bug reports and 'wishlist' items and enhancement plans and ideas.

If you are interested in participating in GSoC as a student, the best approach is to become an active and engaged contributor to the project right away. You should take a look at some of the existing issues on GitHub and see if there are any you think you might be able to take a crack at. Try submitting a pull request for something and start getting the hang of the process and interacting with the PySAL code base and development community.

Guidelines and Prerequisites

Students should start by reading the guidelines for participation. Google also provides guidelines to help with writing a proposal as part of their GSoC Student Guide. It is a good idea to start on your proposal early, post a draft to the pysal chat room and iterate based on the feedback you receive. This will not only improve the quality of your proposal, but also help you find a suitable mentor.

Please note that as a sub-organization of the PSF (and active members of the Python community), we ask that all mentors and students working with PySAL abide by the Python Community Code of Conduct.

Project Ideas

Below are a listing of possible projects that students might consider. We also encourage students to propose their own projects, though several of the following topics are relatively high on our priority list. Our priority list is flexible, and it is important that the topic matches the interest and background of the student.

When considering the following projects, don't be put off by the knowledge prerequisites -- you don't need to be an expert, and there is some scope for research and learning within the GSoC period. However, familiarity with and interest in the subject area and involved technologies will be helpful!

Geovisualization Module

PySAL was originally conceived as a library implementing advanced spatial statistics and econometric methods. Given that there were many different visualization toolkits in the Python ecosystem as well as GIS packages, visualization was not a focus of our library. However, over time users of PySAL wanted the ability to visualize the results of the computations that the analytical components provided. In response a contributed module viz was developed to explore alternative approaches towards providing light-weight visualization for PySAL.

The goal of the viz module is to provide a simple to use and lightweight interface that connects PySAL to different popular visualization toolkits. While much progress has been made, there is more that can be done on the viz project as the visualization space is one that is constantly evolving.

Specific activities for the viz project include:

  • Refinement and extension of the matplotlib interface (e.g. legends, views for analytics, regression object plots)
  • Development of interactive visualizations in jupyter
  • Exploration of potential interfaces for alternative packages (e.g., Bokeh, folium, D3)
  • Exploration of collaboration with geopandas

Difficulty level: intermediate

Mentors: Dani Arribas-Bel, Serge Rey, Joris Van den Bossche, James Gaboardi

Bayesian Spatial Models

Many of the models in pysal.spreg have long been able to be estimated using Bayesian methods. However, due to the lack of support for the simultaneous autoregressive specifications in common Bayesian spatial analysis packages, many statistical users end up writing custom Gibbs samplers for new model specifications.

To help the Bayesian computation community in Python and the spatial analysis community generally, a project demonstrating implementations of the common simultaneous autoregressive spatial model specifications in pysal.spreg, in addition to spatial gaussian process models, conditional autoregressive models, geographically-weighted regressions, or spatial kernel regressions, would provide a set of common reference implementations for Bayesian Spatial Econometrics. These implementations could target either PyMC3 or Stan, but the goal would be to provide examples that allow HMC techniques to be used to estimate common spatial econometric models.

To make these estimation techniques efficient, we anticipate interested candidates possibly needing familiarity with sparse matrix techniques & libraries in python, namely tensorflow.sparse and scipy.sparse. This module may be rolled together with with the new multilevel model estimators in spvcm or local model estimators in mgwr. Together, this would include any custom classes, distributions, or utilities required to state & estimate models efficiently in either PyMC3 or Stan, as well as examples demonstrating how to do so.

Skills:

  • Familiarity with Tensorflow, Numpy, Stan, and PyMC3
  • Background or familiarity with econometric methods and techniques
  • Basic understanding of Bayesian statistics, particularly Bayesian linear models or Gaussian process models

Related Readings:

  • Bannerjee, G. and B. Carlin and A. Gelfand. 2014. Hierarchical Modeling and Analysis for Spatial Data
  • LeSage, J. and R.K. Pace. 2010. Introduction to Spatial Econometrics

Related Viewings:

Difficulty Level: intermediate

Mentors: Levi John Wolf, Serge Rey, Wei Kang

Other

PySAL is an open source project and as such we invite contributions from any interested developer. If you have an idea for an enhancement for PySAL please contact one of the developers to discuss the possibilities for the project in GSOC19.

Some of the above guidelines were 'borrowed' from previously successful GSoC Mentoring Organizations, such as Julia and Statsmodels.

Timeline

  • January 15-Feb 4- sub-organization applications due into PSF
  • February 26 - organizations announced
  • February 27-March 20 students discuss applications with mentoring organizations
  • March 25 - April 9th Student application period
  • May 6 Accepted student proposals announced
  • May 6 - May 27 community bonding
  • May 27 - Aug 26 coding
  • August September 3 results announced

Source: https://developers.google.com/open-source/gsoc/timeline

Student Application Template

Python Software Foundation's student application template.