Skip to content
Taylor Oshan edited this page Aug 16, 2016 · 16 revisions

Progress

Proposal Description/Timeline

  • Generalized linear model (GLM) base class for modeling count data (Poisson model). (Week 1 & 2; ~ May 23rd - June 3rd) Blog Post 1 / Blog Post 2 / Blog Post 3 / Blog Post 4 / Blog Post 5

  • Zero flows, zero-inflation, overdispersion, and heteroskedasticity. (Week 3 & 4; ~ June 6th - June 17th) Blog Post 6

    • Tests for overdispersion
    • Poisson Quasi Maximum Likelihood
    • Unit tests/documentation
  • Exploratory tools. (Week 5 & 6; ~ June 20th - July 1st)

    • Vector-based spatial autocorrelation statistic.
    • Vector randomization for permutation-based hypothesis testing of vector spatial autocorrelation
    • Automate origin/destination specific calibration to investigate non-stationary processes
    • Unit tests/documentation
  • Flow-based spatial weight specifications. (Week 7 & 8; ~ July 4th - July 15th)

  • Origin-destination weights>

  • Network origin-destination weights

  • Unit tests/documentation

  • Spatial autoregressive (SAR) specifications. (Week 9 & 10 & 11; ~ July 18th - August 5th)

    • Log-normal SAR (Not production ready, but explored)
    • Unit tests/documentation
  • Wrap up and prepare module for release. (Week 12 & 13; ~ August 8th - August 23rd)

    • Optimize code
    • Coefficient estimation via maximum likelihood and gradient optimization (using scipy and/or autograd)
    • Zero-inflated Poisson Model
    • Double check tests/documentation
    • Finalize educational materials and provide sample analysis workflow using exploratory tools, diagnostic tests, and formal models
  • Additional goals if there is any extra time and project is ahead of schedule:

    • Competing destinations specifications
    • Spatial eigenvector filter (SF) specifications
    • Non-parametric “universal” model varieties
    • Non-parametric Neural Network routines for calibrating spatial interaction models

General

For general development issues.

SAR (lag) models

For discussion and notes pertaining to the Poisson SAR model and its theory, estimation, and implementation.

Literature

  1. Spatial Econometric Modeling of Origin-Destination Flows

Summary

Questions/Issues

  1. [A Spatial Autoregressive Poisson Gravity Model] (http://onlinelibrary.wiley.com/doi/10.1111/gean.12007/abstract)

Summary

They propose a Poisson model for flows which also has an autoregressive component composed of an origin-based dependence and a destination-based dependence. They do not include the third type of dependence originally proposed by LeSage & Pace (2008), which is an origin-destination-based dependence and they do not really state why they do not include it. They suggest a two-stage nonlinear least squares estimator for the model. Interestingly, this estimator assumed that the sum of the spatial autocorrelation parameters on the origin-based dependence and the destination-based dependence is less than or equal to one. There is no mention of the effects when this assumption is breached in the event that the two parameters are collectively greater than 1. Furthermore, using the two-stage estimation routine means that final estimates of the spatial autocorrelation parameters are never actually obtained. Simulation results show that the estimator is unbiased but these results are based on a relatively unrealistic sample (linear organization of units where each unit has two neighbors except first and last) and on the premise that the spatial structure has been correctly specified (first order contiguity).

Questions/Issues

  1. In this paper the authors state that they are using a nonlinear least squares estimator because the maximum likelihood estimate of the multivariate Poisson distribution does not have an analytically closed form. In the notes they then clarify that it is possible write out the likelihood but that computationally intensive recursive algorithms are needed to compute the likelihood. Whats more confusing to me is how/why they made the jump to the multivariate Poisson distribution. On pages 181-182 they describe the basic Poisson distribution and its ML estimator. They then introduce the SAR component, talk about dispersion properties, and model interpretation (i.e., direct and indirect effects), before finally describing their estimator where they have now assumed a multivariate Poisson (p. 188). Anyone have any insights or literature as to how/how they made the move to the multivariate Poisson given the introduction of the SAR component?