[WIP] Local (aspatial) Correlation #89

ljwolf · 2019-09-17T11:09:23Z

Correlation coefficients can also be "localized" as a LISA:

r[i] = (x[i]*y[i]) / numpy.sqrt((x**2).sum() * (y**2).sum())

for any two variates x,y. This provides the "contribution" each site makes to the global correlation between two variables. When the statistic is large (close to one/negative one), the site contributes to the correlation in the direction of the sign of this local statistic. When this is small (close to zero), the site isn't as important to the correlation.

I believe this also gives us local spearman, if x,y are ranked.
This'd also be helped along by abstracting the permutational inference machinery (both global/unconditioned and local/conditional) into a mixin class.
If this goes here, should Tau_Local go here, too? or, atleast cross-listing them by importing giddy & adding them to an esda.correlation namespace?
Before this PR gets merged, we will need to:

actually implement the statistic
run the input checks
clarify/disclaim the null in the local permutations for the docstring.

ljwolf · 2019-09-18T22:57:21Z

hey @weikang9009 @sjsrey, with giddy.rank.Tau_Local, I don't see an explicit permutational inference strategy. Thinking about it, I'm not sure how to do this for a correlation coefficient: an observation with a small X but outlying Y would have strongly different reference distributions for their local statistic if you permute X and hold Y fixed vs. permuting Y holding X fixed. I'm not yet sure if randomly picking which variate to permute each iteration will work, either.

Do you (A) have a permutation strategy in the works for giddy.rank.Tau_Local, or (B) know of any relevant lit that might suggest a permutation test for this?

weikang9009 · 2019-12-19T00:44:26Z

If this goes here, should Tau_Local go here, too? or, at least cross-listing them by importing giddy & adding them to an esda.correlation namespace?

We all think it makes sense to (1) move the source code of Kendall's tau as well as its spatial counterpart and local decomposition to esda; (2) in giddy, import these functions from esda ; (3) keep the current APIs of these functions in giddy intact. An esda.correlation namespace sounds very reasonable!

@ljwolf if you can start an esda.correlation namespace, I will move the source code from giddy to this namespace and adjust giddy accordingly.

weikang9009 · 2019-12-19T01:00:10Z

hey @weikang9009 @sjsrey, with giddy.rank.Tau_Local, I don't see an explicit permutational inference strategy. Thinking about it, I'm not sure how to do this for a correlation coefficient: an observation with a small X but outlying Y would have strongly different reference distributions for their local statistic if you permute X and hold Y fixed vs. permuting Y holding X fixed. I'm not yet sure if randomly picking which variate to permute each iteration will work, either.

Do you (A) have a permutation strategy in the works for giddy.rank.Tau_Local, or (B) know of any relevant lit that might suggest a permutation test for this?

This is a very reasonable point. We do not have inference for local Tau in giddy at the moment - it seems to be a pretty tricky one with permutation-based inference. I completely agree with you - permuting the values of X would very possibly give quite different results from permuting Y. We have done some simulation experiments to examine the sampling distributions of local Tau sorted by starting ranks (like ranks in X here), these distributions vary to a great extent - starting with a middle rank will have a much narrow distribution than starting with a more extreme rank.

Since for investigating the dynamics (or exchanges) with tau or local tau, there is a temporal dimension, we can potentially use the starting rank as the reference point to build the sampling distribution. I guess we can try considering a certain variable (X or Y) as the conditioning variable and build sampling distributions starting from there?

…rategies

ljwolf · 2020-04-09T13:21:27Z

Revisiting this with @weikang9009's advice, I've started to implement the following strategy for inference.
Pearon_Local would take a conditional_inference keyword with four options:

conditional_inference=False. For each observation, fix their site values. Randomize both the x & the y of remaining sites independently in all k local permutations. It should correctly condition on the local value of xi*yi relative to the space of all remaining cross-products.
conditional_inference='x' conditions on site i's x[i] value. For each observation, fix its x value. Randomize all y values & remaining x values in all k local permutations.
conditional_inference='y' conditions on site i's y[i[ value. For each observation, fix its y value. Randomize all x values and remaining y values in all k local permutations.
conditional_inference=True splits permutations//2 and runs half with x conditional and half with y conditionals for each site. Then, a post-hoc paired t-test is done for each site to check if the two distributions have the same mean correlation. If they don't, then a warning is raised suggesting to use conditional_inference = False.

So,

We might want to change this to read conditional_on or something?
Is it OK to re-use permutations? For instance, the code currently fixes all sites' x and shuffles Y. Since there's no spatial configuration, this should be ok?
We still need to force the site-specific conditioning when conditional_inference=False, too. Right now, it's permuting all observations every time. It should be sufficient to:

numpy.delete(permutation, i), and then use row_stack(iless_permutation, (xi, yi) for the ccomputation of the statistic.

whatever is implemented should also apply for Tau_Local, and I think the right move is the conditional_inference=False. @sjsrey, @weikang9009, perspective?

ljwolf added enhancement new-estimator WIP work in progress (for discussion) labels Sep 17, 2019

ljwolf mentioned this pull request Sep 18, 2019

Lee statistics missing unittests #91

Open

ljwolf added 5 commits April 9, 2020 13:06

start stub of local correlation scores

3ffe94b

Add actual statistic, including pearson_matrix

6b50763

rename Spatial_Pearson_Local to match convention in module

1ea4a22

add missed tests and finish rename for Spatial_Pearson local

c9e942c

handle migration to external tests

78d52a7

ljwolf force-pushed the correlation branch from bae1699 to 78d52a7 Compare April 9, 2020 12:08

ljwolf added 2 commits April 9, 2020 13:53

add permutation inference for local pearson with split conditional st…

b6a17da

…rategies

add movement on the permutation inference.

78ad3a1

ljwolf added a commit to ljwolf/esda that referenced this pull request May 24, 2023

add test_lee.py from pysal#89

81805a9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Local (aspatial) Correlation #89

[WIP] Local (aspatial) Correlation #89

ljwolf commented Sep 17, 2019

ljwolf commented Sep 18, 2019

weikang9009 commented Dec 19, 2019 •

edited

weikang9009 commented Dec 19, 2019 •

edited

ljwolf commented Apr 9, 2020 •

edited

[WIP] Local (aspatial) Correlation #89

Are you sure you want to change the base?

[WIP] Local (aspatial) Correlation #89

Conversation

ljwolf commented Sep 17, 2019

ljwolf commented Sep 18, 2019

weikang9009 commented Dec 19, 2019 • edited

weikang9009 commented Dec 19, 2019 • edited

ljwolf commented Apr 9, 2020 • edited

weikang9009 commented Dec 19, 2019 •

edited

weikang9009 commented Dec 19, 2019 •

edited

ljwolf commented Apr 9, 2020 •

edited