Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: relative neighbourhood depends on the order of observations #573

Open
martinfleis opened this issue Sep 22, 2023 · 2 comments
Open
Assignees

Comments

@martinfleis
Copy link
Member

The graph coming from the relative neighbourhood depends on the order of observations in the df. I guess that should not happen.

import geodatasets
import geopandas

from libpysal.graph._triangulation import _relative_neighborhood

stores = geopandas.read_file(geodatasets.get_path("geoda liquor_stores")).explode(
    index_parts=False
)
stores_unique = stores.drop_duplicates(subset="geometry")

for _ in range(5):
    data = stores_unique.sample(frac=1)
    head, tail, weight = _relative_neighborhood(data)
    print(head.shape)
(3360,)
(3338,)
(3360,)
(3346,)
(3342,)

This issue is also present in the implementation in weights, not only the new one.

for _ in range(5):
    data = stores_unique.sample(frac=1)
    W = weights.Relative_Neighborhood.from_dataframe(data)
    print(W.to_adjlist().shape)
(3352, 3)
(3350, 3)
(3340, 3)
(3344, 3)
(3352, 3)
@ljwolf
Copy link
Member

ljwolf commented Sep 22, 2023

Wow indeed should not happen.

I will get a look next week.

@ljwolf
Copy link
Member

ljwolf commented Sep 22, 2023

I think the issue is that the dkmax search (the inner loop if filter_relativehood) has to iterate over all pairs, rather than terminating early.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants