Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regionalization with a minimum and maximum thresholds of and attribute value (or maximazing homogenization among regions) #274

Open
orlandombaa opened this issue Sep 12, 2022 · 7 comments
Labels
enhancement New feature or request region

Comments

@orlandombaa
Copy link

orlandombaa commented Sep 12, 2022

Hello everyone

I am trying to solve a problem related to regionalization. What I want to do is create regions spatially continuous with a maximum of homogeneity in one of its internal variables (let´s say population) or that I can give a minimum and maximum value per region without specifying the number of regions). Until now I have tried to use the algorithm of MaxPHeuristic where I can give a threshold value, this is a good approach but the homogenization of my internal variable is not very good (using my data I got regions with around 40 % more population than other regions).

Is it possible to give a given number of clusters using the algorithm of MaxPHeuristic and increase the homogeneity of the regions? or I should choose another algorithm?

@orlandombaa orlandombaa changed the title Rationalization with defined cluster numbers and maximizing homogeneity in one variable Rationalization with defined cluster numbers and a minimum or maximum thresholds of and attribute value Sep 12, 2022
@orlandombaa orlandombaa changed the title Rationalization with defined cluster numbers and a minimum or maximum thresholds of and attribute value Rationalization with defined cluster numbers and a minimum and maximum thresholds of and attribute value (or maximazing homogenization among regions) Sep 12, 2022
@jGaboardi
Copy link
Member

@knaaptime Do you have some advice for this?

@orlandombaa orlandombaa changed the title Rationalization with defined cluster numbers and a minimum and maximum thresholds of and attribute value (or maximazing homogenization among regions) Rationalization with a minimum and maximum thresholds of and attribute value (or maximazing homogenization among regions) Sep 12, 2022
@orlandombaa
Copy link
Author

This problem is what in Argis Pro is called Spatially Constrained Multivariate Clustering

https://pro.arcgis.com/en/pro-app/2.8/tool-reference/spatial-statistics/how-spatially-constrained-multivariate-clustering-works.htm

@orlandombaa orlandombaa changed the title Rationalization with a minimum and maximum thresholds of and attribute value (or maximazing homogenization among regions) Regionalization with a minimum and maximum thresholds of and attribute value (or maximazing homogenization among regions) Sep 12, 2022
@knaaptime
Copy link
Member

knaaptime commented Sep 12, 2022

with the max-p algorithm, the p parameter (the number of regions/clusters) is endogenous. Instead of setting the number of clusters a-priori, the analyst sets a minimum value for a threshold variable, then the algorithm works to maximize the number of regions, subject to the threshold constraint. Then the algo tries to maximize homogeneity inside the resulting regions, as long as it doesnt reduce p

max-p is greedy with respect to p, which means it will keep increasing the number of regions (even if it has to sacrifice internal homogeneity) as long as the minimum value for the threshold variable is met. So in your case, it will keep creating new regions as long as it meets your population threshold, even if the resulting regions dont have homogenous population levels

If you want to set the number of regions exogenously, you might try a different method like skater or hierarchical clustering with a spatial constraint. The arcgis method you linked to is using skater.

@orlandombaa
Copy link
Author

Thank you @knaaptime it really help me your comment.

@orlandombaa
Copy link
Author

Hi @knaaptime , I have another question. I have been watching the examples and documentation of the skater algorithm and the hierarchical clustering. I see that in the first case the threshold that I can give to the algorithm is in terms of the number of spatial objects per region. I have been testing this same algorithm but in pygeoda, there you can give the threshold in terms of a variable.
Is there any variation of a skater in Pysal where I can give the number of clusters and the threshold related to a variable?

Best regards,
Orlando

@knaaptime
Copy link
Member

great question. At the moment, our SKATER implementation has a floor argument which can be used to set the minumum number of observations assigned to each cluster, but it looks like we don't have an analog to the threshold_name argument available in max-p that lets you use a variable as a floor condition.

We should add that

cc @xf37

@jGaboardi
Copy link
Member

The corresponding keywords could be:

  • floor -> ceiling
  • quorum -> threshold

@jGaboardi jGaboardi added the enhancement New feature or request label Dec 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request region
Projects
None yet
Development

No branches or pull requests

3 participants