-
-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add corner case to Degree Pearson Correlation Coefficient #7392
base: main
Are you sure you want to change the base?
Add corner case to Degree Pearson Correlation Coefficient #7392
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(related corner case) Right now for a Graph with 1 nodes and 1 edges
, nx.degree_pearson_correlation_coefficient(G)
gives ValueError: x and y must have length at least 2.
>>> G = nx.Graph()
>>> G.add_node(1)
>>> G.add_edge(1,1)
>>> nx.degree_pearson_correlation_coefficient(G)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<class 'networkx.utils.decorators.argmap'> compilation 4", line 3, in argmap_degree_pearson_correlation_coefficient_1
File "/workspaces/networkx/networkx/utils/backends.py", line 633, in __call__
return self.orig_func(*args, **kwargs)
File "/workspaces/networkx/networkx/algorithms/assortativity/correlation.py", line 168, in degree_pearson_correlation_coefficient
return float(sp.stats.pearsonr(x, y)[0])
File "/home/codespace/.local/lib/python3.10/site-packages/scipy/stats/_stats_py.py", line 4727, in pearsonr
raise ValueError('x and y must have length at least 2.')
ValueError: x and y must have length at least 2.
I'll add for this, thankyou so much! What are your thoughts on adding NetworkXUnfeasible if len(x) or len(y) is less than 2? Any other idea is welcome. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @vanshika230 and @Schefflera-Arboricola for the review! I agree with @Schefflera-Arboricola that the additional corner cases should also be added.
Also, as a general comment, let's go ahead and switch all the new exceptions to NetworkXError
- the other, more specific, exceptions (NetworkXPointlessConcept
and NetworkXUnfeasible
) have specialized meanings that don't necessarily map cleanly to these cases IMO!
Noted, I have added the corner case told by @Schefflera-Arboricola , thankyou for that! Please let me know any further changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @vanshika230 ! I've left a few comments below. LMK what u think.
I have made the changes suggested in the review, Thankyou! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As often happens, I think this one is a bit trickier than it appears at first glance!
if nx.is_empty(G): | ||
raise nx.NetworkXError( | ||
"Cannot compute correlation coefficient on an empty graph." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming my reasoning in the above comment is correct (please double-check!), I think the real condition that we want to catch here is that the graph with self-loop edges removed is not empty. One way to check that is by comparing the edge set to the set of self-loop edges, e.g. set(G.edges) - set(nx.selfloop_edges(G))
.
However, this will also lead to subtle changes in the algorithm for the case where the graph with multiple nodes that only contain self-loop edges. Right now, this case gives:
>>> G = nx.Graph([(1, 1), (2, 2), (3, 3)])
>>> nx.degree_pearson_correlation_coefficient(G)
ConstantInputWarning: An input array is constant; the correlation coefficient is not defined.
nan
whereas after the change this would raise an exception instead of producing the nan
. This seems reasonable to me, but I'm no expert!
------ | ||
NetworkXError | ||
If the graph 'G' is empty. | ||
If the graph 'G' has nodes with degrees less than two. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not the condition that triggers the exception - nodes with degree less than 2 are fine (otherwise the path_graph(4)
example in the docstring would fail, since the end nodes have degree 1!). IIUC, the corner case is related to graphs that have only a single self-loop edge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, the corner case is related to graphs that have only a single self-loop edge.
we also get ValueError: x and y must have length at least 2.
for 1 edge DiGraph
s. And there probably might be some other cases too where the len(x) < 2 or len(y) < 2
condition might be true.
import networkx as nx
G = nx.DiGraph([(1, 2)])
G1 = nx.Graph(G)
print(nx.degree_pearson_correlation_coefficient(G1))
print(nx.degree_pearson_correlation_coefficient(G))
# output
"""
nan
/usr/local/lib/python3.10/dist-packages/scipy/stats/_stats_py.py:4781: ConstantInputWarning: An input array is constant; the correlation coefficient is not defined.
warnings.warn(stats.ConstantInputWarning(msg))
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
[<ipython-input-14-73f922e088fa>](https://localhost:8080/#) in <cell line: 5>()
3 G1 = nx.Graph(G)
4 print(nx.degree_pearson_correlation_coefficient(G1))
----> 5 print(nx.degree_pearson_correlation_coefficient(G))
6 xy = nx.node_degree_xy(G, x="out", y="in")
7 print(xy)
3 frames
[/usr/local/lib/python3.10/dist-packages/scipy/stats/_stats_py.py](https://localhost:8080/#) in pearsonr(x, y, alternative, method)
4766
4767 if n < 2:
-> 4768 raise ValueError('x and y must have length at least 2.')
4769
4770 x = np.asarray(x)
ValueError: x and y must have length at least 2.
"""
Closes #6913
Please do let me know any changes to be made. Thankyou!