Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH, BUG]: added compute_colliders and fixed compute_v_structures #7398

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

Schefflera-Arboricola
Copy link
Contributor

@Schefflera-Arboricola Schefflera-Arboricola commented Apr 9, 2024

closes #7385

previously compute_v_structures was computing colliders. Now, I've renamed compute_v_structures to compute_colliders(and updated its implementation too) and added the correct implementation for compute_v_structures. Also, updated and added the relevant docs and tests.

Thank you :)

@Schefflera-Arboricola Schefflera-Arboricola changed the title added compute_colliders and fixed compute_v_structures [ENH, BUG]: added compute_colliders and fixed compute_v_structures Apr 9, 2024
@Schefflera-Arboricola
Copy link
Contributor Author

Schefflera-Arboricola commented Apr 9, 2024

Needed to sort parents because doctests were failing.

Doctest logs
=================================== FAILURES ===================================
_____________ [doctest] networkx.algorithms.dag.compute_colliders ______________
1242     Raises
1243     ------
1244     NetworkXNotImplemented
1245         If `G` is an undirected graph.
1246 
1247     Examples
1248     --------
1249     >>> G = nx.DiGraph()
1250     >>> G.add_edges_from([(1, 2), (0, 5), (3, 1), (2, 4), (3, 1), (4, 5), (1, 5)])
1251     >>> sorted(nx.compute_colliders(G))
Expected:
    [(0, 5, 1), (0, 5, 4), (4, 5, 1)]
Got:
    [(0, 5, 4), (1, 5, 0), (1, 5, 4)]

/home/runner/work/networkx/networkx/networkx/algorithms/dag.py:1251: DocTestFailure
____________ [doctest] networkx.algorithms.dag.compute_v_structures ____________
1284     Raises
1285     ------
1286     NetworkXNotImplemented
1287         If `G` is an undirected graph.
1288 
1289     Examples
1290     --------
1291     >>> G = nx.DiGraph()
1292     >>> G.add_edges_from([(1, 2), (0, 5), (3, 1), (2, 4), (3, 1), (4, 5), (1, 5), (5, 2)])
1293     >>> sorted(nx.compute_v_structures(G))
Expected:
    [(0, 5, 1), (0, 5, 4), (4, 5, 1)]
Got:
    [(0, 5, 4), (1, 5, 0), (1, 5, 4)]

/home/runner/work/networkx/networkx/networkx/algorithms/dag.py:1293: DocTestFailure

Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to change this much, maybe we should consider changing the name so it doesn't include compute. We don't usually include words like that in the function names. What do you think about v_structures(G) and colliders(G)?

I'd love to have a better name that v_structures :)

networkx/algorithms/dag.py Outdated Show resolved Hide resolved
networkx/algorithms/dag.py Outdated Show resolved Hide resolved
networkx/algorithms/dag.py Outdated Show resolved Hide resolved
networkx/algorithms/dag.py Outdated Show resolved Hide resolved
networkx/algorithms/dag.py Outdated Show resolved Hide resolved
networkx/algorithms/dag.py Outdated Show resolved Hide resolved
Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if these new names describe what is going on well.
Should we use causal_colliders to refer to the literature? Or is colliders better/shorter/faster (searching for "collider" doesn't lead to results about causal graphs colliders. But "collider graph theory" does...

networkx/algorithms/dag.py Outdated Show resolved Hide resolved
networkx/algorithms/dag.py Outdated Show resolved Hide resolved
networkx/algorithms/dag.py Outdated Show resolved Hide resolved
@Schefflera-Arboricola Schefflera-Arboricola marked this pull request as draft April 15, 2024 12:08
@Schefflera-Arboricola Schefflera-Arboricola marked this pull request as ready for review April 16, 2024 13:14
Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took the liberty of pushing up some changes that remove both sorting and mins. It works locally so maybe will work on CI.

Removing both the sorting and the min of parents seems to work. The triplets are formed in a deterministic way -- so its the same each time it is run. So I guess this unsorted way of testing doesn't allow other correct solutions, but it does raise if the code no longer works the same way -- and it doesn't encourage users to include lots of sorting and min function calls in their own calls.

In doing this, I noticed that the examples are no longer DAGs. (which led to more investigation)
The docs say these functions should have DAGs as inputs. But it doesn't seem to be checked anywhere. That means we are returning colliders even for directed digraphs with cycles. For better or for worse, there are papers on possibly cyclic causal directed graphs.
https://www.cmu.edu/dietrich/philosophy/docs/tech-reports/68_Richardson.pdf
https://www.sciencedirect.com/science/article/pii/S0888613X19301410
https://arxiv.org/pdf/1309.6836.pdf
And they do talk about colliders in these situations.

So I think we should leave it possible to use these functions on cyclic graphs. But we should add a warning in the doc_string to check that the graph is acyclic if that is what they want.

I have changed the examples to be DAGs and added a check to the example codes to show that it is a good idea to check for a DAG if you want to restrict to that formulation of a collider. And I've added references and notes to the doc_string.

Please change any of this you want

@Schefflera-Arboricola
Copy link
Contributor Author

Thanks for adding the points about directed cyclic graphs.

And about the sort and min-max on parents, I think we should keep them so that the tests pass for all OS, because right now the dispatch test is failing. Also ref. this comment, here also the doctests in dispatch were failing and (ubuntu, 3.11)'s doctests were also failing in CI(from past commits in this PR) bcoz we didn't use sorted and min-max. LMK what u think. Thx!

Copy link
Contributor

@rossbar rossbar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that removing compute_v_structures in this way will be a breaking change for users (e.g. the user who reported #7385).

IMO the easiest way to deal with this would be to reintroduce a compute_v_structures function that essentially just raises a deprecation warning then calls the new colliders.

It may also be worth discussing whether these functions should indeed be exported to the main namespace. From a readthrough, I understand that they are intended to work on any acyclic graphs (not only DAGs); however, I don't know how grokkable they are in a general context. IMO nx.dag.v_structures is a lot clearer than nx.v_structures...

@Schefflera-Arboricola
Copy link
Contributor Author

Schefflera-Arboricola commented May 22, 2024

IMO the easiest way to deal with this would be to reintroduce a compute_v_structures function that essentially just raises a deprecation warning then calls the new colliders.

+1 on this.

It may also be worth discussing whether these functions should indeed be exported to the main namespace. From a readthrough, I understand that they are intended to work on any acyclic graphs (not only DAGs); however, I don't know how grokkable they are in a general context. IMO nx.dag.v_structures is a lot clearer than nx.v_structures...

I'm not sure by "these functions" if you meant all the functions in dag.py or just these 2 functions. But, I also think having nx.dag. is better than nx. for most of these functions. And I was going through the functions in dag.py and

  • descendants, ancestors, all_topological_sorts, is_aperiodic, transitive_closure, seemed like general functions(not specific to DAGs)
  • is_directed_acyclic_graph can have a not_implemented_for("undirected") and not_implemented_for("multigraph") decorators instead of checking G.is_directed() inside the function
  • rest of the functions seemed dag-specific

@dschult
Copy link
Member

dschult commented May 23, 2024

With #7432 merged, this should be rebased-on/merged-with main and will hopefully pass the dispatch tests now that nx-loopback preserves edge order when converting during the tests. I don't think there will be any conflicts on the merge, but if it is messy let me know and I'll take a stab at it. (git merge --abort is helpful if it gets messy)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

compute_v_structures output and docstring do not agree
3 participants