Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: No coarse-grained stationary distribution found #1135

Open
xyxuq opened this issue Oct 24, 2023 · 6 comments
Open

RuntimeError: No coarse-grained stationary distribution found #1135

xyxuq opened this issue Oct 24, 2023 · 6 comments
Assignees
Labels
question Further information is requested

Comments

@xyxuq
Copy link

xyxuq commented Oct 24, 2023

...
Hi there,

I am trying to compute the initial and terminal states of cells from time series experiments with the codes in the attached cellrank_check_script.txt.

It worked when I ran the script with a small subset of data but failed with the whole dataset with 1,222,515 cells.
When I run g.predict_initial_states(allow_overlap=False) , it gave me the error information RuntimeError: No coarse-grained stationary distribution found.

I check the scripts step by step; g.coarse_stationary_distribution is empty when I run with the whole dataset.
Could you please help me check this issue? Thanks in ahead.

The log file is below.

Computing Schur decomposition
Adding `adata.uns['eigendecomposition_fwd']`
       `.schur_vectors`
       `.schur_matrix`
       `.eigendecomposition`
    Finish (1:59:34)
Computing `15` macrostates
Adding `.macrostates`
       `.macrostates_memberships`
       `.coarse_T`
       `.coarse_initial_distribution
       `.coarse_stationary_distribution`
       `.schur_vectors`
       `.schur_matrix`
       `.eigendecomposition`
    Finish (4:33:05)
Writing `GPCCA[kernel=RealTimeKernel[n=1222515], initial_states=None, terminal_states=None]` to `test.initial_terminal_state.fate_probabilities.pickle`
Adding `adata.obs['term_states_fwd']`
       `adata.obs['term_states_fwd_probs']`
       `.terminal_states`
       `.terminal_states_probabilities`
       `.terminal_states_memberships
    Finish`
Writing `GPCCA[kernel=RealTimeKernel[n=1222515], initial_states=None, terminal_states=['0_1', '0_2', '0_3', '14', '18_1', '18_2', '18_3', '19_1', '19_2', '5_1', '5_2', '5_3', '6_1', '6_2', '6_3']]` to `test.initial_terminal_state.fate_probabilities.pickle`
Traceback (most recent call last):
  File "cellrank_macrostates.test.py", line 45, in <module>
    g.predict_initial_states(allow_overlap=False)
  File "~/miniconda3/envs/cellrank/lib/python3.11/site-packages/cellrank/estimators/terminal_states/_gpcca.py", line 368, in predict_initial_states
    raise RuntimeError("No coarse-grained stationary distribution found.")
RuntimeError: No coarse-grained stationary distribution found.

The version of packages:

cellrank==2.0.0 scanpy==1.9.5 anndata==0.9.2 numpy==1.24.4 numba==0.57.1 scipy==1.11.2 pandas==1.5.3 pygpcca==1.0.4 scikit-learn==1.1.3 statsmodels==0.14.0 python-igraph==0.10.8 scvelo==0.3.0 pygam==0.8.0 matplotlib==3.6.3 seaborn==0.12.2

cellrank_check_script.txt

@xyxuq xyxuq added the question Further information is requested label Oct 24, 2023
@Marius1311
Copy link
Collaborator

mh, not sure what's going on here, do you have any idea @michalk8 ?

@xyxuq
Copy link
Author

xyxuq commented Nov 1, 2023

I randomly took subsets of the whole dataset and increased the cells by 5% for each subset. At most, only up to 30% cells (366,754) could run successfully.

@michalk8
Copy link
Collaborator

michalk8 commented Nov 1, 2023

I check the scripts step by step; g.coarse_stationary_distribution is empty when I run with the whole dataset.

The (coarse) stationary distribution is not guaranteed to always exist. Running g.predict_initial_states(allow_overlap=False) sometimes needs it when there is only 1 initial macrostate detected automatically.

To overcome this, if you know how many initial macrostates you expect, you can pass it as g.predict_initial_states(n_states=..., allow_overlap=False), since this won't require access to the coarse stationary distribution.

@ramadatta
Copy link
Member

I check the scripts step by step; g.coarse_stationary_distribution is empty when I run with the whole dataset.

The (coarse) stationary distribution is not guaranteed to always exist. Running g.predict_initial_states(allow_overlap=False) sometimes needs it when there is only 1 initial macrostate detected automatically.

To overcome this, if you know how many initial macrostates you expect, you can pass it as g.predict_initial_states(n_states=..., allow_overlap=False), since this won't require access to the coarse stationary distribution.

Hi @michalk8,

I tried your comment above but still I receive the same error.

g.fit(cluster_key="annotation_cell_states", n_states=[0, 25],n_cells=15)
Computing Schur decomposition
Adding `adata.uns['eigendecomposition_fwd']`
       `.schur_vectors`
       `.schur_matrix`
       `.eigendecomposition`
    Finish (0:00:14)
WARNING: Minimum value must be larger than `1`, found `2`. Setting `min=2`
WARNING: In most cases, 2 clusters will always be optimal. If you really expect 2 clusters, use `n_states=2`. Setting `min=3`
Calculating minChi criterion in interval `[3, 25]`
Computing `22` macrostates
Adding `.macrostates`
       `.macrostates_memberships`
       `.coarse_T`
       `.coarse_initial_distribution
       `.coarse_stationary_distribution`
       `.schur_vectors`
       `.schur_matrix`
       `.eigendecomposition`
    Finish (0:04:44)
GPCCA[kernel=PseudotimeKernel[n=93458], initial_states=None, terminal_states=None]
g.predict_initial_states(n_states=22, allow_overlap=False)
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[87], line 1
----> 1 g.predict_initial_states(n_states=22, allow_overlap=False)

File ~/anaconda3/envs/trajectories_1/lib/python3.11/site-packages/cellrank/estimators/terminal_states/_gpcca.py:368, in GPCCA.predict_initial_states(self, n_states, n_cells, allow_overlap)
    366 stat_dist = self.coarse_stationary_distribution
    367 if stat_dist is None:
--> 368     raise RuntimeError("No coarse-grained stationary distribution found.")
    370 states = list(stat_dist[np.argsort(stat_dist)][:n_states].index)
    371 return self.set_initial_states(states, n_cells=n_cells, allow_overlap=allow_overlap)

RuntimeError: No coarse-grained stationary distribution found.

I have tried n_states from 1 to 22, but still I experience the same error. May I know if this can be fixed? Many thanks!

@Marius1311
Copy link
Collaborator

mh, any idea @michalk8 ?

@shaln
Copy link

shaln commented Apr 25, 2024

Hi, thought I'd mention that I've been getting the same error too, though only when computing the initial state. Both with and without specifying the n_states.

I was able to compute the terminal states with no errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants