Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Markov Chain - Index out of bounds error #1083

Open
Koenig128 opened this issue Mar 7, 2024 · 2 comments
Open

Markov Chain - Index out of bounds error #1083

Koenig128 opened this issue Mar 7, 2024 · 2 comments

Comments

@Koenig128
Copy link

Hi,
I was trying to fit a Markov Chain to my data and got an error. When I was searching the issues, I found that someone else had already reported this error. It would be great if you could help me on this!

Thank you very much in advance!

          I experimented with changing the data, however the issue is also reproducible with random small data. 
import numpy as np
from pomegranate.markov_chain import MarkovChain

np.random.seed(137)
seq_data = np.random.randint(0, 10, (1,10,1))

model = MarkovChain(k = 1)
model.fit(seq_data) 

throws

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[99], line 5
      2 seq_data = np.random.randint(0, 10, (1,6,1))
      4 model = MarkovChain(k = 1)
----> 5 model.fit(seq_data)

File /opt/conda/lib/python3.10/site-packages/pomegranate/markov_chain.py:216, in MarkovChain.fit(self, X, sample_weight)
    193 def fit(self, X, sample_weight=None):
    194 	"""Fit the model to optionally weighted examples.
    195 
    196 	This method will fit the provided distributions given the data and
   (...)
    213 	self
    214 	"""
--> 216 	self.summarize(X, sample_weight=sample_weight)
    217 	self.from_summaries()
    218 	return self

File /opt/conda/lib/python3.10/site-packages/pomegranate/markov_chain.py:276, in MarkovChain.summarize(self, X, sample_weight)
    274 for i in range(X.shape[1] - self.k):
    275 	j = i + self.k + 1
--> 276 	distribution.summarize(X[:, i:j], sample_weight=sample_weight)

File /opt/conda/lib/python3.10/site-packages/pomegranate/distributions/conditional_categorical.py:168, in ConditionalCategorical.summarize(self, X, sample_weight)
    165 strides = torch.tensor(self._xw_sum[j].stride(), device=X.device)
    166 X_ = torch.sum(X[:, :, j] * strides, dim=-1)
--> 168 self._xw_sum[j].view(-1).scatter_add_(0, X_, sample_weight[:,j])
    169 self._w_sum[j][:] = self._xw_sum[j].sum(dim=-1)

RuntimeError: index 42 is out of bounds for dimension 0 with size 28

Originally posted by @salpers in #1077 (comment)

@jmschrei
Copy link
Owner

Hi @Koenig128. Sorry for the delay on this. It turns out that there are a series of small bugs that sometimes mask each other. I am working my way through the code resolving these. A challenge I'm encountering is that I remember finishing the implementation of ConditionalCategorical at an airport just before boarding and thinking "thank god I never have to think about that again" and helpfully leaving myself only the docstring """Still under development.""" for the class.

@jmschrei
Copy link
Owner

This should be fixed in v1.0.4 and I've added in a unit test with this as an example. Please let me know if you encounter any other issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants