Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]Speed issue #1075

Open
tanyasarkjain opened this issue Jan 18, 2024 · 3 comments
Open

[BUG]Speed issue #1075

tanyasarkjain opened this issue Jan 18, 2024 · 3 comments

Comments

@tanyasarkjain
Copy link

Describe the bug
A clear and concise description of what the bug is, including what you were expecting to happen and what actually happened. Please report the version of pomegranate that you are using and the operating system. Also, please make sure that you have upgraded to the latest version of pomegranate before submitting the bug report.

I am using the latest version of pomegranate. My code is taking an incredibly slow amount of time to run, all I am doing is creating an uninitialized hmm and fitting it to my data. The model is 2 states, and the emissions are multivariate (2 features), that take on a range of about 20 numbers each. Furthermore, when I print out the predictions I am only getting a state assignment of 1. I tried it on just 2 iterations and it took about 6 minutes.

To Reproduce
Please provide a snippet of code that can reproduce this error. It is much easier for us to track down bugs and fix them if we have an example script that fails until we're successful.

import pomegranate
import seaborn; seaborn.set_style('whitegrid')
import torch
#https://pomegranate.readthedocs.io/en/latest/tutorials/B_Model_Tutorial_4_Hidden_Markov_Models.html#Initializing-Hidden-Markov-Models

print(pomegranate.version)

from pomegranate.hmm import DenseHMM

Here is a snippet of what mv_emissions looks like: [[[16, 11],
[16, 12],
[13, 12],
[15, 12],
[15, 9],
[14, 6],
[15, 3],
[9, 6],]]
Response time
Although I will likely respond during weekdays if I am not on vacation, I am not likely to be able to merge PRs or write code until the weekend.

@jmschrei
Copy link
Owner

That doesn't sound right. Unfortunately, without code to check what's going on, it'll be difficult for me to provide feedback. Are you using a GPU? What happens if you set max_iter to be a small number? What is the shape of the data you're training on?

@tanyasarkjain
Copy link
Author

tanyasarkjain commented Jan 19, 2024

Okay, now it is no longer taking as long, I finished an issue with the dimensions of my sequences. However I am getting 'nan' improvement now:

`
from pomegranate.hmm import DenseHMM
from pomegranate.distributions import Categorical

starts = [0.5, 0.5]
d = Categorical().fit(all_seq_100_equal[1])
print('d', d.probs)

model = DenseHMM([d, d], starts = starts, max_iter=10, verbose=True)
model.fit(all_seq_100_equal)
print(np.array(all_seq_100_equal).shape)
`

The output is:
d Parameter containing: tensor([[0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0100, 0.0000, 0.0300, 0.0400, 0.0300, 0.1500, 0.1300, 0.2000, 0.2300, 0.1700, 0.0100], [0.0000, 0.0000, 0.0000, 0.0000, 0.0200, 0.0200, 0.4700, 0.0300, 0.0000, 0.0300, 0.2500, 0.0900, 0.0100, 0.0000, 0.0800, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000]]) [1] Improvement: nan, Time: 0.3021s [2] Improvement: nan, Time: 0.336s [3] Improvement: nan, Time: 0.3485s [4] Improvement: nan, Time: 0.3055s [5] Improvement: nan, Time: 0.299s [6] Improvement: nan, Time: 0.296s [7] Improvement: nan, Time: 0.3206s [8] Improvement: nan, Time: 0.32s [9] Improvement: nan, Time: 0.321s [10] Improvement: nan, Time: 0.629s (4550, 100, 2)

Furthermore, all the predictions have a value of 0, which is perhaps why there is nan improvement.
The first element of all_seq_100_equal, for context into the type of data i'm working with, is:
[[16, 1], [18, 10], [17, 7], [13, 9], [14, 1], [17, 14], [13, 10], [18, 10], [18, 9], [15, 9], [13, 15], [16, 1], [18, 9], [17, 1], [16, 14], [16, 14], [12, 1], [14, 4], [15, 5], [18, 6], [17, 4], [18, 14], [12, 9], [12, 11], [14, 11], [18, 9], [18, 15], [7, 14], [11, 15], [11, 1], [18, 7], [16, 6], [14, 6], [17, 3], [14, 1], [14, 10], [17, 11], [15, 14], [12, 11], [15, 11], [14, 10], [14, 11], [17, 6], [18, 11], [15, 1], [13, 3], [14, 2], [15, 3], [17, 11], [16, 11], [12, 6], [13, 2], [14, 3], [16, 2], [12, 11], [14, 6], [14, 1], [13, 11], [15, 2], [16, 1], [10, 15], [15, 12], [15, 6], [17, 7], [17, 6], [13, 1], [14, 2], [12, 6], [16, 2], [15, 2], [16, 11], [15, 1], [13, 10], [13, 7], [16, 10], [14, 7], [12, 10], [17, 10], [13, 2], [17, 11], [15, 11], [15, 7], [17, 9], [17, 7], [16, 10], [15, 11], [17, 1], [16, 14], [16, 12], [18, 10], [16, 8], [18, 11], [19, 10], [17, 11], [14, 10], [14, 7], [18, 10], [15, 7], [14, 5], [18, 11]]

@jmschrei
Copy link
Owner

When you pass in DenseHMM([d, d]) I'm pretty sure you're passing in the same object to each state and so both will always be identical. Try making two copies of the object?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants