[BUG]Speed issue #1075

tanyasarkjain · 2024-01-18T08:39:43Z

Describe the bug
A clear and concise description of what the bug is, including what you were expecting to happen and what actually happened. Please report the version of pomegranate that you are using and the operating system. Also, please make sure that you have upgraded to the latest version of pomegranate before submitting the bug report.

I am using the latest version of pomegranate. My code is taking an incredibly slow amount of time to run, all I am doing is creating an uninitialized hmm and fitting it to my data. The model is 2 states, and the emissions are multivariate (2 features), that take on a range of about 20 numbers each. Furthermore, when I print out the predictions I am only getting a state assignment of 1. I tried it on just 2 iterations and it took about 6 minutes.

To Reproduce
Please provide a snippet of code that can reproduce this error. It is much easier for us to track down bugs and fix them if we have an example script that fails until we're successful.

import pomegranate
import seaborn; seaborn.set_style('whitegrid')
import torch
#https://pomegranate.readthedocs.io/en/latest/tutorials/B_Model_Tutorial_4_Hidden_Markov_Models.html#Initializing-Hidden-Markov-Models

print(pomegranate.version)

from pomegranate.hmm import DenseHMM

Here is a snippet of what mv_emissions looks like: [[[16, 11],
[16, 12],
[13, 12],
[15, 12],
[15, 9],
[14, 6],
[15, 3],
[9, 6],]]
Response time
Although I will likely respond during weekdays if I am not on vacation, I am not likely to be able to merge PRs or write code until the weekend.

jmschrei · 2024-01-18T17:23:18Z

That doesn't sound right. Unfortunately, without code to check what's going on, it'll be difficult for me to provide feedback. Are you using a GPU? What happens if you set max_iter to be a small number? What is the shape of the data you're training on?

tanyasarkjain · 2024-01-19T08:44:58Z

Okay, now it is no longer taking as long, I finished an issue with the dimensions of my sequences. However I am getting 'nan' improvement now:

`
from pomegranate.hmm import DenseHMM
from pomegranate.distributions import Categorical

starts = [0.5, 0.5]
d = Categorical().fit(all_seq_100_equal[1])
print('d', d.probs)

model = DenseHMM([d, d], starts = starts, max_iter=10, verbose=True)
model.fit(all_seq_100_equal)
print(np.array(all_seq_100_equal).shape)
`

The output is:
d Parameter containing: tensor([[0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0100, 0.0000, 0.0300, 0.0400, 0.0300, 0.1500, 0.1300, 0.2000, 0.2300, 0.1700, 0.0100], [0.0000, 0.0000, 0.0000, 0.0000, 0.0200, 0.0200, 0.4700, 0.0300, 0.0000, 0.0300, 0.2500, 0.0900, 0.0100, 0.0000, 0.0800, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000]]) [1] Improvement: nan, Time: 0.3021s [2] Improvement: nan, Time: 0.336s [3] Improvement: nan, Time: 0.3485s [4] Improvement: nan, Time: 0.3055s [5] Improvement: nan, Time: 0.299s [6] Improvement: nan, Time: 0.296s [7] Improvement: nan, Time: 0.3206s [8] Improvement: nan, Time: 0.32s [9] Improvement: nan, Time: 0.321s [10] Improvement: nan, Time: 0.629s (4550, 100, 2)

Furthermore, all the predictions have a value of 0, which is perhaps why there is nan improvement.
The first element of all_seq_100_equal, for context into the type of data i'm working with, is:
[[16, 1], [18, 10], [17, 7], [13, 9], [14, 1], [17, 14], [13, 10], [18, 10], [18, 9], [15, 9], [13, 15], [16, 1], [18, 9], [17, 1], [16, 14], [16, 14], [12, 1], [14, 4], [15, 5], [18, 6], [17, 4], [18, 14], [12, 9], [12, 11], [14, 11], [18, 9], [18, 15], [7, 14], [11, 15], [11, 1], [18, 7], [16, 6], [14, 6], [17, 3], [14, 1], [14, 10], [17, 11], [15, 14], [12, 11], [15, 11], [14, 10], [14, 11], [17, 6], [18, 11], [15, 1], [13, 3], [14, 2], [15, 3], [17, 11], [16, 11], [12, 6], [13, 2], [14, 3], [16, 2], [12, 11], [14, 6], [14, 1], [13, 11], [15, 2], [16, 1], [10, 15], [15, 12], [15, 6], [17, 7], [17, 6], [13, 1], [14, 2], [12, 6], [16, 2], [15, 2], [16, 11], [15, 1], [13, 10], [13, 7], [16, 10], [14, 7], [12, 10], [17, 10], [13, 2], [17, 11], [15, 11], [15, 7], [17, 9], [17, 7], [16, 10], [15, 11], [17, 1], [16, 14], [16, 12], [18, 10], [16, 8], [18, 11], [19, 10], [17, 11], [14, 10], [14, 7], [18, 10], [15, 7], [14, 5], [18, 11]]

jmschrei · 2024-03-11T00:26:54Z

When you pass in DenseHMM([d, d]) I'm pretty sure you're passing in the same object to each state and so both will always be identical. Try making two copies of the object?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG]Speed issue #1075

[BUG]Speed issue #1075

tanyasarkjain commented Jan 18, 2024

jmschrei commented Jan 18, 2024

tanyasarkjain commented Jan 19, 2024 •

edited

jmschrei commented Mar 11, 2024

[BUG]Speed issue #1075

[BUG]Speed issue #1075

Comments

tanyasarkjain commented Jan 18, 2024

jmschrei commented Jan 18, 2024

tanyasarkjain commented Jan 19, 2024 • edited

jmschrei commented Mar 11, 2024

tanyasarkjain commented Jan 19, 2024 •

edited