[Question] What is the difference between predict_proba and log_probability methods for HMMs #1089

ko62147 · 2024-04-01T21:48:04Z

Hello,

I fitted an HMM to a set of observation sequences, however I get positive log probability values (or probability values greater than 1) when I call the log_probability method on some test observation sequences. What does positive log probability values mean in the context of the HMM inference, and how is the log_probability method different from the predict_proba method?

The text was updated successfully, but these errors were encountered:

jmschrei · 2024-04-02T18:46:53Z

predict_proba gives you the posterior probabilities that each observation aligns to each hidden state in the model given all of the other observations in the sequence. It's also called the forward-backward algorithm.

log_probability can be positive when you have continuous observations and have a distribution with a very small variance. For instance, if you have a normal distribution with a mean of 0 and a std of 0.0001, a value of 0 will have a probability above 1.

ko62147 · 2024-04-02T19:28:53Z

Thanks for the reply. I am trying to understand the physical meaning of the results from the log_probability method. Does it mean that situations where continuous observations return a positive log probability (or probability greater than 1) have complete certainty (i.e. 100% probability) that the observations/data are generated from the distribution/model?

jmschrei · 2024-04-02T21:03:47Z

I think you're entering one of the confusing areas of probability theory. Basically, just because a point estimate is above 1 doesn't mean that it's guaranteed to happen. For instance, in my example above, P(0.0001) would be above 1 but so would P(0.00011). Both can't be guaranteed to happen. Instead, people usually look at probabilites of events happening within ranges of a probability distribution and then set those ranges to be very small, e.g., (P(x+e) - P(x - e)) / 2e In my experience, the most practical interpretation of probabilities greater than 1 is that your model has overfit to something.

ko62147 · 2024-04-03T14:57:54Z

Understood. Thanks for the clarification. What do you recommend to reduce overfitting for HMMs?

ko62147 · 2024-04-03T21:21:54Z

I am fitting/training HMMs using time series (datetime) data transformed into radial basis functions or sine/cosine vectors scaled using min-max scaler. However, I keep obtaining positive log_probability values for some of the test sequence observations using these time series (datetime) transformations. Based on your experience:

What would you recommend to address the positive log_probability values returned for the test observation sequences?
What time series (datetime) transformation would you recommend for datetime observations to fit a HMM?
What do you recommend to eliminate overfitting in HMMs trained on these (continuous) observations?
Is it viable/reasonable to combine (transformed/preprocessed) datetime and binary features as observation sequences to fit/train a HMM?

jmschrei · 2024-04-04T20:54:42Z

Having positive log probability values isn't a problem that needs fixing. The math is still all valid, one just needs to know what it means and why.
If you're going to use values explicitly scaled to the 0-1 range you might want to use a distribution like a Beta (you'd have to implement your own) that is explicitly in that range. If you want negative log probabilities and are using a Normal distribution you might try a mean/std scaling instead.
It depends on the model parameters. What does the transition matrix look like? What are the distributions and what do their parameters look like?
Sure, just use https://github.com/jmschrei/pomegranate/blob/master/pomegranate/distributions/independent_components.py This class lets you pass in one univariate distribution for each feature and it can be a totally different distribution type. The one catch is that it doesn't learn covariance across any features.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] What is the difference between predict_proba and log_probability methods for HMMs #1089

[Question] What is the difference between predict_proba and log_probability methods for HMMs #1089

ko62147 commented Apr 1, 2024

jmschrei commented Apr 2, 2024

ko62147 commented Apr 2, 2024

jmschrei commented Apr 2, 2024

ko62147 commented Apr 3, 2024 •

edited

ko62147 commented Apr 3, 2024 •

edited

jmschrei commented Apr 4, 2024

[Question] What is the difference between predict_proba and log_probability methods for HMMs #1089

[Question] What is the difference between predict_proba and log_probability methods for HMMs #1089

Comments

ko62147 commented Apr 1, 2024

jmschrei commented Apr 2, 2024

ko62147 commented Apr 2, 2024

jmschrei commented Apr 2, 2024

ko62147 commented Apr 3, 2024 • edited

ko62147 commented Apr 3, 2024 • edited

jmschrei commented Apr 4, 2024

ko62147 commented Apr 3, 2024 •

edited

ko62147 commented Apr 3, 2024 •

edited