Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HMM with covariates? #1094

Open
christer-watson opened this issue Apr 15, 2024 · 6 comments
Open

HMM with covariates? #1094

christer-watson opened this issue Apr 15, 2024 · 6 comments

Comments

@christer-watson
Copy link

More a question for confirmation than anything (and first time posting). Not a bug report.

I am trying to construct a HMM, 2 states with emission distributions that are Gamma distributions. I would like the transition matrices to be dependent on day-of-the-week (and, down-the-line, other variables). I think the term for these would be covariates.

It appears that this is not currently implemented in Pomegranate. I think my only option would be to create a new HMM distribution that had a more complex treatment of transition matrices. Is all that right? If so, it is probably outside my commitment & knowledge level. However, I wanted to confirm there wasn't another approach before giving up.

@jmschrei
Copy link
Owner

Howdy

Yeah, sorry, time-dependent HMMs are super cool but aren't currently supported. If you have a resource for how they're implemented I'd love to take a look to confirm my understanding but I can't commit to implementing them soon.

@christer-watson
Copy link
Author

Just one last question. Push aside the difficulties of a time covariate. Is something like a temperature covariate (i.e., not cyclic) easy to incorporate, or would that too require creating a new HMM distribution?

@jmschrei
Copy link
Owner

The short answer is you'd probably need to make a new class in either situation but that potentially the modifications would be minimal.

If these covariates linearly modify the log probabilities in the transition matrix you could modify the forward and backward functions (e.g., this line https://github.com/jmschrei/pomegranate/blob/master/pomegranate/hmm/dense_hmm.py#L339 and maybe also https://github.com/jmschrei/pomegranate/blob/master/pomegranate/hmm/dense_hmm.py#L333) to add in your covariates at that time step. Conceptually, if you're able to fix the forward and backward methods (and pass the tensor of covariates through all the other functions, e.g. fit and forward_backward) you should also be able to train the model without any other larger modifications.

If these covariates non-linearly modified/generated the transition matrix you'd probably need to do something more complicated that takes in the covariates as input and outputs a transition matrix at each step. I can see this being possible, particularly with PyTorch, but I don't know exactly how you'd optimize the whole thing.

@christer-watson
Copy link
Author

Thanks for responding here. One last question. Let's say I want to develop a HMM where each hidden state has two different emission distributions. That is, we can observe two different features. One feature is continuous, drawn from a Gaussian, say. The second feature, however, is categorical. Totally different property.

Is this possible?

Reading through the very-well written documentation, I can't figure out if this is possible or not. I've tried playing around, without any luck.

@jmschrei
Copy link
Owner

jmschrei commented Apr 25, 2024

Yes, you should be able to use IndependentComponents, which treats each feature independently with a different distribution: https://github.com/jmschrei/pomegranate/blob/master/pomegranate/distributions/independent_components.py#L15

A downside of this is that it does treat the two features independently -- which may not be exactly what you want.

@christer-watson
Copy link
Author

I think that is exactly what I want. No covariance. Thanks. Btw, the documentation on Independent Components from earlier versions doesn't appear to have migrated to the v1.0 documentation, so I didn't even know this class existed. Reading the older documentation and the code makes clear how to use it, though.

When I tried to create a HMM that was fitting a gaussian feature and a separate categorical feature, I ran into an error. The data are converted to floats, but the categorical distribution is expecting an Int or Long to index a probability calculation. I went into the code and, at the appropriate point, cast the data to Int and it appears to work. I didn't add any error checking, however, so it probably makes the code more fragile.

Would you want me to try to make a pull request with this change or just leave it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants