Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bas.predict error when n<p when using estimator= BMA and se.fit =T #70

Open
petersen-f opened this issue May 11, 2023 · 2 comments
Open

Comments

@petersen-f
Copy link

Describe the bug
Great package! A noticed a bug however when trying to perform predictions.
When training a model with more predictors than variables (n<p) via method = BAS, the prediction of new data (with se.fit =T and estimator= BMA) fails with the following error: Error in solve.default(qr.R(qr(oldX))) : 'a' (14 x 15) must be square

To Reproduce
Steps to reproduce the behavior:

data("bodyfat")
bas_mod <- bas.lm(Bodyfat ~.,data = bodyfat[1:14,], method = 'BAS')
pred <- predict(bas_mod,newdata = bodyfat[15:20,], se.fit = T, estimator = 'BMA') 

Expected behavior
The function should return predictions with the 95% credible interval.
If this behavior is not a bug and this type of prediction is impossible I would expect a more informative error that se.fit =T is not supported for n>p scenarios via BMA and the BAS method. It seems to work fine if the method is set to MCMC however.

Desktop (please complete the following information):

  • OS: Ubuntu 22.04
  • R Version: 4.2.2
  • BAS version: 1.6.4
@merliseclyde merliseclyde self-assigned this May 11, 2023
@merliseclyde
Copy link
Owner

merliseclyde commented May 11, 2023

Thanks! That is a bug in the n<p case. I am guessing the reason that it does not happen with MCMC as the sampler is not visiting the non-full rank models, while BAS in this case is enumerating them and non-full rank models are part of the model space (but there should have been a warning about that - an additional issue).

@merliseclyde
Copy link
Owner

merliseclyde commented Dec 4, 2023

error also triggered using method='deterministic' as this also samples all models.

data("bodyfat")
bas_mod <- bas.lm(Bodyfat ~.,data = bodyfat[1:14,], method = 'BAS')
pred <- predict(bas_mod,newdata = bodyfat[15:20,], se.fit = T, estimator = 'BMA') 

This is also y a problem in bas.glm as well.

  1. Short-term fix is to remove rank deficient models in predict as an option (post-process).
  2. Assign rank-deficient models prior probability 0 in C which would fix this via solution to issue bas.lm and bas.glm ignoring prior model probabilities that are 0  #74
  3. Fix code so that models are not saved for sampling methods BAS, deterministic and MCMC+BAS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants