Incomplete info in documentation regarding the combinations for `<group>=NA` and `re_formula=NULL` #1652

mattansb · 2024-05-12T06:44:30Z

Currently, the cods for prepare_predictions() read:

newdata
[...] NA values within factors are interpreted as if all dummy variables of this factor are zero.
re_formula
[...] If NULL (default), include all group-level effects; if NA, include no group-level effects.

The newdata argument seems to suggest that setting newdata = data.frame(..., group = NA) should have the same effect as re_formula = NA since in both cases the group-specific coefficients are set to 0.

But this is not the case.

Instead, it seem that

prepare_predictions(
  newdata = data.frame(..., group = NA), 
  re_formula = NULL, # default
  allow_new_levels = FALSE # default
)

is closer to

prepare_predictions(
  newdata = data.frame(..., group = "<NEW>"), 
  re_formula = NULL, # default
  allow_new_levels = TRUE
)

(even though newlevels throw an error when allow_new_levels = FALSE).

It is not clear which of sample_new_levels = c("uncertainty", "gaussian") is used in this case.

The text was updated successfully, but these errors were encountered:

paul-buerkner · 2024-05-14T11:21:04Z

newdata = data.frame(..., group = NA) just defines a new grouping level, which does not affect any dummy variables, since random effects don't have dummy variables. Such variables only apply for fixed effects. How can we make this clearer?

mattansb · 2024-05-15T05:05:50Z

I was expecting newdata = data.frame(..., group = NA) to be the same as re_formula = NA be cause I interpreted "NA values within factors are interpreted as if all dummy variables of this factor are zero." to mean that in a mixed model

$$ y = bX + uZ + e $$

Then all $Z$ are set to 0, similar to how if group was a fixed effect all $X$ would be set to 0.

But if newdata = data.frame(..., group = NA) is just another "new" level, than it should also give an error if not setting allow_new_levels:

library(brms)

fit <- brm(count ~ 1 + (1|patient),
           data = epilepsy, family = poisson())


posterior_epred(fit,
  newdata = data.frame(patient = "<NEW>")
)
#> Error: Levels '<NEW>' of grouping factor 'patient' cannot be found in the 
#> fitted model. Consider setting argument 'allow_new_levels' to TRUE.

# Does not throw an error...
posterior_epred(fit,
  newdata = data.frame(patient = NA)
)
#>           [,1]
#> [1,]  1.772992
#> [2,]  4.682992
#> [3,] 11.606553
#> [4,]  2.182194
#> [5,]  1.660112
#> [6,]  2.234523
#> .....

If this is the intended behavior, it should also require setting allow_new_levels = TRUE, and maybe the docs should read:

NA values within fixed factors are interpreted as if all dummy variables of this factor are zero. NA values within random factors are treated as a new level.

paul-buerkner · 2024-05-15T05:48:52Z

good points. let me check in more detail. Mattan S. Ben-Shachar ***@***.***> schrieb am Mi., 15. Mai 2024, 07:06:

…

I was expecting newdata = data.frame(..., group = NA) to be the same as re_formula = NA be cause I interpreted "*NA values within factors are interpreted as if all dummy variables of this factor are zero.*" to mean that in a mixed model $$ y = bX + uZ + e $$ Then all $Z$ are set to 0, similar to how if group was a fixed effect all $X$ would be set to 0. But if newdata = data.frame(..., group = NA) is just another "new" level, than it should also give an error if not setting allow_new_levels: library(brms) fit <- brm(count ~ 1 + (1|patient), data = epilepsy, family = poisson()) posterior_epred(fit, newdata = data.frame(patient = "<NEW>") )#> Error: Levels '<NEW>' of grouping factor 'patient' cannot be found in the #> fitted model. Consider setting argument 'allow_new_levels' to TRUE. # Does not throw an error... posterior_epred(fit, newdata = data.frame(patient = NA) )#> [,1]#> [1,] 1.772992#> [2,] 4.682992#> [3,] 11.606553#> [4,] 2.182194#> [5,] 1.660112#> [6,] 2.234523#> ..... If this is the intended behavior, it should also require setting allow_new_levels = TRUE, and maybe the docs should read: NA values within *fixed* factors are interpreted as if all dummy variables of this factor are zero. *NA values within random factors are treated as a new level.* — Reply to this email directly, view it on GitHub <#1652 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADCW2ABDT7CHOMJQRT2JVLLZCLUMJAVCNFSM6AAAAABHSR3ZOWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJRGU4TIMRVG4> . You are receiving this because you commented.Message ID: ***@***.***>

paul-buerkner added the documentation label May 21, 2024

paul-buerkner added this to the brms 2.22.0 milestone May 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incomplete info in documentation regarding the combinations for `<group>=NA` and `re_formula=NULL` #1652

Incomplete info in documentation regarding the combinations for `<group>=NA` and `re_formula=NULL` #1652

mattansb commented May 12, 2024

paul-buerkner commented May 14, 2024

mattansb commented May 15, 2024

paul-buerkner commented May 15, 2024 via email

Incomplete info in documentation regarding the combinations for <group>=NA and re_formula=NULL #1652

Incomplete info in documentation regarding the combinations for <group>=NA and re_formula=NULL #1652

Comments

mattansb commented May 12, 2024

paul-buerkner commented May 14, 2024

mattansb commented May 15, 2024

paul-buerkner commented May 15, 2024 via email

Incomplete info in documentation regarding the combinations for `<group>=NA` and `re_formula=NULL` #1652

Incomplete info in documentation regarding the combinations for `<group>=NA` and `re_formula=NULL` #1652