Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_variance() and null_model() functions not working depending on the way you call the objects of your model #837

Open
Sebastian-Montoya-B opened this issue Dec 4, 2023 · 1 comment
Labels
3 investigators ❔❓ Need to look further into this issue bug 🐛 Something isn't working get_variance function specific labels

Comments

@Sebastian-Montoya-B
Copy link

Dear easystats team.

I was trying to use the get_variance() function of the insight package, and I got the following warning:

Warning message:
Can't calculate model's distribution-specific variance. Results are not reliable.

While doing the traceback of that warning, I found out that the function null_model() "does not like" the way I call the response variable in a glmmTMB model. So, when I use null_model(my_model), I get

NULL

as an outcome, even when there's nothing "wrong" with my model. Such outcome affects the use of other functions like R2() from the performance package. Additionally, some less experienced users of R might not bet able to find what is actually happening, because, in this specific case, the warning message is misleading.

Here is a reproducible example using the attached csv file, where, as you will see, the null_model() function will only work correctly for model2 and model3:
dfexample.csv

dfex<-read.csv("dfexample.csv")

library(insight)
library(glmmTMB)


model1<-glmmTMB(dfex[,2]~ response +(1|random), data=dfex, family=ordbeta)
get_variance(model1)
null_model(model1, verbose = TRUE)

model2<-glmmTMB(response~ dfex[,4] +(1|random), data=dfex, family=ordbeta)
get_variance(model2)
null_model(model2, verbose = TRUE)

model3<-glmmTMB(response~ predictor +(1|random), data=dfex, family=ordbeta)
get_variance(model3)
null_model(model3, verbose = TRUE)

where you will find the following outcome:

null_model(model1, verbose = TRUE)
NULL

null_model(model2, verbose = TRUE)
Formula: response ~ (1 | random)
Data: dfex
AIC BIC logLik df.resid
-445.3939 -430.6148 227.6969 137
Random-effects (co)variances:

Conditional model:
Groups Name Std.Dev.
random (Intercept) 0.4869

Number of obs: 142 / Conditional model: random, 3

Dispersion parameter for ordbeta family (): 26.7

lower cutoff estimate: 0.000579, 1

Fixed Effects:

Conditional model:
(Intercept)
-2.324

null_model(model3, verbose = TRUE)
Formula: response ~ (1 | random)
Data: dfex
AIC BIC logLik df.resid
-445.3939 -430.6148 227.6969 137
Random-effects (co)variances:

Conditional model:
Groups Name Std.Dev.
random (Intercept) 0.4869

Number of obs: 142 / Conditional model: random, 3

Dispersion parameter for ordbeta family (): 26.7

lower cutoff estimate: 0.000579, 1

Fixed Effects:

Conditional model:
(Intercept)
-2.324

Here is my session info:

R version 4.2.3 (2023-03-15 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale:
[1] LC_COLLATE=Spanish_Colombia.utf8 LC_CTYPE=Spanish_Colombia.utf8
[3] LC_MONETARY=Spanish_Colombia.utf8 LC_NUMERIC=C
[5] LC_TIME=Spanish_Colombia.utf8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] glmmTMB_1.1.8 insight_0.19.7.3

loaded via a namespace (and not attached):
[1] zoo_1.8-11 tidyselect_1.2.0 TMB_1.9.6 purrr_1.0.2
[5] listenv_0.8.0 splines_4.2.3 lattice_0.20-45 bestNormalize_1.9.1
[9] vctrs_0.6.3 generics_0.1.3 mgcv_1.8-42 utf8_1.2.2
[13] survival_3.5-3 prodlim_2023.08.28 rlang_1.1.1 nloptr_2.0.3
[17] pillar_1.9.0 glue_1.6.2 rngtools_1.5.2 doRNG_1.8.6
[21] multcomp_1.4-25 emmeans_1.8.8 foreach_1.5.2 lifecycle_1.0.3
[25] lava_1.7.2.1 timeDate_4022.108 commonmark_1.9.0 mvtnorm_1.1-3
[29] future_1.29.0 recipes_1.0.8 coda_0.19-4 codetools_0.2-19
[33] doParallel_1.0.17 parallel_4.2.3 class_7.3-21 fansi_1.0.3
[37] TH.data_1.1-2 Rcpp_1.0.9 xtable_1.8-4 ipred_0.9-14
[41] parallelly_1.32.1 lme4_1.1-34 digest_0.6.30 dplyr_1.1.3
[45] numDeriv_2016.8-1.1 grid_4.2.3 hardhat_1.3.0 cli_3.6.1
[49] tools_4.2.3 sandwich_3.0-2 magrittr_2.0.3 tibble_3.2.1
[53] future.apply_1.11.0 pkgconfig_2.0.3 MASS_7.3-58.2 Matrix_1.6-1.1
[57] data.table_1.14.6 xml2_1.3.3 estimability_1.4.1 lubridate_1.9.0
[61] timechange_0.1.1 gower_1.0.1 minqa_1.2.5 butcher_0.3.3
[65] rstudioapi_0.14 iterators_1.0.14 R6_2.5.1 globals_0.16.2
[69] rpart_4.1.19 boot_1.3-28.1 nnet_7.3-19 nlme_3.1-162
[73] compiler_4.2.3

Thank you very much for your time.

@bwiernik
Copy link
Contributor

bwiernik commented Dec 6, 2023

It might be possible to fix this, but of the 3, only model3 is a reliable way to specify a model. Mixing use of the data argument with direct supplying of vectors (via subsetting the original data frame) will be extremely unreliable and prone to errors or incorrect results

@strengejacke strengejacke added bug 🐛 Something isn't working 3 investigators ❔❓ Need to look further into this issue labels Jan 30, 2024
@strengejacke strengejacke added the get_variance function specific labels label Mar 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 investigators ❔❓ Need to look further into this issue bug 🐛 Something isn't working get_variance function specific labels
Projects
None yet
Development

No branches or pull requests

3 participants