Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with icc() when running glmmTMB and family = ordbeta() or beta_family() #764

Open
reiniervlinschoten opened this issue May 8, 2023 · 6 comments
Labels
3 investigators ❔❓ Need to look further into this issue get_variance function specific labels

Comments

@reiniervlinschoten
Copy link

It seems like there is an error when running the icc() function on a glmmTMB object that is ran with the ordbeta() family.
The icc() is way lower than expected when for example running the same model but with a gaussian distribution (without adjusting the range).
Or it is way too high, when using a beta distribution (while adjusting the range).

See also this reprex:

library(glmmTMB)
library(lme4)
#> Loading required package: Matrix
library(DHARMa)
#> This is DHARMa 0.4.6. For overview type '?DHARMa'. For recent changes, type news(package = 'DHARMa')
library(performance)
library(tidyverse)

set.seed(101)
dd <- data.frame(z = rnorm(1000, 80, 20), 
                 group = rep(-9:10, each = 50)) %>%
  mutate(re = rnorm(1000, group, 10)) %>%
  mutate(z = pmax(z + re, 0.1))

mod_lme4 <- lmer(z ~ (1|group), data = dd)
icc(mod_lme4)
#> # Intraclass Correlation Coefficient
#> 
#>     Adjusted ICC: 0.073
#>   Unadjusted ICC: 0.073


mod_TMB <- glmmTMB(z ~ (1|group), data = dd, 
                   control = glmmTMBControl(rank_check = "adjust"))
icc(mod_TMB)
#> # Intraclass Correlation Coefficient
#> 
#>     Adjusted ICC: 0.069
#>   Unadjusted ICC: 0.069


ordbeta_TMB <- glmmTMB(z ~ (1|group), data = dd %>% mutate(z = z/max(z)), 
                       family=ordbeta(),
                       control = glmmTMBControl(rank_check = "adjust"))
icc(ordbeta_TMB)
#> # Intraclass Correlation Coefficient
#> 
#>     Adjusted ICC: 0.003
#>   Unadjusted ICC: 0.003

beta_TMB <- glmmTMB(z ~ (1|group), data = dd %>% mutate(z = z/((max(z) + 0.01))), 
                    family=beta_family(),
                    control = glmmTMBControl(rank_check = "adjust"))
icc(beta_TMB)
#> Warning: mu of 1.1 is too close to zero, estimate of random effect variances may
#>   be unreliable.
#> Warning: Model's distribution-specific variance is negative. Results are not
#>   reliable.
#> # Intraclass Correlation Coefficient
#> 
#>     Adjusted ICC: 1.496
#>   Unadjusted ICC: 1.496
Created on 2023-05-08 with [reprex v2.0.2](https://reprex.tidyverse.org/)
@strengejacke strengejacke added 3 investigators ❔❓ Need to look further into this issue get_variance function specific labels labels May 9, 2023
@strengejacke
Copy link
Member

Might be related to #664. glmmTMB updated the family$variance() functions (resp. the ordered-beta family is rather new and not yet considered in insight::get_variance()), and I have to check the code, since the returned variance from glmmTMB has changed.

@mattansb
Copy link
Member

mattansb commented May 9, 2023

Other than the issue reported by Daniel,
icc() returns the model implied ICC on the link scale, so it is generally expected the use of different families or link functions will produce difference ICC values. (note that ordbeta_TMB and beta_TMB produce predictions that are not in agreement).

library(lme4)
#> Warning: package 'lme4' was built under R version 4.2.3
#> Loading required package: Matrix
#> Warning: package 'Matrix' was built under R version 4.2.3
library(performance)
#> Warning: package 'performance' was built under R version 4.2.3

m_lin <- lmer(cyl ~ 1 + (1 | gear),
              data = mtcars)

m_pois <- glmer(cyl ~ 1 + (1 | gear),
                family = poisson("log"),
                data = mtcars)

m_pois2 <- glmer(cyl ~ 1 + (1 | gear),
                 family = poisson("sqrt"),
                 data = mtcars)

icc(m_lin)
#> # Intraclass Correlation Coefficient
#> 
#>     Adjusted ICC: 0.544
#>   Unadjusted ICC: 0.544
icc(m_pois)
#> # Intraclass Correlation Coefficient
#> 
#>     Adjusted ICC: 0.155
#>   Unadjusted ICC: 0.155
icc(m_pois2)
#> # Intraclass Correlation Coefficient
#> 
#>     Adjusted ICC: 0.145
#>   Unadjusted ICC: 0.145

Created on 2023-05-09 with reprex v2.0.2

@reiniervlinschoten
Copy link
Author

Other than the issue reported by Daniel, icc() returns the model implied ICC on the link scale, so it is generally expected the use of different families or link functions will produce difference ICC values. (note that ordbeta_TMB and beta_TMB produce predictions that are not in agreement).

@mattansb , is there any way to get the ICC from different models on the same scale? Or are ICCs from different models not comparable?

The ordered beta paper can be found here: https://www.cambridge.org/core/journals/political-analysis/article/abs/ordered-beta-regression-a-parsimonious-wellfitting-model-for-continuous-data-with-lower-and-upper-bounds/89F4141DA16D4FC217809B5EB45EEE83

This might be different than the beta distribution due to the cut-points used for the lower and upper bounds (see also equation (5))

@mattansb
Copy link
Member

mattansb commented May 9, 2023

You can potentially compute a prediction based R2:

library(glmmTMB)
#> Warning: package 'glmmTMB' was built under R version 4.2.3
#> Warning in checkDepPackageVersion(dep_pkg = "TMB"): Package version inconsistency detected.
#> glmmTMB was built with TMB version 1.9.3
#> Current TMB version is 1.9.4
#> Please re-install glmmTMB from source or restore original 'TMB' package (see '?reinstalling' for more information)
library(dplyr)
#> Warning: package 'dplyr' was built under R version 4.2.3
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

set.seed(101)
dd <- data.frame(z = rnorm(1000, 80, 20), 
                 group = rep(-9:10, each = 50)) %>%
  mutate(re = rnorm(1000, group, 10)) %>%
  mutate(z = pmax(z + re, 0.1),
         z2 = z / max(z),
         z3 = z / (max(z) + 0.01))

m_gauss <- glmmTMB(z ~ (1|group), data = dd)

m_ordbeta <- glmmTMB(z2 ~ (1|group), data = dd, 
                     family = ordbeta())

m_beta <- glmmTMB(z3 ~ (1|group), data = dd, 
                  family = beta_family())


dd$pred_m_gauss <- predict(m_gauss, type = "response", re.form = NULL)
dd$pred_m_ordbeta <- predict(m_ordbeta, type = "response", re.form = NULL)
dd$pred_m_beta <- predict(m_beta, type = "response", re.form = NULL)

with(dd, cor(z, pred_m_gauss)^2)
#> [1] 0.08750488
with(dd, cor(z2, pred_m_ordbeta)^2)
#> [1] 0.08537998
with(dd, cor(z3, pred_m_beta)^2)
#> [1] 0.08477542

Created on 2023-05-09 with reprex v2.0.2

@reiniervlinschoten
Copy link
Author

That is an interesting option, but I am looking for the ICC (adjusted) for one of the grouping factors (random intercepts) in my data. Is there a comparable workaround?

@mattansb
Copy link
Member

That's a bit harder to do - but you can probably use this method to get those as well, by subtracting the re.form = NA from the re.form = NULL predictions or something of the like...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 investigators ❔❓ Need to look further into this issue get_variance function specific labels
Projects
None yet
Development

No branches or pull requests

3 participants