Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ICC above 1 for glmmTMB with beta_family #664

Open
roaldarbol opened this issue Oct 10, 2022 · 5 comments
Open

ICC above 1 for glmmTMB with beta_family #664

roaldarbol opened this issue Oct 10, 2022 · 5 comments
Labels
3 investigators ❔❓ Need to look further into this issue bug 🐛 Something isn't working get_variance function specific labels

Comments

@roaldarbol
Copy link

roaldarbol commented Oct 10, 2022

I'm currently performing repeatability analysis, mostly with the rptR package, however my data are proportions (not just ones and zeros) and is best modelled with a beta family. So I've been using glmmTMB with a beta_family.
To verify my repeatability estimates, I tested the performance::icc() function. However, for certain datasets I get ICCs above 1 - which makes me a bit suspicious about using the ICCs for beta in general.

I also get the warnings 1: mu of 1.1 is too close to zero, estimate of random effect variances may be unreliable. and 2: Model's distribution-specific variance is negative. Results are not reliable.. The mu warning seems to show up for all the models I run using a beta family.

I managed to create a MRE:

set.seed(123)
a <- seq(from = 5, to = 95)
b1 <- jitter(a, factor = 20)
b2 <- jitter(a, factor = 20)
b2 <- b2 + 30
b3 <- jitter(a, factor = 20)
b3 <- b3 + 30
t_a <- rep('a', length(a))
t_b <- rep('b', length(a))

c <- as_factor(a)
d <- tibble::tibble(id = c(c,c,c,c),
            value = c(a,b1,b2,b3),
            treatment = c(t_a, t_a, t_b, t_b))
d$value <- d$value / (max(d$value) + 0.1)
glmm_rpd <- glmmTMB::glmmTMB(value ~ treatment + (1 | id),
                              data = d,
                              family=beta_family)
performance::icc(glmm_rpd)
# # Intraclass Correlation Coefficient
#
#    Adjusted ICC: 1.001
#  Unadjusted ICC: 0.746
# Warning messages:
# 1: mu of 1.1 is too close to zero, estimate of random effect variances may be unreliable. 
# 2: Model's distribution-specific variance is negative. Results are not reliable. 

For my own analysis, I often get Adjusted ICC values in the 90's, though I'm a bit unsure of how to properly interpret them. I've read the relevant papers, but either I haven't found a concise formulation - or the values I'm getting are plain wrong and that's why things don't add up in my head (I think the Adjusted ICC is the amount of residual variance explained by the random effects after correcting for the fixed effects - is that way off?)

(PS I'm posting the issue here as I think the issue might stem from the insight::compute_variances() function)

@strengejacke strengejacke added bug 🐛 Something isn't working 3 investigators ❔❓ Need to look further into this issue labels Oct 10, 2022
@roaldarbol
Copy link
Author

roaldarbol commented Oct 13, 2022

Just to add some information (in case it's helpful) If I try to do the same thing with a gaussian model I do not seem to get the error. And it seems to be the same thing that affects the R2 (or at least they're both affected).

glmm_gaussian <- glmmTMB::glmmTMB(total_scaled ~ time_of_day + sex + (1 | animal_id),
              data = paired_long_ld,
              family = gaussian)
glmm_beta <- glmmTMB::glmmTMB(total_scaled ~ time_of_day + sex + (1 | animal_id),
              data = paired_long_ld,
              family = beta_family)
performance::compare_performance(glmm_long_dd_a, glmm_long_dd_b)
# Some of the nested models seem to be identical and probably only vary in their random effects.
# # Comparison of Model Performance Indices
#
# Name           |   Model | AIC (weights) | AICc (weights) | BIC (weights) | R2 (cond.) | R2 (marg.) |   ICC |  RMSE | Sigma
# --------------------------------------------------------------------------------------------------------------------------
# glmm_gaussian | glmmTMB | -51.4 (<.001) |  -50.8 (<.001) | -37.8 (<.001) |      0.658 |      0.134 | 0.605 | 0.128 | 0.144
# glmm_beta     | glmmTMB | -68.9 (>.999) |  -68.3 (>.999) | -55.3 (>.999) |      1.037 |      0.200 | 1.047 | 0.129 | 9.221

As this is not-yet-published data I can't share it publicly, but if one of the devs shoot me a message/email I'll be happy to share along with some reproducible code.

@roaldarbol
Copy link
Author

roaldarbol commented Oct 13, 2022

Alright, I'll have to share some data (here). Now I'm getting negative ICCs and +1 R2s. It's also worth noting that when I get these spurious values, I get nothing in the Homogeneity of Variance subplot in check_model(), in case that can help triage the issue.

library(glmmTMB)
library(performance)

# See distributions in image below
df_weird <- read_csv('minimal_data.csv')
ggplot(df_weird, aes(value)) +
  geom_histogram() +
  facet_grid(
    rows = vars(condition_b),
    cols = vars(condition_a)
  )

# Fit model
glmm_weird <- glmmTMB::glmmTMB(value ~ condition_a + condition_b + (1 | id),
              data = df_weird,
              family = beta_family)

# Check model
performance::check_model(glmm_weird)

# See model performance
performance::model_performance(glmm_weird)
# # Indices of model performance
#
# AIC       |      AICc |       BIC | R2 (cond.) | R2 (marg.) |    ICC |  RMSE | Sigma
# ------------------------------------------------------------------------------------
# -1140.006 | -1139.720 | -1106.099 |      1.230 |      1.095 | -1.419 | 0.139 | 6.207

The data doesn't look too crazy:
Screenshot 2022-10-13 at 18 57 48

And the model check doesn't look completely awful - though notice the missing subplot:
Screenshot 2022-10-13 at 18 57 23

@strengejacke strengejacke added the get_variance function specific labels label Nov 18, 2022
@roaldarbol
Copy link
Author

Hey there! Just wanted to check whether this is on the radar? :-) I know there's probably plenty of stuff to work on, I just think it would be worth some attention as this currently seems to be giving actual wrong output.

@strengejacke
Copy link
Member

Yes, it is :-) But indeed needs some more thorough digging into details.

@roaldarbol
Copy link
Author

Perfect, just wanted to make sure. :-) Do you have any clue when it will be dug into roughly? Month, quarter, half a year?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 investigators ❔❓ Need to look further into this issue bug 🐛 Something isn't working get_variance function specific labels
Projects
None yet
Development

No branches or pull requests

2 participants