Differences between get_predicted and get_predicted_ci for mixed models #797

fhui28 · 2023-08-16T00:19:24Z

This is something that came up during a visit to @bbolker with @emitanaka. Basically, for (G)LMMs, the standard errors returned by get_predicted and get_predicted_ci differ.

An example for LMMs first:

library(tidyverse)
library(lme4)
library(glmmTMB)
library(insight)

m1 <- lmer(Petal.Width ~ Sepal.Length + (1 | Species), iris)
get_predicted(m1, ci = 0.95) %>% as.data.frame() %>% head
predictions <- predict(m1)
get_predicted_ci(m1, predictions) %>% as.data.frame() %>% head 
#' Matches. Currently, both of these standard errors are actually wrong because they 
#' do not account for the uncertainty in the predicted random effects. 
#' This is an lme4 thing though, and Ben Bolker among others is aware of this. 


m2 <- glmmTMB(Petal.Width ~ Sepal.Length + (1 | Species), iris, REML = TRUE)
get_predicted(m2, ci = 0.95) %>% as.data.frame() %>% head 
predictions <- predict(m2)
get_predicted_ci(m2, predictions) %>% as.data.frame() %>% head 
#' Does not match. The former gets the TMB prediction error,  which is better. 
#' But the latter reverts back to the lme4 incorrect prediction error.

For the issue above, and with respect to the insights package, my bigger concern is the inconsistency in modem m2. As mentioned, the issue why the prediction errors differ between lmer and glmmTMB is something @bbolker is already aware of; see this github issue from glmmTMB

Things gets a bit weirder for GLMMs though, and this is where I will throw my hands up in the air (even higher!)

m1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd), data = cbpp, family = binomial)
get_predicted(m1, ci = 0.95) %>% as.data.frame() %>% head
predictions <- predict(m1, type = "response")
get_predicted_ci(m1, predictions) %>% as.data.frame() %>% head 
#' These two do not match (the latter is conisderably larger); not sure why.


m2 <- glmmTMB(cbind(incidence, size - incidence) ~ period + (1 | herd), data = cbpp, family = binomial)
get_predicted(m2, ci = 0.95) %>% as.data.frame() %>% head
predictions <- predict(m2, type = "response")
get_predicted_ci(m2, predictions) %>% as.data.frame() %>% head 
#' These two also do not match (the latter is conisderably larger); not sure why. 
#' Also, note the latter gets very similar results to get_predicted_ci(m1, predictions), 
#' while the former is not the same as get_predicted(m1, ci = 0.95)

The text was updated successfully, but these errors were encountered:

strengejacke · 2023-09-26T20:31:38Z

get_predicted_ci() is mostly (internally) used for those model classes that do not return any standard errors for predictions (i.e. when there's nothing like se.fit = TRUE). This also applies to merMod-models. For these models, we use something like:

M <- model.matrix(m1)
V <- vcov(m1)
se <- sqrt(diag(M %*% V %*% t(M)))

based on https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#lme4. Thus, when calling get_predicted_ci(), standard errors are always based on the above formula. get_predicted() returns standard errors from predict() where possible. That's why get_predicted.glmmTMB() relies on standard errors returned by predict.glmmTMB(), but when calling get_predicted_ci() - even for a glmmTMB model - the above formula is used. That explains the difference between standard errors for glmmTMB models returned by get_predicted() or get_predicted_ci().

I think this should be clearly documented, and that get_predicted() is preferred (we may even hide get_predicted_ci() from the users?) in order to get SE and CI.

strengejacke · 2023-09-26T20:32:57Z

Tagging @bbolker just FYI.

Fixes #797

…ls (#814) * Differences between get_predicted and get_predicted_ci for mixed models Fixes #797 * news, version * docs * this is intentional, turn styler off * typos

strengejacke · 2023-09-27T11:28:32Z

Not sure if the documentation I added is sufficient, so I'll keep this issue open.

strengejacke added the 3 investigators ❔❓ Need to look further into this issue label Aug 21, 2023

strengejacke mentioned this issue Sep 26, 2023

How are predict() standard errors calculated? glmmTMB/glmmTMB#915

Open

strengejacke self-assigned this Sep 26, 2023

strengejacke added a commit that referenced this issue Sep 26, 2023

Differences between get_predicted and get_predicted_ci for mixed models

a9937a0

Fixes #797

strengejacke mentioned this issue Sep 26, 2023

Differences between get_predicted and get_predicted_ci for mixed models #814

Merged

strengejacke closed this as completed in #814 Sep 27, 2023

strengejacke reopened this Sep 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Differences between get_predicted and get_predicted_ci for mixed models #797

Differences between get_predicted and get_predicted_ci for mixed models #797

fhui28 commented Aug 16, 2023 •

edited by etiennebacher

strengejacke commented Sep 26, 2023 •

edited

strengejacke commented Sep 26, 2023

strengejacke commented Sep 27, 2023

Differences between get_predicted and get_predicted_ci for mixed models #797

Differences between get_predicted and get_predicted_ci for mixed models #797

Comments

fhui28 commented Aug 16, 2023 • edited by etiennebacher

strengejacke commented Sep 26, 2023 • edited

strengejacke commented Sep 26, 2023

strengejacke commented Sep 27, 2023

fhui28 commented Aug 16, 2023 •

edited by etiennebacher

strengejacke commented Sep 26, 2023 •

edited