Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect the package that is needed to create or predict a model object #849

Open
tripartio opened this issue Feb 7, 2024 · 8 comments
Open
Labels
wontfix 🚫 This will not be worked on

Comments

@tripartio
Copy link

Hello. First, thanks for your fantastic package that I only recently discovered. It has really simplified some really tricky parts of my ale package.

One thing I would like to be able to do is to detect the package that is needed to create or predict a model object. For example, if I give the {insight} package a gam model object, I would like it to tell me that this object was created using the {mgcv} package. Is this possible with {insight}?

@bwiernik
Copy link
Contributor

bwiernik commented Feb 7, 2024

If the object is an S3 object, this should work --

getS3method("predict", class(object)[1]) |> rlang::ns_env_name()

If the object is S4 (eg, from lme4) --

attr(findClass('lmerMod')[[1]], "name")

@tripartio
Copy link
Author

@bwiernik Thanks for your suggestions. I will try to test them soon when I have a free moment.

Could such functionality be integrated as a function in the package?

@bwiernik
Copy link
Contributor

bwiernik commented Feb 8, 2024

Can you say more about the use case? If you have the package to fit the model, then the package should certainly be available to supply its methods

@tripartio
Copy link
Author

@bwiernik, the context is parallel processing with my ale package. Most of the package functions receive a model as input and then analyze the model for various interpretable machine learning (IML) tasks.

So, for example, when my ale() function runs sequentially (without parallel processing), there is no problem. As you indicated, as long as the package with which the model was created is installed on the system, the ale() function runs fine. But since I added parallel processing, the process has started choking, even though I use furrr, an advanced parallel processing package that takes care of almost everything automagically. The problem is that furrr has to send the necessary package environments to each parallel worker so that each worker can independently run the code. It cannot automatically detect that my code needs further packages, so it chokes. So, my code needs to tell furrr which extra packages are needed for the parallel workers to do their tasks.

The current version of my code requires users to specify a model_packages argument just for the sake of parallel processing. I would like to modify my code so that it automatically detects the model's package so that users would not need to supply this argument. This is what I would like insight to do for me.

So, I think that my use case could be generalized to parallel processing when the source package of certain complex objects (in my case, models) needs to be detected.

@tripartio
Copy link
Author

If the object is an S3 object, this should work --

getS3method("predict", class(object)[1]) |> rlang::ns_env_name()

@bwiernik Thanks; I have now implemented this check for S3 objects in my package and it now automatically detects and loads the appropriate packages.

I don't understand the S4 check code you gave (probably because I rarely work with S4 objects), so my package is now configured to automatically check for the S3 package and then give a graceful error message if it cannot be detected. Then users can explicitly specify the packages with my existing manual mechanism. That is acceptable, since it should work automatically for most users and only require manual intervention for a few complicated cases.

Would it be feasible to incorporate such a check into the {insight} package, extended with checks for S4 objects as well?

Regardless, I appreciate your help. Your little S3 code has let me simplify my function usage for most users.

@bwiernik
Copy link
Contributor

bwiernik commented Feb 9, 2024

R has two widely used class systems—S3 and S4 (and several less-used ones). S3 is most widely used, but some major modeling packages do use S4 (lme4 and OpenMx are the first ones that come to mind).

The function isS4() can detect if an object is S4 or not. The code above will return the namespace associated with the class of the S4 model.

@tripartio
Copy link
Author

If the object is S4 (eg, from lme4) --

attr(findClass('lmerMod')[[1]], "name")

But you specified the name of the S4 class in the code. I am not sufficiently familiar with S4 to convert this to code where I have an object of undetermined type and then probe its namespace (as with your S3 snippet above).

@bwiernik
Copy link
Contributor

Oh sorry

attr(findClass(object)[[1]], "name")

@strengejacke strengejacke added the wontfix 🚫 This will not be worked on label Mar 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix 🚫 This will not be worked on
Projects
None yet
Development

No branches or pull requests

3 participants