Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about running GROBID on a HPC cluster #1080

Open
kyuhunl opened this issue Feb 1, 2024 · 1 comment
Open

Questions about running GROBID on a HPC cluster #1080

kyuhunl opened this issue Feb 1, 2024 · 1 comment

Comments

@kyuhunl
Copy link

kyuhunl commented Feb 1, 2024

Hi, I am trying to use GROBID on my organization's HPC cluster. I have tried to use it on my own laptop (ARM Macbook), but I keep having trouble running the docker image. Our cluster does not support running docker images but supports singularity. Will pulling GROBID as a singularity image on the cluster work? Also, our HPC administrator recommends running GROBID in batch mode, contrary to your recommendation. What kind of issues should I expect when using batch mode instead of service mode?

@lfoppiano
Copy link
Collaborator

@kyuhunl, it should work on singularity, however, I never tried, nor do I have access to an HPC service.

If you run it without docker/singularity you might have issues configuring the deep learning models.
Generally, running the batch mode is effective when you pass to it the directory with all the PDF documents. If you run the command for the batch mode multiple times, it will spend a lot of time loading/unloading models, for which the service on a normal server(s) might be more effective.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants