Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attribute error while harvesting #35

Open
epatters opened this issue Jul 9, 2020 · 1 comment
Open

Attribute error while harvesting #35

epatters opened this issue Jul 9, 2020 · 1 comment

Comments

@epatters
Copy link

epatters commented Jul 9, 2020

When running the command

oai-harvest -p arXiv http://export.arxiv.org/oai2

I get the error

INFO     Harvesting from http://export.arxiv.org/oai2
ERROR    'Record' object has no attribute 'identifier'
Traceback (most recent call last):
  File "/home/epatters/miniconda3/lib/python3.7/site-packages/oaiharvest/harvest.py", line 181, in main
    **kwargs
  File "/home/epatters/miniconda3/lib/python3.7/site-packages/oaiharvest/harvesters/directory_harvester.py", line 65, in harvest
    record.identifier, metadataPrefix
AttributeError: 'Record' object has no attribute 'identifier'

after downloading about ~100K records. I've tried several times, each with the same result. I am running Python 3.7 on a Linux system.

@epatters
Copy link
Author

epatters commented Jul 9, 2020

Update: The error comes from a logging statement about record delete requests. Commenting out the logging statement seems to work around the error.

One final update, in case it helps anybody: to avoid timeouts while harvesting, I also had to increase the WAIT_MAX constant in pyoai. With these changes, I successfully harvested the complete arXiv metadata.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant