Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Affiliations not captured from crossref data #96

Open
seasidesparrow opened this issue Mar 12, 2024 · 1 comment
Open

Affiliations not captured from crossref data #96

seasidesparrow opened this issue Mar 12, 2024 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@seasidesparrow
Copy link
Member

Describe the bug
For at least some crossref records, the affiliation information for each author is returned in a structure tagged as <affiliations><institution><institution_name>. (See e.g. 10.1364/AO.505607). Currently, the crossref parser is looking for the tag <affiliation> (not affiliations) and extracting the contents with .get_text(). This misses the structure above entirely.

To Reproduce
Steps to reproduce the behavior: harvest the crossref xml from their api, and parse with adsingestp.parsers.crossref. Authors 1 and 5 will have ORCIDs, but there will not be any additional affiliation information.

Additional context
Add any other context about the problem here.

@seasidesparrow seasidesparrow added the bug Something isn't working label Mar 12, 2024
@seasidesparrow seasidesparrow self-assigned this Mar 12, 2024
@seasidesparrow
Copy link
Member Author

The crossref parser is actually parsing crossref xml data that has passed through the Habanero Content Negotiation method, and so it needs to be able to read data in the UNIXREF-XML query return format, documented here: https://www.crossref.org/schema/unixref1.1.xsd.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant