Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All parsers including JATS should populate the fulltext item depending on whether the publisher supplied file has a "<body>" tag or equivalent. #79

Open
seasidesparrow opened this issue Nov 30, 2023 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@seasidesparrow
Copy link
Member

Describe the bug
Publisher-supplied metadata may or may not contain the body of the fulltext included. For cases where we do receive the fulltext from the publisher, we need to populate the fulltext element of the Document. fulltext is a dict having the keys language and body. In the case of (for example) JATS content, the fulltext will be enclosed within the <body></body> tag. We need to extract this and write it into the fulltext.body.

To Reproduce
Steps to reproduce the behavior:

Additional context
Add any other context about the problem here.

@seasidesparrow seasidesparrow added the bug Something isn't working label Nov 30, 2023
@seasidesparrow seasidesparrow self-assigned this Dec 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant