Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a generic XML Handler #287

Open
cmahnke opened this issue Mar 15, 2024 · 5 comments
Open

Add a generic XML Handler #287

cmahnke opened this issue Mar 15, 2024 · 5 comments
Labels
status: blocked type: feature New feature or request

Comments

@cmahnke
Copy link

cmahnke commented Mar 15, 2024

Currently the user is required to implement other XML metadata formats on their own, see docs.

There should be a generic class mapping that works this way (pseudo code):

records = oai.list_records(metadata_prefix='lido', set_='fowi')
for record in records:
    str = ElementTree.tostring(record.metadata, encoding='unicode')

Where record.metadata would be a xml.etree.ElementTree. This is important to be able to use document relative calls to .xpath(), since this way one doesn't have to take the encapsulating OAI Response into account.

Otherwise a helper function on the Record class, similar to get_metadata would be nice. From my point of view it's more helpful to interact directly with the XML then writing a handler for mapping complex metadata into a python dictionary.

@cmahnke cmahnke added the type: feature New feature or request label Mar 15, 2024
@afuetterer
Copy link
Owner

afuetterer commented Apr 4, 2024

Thank you for opening this.

I want to finish #126 first, which changes the way the XML records are parsed anyway. It should make things easier. But an option to return XML directly sounds fair.

You need to work with lido records, yes? Can you provide an example or the XSD?

Are you referring to this schema?
https://cidoc.mini.icom.museum/working-groups/lido/lido-overview/lido-schema/

@cmahnke
Copy link
Author

cmahnke commented Apr 4, 2024

Yes, I work with Lido data, it's the schema you referred to (but not the latest version, see http://www.lido-schema.org/schema/v1.0/lido-v1.0.xsd).

From my point of view there is neither a direct need (though it would certainly be nice) nor a direct dependency. Since I suspect you need to extract the Record document before feeding it into this framework, why not just expose this extracting method to the user?

I just had a look at #126, it seems that the xml to data mapping applies to the whole OAI response, that way my previus comment doesn't make that much sense.
BTW: Please don't include generated code in the repository, regenerate it on build...

@afuetterer
Copy link
Owner

Can you give an example of a lido record (URL) or are these only available through authentication at https://sammlungen.uni-goettingen.de/oai?

About your second comment:

Yes, the mapping will parse the entire oai response to be able to conveniently access error codes, resumption tokens etc.
The auto-generated classes will be part of the library, at least for oai-pmh and oai-dc. These will be needed for things like type checking, testing etc.
So why shouldn't I include these in the repository?

@cmahnke
Copy link
Author

cmahnke commented Apr 5, 2024

The LIDO records themself are also avail abele for download, just press the LIDO button on top of each sigle record page, like http://sammlungen.uni-goettingen.de/lidoresolver?id=record_kuniweb_592744

I would consider it as a bad practise to include code that's reproducable by build tools, since:

  • It doesn't add anything genuine to the erepository
  • Might be misleading to anyone who tries to understand what the code does
  • It's a vilolation of the DRY principle

Just make sure your builds are reproduceable, that is make sure to use a defined version of the generator (not just the latest one).

@afuetterer
Copy link
Owner

Alright, let's agree to disagree. I think your arguments are not valid.

If you want to discuss #126 further, please comment over there and try to be constructive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: blocked type: feature New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants