Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance author parsing with case mixing #109

Open
aaccomazzi opened this issue Oct 9, 2020 · 0 comments
Open

Enhance author parsing with case mixing #109

aaccomazzi opened this issue Oct 9, 2020 · 0 comments
Assignees

Comments

@aaccomazzi
Copy link
Member

We have cases where the input author list may be in upper or lower case, and we want to create a mixed-case version of it. Classic has a PERL module which is used to manipulate author names and offers this functionality. I link here to the relevant parts of the code for reference (private repo) and to some of the test cases that should be used as unit tests to base our python code on.

Because the mixing of case is dependent on the specific name fragment (last vs. first vs. suffix), the perl module creates an object for each name that needs to be parsed. So if we follow this approach the code may get more complicated, but the advantage is that we may be able to better deal with things such as teams and collaborations as well.

Code:
https://github.com/adsabs/adsperl/blob/master/ADS/Authors/ADS-Authors-Names/lib/ADS/Authors/Names.pm#L286
https://github.com/adsabs/adsperl/blob/master/ADS/Authors/ADS-Authors-Names/lib/ADS/Authors/Names/Element.pm#L341

Tests:
https://github.com/adsabs/adsperl/blob/master/ADS/Authors/ADS-Authors-Names/t/names.t
https://github.com/adsabs/adsperl/blob/master/ADS/Authors/ADS-Authors-Names/t/parse_names.t
https://github.com/adsabs/adsperl/blob/master/ADS/Authors/ADS-Authors-Names/t/element.t

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants