Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First author sort and unicode leads to unexpected results #1882

Open
aaccomazzi opened this issue Jul 2, 2019 · 0 comments
Open

First author sort and unicode leads to unexpected results #1882

aaccomazzi opened this issue Jul 2, 2019 · 0 comments
Labels

Comments

@aaccomazzi
Copy link
Member

Consider the following query: author:"simek, m" year:1985-1987, which yields 16 records many authored by Šimek, M.. If we sort by first author, we get an unexpected list:
https://ui.adsabs.harvard.edu/search/p_=0&q=author%3A%22simek%2C%20m%22%20year%3A1985-1987&sort=first_author%20desc%2C%20bibcode%20desc

As you can see, the first papers in the list are authored by Šimek, M., followed by Znojil, V., followed by Simek, M. Since these are unicode strings, the sorting follows the proper unicode collation sequence, but from a user perspective it feels unnatural (one would expect the Šimek, M. and Simek, M. to be bunched together).

We could accomplish this by switching the sort from first_author to first_author_norm which is an ascii transliteration of the first_author field. Should we? And if not, why?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant