First author sort and unicode leads to unexpected results #1882

aaccomazzi · 2019-07-02T15:34:51Z

Consider the following query: author:"simek, m" year:1985-1987, which yields 16 records many authored by Šimek, M.. If we sort by first author, we get an unexpected list:
https://ui.adsabs.harvard.edu/search/p_=0&q=author%3A%22simek%2C%20m%22%20year%3A1985-1987&sort=first_author%20desc%2C%20bibcode%20desc

As you can see, the first papers in the list are authored by Šimek, M., followed by Znojil, V., followed by Simek, M. Since these are unicode strings, the sorting follows the proper unicode collation sequence, but from a user perspective it feels unnatural (one would expect the Šimek, M. and Simek, M. to be bunched together).

We could accomplish this by switching the sort from first_author to first_author_norm which is an ascii transliteration of the first_author field. Should we? And if not, why?

The text was updated successfully, but these errors were encountered:

aaccomazzi added the question label Jul 2, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

First author sort and unicode leads to unexpected results #1882

First author sort and unicode leads to unexpected results #1882

aaccomazzi commented Jul 2, 2019

First author sort and unicode leads to unexpected results #1882

First author sort and unicode leads to unexpected results #1882

Comments

aaccomazzi commented Jul 2, 2019