`similar() + score` doesn't produce the same results as `topn(x, similar(), "score desc")` #185

romanchyla · 2021-12-23T18:25:38Z

(base) aaccomazzilap5:~ aaccomazzi$ alias curlads
alias curlads='curl -H '\''Authorization: Bearer:TOKEN'\'''
(base) aaccomazzilap5:~ aaccomazzi$ curlads 'https://ui.adsabs.harvard.edu/v1/search/query?fl=bibcode&p_=0&q=similar(%22solar%20wind%22%20SWEAP)&rows=500&sort=score%20desc%2C%20bibcode%20desc' > curl-similar.payload && curlads 'https://ui.adsabs.harvard.edu/v1/search/query?fl=bibcode&p_=0&q=topn(500%2C%20similar(%22solar%20wind%22%20SWEAP)%2C%20%22score%20desc%2C%20bibcode%20desc%22)&rows=500&sort=score%20desc%2C%20bibcode%20desc' > curl-topn.payload
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 25400  100 25400    0     0  59905      0 --:--:-- --:--:-- --:--:-- 59905
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 25439  100 25439    0     0   8756      0  0:00:02  0:00:02 --:--:--  8753
(base) aaccomazzilap5:~ aaccomazzi$ jq '{ bibcode: [ .response.docs[].bibcode ]}' < curl-similar.payload | sort > curl-similar.bib
(base) aaccomazzilap5:~ aaccomazzi$ jq '{ bibcode: [ .response.docs[].bibcode ]}' < curl-topn.payload | sort > curl-topn.bib
(base) aaccomazzilap5:~ aaccomazzi$ diff curl-similar.bib curl-topn.bib | wc -l

using stable sort (e.g. bibcode, or classic_score) produces the same results for both queries
in millions of documents scored, using score and floating point calculations, we can expect our of order results (even with breaking ties on bibcode); in the first 500 hits I see 24 docs with the same score (score desc without bibcode)
topn(...., "score desc") and similar(....)&sort=score+desc
the first 19 docs have the same scores, but the 20th has a different one -- by quite a large delta >0.001 - so that is not a floating point
4. scores are monotonically decreasing (for topn(...) without exterior sort); so that rules out a potential bug in the collector (but that is a weak indication)
5. at this point, I'm suspicious of this: https://github.com/romanchyla/montysolr/blob/8a1871e21004d2a92744265e30740815ee20506e/contrib/adsabs/src/java/org/apache/lucene/search/AbstractSecondOrderCollector.java#L443 -- the only thing we have available to us inside second order collector is a score (which was produced from score+bibcode higher up) but if those guys produce the same score, we cannot break the tie again

unfortunately, for the topn I can't get debug output -- there is a bug resulting in NullPointer exception - that I have to fix first; to really figure out why the scores are different

The text was updated successfully, but these errors were encountered:

aaccomazzi · 2021-12-25T17:22:50Z

One additional observation: our score is a function of the lucene score and the boost factor. Is it possible that topn uses just the lucene score in truncating the list? This would explain the different lists of papers. The topn list has papers with a cumulative count of 5,300 citations whereas the list selected from the 500 top similar papers has 52,000!

romanchyla · 2021-12-29T00:00:41Z

@aaccomazzi your intuition is correct. Because how the custom scoring is implemented - lucene_score + (cite_read_boost + AA constant) we'll be getting different values even if both calls (i.e. topn and the top level custom rescoring) use the same parameters; they are doing different thing

It took me this long to fix the underlying bugs in the 2nd order collectors; these bugs were only affecting debug output - but were quite difficult to identify. I've then figured how to modify topn() -- tailor is deploying it to DEV as I write this; but I'm not yet ready to commit to using this in prod.

Also to note: in the dev topn() is wrapped by custom -- so we'll be doing the computation twice; but because of the rescoring, we'll be not getting the same order (as if when done once)

But we'll be able to test...

romanchyla · 2022-02-18T20:05:42Z

placeholder: I'm going to include the customized topn() in the next release - but it is not yet the ideal solution

it is using the custom ads scoring formula custom(SecondOrderQuery(title:foo, collector=SecondOrderCollectorTopN(2)), sum(float(cite_read_boost),const(0.5))) where previously it would only use lucene score SecondOrderQuery(title:foo, collector=SecondOrderCollectorTopN(2))

the trouble is: docs get rescored twice (and change order) -- I have to solve this differently

JCRPaquin added bug debugging query score labels Aug 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`similar() + score` doesn't produce the same results as `topn(x, similar(), "score desc")` #185

`similar() + score` doesn't produce the same results as `topn(x, similar(), "score desc")` #185

romanchyla commented Dec 23, 2021

aaccomazzi commented Dec 25, 2021

romanchyla commented Dec 29, 2021

romanchyla commented Feb 18, 2022

similar() + score doesn't produce the same results as topn(x, similar(), "score desc") #185

similar() + score doesn't produce the same results as topn(x, similar(), "score desc") #185

Comments

romanchyla commented Dec 23, 2021

aaccomazzi commented Dec 25, 2021

romanchyla commented Dec 29, 2021

romanchyla commented Feb 18, 2022

`similar() + score` doesn't produce the same results as `topn(x, similar(), "score desc")` #185

`similar() + score` doesn't produce the same results as `topn(x, similar(), "score desc")` #185