Optimise second-chained properties results, refs 3722 #5036

Seb35 · 2021-08-03T23:11:57Z

This PR is made in reference to: #3722

This PR addresses or contains:

When a chained result is requested, there is already an optimisation for the first-chained property with a bulk request of this property accross all results (Optimize chain lookup in prefetch mode, refs 3722 #4572); this optimises further the second-chained property (then third, etc) with a bulk request of this property accross all results (results from the first-chained property results).
solves Chained properties not displayed when occuring multiple times #4988

The “prefetch cache” operation is now global: all prefetching is done at the first execution of PrefetchCache::prefetch() for all lines of the result, instead of one prefetching on each line.

The real cache (PrefetchCache->cache) have keys "PropertyName" or "-PropertyName" and the values are arrays [ "semantic object ID" => [ list of WikiPages ] ] ; so that it is easy to pick values in PrefetchCache::getPropertyValues(). (I say WikiPages here and after, but it could be other data types.)

The second cache (PrefetchCache->lookupCache) is aligned with the first one; its keys are the whole property chain like "Prop1.-Prop2.Prop3…PropN-1" for the N-level chain (so that it is empty for first-level chain) and the values are the WikiPages of the result for this level N; this is mainly used to be able to compute the next level starting from this result.

Earlier versions of this PR failed on tests p-0434.json and p-0467.json due to the vicious repetition of a same property in the chain, which why the whole property chain must be cached somewhere (here in lookupCache) and not only the property.

On a request #ask with 999 results and two 2-levels chain properties (with no common property), it took 18.4 seconds before this patch and 3.7 seconds with this patch (mean over a sample of 5 reloads of the page; it is not impossible other factors influence the performance like MySQL internal cache, but the figures are anyway quite different). This was on SMW 3.2.2 + MW 1.33 + ElasticStore + PHP 7.3.

This PR includes:

Tests (unit/integration)
CI build passed

Seb35 · 2021-08-03T23:35:49Z

@mwjames: this is the logical continuation of #4572, could you review it?

Seb35 · 2021-08-04T08:58:50Z

I will add a test to catch issue 4988, probably by adding a new Ask request in p-0434.json or p-0467.json.

When a chained result is requested, there is already an optimisation for the first-chained property with a bulk request of this property accross all results (SemanticMediaWiki#4572); this optimises further the second-chained property (then third, etc) with a bulk request of this property accross all results (results from the first-chained property results).

Seb35 · 2021-08-04T17:21:24Z

I added the parser test p-0468.json, but I’m not sure about the exact naming logic. I copied q-0601.json and adapted it to trigger issue #4988, which fails as expected before this PR.

gesinn-it-gea · 2022-05-17T13:47:31Z

I did a local test: looks good to me. @JeroenDeDauw what do you think?

jaideraf · 2024-04-23T12:26:56Z

I did a local test: looks good to me.

@JeroenDeDauw could this be merged? Can we run the current GitHub automated tests here? I don't understand how to do it (or I don't have the permissions to do it here by the UI).

JeroenDeDauw · 2024-04-23T12:34:58Z

I also do not know how to trigger the tests in this case. @Seb35 could do a rebase against master and force-push that, which should do the trick. Or we could do this ourselves and make a new PR.

Seb35 force-pushed the issue-4988 branch from 6b87451 to b297e09 Compare August 4, 2021 17:15

gesinn-it-gea requested review from mwjames and JeroenDeDauw and removed request for mwjames May 17, 2022 13:44

gesinn-it-gea added this to the SMW 4.0.2 milestone May 17, 2022

gesinn-it-gea requested a review from gesinn-it-wam May 18, 2022 14:38

gesinn-it-gea modified the milestones: SMW 4.0.2, SMW 4.0.3 Jul 12, 2022

jaideraf mentioned this pull request May 20, 2024

fix SMW\SQLStore\EntityStore\PrefetchCache cache key overwrap #5623

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimise second-chained properties results, refs 3722 #5036

Optimise second-chained properties results, refs 3722 #5036

Seb35 commented Aug 3, 2021 •

edited

Seb35 commented Aug 3, 2021

Seb35 commented Aug 4, 2021

Seb35 commented Aug 4, 2021

gesinn-it-gea commented May 17, 2022 •

edited

jaideraf commented Apr 23, 2024 •

edited

JeroenDeDauw commented Apr 23, 2024

Optimise second-chained properties results, refs 3722 #5036

Are you sure you want to change the base?

Optimise second-chained properties results, refs 3722 #5036

Conversation

Seb35 commented Aug 3, 2021 • edited

Seb35 commented Aug 3, 2021

Seb35 commented Aug 4, 2021

Seb35 commented Aug 4, 2021

gesinn-it-gea commented May 17, 2022 • edited

jaideraf commented Apr 23, 2024 • edited

JeroenDeDauw commented Apr 23, 2024

Seb35 commented Aug 3, 2021 •

edited

gesinn-it-gea commented May 17, 2022 •

edited

jaideraf commented Apr 23, 2024 •

edited