Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise second-chained properties results, refs 3722 #5036

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Seb35
Copy link
Contributor

@Seb35 Seb35 commented Aug 3, 2021

This PR is made in reference to: #3722

This PR addresses or contains:

The “prefetch cache” operation is now global: all prefetching is done at the first execution of PrefetchCache::prefetch() for all lines of the result, instead of one prefetching on each line.

The real cache (PrefetchCache->cache) have keys "PropertyName" or "-PropertyName" and the values are arrays [ "semantic object ID" => [ list of WikiPages ] ] ; so that it is easy to pick values in PrefetchCache::getPropertyValues(). (I say WikiPages here and after, but it could be other data types.)

The second cache (PrefetchCache->lookupCache) is aligned with the first one; its keys are the whole property chain like "Prop1.-Prop2.Prop3…PropN-1" for the N-level chain (so that it is empty for first-level chain) and the values are the WikiPages of the result for this level N; this is mainly used to be able to compute the next level starting from this result.

Earlier versions of this PR failed on tests p-0434.json and p-0467.json due to the vicious repetition of a same property in the chain, which why the whole property chain must be cached somewhere (here in lookupCache) and not only the property.

On a request #ask with 999 results and two 2-levels chain properties (with no common property), it took 18.4 seconds before this patch and 3.7 seconds with this patch (mean over a sample of 5 reloads of the page; it is not impossible other factors influence the performance like MySQL internal cache, but the figures are anyway quite different). This was on SMW 3.2.2 + MW 1.33 + ElasticStore + PHP 7.3.

This PR includes:

  • Tests (unit/integration)
  • CI build passed

@Seb35
Copy link
Contributor Author

Seb35 commented Aug 3, 2021

@mwjames: this is the logical continuation of #4572, could you review it?

@Seb35
Copy link
Contributor Author

Seb35 commented Aug 4, 2021

I will add a test to catch issue 4988, probably by adding a new Ask request in p-0434.json or p-0467.json.

When a chained result is requested, there is already an optimisation
for the first-chained property with a bulk request of this property
accross all results (SemanticMediaWiki#4572); this optimises further the second-chained
property (then third, etc) with a bulk request of this property accross
all results (results from the first-chained property results).
@Seb35
Copy link
Contributor Author

Seb35 commented Aug 4, 2021

I added the parser test p-0468.json, but I’m not sure about the exact naming logic. I copied q-0601.json and adapted it to trigger issue #4988, which fails as expected before this PR.

@gesinn-it-gea gesinn-it-gea requested review from mwjames and JeroenDeDauw and removed request for mwjames May 17, 2022 13:44
@gesinn-it-gea gesinn-it-gea added this to the SMW 4.0.2 milestone May 17, 2022
@gesinn-it-gea
Copy link
Member

gesinn-it-gea commented May 17, 2022

I did a local test: looks good to me. @JeroenDeDauw what do you think?

@jaideraf
Copy link
Member

jaideraf commented Apr 23, 2024

I did a local test: looks good to me.

@JeroenDeDauw could this be merged? Can we run the current GitHub automated tests here? I don't understand how to do it (or I don't have the permissions to do it here by the UI).

@JeroenDeDauw
Copy link
Member

I also do not know how to trigger the tests in this case. @Seb35 could do a rebase against master and force-push that, which should do the trick. Or we could do this ourselves and make a new PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants