-
-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimise second-chained properties results, refs 3722 #5036
base: master
Are you sure you want to change the base?
Conversation
I will add a test to catch issue 4988, probably by adding a new Ask request in p-0434.json or p-0467.json. |
When a chained result is requested, there is already an optimisation for the first-chained property with a bulk request of this property accross all results (SemanticMediaWiki#4572); this optimises further the second-chained property (then third, etc) with a bulk request of this property accross all results (results from the first-chained property results).
I added the parser test p-0468.json, but I’m not sure about the exact naming logic. I copied q-0601.json and adapted it to trigger issue #4988, which fails as expected before this PR. |
I did a local test: looks good to me. @JeroenDeDauw what do you think? |
@JeroenDeDauw could this be merged? Can we run the current GitHub automated tests here? I don't understand how to do it (or I don't have the permissions to do it here by the UI). |
I also do not know how to trigger the tests in this case. @Seb35 could do a rebase against master and force-push that, which should do the trick. Or we could do this ourselves and make a new PR. |
This PR is made in reference to: #3722
This PR addresses or contains:
The “prefetch cache” operation is now global: all prefetching is done at the first execution of
PrefetchCache::prefetch()
for all lines of the result, instead of one prefetching on each line.The real cache (PrefetchCache->cache) have keys "PropertyName" or "-PropertyName" and the values are arrays [ "semantic object ID" => [ list of WikiPages ] ] ; so that it is easy to pick values in PrefetchCache::getPropertyValues(). (I say WikiPages here and after, but it could be other data types.)
The second cache (PrefetchCache->lookupCache) is aligned with the first one; its keys are the whole property chain like "Prop1.-Prop2.Prop3…PropN-1" for the N-level chain (so that it is empty for first-level chain) and the values are the WikiPages of the result for this level N; this is mainly used to be able to compute the next level starting from this result.
Earlier versions of this PR failed on tests p-0434.json and p-0467.json due to the vicious repetition of a same property in the chain, which why the whole property chain must be cached somewhere (here in lookupCache) and not only the property.
On a request #ask with 999 results and two 2-levels chain properties (with no common property), it took 18.4 seconds before this patch and 3.7 seconds with this patch (mean over a sample of 5 reloads of the page; it is not impossible other factors influence the performance like MySQL internal cache, but the figures are anyway quite different). This was on SMW 3.2.2 + MW 1.33 + ElasticStore + PHP 7.3.
This PR includes: