Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PrefixMapStd is very slow for lookups that 'miss' #1474

Open
Aklakan opened this issue Aug 9, 2022 · 0 comments · May be fixed by #1475
Open

PrefixMapStd is very slow for lookups that 'miss' #1474

Aklakan opened this issue Aug 9, 2022 · 0 comments · May be fixed by #1475
Labels

Comments

@Aklakan
Copy link
Contributor

Aklakan commented Aug 9, 2022

PR: #1475

Version

4.6.0-SNAPSHOT

What happened?

Another follow up on https://issues.apache.org/jira/browse/JENA-2309
I investigated the issue of slow PrefixMapStd further and it turns out that when there are 'hits' using its fast track approach it's very fast - however on 'misses' it gets extremely slow because then it falls back to iteration of all prefixes in the map.

Using a small benchmark runner (part of the PR - see here) shows the the PrefixMapStd can be easily brought to require dozens of seconds for lookups that can actually be accomplished in a split second.
For that I created a new PrefixMap intended as a replacement that combines the best of both worlds: The fast track of PrefixMapStd and the trie backing (together with a guava cache) of the removed FastAbbreviatingPrefixMap.

Approach 0 = Jena's original PrefixMapStd:

Run 1 with base <http://example.org/> and separator / using approach 0: Static IRI lookups took 0.082 seconds
Run 1 with base <http://example.org/> and separator / using approach 0: Dynamic IRI lookups took 33.443 seconds
Run 1 with base <urn:foo:bar:> and separator : using approach 0: Static IRI lookups took 25.482 seconds
Run 1 with base <urn:foo:bar:> and separator : using approach 0: Dynamic IRI lookups took 31.088 seconds

Approach 1 = my optimized PrefixMap:

Run 1 with base <http://example.org/> and separator / using approach 1: Static IRI lookups took 0.070 seconds
Run 1 with base <http://example.org/> and separator / using approach 1: Dynamic IRI lookups took 0.401 seconds
Run 1 with base <urn:foo:bar:> and separator : using approach 1: Static IRI lookups took 0.099 seconds
Run 1 with base <urn:foo:bar:> and separator : using approach 1: Dynamic IRI lookups took 0.433 seconds

W.r.t. to the interface contract of PrefixMap I am not sure whether synchronization / thread-safe maps need to be used.
PrefixMapStd used a ConcurrentHashMap but if synchronization is actually not a concern then a LinkedHashMap that gives control over the order may be a better choice.

Are you interested in making a pull request?

Yes

@Aklakan Aklakan added the bug label Aug 9, 2022
@Aklakan Aklakan linked a pull request Aug 9, 2022 that will close this issue
4 tasks
Aklakan added a commit to Aklakan/jena that referenced this issue Aug 12, 2022
Aklakan added a commit to Aklakan/jena that referenced this issue Aug 24, 2022
Aklakan added a commit to Aklakan/jena that referenced this issue Aug 24, 2022
Aklakan added a commit to Aklakan/jena that referenced this issue Aug 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant