Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EXPERIMENT] Branch mispredictions in twitter.json #2061

Draft
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

jkeiser
Copy link
Member

@jkeiser jkeiser commented Sep 5, 2023

This is an accounting of what causes our various branch misses in stage 2. I took twitter.json and transformed it via a series of steps into a single file with a single array and only empty strings where each scalar was in the original. I did this such that the size was always identical, values in roughly the same position, and (with the exception of array nesting removal) the number of structurals and size of the output from stage 2 identical.

The raw results from icelake are here, but here's the rundown. These numbers assume that branch misses from all these sources are completely independent, which probably isn't the case, but probably isn't completely wrong, either. Fascinatingly to me, this does seem to be a complete list: removing all of these sources of misprediction brings branch misses down from 773 to 3.

  • Integers with more than 1 digit (*) - 30% of branch misses in the file
  • Container type switching - 28%
  • String length - 18%
  • String/number/bool/null type switching - 17%
  • Backslashes - 6%
  • UTF-8 - 1%

(*) Of note is that a file where all numbers are replaced with 18-digit integers (or the same for 8-digit) seems to be just about as unpredictable as a file where the numbers have all sorts of different sizes. 1-digit numbers do not have this issue. I'm not sure why this is.

@jkeiser jkeiser force-pushed the jkeiser/branch-test branch 2 times, most recently from 02b0d46 to 65c243a Compare September 12, 2023 19:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant