Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a variant of runBenchmark which also takes iteration range. #254

Open
phadej opened this issue Jan 12, 2022 · 2 comments
Open

Add a variant of runBenchmark which also takes iteration range. #254

phadej opened this issue Jan 12, 2022 · 2 comments

Comments

@phadej
Copy link
Contributor

phadej commented Jan 12, 2022

Currently there is:

runBenchmark :: Benchmarkable -> Double -> IO (Vector Measured, Double)

Run a single benchmark, and return measurements collected while executing it, along with the amount of time the measurement process took.

Double argument is: Lower bound on how long the benchmarking process should take. In practice, this time limit may be exceeded in order to generate enough data to perform meaningful statistical analyses.

I'd like to have a variant where we can specify the optional lower and upper bounds for iteration count (run at least 10 times, no need to run over 100 times).

Lower bound would be great to ensure more statistical significance. Upper bound would allow to avoid running individual benchmark for too long.

I think that then we'd need both lower and upper bound for time as well (current is actually more like soft upper bound, but as there are no end criteria it's also a lower bound).

I think it would best to put these four options into a record, to avoid double-blindness.
In case when no bounds is specified, we could either fail or use some default time bound.

EDIT: summary:

  • the end criteria would be in N dimensions: you can stop only when all lower bounds are crossed and at least one upper bound.
  • no lower bound specified: zero
  • no upper bounds: set time to default, other unset ones are "infinity"/unreachable.

EDIT2; tasty-bench does analysis, and decides end criteria on the fly. That's another option and can be used to implement the above; but i think the above is still conseptually simpler.

i offer to write a patch.

@RyanGlScott
Copy link
Member

I agree that we should offer more control over how this works for the benefit of power users. See also #210/#218, where users have requested the ability to reduce the number of iterations of very long-lived benchmarks. My only request is that we should very carefully document the pitfalls of changing the defaults that criterion uses, as it has the potential to make the data more susceptible to statistical noise.

If you submit a PR, I can review it more carefully. From a conceptual level, however, I'd support this direction of travel.

@phadej
Copy link
Contributor Author

phadej commented Jan 12, 2022

Yes, I wouldn't change the current defaults, i.e. the already existing functions behavior.

phadej added a commit to phadej/criterion that referenced this issue Jan 12, 2022
Refactor runBenchmark to use it.
Resolves haskell#254
phadej added a commit to phadej/criterion that referenced this issue Jan 12, 2022
Refactor runBenchmark to use it.
Resolves haskell#254
phadej added a commit to phadej/criterion that referenced this issue Jan 12, 2022
Refactor runBenchmark to use it.
Resolves haskell#254
phadej added a commit to phadej/criterion that referenced this issue Jan 12, 2022
Refactor runBenchmark to use it.
Resolves haskell#254
phadej added a commit to phadej/criterion that referenced this issue Jan 13, 2022
Refactor runBenchmark to use it.
Resolves haskell#254
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants