Add runBenchmarkWith, general benchmark runner #255

phadej · 2022-01-12T16:50:16Z

Refactor runBenchmark to use it.
Resolves #254

I opted out to generating some data-type with configuration etc. The logic is just easier to write as code directly.

phadej · 2022-01-12T17:56:55Z

I'll add semigroups compat if this is otherwise ok, or when I make tweaks.

RyanGlScott · 2022-01-12T22:36:03Z

criterion-measurement/src/Criterion/Measurement.hs

@@ -289,41 +291,69 @@ runBenchmark :: Benchmarkable
             -- exceeded in order to generate enough data to perform
             -- meaningful statistical analyses.
             -> IO (V.Vector Measured, Double)
-runBenchmark bm timeLimit = do
+runBenchmark bm timeLimit = runBenchmarkWith endAt 0 bm where
+  endAt :: Int -> Int64 -> Double -> NonEmpty Measured -> Double -> Maybe (Int64, Double)


Since this is the default behavior for runBenchmarkWith's termination check, I'd favor factoring this out into a top-level definition so that a user can use it if they wish to tweak it slightly. Since this definition closes over timeLimit, it might make things more clear to turn this into a newtype:

newtype TerminationCheck s = TerminationCheck { getTerminationCheck :: Int -> Int64 -> Double -> NonEmpty Measured -> s -> Maybe (Int64, s) } defaultTerminationCheck :: Double -> TerminationCheck s defaultTerminationCheck timeLimit = TerminationCheck $ \count iters delta ms prev -> do ...

Then you can mention in the Haddocks for runBenchmark that runBenchmark bm timeLimit = runBenchmarkWith (defaultTerminationCheck timeLimit) 0 bm.

I'm not so sure about. That function needs to be factored so its pieces could be usable, but I don't know how we could factor it. There are at least two parts: updating the state and deciding termination, but that consist of three checks already.

i.e. you can make it end sooner, but I don't see how you can make it run longer (in some situations).

I think I explained myself poorly in #255 (comment), so let me try again. I agree that because termination checks will need to perform arbitrary computation, there's not a good way in general to configure everything that a termination check might want to do. That being said, I still think there is value in exposing the default termination check to users. This is because in #218, a user wishes to do what runBenchmark does plus an additional constraint on the number of iterations performed. In terms of code, this could look something like this:

longLivedComputationCheck :: TerminationCheck r s longLivedComputationCheck = TerminationCheck $ \... -> do getTerminationCheck defaultTerminationCheck ... if <time exceeds certain threshold> then Left ... else Right ...

For some use cases, there might not be a better alternative than having to copy-paste the source code of defaultTerminationCheck and tweaking it, but I don't want that to always be the case.

RyanGlScott · 2022-01-12T22:37:25Z

criterion-measurement/src/Criterion/Measurement.hs

+-- and should return 'Nothing' for run to terminate
+-- or @'Just' (nextIters, nextState)@ for run to continue with new iteration count and state.
+--
+runBenchmarkWith :: forall s. (Int -> Int64 -> Double -> NonEmpty Measured -> s -> Maybe (Int64, s))


Speaking of which, what is the best way to tweak the default behavior? I imagine that a common use case would be to use runBenchmarkWith to impose a lower maximum on the number of iterations being run, as per #210/#218. Is this straightforward to accomplish with this design?

I don't see an other way then copying the source and tweaking it.

RyanGlScott · 2022-01-12T22:38:06Z

criterion-measurement/src/Criterion/Measurement.hs

+-- * user defined state
+--
+-- and should return 'Nothing' for run to terminate
+-- or @'Just' (nextIters, nextState)@ for run to continue with new iteration count and state.


This needs a loud disclaimer that you should be careful when picking a termination check, as its implementation could affect the statistical quality of the data.

analyseSample doesn't have any pre condiotions specified. Looking at the code, it apperently needs at least 4 samples, then the statistical quality will be estimated (in the confidence interval).

I.e. I don't know how to put it "Termination check condition should be picked so the data generated is suitable for whatever further analysis is performed on it". Isn't that obvious?

To be clear, I'm not looking for a super-detailed analysis of how to make the data statistically significant. I'm just looking for a generic warning saying that you should exercise caution when using this function, as it has the potential to distort the meaningfulness of the data if implemented haphazardly.

RyanGlScott · 2022-01-12T22:38:35Z

criterion-measurement/src/Criterion/Measurement.hs

+--
+-- The first argument is termination check, it takes as arguments:
+--
+-- * current run count (starting from zero)


If TerminationCheck were factored out into a newtype, these Haddocks could be moved to the newtype instead.

Refactor runBenchmark to use it. Resolves haskell#254

phadej · 2022-01-13T09:04:53Z

I changed the type of function. It fits better the usage where some analysis is done already on each step (in ~bayesian style), so you won't end doing it twice on the last sample (and without access to the state).

[ci skip]

phadej · 2022-12-10T00:12:13Z

I don't need this functionality anymore.

phadej force-pushed the runBenchmarkWith branch 3 times, most recently from 7fe703e to 9b89f0c Compare January 12, 2022 17:32

RyanGlScott reviewed Jan 12, 2022

View reviewed changes

Add runBenchmarkWith, general benchmark runner

235709d

Refactor runBenchmark to use it. Resolves haskell#254

phadej force-pushed the runBenchmarkWith branch from 9b89f0c to 235709d Compare January 13, 2022 09:04

kderme pushed a commit to input-output-hk/criterion that referenced this pull request Feb 22, 2022

Mention haskell#244/haskell#255 in the changelog

bfc31ff

[ci skip]

phadej closed this Dec 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add runBenchmarkWith, general benchmark runner #255

Add runBenchmarkWith, general benchmark runner #255

phadej commented Jan 12, 2022 •

edited

phadej commented Jan 12, 2022

RyanGlScott Jan 12, 2022

phadej Jan 13, 2022 •

edited

RyanGlScott Jan 14, 2022

RyanGlScott Jan 12, 2022

phadej Jan 13, 2022

RyanGlScott Jan 12, 2022

phadej Jan 13, 2022

RyanGlScott Jan 14, 2022

RyanGlScott Jan 12, 2022

phadej commented Jan 13, 2022

phadej commented Dec 10, 2022

Add runBenchmarkWith, general benchmark runner #255

Add runBenchmarkWith, general benchmark runner #255

Conversation

phadej commented Jan 12, 2022 • edited

phadej commented Jan 12, 2022

Choose a reason for hiding this comment

phadej Jan 13, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

phadej commented Jan 13, 2022

phadej commented Dec 10, 2022

phadej commented Jan 12, 2022 •

edited

phadej Jan 13, 2022 •

edited