-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add runBenchmarkWith, general benchmark runner #255
Conversation
7fe703e
to
9b89f0c
Compare
I'll add semigroups compat if this is otherwise ok, or when I make tweaks. |
@@ -289,41 +291,69 @@ runBenchmark :: Benchmarkable | |||
-- exceeded in order to generate enough data to perform | |||
-- meaningful statistical analyses. | |||
-> IO (V.Vector Measured, Double) | |||
runBenchmark bm timeLimit = do | |||
runBenchmark bm timeLimit = runBenchmarkWith endAt 0 bm where | |||
endAt :: Int -> Int64 -> Double -> NonEmpty Measured -> Double -> Maybe (Int64, Double) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is the default behavior for runBenchmarkWith
's termination check, I'd favor factoring this out into a top-level definition so that a user can use it if they wish to tweak it slightly. Since this definition closes over timeLimit
, it might make things more clear to turn this into a newtype:
newtype TerminationCheck s = TerminationCheck
{ getTerminationCheck :: Int -> Int64 -> Double -> NonEmpty Measured -> s -> Maybe (Int64, s) }
defaultTerminationCheck :: Double -> TerminationCheck s
defaultTerminationCheck timeLimit = TerminationCheck $ \count iters delta ms prev -> do ...
Then you can mention in the Haddocks for runBenchmark
that runBenchmark bm timeLimit = runBenchmarkWith (defaultTerminationCheck timeLimit) 0 bm
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not so sure about. That function needs to be factored so its pieces could be usable, but I don't know how we could factor it. There are at least two parts: updating the state and deciding termination, but that consist of three checks already.
i.e. you can make it end sooner, but I don't see how you can make it run longer (in some situations).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I explained myself poorly in #255 (comment), so let me try again. I agree that because termination checks will need to perform arbitrary computation, there's not a good way in general to configure everything that a termination check might want to do. That being said, I still think there is value in exposing the default termination check to users. This is because in #218, a user wishes to do what runBenchmark
does plus an additional constraint on the number of iterations performed. In terms of code, this could look something like this:
longLivedComputationCheck :: TerminationCheck r s
longLivedComputationCheck = TerminationCheck $ \... -> do
getTerminationCheck defaultTerminationCheck ...
if <time exceeds certain threshold>
then Left ...
else Right ...
For some use cases, there might not be a better alternative than having to copy-paste the source code of defaultTerminationCheck
and tweaking it, but I don't want that to always be the case.
-- and should return 'Nothing' for run to terminate | ||
-- or @'Just' (nextIters, nextState)@ for run to continue with new iteration count and state. | ||
-- | ||
runBenchmarkWith :: forall s. (Int -> Int64 -> Double -> NonEmpty Measured -> s -> Maybe (Int64, s)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see an other way then copying the source and tweaking it.
-- * user defined state | ||
-- | ||
-- and should return 'Nothing' for run to terminate | ||
-- or @'Just' (nextIters, nextState)@ for run to continue with new iteration count and state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs a loud disclaimer that you should be careful when picking a termination check, as its implementation could affect the statistical quality of the data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
analyseSample
doesn't have any pre condiotions specified. Looking at the code, it apperently needs at least 4 samples, then the statistical quality will be estimated (in the confidence interval).
I.e. I don't know how to put it "Termination check condition should be picked so the data generated is suitable for whatever further analysis is performed on it". Isn't that obvious?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, I'm not looking for a super-detailed analysis of how to make the data statistically significant. I'm just looking for a generic warning saying that you should exercise caution when using this function, as it has the potential to distort the meaningfulness of the data if implemented haphazardly.
-- | ||
-- The first argument is termination check, it takes as arguments: | ||
-- | ||
-- * current run count (starting from zero) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If TerminationCheck
were factored out into a newtype, these Haddocks could be moved to the newtype instead.
Refactor runBenchmark to use it. Resolves haskell#254
9b89f0c
to
235709d
Compare
I changed the type of function. It fits better the usage where some analysis is done already on each step (in ~bayesian style), so you won't end doing it twice on the last sample (and without access to the state). |
I don't need this functionality anymore. |
Refactor runBenchmark to use it.
Resolves #254
I opted out to generating some data-type with configuration etc. The logic is just easier to write as code directly.