Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add runBenchmarkWith, general benchmark runner #255

Closed
wants to merge 1 commit into from

Conversation

phadej
Copy link
Contributor

@phadej phadej commented Jan 12, 2022

Refactor runBenchmark to use it.
Resolves #254


I opted out to generating some data-type with configuration etc. The logic is just easier to write as code directly.

@phadej phadej force-pushed the runBenchmarkWith branch 3 times, most recently from 7fe703e to 9b89f0c Compare January 12, 2022 17:32
@phadej
Copy link
Contributor Author

phadej commented Jan 12, 2022

I'll add semigroups compat if this is otherwise ok, or when I make tweaks.

@@ -289,41 +291,69 @@ runBenchmark :: Benchmarkable
-- exceeded in order to generate enough data to perform
-- meaningful statistical analyses.
-> IO (V.Vector Measured, Double)
runBenchmark bm timeLimit = do
runBenchmark bm timeLimit = runBenchmarkWith endAt 0 bm where
endAt :: Int -> Int64 -> Double -> NonEmpty Measured -> Double -> Maybe (Int64, Double)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is the default behavior for runBenchmarkWith's termination check, I'd favor factoring this out into a top-level definition so that a user can use it if they wish to tweak it slightly. Since this definition closes over timeLimit, it might make things more clear to turn this into a newtype:

newtype TerminationCheck s = TerminationCheck
  { getTerminationCheck :: Int -> Int64 -> Double -> NonEmpty Measured -> s -> Maybe (Int64, s) }

defaultTerminationCheck :: Double -> TerminationCheck s
defaultTerminationCheck timeLimit = TerminationCheck $ \count iters delta ms prev -> do ...

Then you can mention in the Haddocks for runBenchmark that runBenchmark bm timeLimit = runBenchmarkWith (defaultTerminationCheck timeLimit) 0 bm.

Copy link
Contributor Author

@phadej phadej Jan 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not so sure about. That function needs to be factored so its pieces could be usable, but I don't know how we could factor it. There are at least two parts: updating the state and deciding termination, but that consist of three checks already.

i.e. you can make it end sooner, but I don't see how you can make it run longer (in some situations).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I explained myself poorly in #255 (comment), so let me try again. I agree that because termination checks will need to perform arbitrary computation, there's not a good way in general to configure everything that a termination check might want to do. That being said, I still think there is value in exposing the default termination check to users. This is because in #218, a user wishes to do what runBenchmark does plus an additional constraint on the number of iterations performed. In terms of code, this could look something like this:

longLivedComputationCheck :: TerminationCheck r s
longLivedComputationCheck = TerminationCheck $ \... -> do
  getTerminationCheck defaultTerminationCheck ...
  if <time exceeds certain threshold>
    then Left ...
    else Right ...

For some use cases, there might not be a better alternative than having to copy-paste the source code of defaultTerminationCheck and tweaking it, but I don't want that to always be the case.

-- and should return 'Nothing' for run to terminate
-- or @'Just' (nextIters, nextState)@ for run to continue with new iteration count and state.
--
runBenchmarkWith :: forall s. (Int -> Int64 -> Double -> NonEmpty Measured -> s -> Maybe (Int64, s))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Speaking of which, what is the best way to tweak the default behavior? I imagine that a common use case would be to use runBenchmarkWith to impose a lower maximum on the number of iterations being run, as per #210/#218. Is this straightforward to accomplish with this design?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see an other way then copying the source and tweaking it.

-- * user defined state
--
-- and should return 'Nothing' for run to terminate
-- or @'Just' (nextIters, nextState)@ for run to continue with new iteration count and state.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a loud disclaimer that you should be careful when picking a termination check, as its implementation could affect the statistical quality of the data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

analyseSample doesn't have any pre condiotions specified. Looking at the code, it apperently needs at least 4 samples, then the statistical quality will be estimated (in the confidence interval).

I.e. I don't know how to put it "Termination check condition should be picked so the data generated is suitable for whatever further analysis is performed on it". Isn't that obvious?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, I'm not looking for a super-detailed analysis of how to make the data statistically significant. I'm just looking for a generic warning saying that you should exercise caution when using this function, as it has the potential to distort the meaningfulness of the data if implemented haphazardly.

--
-- The first argument is termination check, it takes as arguments:
--
-- * current run count (starting from zero)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If TerminationCheck were factored out into a newtype, these Haddocks could be moved to the newtype instead.

Refactor runBenchmark to use it.
Resolves haskell#254
@phadej
Copy link
Contributor Author

phadej commented Jan 13, 2022

I changed the type of function. It fits better the usage where some analysis is done already on each step (in ~bayesian style), so you won't end doing it twice on the last sample (and without access to the state).

kderme pushed a commit to input-output-hk/criterion that referenced this pull request Feb 22, 2022
@phadej
Copy link
Contributor Author

phadej commented Dec 10, 2022

I don't need this functionality anymore.

@phadej phadej closed this Dec 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a variant of runBenchmark which also takes iteration range.
2 participants