Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test Coverage Metrics #57

Open
pepperbob opened this issue Sep 30, 2015 · 2 comments
Open

Test Coverage Metrics #57

pepperbob opened this issue Sep 30, 2015 · 2 comments

Comments

@pepperbob
Copy link
Contributor

We're having the rdfunit-junit integration up and running which provides us with a good overview of failing test cases, esp. in conjunction with IDE and/or CI-server. If a test is "red" we can trust something broke.

However, the issue with "green" tests is, that we actually do not know why it is green: could be data that's valid according to the test case or maybe because there's no data to validate at all. The latter fact would decrease significance of that test (at least in the given context).

Furthermore we're missing metrics of how much input-data is actually covered by the test cases. Looking at the TestCoverageEvaluator this seems usable - though we need some elaboration. It's currently not clear what input is expected.

Request:

  • Can we figure out on a per-test-case basis if there is data to be tested (before/after test is run)? We could use the "test-skipped" notificaiton of JUnit to provide an overview how many tests are not testing anything.
  • Could the API of TestCoverageEvaluator elaborated?
@jimkont
Copy link
Member

jimkont commented Sep 30, 2015

TestCoverageEvaluator was created at the very first beginning of the project and not used since. I updated the code a bit to make not throw errors.

It is still not working correctly but now shows some (wrong) output when you pass -c as a CLI parameter

[INFO  TestCoverageEvaluator] Fdom Coverage: 0.0
[INFO  TestCoverageEvaluator] fRang Coverage: 0.0
[INFO  TestCoverageEvaluator] fDep Coverage: 0.0
[INFO  TestCoverageEvaluator] fCard Coverage: 0.0
[INFO  TestCoverageEvaluator] fMem Coverage: 0.0
[INFO  TestCoverageEvaluator] fCDep Coverage: 0.0

if this is updated to provide the correct numbers it could probably handle your use case.
Each metric measures a specific test coverage according to pages 3-4 in http://svn.aksw.org/papers/2014/WWW_Databugger/public.pdf

the lower the metric numbers the fewer cases are actually tested in the input source.

This class was more like a hack to generate table 4 in the paper.
What is does (or what I remember it was doing) is

  • relate RDFUnit patterns to each metric
  • calculates class & property statistics for the input source
  • exploit some rdfunit pattern hacks to get the classes / properties / patters associated with each test case and calculate the metrics

It needs some work to get this in a good shape & usable. Let me know if working in this directions covers your goal

@jimkont
Copy link
Member

jimkont commented Oct 1, 2015

Note that, ideally, the metrics should be identified by doing pattern identification inside the SPARQL queries.
This would also work on non pattern-based test cases or pattern-based test cases where the pattern is not associated with coverage metrics.
However, this approach handles most cases easily.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants