[WIP] Adds support for Pipelines with ModelVisualizers #955

bbengfort · 2019-08-27T00:25:36Z

This PR fixes #498 and adds support for Pipelines in most ScoreVisualizers. It also adds some automatic checking extending functionality described in #180.

I have made the following changes:

Added an is_pipeline type check
Added functionality to ModelVisualizer to access the final estimator in a pipeline

TODOs and questions

Still to do:

modify visualizers to use new code
create tests for pipelines with model visualizers
document pipelines in visualizers

CHECKLIST

Is the commit message formatted correctly?
Have you noted the new functionality/bugfix in the release notes of the next release?

Included a sample plot to visually illustrate your changes?
Do all of your functions and methods have docstrings?
Have you added/updated unit tests where appropriate?
Have you updated the baseline images if necessary?
Have you run the unit tests using pytest?
Is your code style correct (are you using PEP8, pyflakes)?
Have you documented your new feature/functionality in the docs?

Have you built the docs using make html?

bbengfort · 2019-08-27T00:27:19Z

yellowbrick/base.py

+        try:
+            return getattr(self._final_estimator(), attr)
+        except AttributeError as e:
+            raise NotFitted(str(e))


It makes sense to me that this would raise not fitted but I don't like the message "object has no attribute coef_" ... not sure if it would be better to just raise the Attribute Error or not.

bbengfort · 2019-08-28T14:37:07Z

tests/test_api.py

+from sklearn.datasets import make_blobs, make_classification, make_regression
+
+
+BASES = [


This file takes the work from the audit and adds systematic checks to be tested across different groups of visualizers -- it replaces the old tests/checks.py framework by using pytest do perform all the checks.

bbengfort · 2019-08-28T14:37:52Z

tests/test_api.py

+
+@pytest.mark.skip("too many edge cases, is tested in most visualizer-specific tests")
+@pytest.mark.parametrize("Viz", VISUALIZERS)
+def test_fit(Viz):


I thought this might be an easy one to implement, but there are a lot of edge cases here, so I'm thinking about removing it.

bbengfort · 2019-08-28T14:46:58Z

tests/test_api.py

+    assert oz.fit(data.X.train, data.y.train) is oz
+
+
+@pytest.mark.xfail(reason="quick methods aren't primetime yet")


This check also tells us a lot about the state of our quick methods; only 5 xpass (e.g. pass even though the test is marked xfail)

bbengfort · 2019-08-28T14:47:36Z

tests/test_base.py

+##########################################################################
+
+
+class TestModelVisualizer(VisualTestCase):


Adds tests for the final estimator functionality implemented for Pipelines.

bbengfort · 2019-08-28T14:48:11Z

yellowbrick/features/__init__.py

@@ -17,18 +17,23 @@
 ## Imports
 ##########################################################################


Ensures all feature visualizers are imported into the top level.

In fact, all of the modifications to __init__.py files are to ensure that the visualizers and bases are imported at the top level so they can be imported into the check functionality.

bbengfort · 2019-08-28T14:49:42Z

yellowbrick/utils/types.py

+
+
+# Alias for closer name to isinstance and issubclass
+ispipeline = is_pipeline


This only checks Pipeline objects, I'm not sure if we also need to check FeatureUnion - am open for suggestions.

working toward model visualizers working with pipelines

77a6b54

bbengfort commented Aug 27, 2019

View reviewed changes

added API checks framework

dd8d6c5

bbengfort commented Aug 28, 2019

View reviewed changes

quick methods check

5cf2c4b

bbengfort commented Aug 28, 2019

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Adds support for Pipelines with ModelVisualizers #955

[WIP] Adds support for Pipelines with ModelVisualizers #955

bbengfort commented Aug 27, 2019 •

edited

bbengfort Aug 27, 2019

bbengfort Aug 28, 2019

bbengfort Aug 28, 2019

bbengfort Aug 28, 2019

bbengfort Aug 28, 2019

bbengfort Aug 28, 2019

bbengfort Aug 28, 2019

bbengfort Aug 28, 2019

		from sklearn.datasets import make_blobs, make_classification, make_regression


		BASES = [

		assert oz.fit(data.X.train, data.y.train) is oz


		@pytest.mark.xfail(reason="quick methods aren't primetime yet")

		##########################################################################


		class TestModelVisualizer(VisualTestCase):

		@@ -17,18 +17,23 @@
		## Imports
		##########################################################################



		# Alias for closer name to isinstance and issubclass
		ispipeline = is_pipeline

[WIP] Adds support for Pipelines with ModelVisualizers #955

Are you sure you want to change the base?

[WIP] Adds support for Pipelines with ModelVisualizers #955

Conversation

bbengfort commented Aug 27, 2019 • edited

TODOs and questions

CHECKLIST

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bbengfort commented Aug 27, 2019 •

edited