Add solution downsampling feature #537

aymkhalil · 2016-02-18T14:00:34Z

This PR enables downsampling a solution to save on write time & disk space should the solution be written out to files. It uses local averaging over user defined averaging factors along each axis:

In order to enable dowsampling, simply pass a downsampling factors tuple to the controller. For example, in 3D:

 claw = pyclaw.Controller()
 claw.downsampling_factors = (2, 2, 2)

The downsampling factors should evenly divide the grid in each dimension.
An optional dependency on scikit-image is introduced only for users who wish to use this feature. If such dependency is undesirable, we could implement our own local averaging method.
Test cases are yet to be added.

ketch · 2016-02-21T12:48:01Z

I have looked over this with @aymkhalil and it seems like a good first step. In terms of design, a downside is that it requires creating new Solution, State, and DA objects. But this leads to the advantage of not making any modification to the read and write functions themselves.

One important question is whether to keep the scikit-image optional dependency or replace it with some complicated numpy code (which seems like it would necessarily introduce a copy, at least in 3D).

An obvious next step after this is to allow different types of output (e.g. "checkpoints" and smaller ones) with different frequencies.

@mandli, any comments?

mandli · 2016-02-21T17:59:17Z

Reviewing, I will try to send some feedback by tonight. One thought though after reading through the PR once was that there are a couple of spots where there are deviations from Python coding styles which could be corrected easily. I will try to point these out with some code comments.

mandli · 2016-02-22T01:27:02Z

src/petclaw/solution.py

+            - ds_domain_ranges: global ranges of the downsampled solution
+        """
+
+        from skimage.transform import downscale_local_mean


Can we add some specifics as to what downscale_local_mean is doing? For instance, what the stencil size is? At the very least put in the doc-string that the function is from sci-kit image.

mandli · 2016-02-22T01:37:36Z

I think this generally looks good. I would suggest that tests be added since this is a fairly complex procedure, especially in parallel. I also wonder if this functionality should be implemented as a separate function that takes in a solution and then returns the specified downsampled solution. I only say this as most of the functionality ends up passing through self anyway and I could see some merit in making this its own module that we could implement a few different types of downsampling in. For instance if I wanted to downsample by simply striding through the array, it would be great to have a common interface to this functionality.

aymkhalil · 2016-02-22T14:33:00Z

@mandli I'm working on the test cases and your comments and will push the new modifications soon.

Regarding your last comment about implementing the functionality in a separate module, I agree with you, however, I just wanted the downsampled version of the solution to stick around in memory as long as the original solution object lives to save the overhead of initializing the new DA objects should the write function be called several times during the simulation. Now implementing a different technique for downsampling will require some refactoring, I agree, but at least the average method (which is performed locally only) is already separate in the downscale_local_mean method which I'm working on replacing it with a custom code (most likely, in util.py).

Now this leaves us with _downsample_global_to_local which, handles ranges mapping between the original & the downsamped solution objects which kind of lies in the middle between constructing the additional DA objects and calling the local average function itself, I'll give it another thought and see if I can move this logic from the solution object. I really think this is reusable in any technique that depends on a "rolling window" type of downsampling.

Anyway, let's have another round of comments after I push my changes.

mandli · 2016-02-22T17:02:30Z

@aymkhalil You make some good points and the core routine, downscale_local_mean, is really the one we would most likely want to replace. As a particular test case of what we may want to implement take an application where the aux array's consistency with q is important. This happens in GeoClaw with bathymetry data and is pretty sensitive. For instance, using linear interpolation near shore can cause water to get where it's not supposed to be. Nothing to be particularly concerned with perhaps now but something to think about when writing the interface.

There is some concern I have regarding memory pressure but as that is usually not an issue for us I am not too worried about it.

aymkhalil · 2016-02-23T10:25:47Z

I pushed some test cases and added a couple of doc-strings.
I couldn't find a better implementation than what's in ski-image for computing local average. What it does basically is first view the array to be downsampled as blocks of the size of the downsampling factors, then it applies the local averaging method (or any universal Numpy function).
They key method is view as blocks:

>>> from skimage.util.shape import view_as_blocks
>>> a = np.arange(4*4).reshape(4,4)
>>> a
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

>>> view_as_blocks(a, (2, 2))
array([[[[ 0,  1],
         [ 4,  5]],

        [[ 2,  3],
         [ 6,  7]]],

       [[[ 8,  9],
         [12, 13]],

        [[10, 11],
         [14, 15]]]])

If the scikit-image dependency is undesirable, we could just copy the functions we need, e.g. the vew_as_blocks method:
https://github.com/scikit-image/scikit-image/blob/master/skimage/util/shape.py#L10-L104

Then we could do the following:

>>> a = view_as_blocks(a, (2, 2))
>>> a.mean(-1).mean(-1)
array([[  2.5,   4.5],
       [ 10.5,  12.5]])

However, other methods in scikit-image may come in really handy especially if we want to generalize this feature.

Refactoring is required to support a generic downsampling feature. It would be great to allow the users to set their desired factors and method:
claw.ds_factors = (2, 2, 2)
claw.ds_method = 'local_mean'

Some methods may require non-overlapping blocks, overlapping windows, padding, etc. and this needs more discussion as to what methods are expected and what techniques they have in common to take this into account as @mandli has pointed. @ketch do you want this as part of this PR?

mandli · 2016-02-23T21:07:46Z

Thanks for putting in the tests, just a few more comments:

I think it's fine to depend on sci-kit image as long as we do not run afoul of Travis limits
Minor detail, can you make the lines that go past 80 columns wrap?
I think we could move on this PR while creating an issue to add flexibility as you suggest. As long as the changes needed would be mostly lower level and not involve the interface we should be fine.

Add solution downsampling feature

00ae4a3

aymkhalil force-pushed the downsampling branch from 2212ffe to 00ae4a3 Compare February 18, 2016 14:15

mandli reviewed Feb 22, 2016
View reviewed changes

aymkhalil force-pushed the downsampling branch from 3aba659 to d451c50 Compare February 23, 2016 09:12

Add test cases for solution downsampling

4fa125f

aymkhalil force-pushed the downsampling branch from d451c50 to 4fa125f Compare February 23, 2016 09:36

Wrap lines that go past 80 columns

e1f94f3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add solution downsampling feature #537

Add solution downsampling feature #537

aymkhalil commented Feb 18, 2016

ketch commented Feb 21, 2016

mandli commented Feb 21, 2016

mandli Feb 22, 2016

mandli commented Feb 22, 2016

aymkhalil commented Feb 22, 2016

mandli commented Feb 22, 2016

aymkhalil commented Feb 23, 2016

mandli commented Feb 23, 2016

Add solution downsampling feature #537

Are you sure you want to change the base?

Add solution downsampling feature #537

Conversation

aymkhalil commented Feb 18, 2016

ketch commented Feb 21, 2016

mandli commented Feb 21, 2016

mandli Feb 22, 2016

Choose a reason for hiding this comment

mandli commented Feb 22, 2016

aymkhalil commented Feb 22, 2016

mandli commented Feb 22, 2016

aymkhalil commented Feb 23, 2016

mandli commented Feb 23, 2016