Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remaining Pixel Data work #2031

Open
5 of 6 tasks
scaramallion opened this issue Mar 21, 2024 · 6 comments
Open
5 of 6 tasks

Remaining Pixel Data work #2031

scaramallion opened this issue Mar 21, 2024 · 6 comments

Comments

@scaramallion
Copy link
Member

scaramallion commented Mar 21, 2024

How do we want to handle migrating to the new pixels backend?

  1. Switch Dataset.pixel_array to use pixels in v3.0, add an easy to use method to switch back to using pixel_data_handlers
  2. Keep Dataset.pixel_array as-is, add pixels as an option (use_future) and switch over in v3.x

Personally I think 1 is best:

  • Using 2 will likely result in little real-world testing of pixels and so any issues in pixels will suddenly be apparent the instant we switch over anyway.
  • There are backwards incompatible changes with pixels so they should be in the v3.0 release

I think it's fair to say that pixel_data_handlers and encoders can be removed in v4.0.

The remaining pixels module work I have left for v3.0 is:

  • Add J2K encoding (mostly done, I've run into an upstream issue with encoding failures for random 1-bit images but I don't think it will be an issue in practice with non-random data)
  • Add support for Dataset to pixels.utils.pixel_array() and pixels.utils.iter_pixels()
  • Migrate the functions in pixel_data_handlers.util to pixels.processing and pixels.utils
  • Refactor the Dataset pixel data methods, I think maybe just limit it to compress(), decompress() and maybe one other for controlling Dataset.pixel_array
  • Migrate Dataset.pixel_array backend? See above.
  • Documentation updates
@darcymason
Copy link
Member

Personally I think 1 is best:

Definitely agree.

... any issues in pixels will suddenly be apparent the instant we switch over anyway.

I've been thinking that v3.0 should be taken through a pre-release process. At least a release-candidate version, possibly even a beta release before that. That limits any initial problems to a few users with enough interest to try things out ahead or time, and in a test environment. Feedback at that stage should cause fewer problems for others later.

I think it's fair to say that pixel_data_handlers and encoders can be removed in v4.0.

+1 here too.

@mrbean-bremen
Copy link
Member

Same here. Also agree about the pre-release, this makes sense for a release with breaking changes.

Side note - I will be away without access to a computer (not counting my mobile), will be back in april.

@scaramallion
Copy link
Member Author

scaramallion commented Apr 19, 2024

What sort of behaviour would you like for Dataset.pixel_array with the new backend? Should it just emulate the current behaviour by returning the full pixel data or should I allow for full flexibility; frame indexing, colour space control, buffer view only, etc, etc? Keeping in mind the full functionality (will be) available via the pixels.utils.pixel_array() function or the specific encoder instance's as_array() method.

Secondly, I'd really, really like to convert Dataset.pixel_array to a class method, although probably not in v3.X. I completely understand that we have a huge amount of user code that depends on the property, but the fix for the compatibility breakage would only be a pair of brackets: ds.pixel_array -> ds.pixel_array(), and the property is getting really clunky with all the added flexibility.

Do you have an opinion on converting it at some point or is it just something I'll have to live with?

@mrbean-bremen
Copy link
Member

but the fix for the compatibility breakage would only be a pair of brackets: ds.pixel_array -> ds.pixel_array(), and the property is getting really clunky with all the added flexibility

I agree that with a property you won't expect any potentially costly operations, so that is not ideal.
I don't think that changing it from a property to a method is the best idea, as using it the old way would get the user unexpected warnings / errors, or just crashes later on. I would rather deprecate the property in favor of a method with a new name.

As for the functionality, given that all the advanced options are available via the pixels module, I would probably go for the old behavior (e.g. return full pixel data). Not sure, though...

@darcymason
Copy link
Member

I was just starting to type, only to see @mrbean-bremen's comments which very much echo mine:

I think there is too much history with pixel_array and it is too highly used to change it. In my past experience, someone running a legacy script only to see a couple pages of Python traceback is high probably just to say, 'its broken' and not really investigate unless the script was critical.

Another new name with the new behavior would be preferable. Perhaps even just pixels() returning a full-featured object, to avoid looking too much like pixel_array. Although pixels is also a submodule name, but at least they wouldn't actually conflict in any way.

With a new name and the documentation examples all converted over to it, maybe in a distance future the pixel_array property could be deprecated (to remove in an even more distant future).

@scaramallion
Copy link
Member Author

OK, fair enough. pixel_array is a pretty good name though, it would be a shame to retire it. I'll leave it alone for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants