Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SeekableFileObject needs readline() for Pillow plugin #1007

Open
Dotrar opened this issue Jun 15, 2023 · 2 comments
Open

SeekableFileObject needs readline() for Pillow plugin #1007

Dotrar opened this issue Jun 15, 2023 · 2 comments

Comments

@Dotrar
Copy link

Dotrar commented Jun 15, 2023

Hey all,

I've found an issue, and it's stopping imageio from trying every plugin and eventually saying a certain image not supported.

I'm trying to open a url of a larger JPG from an s3 storage. It's unable to ascertain the file-type so that it goes for the "try every plugin" approach:

  1. in imopen, if we can't find the resource, we will "try everything" here

  2. We use Request which will return a SeekableFileObject as shown here

  3. Using Pillow ImImagePlugin to read a file, we attempt a readline as seen here @ Pillow

But SeekableFileObject does not have readline, causing an error. https://github.com/imageio/imageio/blob/master/imageio/core/request.py#L653

image

@Dotrar
Copy link
Author

Dotrar commented Jun 17, 2023

forgot to mention, adding in

def readline(self):
    return self.read(100)

fixes the crash and allows imageio to open my image, by the way.

@Dotrar Dotrar changed the title SeekableFileObject causes crash, needs readline for obscure PIL plugin SeekableFileObject needs readline() for Pillow plugin Jun 17, 2023
@FirefoxMetzger
Copy link
Contributor

I'm trying to open a url of a larger JPG from an s3 storage. It's unable to ascertain the file-type

How come?

Assuming you pass a URL and the file either ends with a JPEG suffix (e.g. .jpg) then ImageIO will try plugins known to read JPEG first and only fall back to trying everything if the "known-to-read-jpeg" plugins fail. If the URL doesn't end with a JPEG suffix, but you know it to be JPEG, you can also pass extension=".jpg" as an additional kwarg to imopen, imread, etc. to set/overwrite the suffix of the resource itself.

it goes for the "try every plugin" approach:

If you already know you want to read using pillow you can also pass plugin="pillow" as an additional kwarg, in which case plugin selection is skipped in favour of using the plugin you specified.

But SeekableFileObject does not have readline, causing an error.

Since SeekableFileObject is a binary file and lacks the notion of a "line of text" or an "EOL character (\n)". As such the function reads a single line from the file doesn't make much sense in this context and we didn't implement readline.

Using Pillow ImImagePlugin to read a file, we attempt a readline as seen here @ Pillow

The plugin you are quoting here is the plugin for reading IFUNC Image Memory not JPEG. I have actually never encountered this format and Google and the pillow repo are suspiciously silent about it. I suspect that this is a legacy format that is no longer in active use and that should probably not be used further.

def readline(self):
   return self.read(100)

The problem with setting readline to read the first 100 bytes is that this is an arbitrary cutoff. This may or may not mean something for the underlying data of the stream and will most likely leave the head at an awkward and unexpected location.

I am still not convinced we want to add readline, but were we to do so we probably want to read the entire file. In my mind "read until you encounter the end of the current line" plus "there is no end of line for binary files" results in "read until the end of the file".


Regarding the actual problem of ImageIO being unable to read a large JPEG from S3, are you able to share a code snippet that reproduces the failure?

Walking over all plugins and trying them out is expensive and more of a last resort than something we want to do on a regular basis. If we can avoid this by doing something smart during plugin selection I'd be quite interested in this option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants