Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: GridSpec do not type check as Iterable in zip #28151

Open
Jacob-Stevens-Haas opened this issue Apr 29, 2024 · 7 comments
Open

[Bug]: GridSpec do not type check as Iterable in zip #28151

Jacob-Stevens-Haas opened this issue Apr 29, 2024 · 7 comments

Comments

@Jacob-Stevens-Haas
Copy link
Contributor

Bug summary

GridSpec objects are iterable because they have a __getitem__ method. However, they are not a collections.abc.Iterable:

[Iterable] does not detect classes that iterate with the __getitem__() method.

This means that they do not type check in zip, at least according to typeshed

Code for reproduction

from matplotlib.gridspec import GridSpec

gs = GridSpec(3, 3)
counter = range(9)
for cell in gs:
    print(cell)

for i, cell in zip(counter, gs):
    print(cell)

Actual outcome

Code runs, but type type checking says that GridSpec is not Iterable

Expected outcome

Type checks OK

Additional information

No response

Operating system

Linux

Matplotlib Version

3.8.4

Matplotlib Backend

No response

Python version

3.10.12

Jupyter version

N/A

Installation

pip

@Jacob-Stevens-Haas Jacob-Stevens-Haas changed the title [Bug]: GridSpec are not Iterable [Bug]: GridSpec do not type check as Iterable in zip Apr 29, 2024
@Jacob-Stevens-Haas
Copy link
Contributor Author

FWIW, adding the following roll-your-own __getitem__() iterator works. There's probably a way to make it more generic.

class GridSpecIterator:
    def __init__(self, gs: GridSpec):
        self.gs = gs
        self.counter = 0

    def __next__(self):
        try:
            subplotspec = self.gs.__getitem__(self.counter)
            self.counter += 1
            return subplotspec
        except IndexError:
            raise StopIteration
    
    def __iter__(self):
        return self


class GridSpec(GridSpec):
    def __iter__(self):
        return GridSpecIterator(self)

@ksunden
Copy link
Member

ksunden commented Apr 29, 2024

There are a couple of things that could be done here.

I'm not too enthused by adding a new custom object here, as in your example. (It is an additional layer of complexity/public (or at least leaky to public) API.

One option would be to lean in to the Sequence protocol (which the iter function assumes works for things that implement __getitem__, and is why it works, despite the mere existence of __getitem__ not fully qualifying it as "Iterable")

  • Inherit from collections.abc.Sequence
  • add a __len__ (return self.nrows * self.ncols)

Upside is that it is easy to do, and makes it conform to standards.
Downside is that the Sequence protocol brings with it additional (albeit automatically included) API such as index, count, and __reversed__, which may not be particularly useful and thus desirable to make an option. (e.g. count will only ever be 0 or 1, and generally depends on having already extracted the SubplotSpec from the GS you are querying)

The other option would be to just implement __iter__ as a generator:

class GridSpec(GridSpecBase):
    ...
    
    def __iter__(self):
        for i in range(self.nrows * self.ncols):
            yield self[i]

The generator object is effectively the same as the custom class, but returns a standard python type. It would seem to me to be a smaller leap from iter(gs) giving an iterator (standard python) object.

But it still satisfies type checking, and is slightly (in my opinion) easier to follow what is happening.

I think my leaning is to go with adding a __iter__ generator for explicitness. Though I am willing to hear arguments for full Sequence or something else

@anntzer
Copy link
Contributor

anntzer commented Apr 30, 2024

I guess the typeshed issue you want to link to is python/typeshed#8597.
Personally I would be quite strongly opposed (as always) to add boilerplate just to appease type checkers. It is a well documented fact that classes with getitem are iterable so mypy & friends are just not in sync with reality if they don't realize that such classes are iterable (whether their maintainers like that or not does not change the documented python behavior).

I suspect python/typeshed#7817 may also provide a sufficient workaround.

@tacaswell
Copy link
Member

An explicit __iter__ is not unreasonable, but I also lean towards the position that this is a bug in the typechecking tools and we should not pick up epicycles to placate arguably broken tools.

@Jacob-Stevens-Haas
Copy link
Contributor Author

Thanks all for sharing, and good to know iter(gs) solves problems on the user side.

It doesn't look like the community is going to come to a consensus soon on whether it's a

  • typeshed responsibility (by integrating the _GetItemIterable to anything that accepts collections.abc.Iterable),
  • type checker's responsibility (by special casing this behavior),
  • downstream package's responsibility (by implementing explicit __iter__).

I'm slightly in favor of the explicit __iter__, if only from the limited perspective of "I'm a user and this is the fastest way to solve the problem for users like me."

@timhoffm
Copy link
Member

timhoffm commented May 2, 2024

Would this be a reasonable workaround? https://github.com/pytorch/pytorch/pull/77116/files

It fixes the issue but does not change runtime behavior. So it's only "type"-overhead. AFAICS, the only reason it was not integrated in pytorch is that the respective class is generally mapping-like. While many users use interger indices, which then allow iteration, the behavoir is not guaranteed in general. I think it would always work for GridSpec.

@ksunden
Copy link
Member

ksunden commented May 3, 2024

Since we use pyi files, we can actually just put the def __iter__(self) -> Iterable[SubplotSpec]: ... in there.

It will trigger stubtest to fail since runtime has no __iter__, but we can add an explicit ignore for it to the ci/mypy_stubtest_allowlist.txt.

This is effectively the same as the pytorch example, just not touching runtime file at all.

I will defend the choice made by type checkers here because this style of "iterable" is dependent on a particular meaning applied to __getitem__ that not all classes implement. While it is true that it is documented, I think explicitly opting in to type checking as iterable is not a particularly high ask for clarity, and enables using __getitem__ without being lumped into iterable. That said, I could see why having __getitem__(self, value: int) (even if int is part of a union, as it is in this example) could get you over that "old style iteration applies" explicitness hurdle, and would not fight against tools adopting that standard (perhaps not perfect still, but I'd say better than "implements getitem without type hints").

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants