Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix accessing builtin dict's attrs for a missing field #1557

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

mahenzon
Copy link
Contributor

@mahenzon mahenzon commented Apr 5, 2020

I've faced this issue, when a schema has a field with name same as builtin dict's has attribute:

from marshmallow import fields, Schema


class MyNestedSchema(Schema):
    update = fields.Boolean(required=False)


class MySchema(Schema):
    data = fields.Nested(MyNestedSchema)


class MyObj:
    def __init__(self):
        self.data = {
            # NOT passing a value, so it's missing
            # 'update': True,
        }


def main():
    obj = MyObj()
    data = MySchema().dump(obj)
    print(data)


if __name__ == '__main__':
    main()

Fails with this:

Traceback (most recent call last):
  File "/home/khorenyan/.PyCharm2019.2/config/scratches/scratch_42.py", line 40, in <module>
    main()
  File "/home/khorenyan/.PyCharm2019.2/config/scratches/scratch_42.py", line 35, in main
    data = MySchema().dump(obj)
  File "/home/khorenyan/projs/marshmallow/src/marshmallow/schema.py", line 556, in dump
    result = self._serialize(processed_obj, many=many)
  File "/home/khorenyan/projs/marshmallow/src/marshmallow/schema.py", line 520, in _serialize
    value = field_obj.serialize(attr_name, obj, accessor=self.get_attribute)
  File "/home/khorenyan/projs/marshmallow/src/marshmallow/fields.py", line 311, in serialize
    return self._serialize(value, attr, obj, **kwargs)
  File "/home/khorenyan/projs/marshmallow/src/marshmallow/fields.py", line 566, in _serialize
    return schema.dump(nested_obj, many=many)
  File "/home/khorenyan/projs/marshmallow/src/marshmallow/schema.py", line 556, in dump
    result = self._serialize(processed_obj, many=many)
  File "/home/khorenyan/projs/marshmallow/src/marshmallow/schema.py", line 520, in _serialize
    value = field_obj.serialize(attr_name, obj, accessor=self.get_attribute)
  File "/home/khorenyan/projs/marshmallow/src/marshmallow/fields.py", line 311, in serialize
    return self._serialize(value, attr, obj, **kwargs)
  File "/home/khorenyan/projs/marshmallow/src/marshmallow/fields.py", line 1103, in _serialize
    elif value in self.truthy:
TypeError: unhashable type: 'dict'

It happens this way:

  • When being serialized, object is passed to the marshmallow.utils._get_value_for_key
  • because of the missing update field in data on the line obj[key] the KeyError exception is raised
  • and on the next line the getattr(obj, key, default) is returned

If the key is not found here, it returns default, so if my field is 'update_' or 'qwerty', it works as expected -- the result is returned and it's {'data': {}}

But the 'update' matches the builtin dict's attribute (method) so it's returned. And that's not expected at all.

Of course, I could override the get_attribute method in MyNestedSchema this way:

class MyNestedSchema(Schema):
    def get_attribute(self, obj: dict, attr: str, default: Any):
        try:
            return obj[attr]
        except KeyError:
            return default

    update = fields.Boolean(required=False)

But I think, that needs to be fixed.

I thought about checking if the getattr(obj, key, default) expression returns a method of a builtin's object, or checking a key to be one of the builtin's attrs, for example:

>>> dir(dict)
['__class__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'clear', 'copy', 'fromkeys', 'get', 'items', 'keys', 'pop', 'popitem', 'setdefault', 'update', 'values']

I'm not sure if that's a proper way to do this check. Maybe we need to check even for isinstance instead of the object's type, but here's my current solution.

@mahenzon mahenzon force-pushed the fix-accessing-dicts-methods-on-missing branch from fa24661 to 4661e9a Compare April 5, 2020 21:53
AUTHORS.rst Show resolved Hide resolved
@mahenzon
Copy link
Contributor Author

@deckar01 any ideas? it's been a long time already. If we need any changes, let's discuss them. I'm not sure about explicit types to be passed to the schemas' methods

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants