Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let data_key and attribute default to field name #1897

Open
wants to merge 7 commits into
base: dev
Choose a base branch
from

Conversation

lafrech
Copy link
Member

@lafrech lafrech commented Oct 27, 2021

Closes #1864.

Trying to address #1864, I get typing issues, as foreseen in #1864 (comment). I wouldn't mind a bit of help, here.

This change spares users the hassle of checking if data_key is None, as can be seen in the changes in schema.py. This would be used in apispec, for instance.

Notes:

  • In Pluck, we don't check for None but only for truthyness, which may be wrong for empty strings. I didn't look into it any further but this PR would fix it.
  • I also intend to add data_key to the repr. The printed name will be the actual value, be it the parameter or the field name (default). I don't think it is an issue.
  • I also intend to do the same with attribute, symmetrically.

field_name = (
field_obj.data_key if field_obj.data_key is not None else attr_name
)
field_name = field_obj.data_key
raw_value = data.get(field_name, missing)
Copy link
Member

@deckar01 deckar01 Oct 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error: Argument 1 to "get" of "Mapping" has incompatible type "Optional[str]"; expected "str"

The problem is that Fields.__init__ assigned a Optional[str] type to data_key. _bind_to_schema guarantees that field_obj.data_key is a str, but the type checker doesn't know that _deserialize is guaranteed to run after _bind_to_schema. I guess the get type annotation enforces this argument, but AFAICT, None doesn't actually error. 🤷

It wants you to ensure data_key is never None. The only thing I can think of is to make data_key a @property that raises an exception if the field is unbound.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we agree on the reason behind the error.

It would be a shame to "complicate" the code, even slightly, to please the linter.

I suppose we could add comments to skip test on each failing line, but I'd rather find a way to centralize that by somehow forcing the type of self.data_key once and for all.

We could use a cast:

self.data_key = typing.cast(str, data_key)

but doing this in __init__ before field binding would be wrong.

Or we do it just before the type check error:

field_name = typing.cast(str, field_obj.data_key)

Still not ideal because:

  • It is not done everywhere, only where the error was triggered. Other errors may occur later here or in other libs.
  • It is executed on each deserialization, although IIUC it is neglectable.

I just pushed a commit doing that.

@lafrech lafrech force-pushed the default_data_key_attribute_to_field_name branch from 64c268d to a53475a Compare October 28, 2021 06:55
@lafrech lafrech marked this pull request as ready for review October 28, 2021 06:55
@lafrech
Copy link
Member Author

lafrech commented Oct 28, 2021

Notes:

  • I renamed a few local variables in Schema methods to be more explicit: data_key instead of field_name. We might want to rename field_name attribute of _call_and_store because it not necessarily the field name. That can be done later.

  • In Field.get_value, I had to keep the if self.attribute is None check because the field can be used while not bound to a schema.

@lafrech
Copy link
Member Author

lafrech commented Oct 28, 2021

I performed a quick test on apispec codebase to see how things could be simplified there and the result is a bit disappointing. apispec operates on schema instances or classes, and when operating on a class, fields are not bound to the schema so we can't rely on data_key or attribute being set.

This feature would be more useful if it occurred on schema class creation, not instantiation. Technically, that could mean adding a _set_name method to Field that would be called from Schema._get_fields. This is what I do in the last commit. Although setting name at schema creation seems more logical to me, the downside of this is that custom fields, especially container fields, may now have to override both _bind_to_schema and _set_name. (Part of the trouble there is related to inferred or Meta.include fields, which may be dropped in marshmallow 4 (at least inferred ones: #1356).)

With this last commit, the changes in apispec work.

Notes:

  • The change in apispec is minimal (2 lines changed), so the benefit of this is rather limited.)
  • If I understand @kasium correctly in Always set data_key #1864, they are using _declared_fields so they are working on Schema classes, not instances, therefore, they fall in the same case as apispec and the change would only be beneficial if done at Schema creation.

@kasium
Copy link

kasium commented Oct 28, 2021

@lafrech yes, that's right. I use the declared_fields attribute. But I guess using a schema instance would not add too much trouble if somehow cached (not your worry). However, having it on class level would still be nice

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Always set data_key
3 participants