Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing nil values after deserializing avro #251

Open
mt3593 opened this issue May 26, 2020 · 4 comments
Open

Removing nil values after deserializing avro #251

mt3593 opened this issue May 26, 2020 · 4 comments

Comments

@mt3593
Copy link

mt3593 commented May 26, 2020

An optional field in AVRO is currently set by setting a default to nil and setting up a union type with "null" as the first type in that union.

When read back it fills out the values with nil.

This means that I have to now deal with the nils in my code rather than the absence of the key in the map. Not just that but it also effects my tests, when I pull back a value from a avro topic I now pull back these nil that I did not set, this means that all schema changes break my tests as I now have to update to assert they have the nil values.

Personally I think we should strip out nil values, given I didn't set the nil and it's an optional field we can't determine which is set by default and which is set by the user so I think it's safe to remove and makes coding against it easier.

@cddr
Copy link
Contributor

cddr commented May 28, 2020

I think changing this behaviour would risk breaking existing programs but it shouldn't be difficult to get the behaviour you'd like using something which strips out the nils after deserialization.

But ignoring that, I think that the existing "in-memory" representation of avro messages already deviates more than it should from the "standard json" representation which includes a tag to identify which element of the union is satisfied by the value.

Personally if I was starting a project from scratch, I'd avoid the avro implementation in here which has a few mistakes that make it less interoperable with other avro based systems than it could/should be. This PR has more specifics on the problem I'm talking about #224

@creese
Copy link
Contributor

creese commented Jun 1, 2020

Does it matter that the map contains keys your applications doesn't need? I believe clojure.spec follows a similar philosophy here.

@mt3593
Copy link
Author

mt3593 commented Jun 2, 2020

@creese with Clojure spec if your map contains keys it would apply the spec against the key (if it has a spec for the key of course).

(s/def ::name string?)
(s/def ::abc (s/keys :opt [::name]))

(s/assert* ::abc {}) ;; Happy with this
(s/assert* ::abc {::name ""}) ;; Happy with this
(s/assert* ::abc {::name nil}) ;; Not happy with this

Spec prefers you to remove the nils.

Spec does allow you to provide more keys than are declared in the spec:

(s/assert* ::abc {::name "" :id "1234" :foo 98})

Which I think is the principle you are talking about but that doesn't apply here as we only get back what's in the schema.

@cddr currently I've done exactly that, after deserialization I strip out the nil's. Agree changing this behaviour would break existing programs (or just give someone some headache updating tests). Ideally want to keep things down to one library, makes updating easier and it's nice to only have a handful of options open source wise. Would you entertain the idea of having this as a config? Something you have to explicitly turn on? Or given your comment about interoperability could we create a version 2 and slowly deprecate the old version?

@cddr
Copy link
Contributor

cddr commented Jun 2, 2020

Spec prefers you to remove the nils.

Remember that spec also has the s/nilable constructor which might be a better match for avro's ["null", "some-type"]

Would you entertain the idea of having this as a config?

Thankfully not my call really. I'm just an enthusiastic contributor now :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants