Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add round-tripping Equality/IComparable contract generation #838

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

lipchev
Copy link
Collaborator

@lipchev lipchev commented Sep 21, 2020

Addresses issues posed in #717

  • CompareTo implemented in terms of the two-way comparison between 'this.As(other)' & 'other.As(this)'
  • Equals changed to CompareTo(other) == 0
  • Comparison operators re-written in terms of CompareTo values
  • GetHashCode generation modified such that it is consistent with the Equals function- i.e. not including the UnitType
  • Added tests (QuantityComparisonTests) for the IComparable, Equality contracts, including the (previously failing) comparisons with problematic values such as 0.001

- CompareTo implemented in terms of the two-way comparison between  'this.As(other)' & 'other.As(this)'
- Equals changed to CompareTo(other) == 0
- Comparison operators re-written in terms of CompareTo values
- GetHashCode generation modified such that it is consistent with the Equals function- i.e. not including the UnitType
- Added tests (QuantityComparisonTests) for the IComparable, Equality contracts, including the (previously failing) comparisons with problematic values such as 0.001
@codecov-commenter
Copy link

codecov-commenter commented Sep 21, 2020

Codecov Report

Merging #838 (28d55bc) into master (d89306b) will increase coverage by 2.96%.
The diff coverage is 99.60%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #838      +/-   ##
==========================================
+ Coverage   80.45%   83.42%   +2.96%     
==========================================
  Files         285      285              
  Lines       42607    43876    +1269     
==========================================
+ Hits        34281    36603    +2322     
+ Misses       8326     7273    -1053     
Impacted Files Coverage Δ
...et/GeneratedCode/Quantities/MassConcentration.g.cs 89.14% <20.00%> (+1.73%) ⬆️
...nitsNet/GeneratedCode/Quantities/Acceleration.g.cs 82.60% <100.00%> (+3.15%) ⬆️
...et/GeneratedCode/Quantities/AmountOfSubstance.g.cs 82.93% <100.00%> (+3.09%) ⬆️
...tsNet/GeneratedCode/Quantities/AmplitudeRatio.g.cs 77.85% <100.00%> (+3.61%) ⬆️
UnitsNet/GeneratedCode/Quantities/Angle.g.cs 82.06% <100.00%> (+2.88%) ⬆️
...tsNet/GeneratedCode/Quantities/ApparentEnergy.g.cs 77.31% <100.00%> (+3.70%) ⬆️
...itsNet/GeneratedCode/Quantities/ApparentPower.g.cs 77.85% <100.00%> (+3.61%) ⬆️
UnitsNet/GeneratedCode/Quantities/Area.g.cs 84.51% <100.00%> (+1.22%) ⬆️
UnitsNet/GeneratedCode/Quantities/AreaDensity.g.cs 75.09% <100.00%> (+5.74%) ⬆️
.../GeneratedCode/Quantities/AreaMomentOfInertia.g.cs 79.48% <100.00%> (+3.75%) ⬆️
... and 200 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a7c52a8...b2ba83f. Read the comment docs.

return _value.CompareTo(other.GetValueAs(this.Unit));
var asFirstUnit = other.GetValueAs(this.Unit);
var asSecondUnit = GetValueAs(other.Unit);
return (_value.CompareTo(asFirstUnit) - other.Value.CompareTo(asSecondUnit)) / 2;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully grok this one.
I tried reading up on he last comments in #717, but I still don't quite see why this works and whether it is a good idea or if it has some edge-cases to it. Could you try to ELI5 ?

I get that we are doing an average of comparing A and B in both A's unit and in B's unit.

What exactly does this solve? Is it when A's and B's units are really different and trying to represent either quantity in the other's unit results in big rounding errors? Why does averaging help?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure- so basically the HashCode is used in some collections for fast lookup of objects. It is part of the 'Equality Contract' - which states that 'Equals(x,y) => Equals(x.GetHashCode(), y.GetHashCode()))', yet at the same time the result of GetHashCode is not considered as uniquely identifying (it is possible to have the same HashCode for two objects for which Equals(x,y) == false). When creating hashing functions one tries to make the result as unique as possible- but not such that it violates the first condition- typically we take the result of combining the hashes of the properties that make up the Equals function.
In practice all collections that use the hash function by definition compare both the result of the hash comparison, followed by a full Equals check- the first condition skipping 99.99% of the non-equal entities, while the second one ensures that we are not being tricked by a hash-collision.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why we wouldn't want this compared to raw double comparison in equivalent units.

@@ -792,7 +794,8 @@ public bool Equals({_quantity.Name} other, double tolerance, ComparisonType comp
/// <returns>A hash code for the current {_quantity.Name}.</returns>
public override int GetHashCode()
{{
return new {{ QuantityType, Value, Unit }}.GetHashCode();
var roundedBaseValue = Math.Round(GetValueInBaseUnit(), 5);
return new {{ QuantityType, roundedBaseValue }}.GetHashCode();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This deserves some comments explaining the rationale for the GetHashCode. It is not intuitive to me, had I not known about the discussion.

Why is 5 a good choice here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QuantityType was added as you could have two different units (say Length and Area) with identical values. If they also had the same unit enum value (treated as int), they would have the same hash code.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it problematic that 1 kilogram and 1 meter shares hash code though? They would still not be equal.
As long two equal values produce the same hash code, we should be fine. There is no requirement that in-equal values should produce different hash codes, as mentioned in #838 (comment).

I believe we could add back the QuantityType though and keep the new rounded value, but I'm just trying to understand what the impact of dropping quantity type would be. Less performant lookup of quantities in an array due to hash code collisions?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly- it's an over-estimation of the max rounding error that we could get by converting to base (it's so stated in the Precision chapter of the wiki).

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#717 (comment)

HashSet may have been a better example - for set operations. It too supports an IEqualityComparer to be passed in.

So one impact of not including quantity type in the hash code would be that

var hs = new HashSet<IQuantity>();
hs.Add(Mass.FromKilograms(1));
hs.Add(Mass.FromKilograms(1));
hs.Add(Length.FromMeters(1));
hs.Dump();

/* Today
HashSet<IQuantity> (2 items)
--
1 kg
1 m
*/

/* This PR would change it to this?
HashSet<IQuantity> (1 item)
--
1 m
*/

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I'm not sure what I'm reading but QuantityValue is still there hah. Ignore that.

Just for our own learning: it does work! I thought it would disallow it.
The difference is without the quantity type it has to call Equals to check if it's a real collision or not. Given how easy it is to avoid, without a significant hash cost, it should be included for performance.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way (I didn't know that) but renaming any of the variables forming the anonymous class new { QuantityType, roundedBaseValue }, would produce a wholly different result from the GetHashCode that follows it :)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I'm not sure what I'm reading but QuantityValue is still there hah. Ignore that.

LOL! I read it that way too! 😆

var secondMass = firstMass.ToUnit(MassUnit.Microgram);

Assert.Equal(0, firstMass.CompareTo(secondMass));
Assert.Equal(0, secondMass.CompareTo(firstMass));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This resulted in -1 and 0 before this PR, so that seems like an improvement.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before considering this failure of the implementation of the IComparable interface I was actually leaning more towards the Equals contract that also included the current Unit (without doing any conversions) - however coming up with rounding-error-proof (hopefully) implementation of the CompareTo tipped the scales for me in the other direction.
In any case- for the Equality contract both options are still on the table..

var firstMass = Mass.FromGrams(0);
var secondMass = Mass.FromGrams(double.Epsilon);

Assert.Equal(firstMass.GetHashCode(), secondMass.GetHashCode());
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you mentioned this earlier from the docs.

If two objects compare as equal, the GetHashCode() method for each object must return the same value. However, if two objects do not compare as equal, the GetHashCode() methods for the two objects do not have to return different values.

Here the values are not equal, but the hash codes are still allowed to be equal. Nice!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They don't have to, but it can save a lot of equality checking.

double v1 = 0.0;
double v2 = double.Epsilon;
		
Console.WriteLine(v1.GetHashCode());
Console.WriteLine(v2.GetHashCode());
0
1


namespace UnitsNet.Tests
{
public class QuantityComparisonTests
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation made a whole lot more sense after I found these tests, thank you!
I still don't fully understand the implementation and whether there can be any issues with it, but I am way more comfortable after seeing the excellent work on these test cases 👍

@angularsen
Copy link
Owner

I think this looks pretty good. Just some loose questions/comments above.

@tmilnthorp you were involved in #717 too, maybe take a look when you have the time?

}}

/// <summary>Returns true if greater than.</summary>
public static bool operator >({_quantity.Name} left, {_quantity.Name} right)
{{
return left.Value > right.GetValueAs(left.Unit);
return left.CompareTo(right) == 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this notation better

@stale
Copy link

stale bot commented Nov 21, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Nov 21, 2020
@stale stale bot closed this Nov 28, 2020
@lipchev lipchev self-assigned this Nov 28, 2020
@lipchev
Copy link
Collaborator Author

lipchev commented Nov 28, 2020

Hey guys, ignore this if you're busy- just pinging out to ask if there is something I should do to move this forward.

@lipchev lipchev reopened this Nov 28, 2020
@angularsen
Copy link
Owner

angularsen commented Nov 29, 2020

I tried to refresh my memory by looking back at the comments.
I think for me, the only think I'm still unsure about is the hard coded constant of 5. Can we run into any issues by this choice? Also add some code comments to that part.

#838 (comment)

This deserves some comments explaining the rationale for the GetHashCode. It is not intuitive to me, had I not known about the discussion.
Why is 5 a good choice here?

Honestly- it's an over-estimation of the max rounding error that we could get by converting to base (it's so stated in the Precision chapter of the wiki)

I updated the precision wiki article a bit to be a bit more accurate with today's code base, but the essence remains as before.
https://github.com/angularsen/UnitsNet/wiki/Precision

@angularsen angularsen added the pinned Issues that should not be auto-closed due to inactivity. label Nov 29, 2020
@lipchev
Copy link
Collaborator Author

lipchev commented Nov 30, 2020

I gave it some more thought tonight- and I no longer think this is a good idea. Not that it doesn't do what it was supposed to (q == q.ToUnit(X)) but it just doesn't make any sense to actually use it- given the limiting constraint that q.ToUnit(X).ToUnit(Y) is not necessarily equal to q (the precision lost on the way is non-recoverable). Any method with a quantity as input cannot thus be allowed to make any sane assumptions of equality beyond "both unit and value match" (you'd just be one more ToUnit() away from ruining your day).
I therefore propose that we revert to using the GetValueInBaseUnit() for the Comparator and the strict Equals/HashCode equality contract (using both Unit & Value)

@angularsen
Copy link
Owner

angularsen commented Dec 1, 2020

I agree it sounds like the safest option. I still don't feel this equality is all that useful though, for any non-trivial unit conversion there will be loss of precision and two very similar quantities will still be considered in-equal. And even worse, it is an implementation detail where some quantities use decimal.

I think custom equality comparison methods adds more value than trying to fit the quantities into IEquatable and IComparable, but if we should offer an implementation I would vote for the most conservative/safest option.

@lipchev
Copy link
Collaborator Author

lipchev commented Dec 8, 2020

The problem is that the "safest" option in this case {unit, value} would be quite a breaking change. Anyone who has ever used something like x == Mass.Zero or default(Mass) could be in trouble if 'x' does not come with the default unit (kg in this case).
Also there are some edge cases that might come as a surprise: like having X<=Y & X >= Y but X != Y.

@angularsen
Copy link
Owner

angularsen commented Dec 8, 2020

Good point. Changing equality from base unit to value+unit would break things like Zero comparison, I hadn't thought of that. So. Hm.

Actually. I think we should honor this contract, not value+unit:

Length.FromCentimeters(0).Equals(Length.Zero)

For me, that is the most intuitive. I know, equality won't work except for trivial examples like Length.FromCentimeters(100).Equals(Length.FromMeters(1)), but I just don't see a better compromise given this very limited contract.

…her.As(Unit)

- changed the GetHashCode generation scheme: depending on the QuantityType alone (the unit & value are ignored)
- updated the two relevant tests (that were previously passing using mixed units)
@lipchev
Copy link
Collaborator Author

lipchev commented Mar 7, 2021

So, I've reverted the Equals method (same as before) and made the GetHashCode() return only the QuantityType's hash-code- as such it would match the contract requirements. However, I'm pretty sure there is 'reflexivity' requirement that the Equals method does not currently satisfy (or at least not always). I've adjusted the mixed-unit tests such as to demonstrate the problematic behavior.

As far as the Comparable interface- there I've kept the proposed scheme (that is reflexive). I think this is the safer option, even if there could be cases where A == B && A.CompareTo(B) != 0, the inverse is always true, i.e. A != B => A.CompareTo(B) != 0.

All in all, I'm not terribly happy with the whole thing- it would have been so much neater with a decimal-like value representation. In any case, even if we kept the original implementation- it wouldn't hurt to grab the tests from this branch (and make a note about the limitations in the wiki).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement pinned Issues that should not be auto-closed due to inactivity.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants