All types are Equatable, but some are more equal than others

Vincent Esche
DefiniteLoops
Published in
9 min readMar 27, 2017

--

It’s probably safe to say that hardly more than a week will get to go by upon taking up learning Swift before one encounters the Equatable protocol. It is one of the simplest and probably the most adopted protocol in the Swift stdlib:

Implementing Equatable, alongside Hashable allows one’s types to be used as keys in a Dictionary<K, V> or a values in a Set<V>.

Taking a closer look at Dictionary and Set reveals that they actually share quite a lot in semantics and can thus even be implemented in terms of each other. As such one could implement a Dictionary<K, V> in terms of a Set<(K, V)>, conceptually*. Likewise one could implement a Set<V> in terms of a Dictionary<V, ()>**. One possible reason one might choose to do so is sharing logic, which otherwise would have to be maintained redundantly.

* Non-unary tuples in Swift cannot implement methods, so the proper way to do things would be actually using a small wrapper struct Pair<K, V> instead of (K, V), resulting in Set<Pair<K, V>>.

** The () (aka Void) type is a zero-sized type in Swift, also known as the unit type. While itself – as well as its instances – are present during type checking, the compiler will actually strip them from the resulting binary. A tuple of type (V, ()) would thus boil down to the same size as just (V) (aka V).

For the sake of simplicity we shall only look at Set<V>, but most of the concepts and points to be made apply to Dictionary<K, V>, likewise.

So, what exactly is a Set<V>? And how does it differ from Array<V>?

Unlike an Array<V>, which retains the order of its elements, a Set<V> data structure (ignoring implementation details of whether it’s implemented in terms of an ordered TreeSet<V> or an unordered HashSet<V>) is defined to store its unique members in an unspecified order. Any prior order of its member objects is lost on insertion. Inserting two objects a and b for which the expression a == b evaluates to true, will result in only one of them actually getting inserted*. And unlike a CountingSet<V>, a Set<V> does not keep count of how often an object has been inserted. As such, inserting an object twice, followed by a single removal of said object, will result in the actual removal of said object from the set.

* Which one is inserted/kept depends on the implementation. In case of Swift calling set.insert(object) will not have any effect if object is already found in set. If one however wants to replace existing members one has to call set.update(with: object) instead.

The bare minimum API of a Set<Element> would thus be …

  • init()
  • func insert(_ e: Element)
  • func remove(_ e: Element)
  • func contains(_ e: Element) -> Bool

… as well as some way to access its members without explicitly querying for members:

  • func makeIterator() -> Iterator

Everything else can be implemented in terms of these methods.

So here are a couple of behaviors one would expect to hold true for a s: Set<_>:

  • Calling s.insert(object) must either increment s.count by one (if object is not yet a member), or have it stay the same (if object is already a member).
  • Calling s.remove(object) must either decrement s.count by one (if object is a member), or have it stay the same (if object is not a member).
  • Calling (s.union(s)) must return a Set that is equivalent to s.
  • Likewise, calling (s.intersection(s)) must return a Set that is equivalent to s, too.
  • Calling (s.subtracting(s)) must return a Set that is empty.
  • Constructing s from a sequence of non-unique objects must result in a Set containing only unique objects from said sequence, while dropping all the redundant ones.
  • For any pair of Sets s & t where s == t
  • … where e is any member of t (or vice versa) calling s.contains(e) must also return true.
  • … calling s.isSubset(of: t) (or vice versa) must return true.
  • … calling s.isSuperset(of: t) (or vice versa) must return true.
  • … calling s.isDisjoint(with: t) (or vice versa) must return false.
  • … calling s.symmetricDifference(t) (or vice versa) must return an empty Set.

Now what If I told you that none of these hold true for Set<Float>, Set<Double>, and consequently Set<V>?

How can this be, given that both, Float and Double conform to Hashable (and therefor also Equatable), one of the (quite literally) key-requirements of Set<Float> and Set<Double>.

Let’s take a look at Apple’s official documentation for Equatable to make sure we’re on the same page on what constitutes conformance to Equatable. The important part is this one:

Since equality between instances of Equatable types is an equivalence relation, any of your custom types that conform to Equatable must satisfy three conditions, for any values a, b, and c:

a == a is always true (Reflexivity)

a == b implies b == a (Symmetry)

a == b and b == c implies a == c (Transitivity)

We already discussed in detail what constitutes the conformance of a type to Hashable in an earlier article, so we will not go int any detail on it here but will instead focus entirely on Equatable.

You have probably used the functionality of Equatable in combination with either Float and Double more often than you can remember and are wondering at this point what kind of weird behavior this could be alluding to.

After pondering for a moment you might remember that a fellow programmer once told you to never compare two floating-point numbers against each other. Bad things would happen, they said.

And indeed what they were referring to were the infamous rounding errors found in common floating-point representations. As such 0.1 + 0.2 does not necessarily have to evaluate to the same value as 0.3 on its own. In fact Swift will indeed evaluate the expression 0.1 + 0.2 to a value of 0.30000000000000004, yet the expression 0.3 to a value of 0.29999999999999999. Some real numbers simply cannot be cleanly expressed as floating-point numbers. To add insult to injury even an expression of (a + b) + c could yield a (oh so slightly) different result than a + (b + c) for floating-point values. IEEE 754 floating-point numbers do not guarantee proper associativity.

This however is actually not even what’s at play here. After all once a floating-point value is instantiated it does not change its value and thus should be equal to itself at any time. Once 42.0, always 42.0.

Now it’s quite likely that you never heard of NaN before, or have you? If so, no big surprise. After all we prefer to not talk about NaN around here, if possible. NaN is floating-point’s dirty laundry. It’s weird. It’s a wolf pack in sheep’s clothing.

Yep, there’s indeed more than one NaN: a total of 8388606 of them in Float alone, to be specific. And even more of them are to be found in Double. All of them operating under the facade of a single identifier and yet boldly proclaiming “I’m not with them!” when asked.

Remember this rule from Equatable?

Reflexivity: a == a is always true.

Well, as it turns out Float.nan != Float.nan. Spooky, eh?

What at first glance looks like a bug in Swift is unfortunately precisely how NaN is supposed to work. In pretty much every other major programming language, too.

When it comes to NaN it’s all about what NaN is not: a number.

It is a single name used for all members of the group of possible combinations of 32 (or 64 for Double) bits, that happen to not be a valid floating-point number representation, according to IEEE 754.

That’s quite a mouth-full and a rabbit hole for those who feel so inclined.

The strange behavior of Set is a direct result of this weird detail about Float/Double. It is blinded by its trust of Float/Double to play by the rules they both agreed upon (i.e. Equatable).

So, if this dubious NaN is so troublesome, one might ask, then why don’t we just remove it from Float/Double?

Well, turns out we can’t. NaN is an inevitable consequence of how floating-point numbers are represented in IEEE 754.

You could think of every Float/Double deep down actually being a Float?/Double? in disguise with Float.nan/Double.nan being the nil/.none case. Having them that way would actually clean things up drastically. We can’t however do it like that without both: breaking compatibility with every other major programming language, and saying goodbye to any future prospect of having a foreign function interface. We would also be losing all the benefits of hardware-accelerated floating-point operations, which we could no longer make any use of.

So what can we do to fix this then?

Well, turns out there is a way to avoid this issue and make Swift’s stdlib more expressible at the same time. And it’s pretty straight-forward, too: Split up Equatable into two protocols.

With Equatable and PartiallyEquatable in place Set<Element : Hashable> and Dictionary<Key : Hashable, Value> could remain as they are, but Float’s and Double’s conformance would be demoted from Equatable to PartiallyEquatable.

So while one could still compare two floating-point values a and b against each other via a == b (while still bringing all the weirdness of NaN with it!) one could not use them as a key or element to any data structure that requires conformance to Equatable (be it directly, or indirectly).

As such this would essentially prohibit any use of Set<Float/Double> or Dictionary<Float/Double, _> (and rightly so from a type-theoretical perspective), thus being a rather obvious breaking change. But one that should only break code that was probably an ill-informed mistake to begin with. After all as we saw earlier it’s not just NaN that’s a bit weird about them.

If one really, really wanted to use a T: FloatingPoint in a Hashable/Equatable context one would have to wrap it like this and then use SafeFloat<_> as the key/element:

The guard !inner.isNaN else { return nil } is what makes this work safely.

Why and how the same problem/and solution (Comparable, PartiallyComparable) applies to Comparable will be left as an exercise for the reader. (Hint: protocol Comparable: Equatable { … })

If Swift had true generic protocols (No, what you’re thinking of are PATs, protocols with associated types. A solution to a different problem.), then the proper and more general definition of Equatable and PartiallyEquatable would actually be:

… and similarly so for Comparable and PartiallyComparable.

This leaves us with one final question: “Why on earth should we care about NaN to begin with?” You have most likely never been in the presence of NaN before anyway.

What would you expect sequence.sorted() to do? If you answered with “Well, it returns the elements of the sequence, sorted” in your head, then you’re right on track.

As it turns out it is in fact exactly what the docs have to say about func sorted():

Returns the elements of the sequence, sorted.

So we should be able to safely expect the result of sequence.sorted() to hold true for any pair of elements a and b where a < b the following property: a is to be found at a lower index in the sequence, than b .

But does it? Let’s find out:

Well, … that’s not quite what I would call sorted.

One more, just for fun:

I’ll leave it as an exercise for the reader to figure out what’s going on here and why this is pretty bad.

So, why should we care, again?

We should care, because every time one adds NaN to the mix one risk messing with the integrity of one’s program’s semantics.

Reflexivity requires for every value of type T to return true for self == self. Float.nan is member of Float, but Float.nan == Float.nan evaluates to false. Thus Float/Double should clearly never have been declared to conform to Equatable. The fact that both Float and Double deliberately proclaim to conform to Equatable against better knowledge is leaving a gaping hole in the safety and soundness promises of Swift’s type system.

As we saw earlier Set<Float/Double> and Dictionary<Float/Double, _> are closely related to each other and basically share a lot of their semantics. As such everything we observed for the former will also apply to the latter in some form or another.

While one could consider Set<Float/Double> or Dictionary<Float/Double, _> a clear edge-case, it is important to understand that the weirdness creeps in as soon as one carelessly includes an instance property let _: Float or let _: Double in the implementation of a type’s static func == (lhs: Self, rhs: Self) -> Bool.

It doesn’t help to have the sturdiest of all walls to secure your fortified castle if you then leave one of the outer doors wide open at night.

Update 1:

Sounds good!

Update 2:

The gift that keeps on giving:

--

--