Best practices for overriding isEqual: and hash

How do you properly override isEqual: in Objective-C? The "catch" seems to be that if two objects are equal (as determined by the isEqual: method), they must have the same hash value.

The Introspection section of the Cocoa Fundamentals Guide does have an example on how to override isEqual: , copied as follows, for a class named MyWidget :

- (BOOL)isEqual:(id)other {
    if (other == self)
        return YES;
    if (!other || ![other isKindOfClass:[self class]])
        return NO;
    return [self isEqualToWidget:other];
}

- (BOOL)isEqualToWidget:(MyWidget *)aWidget {
    if (self == aWidget)
        return YES;
    if (![(id)[self name] isEqual:[aWidget name]])
        return NO;
    if (![[self data] isEqualToData:[aWidget data]])
        return NO;
    return YES;
}

It checks pointer equality, then class equality, and finally compares the objects using isEqualToWidget: , which only checks the name and data properties. What the example doesn't show is how to override hash .

Let's assume there are other properties that do not affect equality, say age . Shouldn't the hash method be overridden such that only name and data affect the hash? And if so, how would you do that? Just add the hashes of name and data ? For example:

- (NSUInteger)hash {
    NSUInteger hash = 0;
    hash += [[self name] hash];
    hash += [[self data] hash];
    return hash;
}

Is that sufficient? Is there a better technique? What if you have primitives, like int ? Convert them to NSNumber to get their hash? Or structs like NSRect ?

( Brain fart : Originally wrote "bitwise OR" them together with |= . Meant add.)


Start with

 NSUInteger prime = 31;
 NSUInteger result = 1;

Then for every primitive you do

 result = prime * result + var

For 64bit you might also want to shift and xor.

 result = prime * result + (int) (var ^ (var >>> 32));

For objects you use 0 for nil and otherwise their hashcode.

 result = prime * result + [var hash];

For booleans you use two different values

 result = prime * result + (var)?1231:1237;

Explanation and Attribution

This is not tcurdt's work, and comments were asking for more explanation, so I believe an edit for attribution is fair.

This algorithm was popularized in the book "Effective Java", and the relevant chapter can currently be found online here. That book popularized the algorithm, which is now a default in a number of Java applications (including Eclipse). It derived, however, from an even older implementation which is variously attributed to Dan Bernstein or Chris Torek. That older algorithm originally floated around on Usenet, and certain attribution is difficult. For example, there is some interesting commentary in this Apache code (search for their names) that references the original source.

Bottom line is, this is a very old, simple hashing algorithm. It is not the most performant, and it is not even proven mathematically to be a "good" algorithm. But it is simple, and a lot of people have used it for a long time with good results, so it has a lot of historical support.


I'm just picking up Objective-C myself, so I can't speak for that language specifically, but in the other languages I use if two instances are "Equal" they must return the same hash - otherwise you are going to have all sorts of problems when trying to use them as keys in a hashtable (or any dictionary-type collections).

On the other hand, if 2 instances are not equal, they may or may not have the same hash - it is best if they don't. This is the difference between an O(1) search on a hash table and an O(N) search - if all your hashes collide you may find that searching your table is no better than searching a list.

In terms of best practices your hash should return a random distribution of values for its input. This means that, for example, if you have a double, but the majority of your values tend to cluster between 0 and 100, you need to make sure that the hashes returned by those values are evenly distributed across the entire range of possible hash values. This will significantly improve your performance.

There are a number of hashing algorithms out there, including several listed here. I try to avoid creating new hash algorithms as it can have large performance implications, so using the existing hash methods and doing a bitwise combination of some sort as you do in your example is a good way to avoid it.


I found this thread extremely helpful supplying everything I needed to get my isEqual: and hash methods implemented with one catch. When testing object instance variables in isEqual: the example code uses:

if (![(id)[self name] isEqual:[aWidget name]])
    return NO;

This repeatedly failed (ie, returned NO) without and error, when I knew the objects were identical in my unit testing. The reason was, one of the NSString instance variables was nil so the above statement was:

if (![nil isEqual: nil])
    return NO;

and since nil will respond to any method, this is perfectly legal but

[nil isEqual: nil]

returns nil, which is NO, so when both the object and the one being tested had a nil object they would be considered not equal (ie, isEqual: would return NO).

This simple fix was to change the if statement to:

if ([self name] != [aWidget name] && ![(id)[self name] isEqual:[aWidget name]])
    return NO;

This way, if their addresses are the same it skips the method call no matter if they are both nil or both pointing to the same object but if either is not nil or they point to different objects then the comparator is appropriately called.

I hope this saves someone a few minutes of head scratching.

链接地址: http://www.djcxy.com/p/58482.html

上一篇: Ruby集合类:集合的平等

下一篇: 重写isEqual的最佳实践:和散列