Java defines the equality operators
!= for testing reference equality but uses the
equals() method defined in
Object and its subclasses for testing abstract object equality. Naïve programmers often confuse the intent of the
== operation with that of the
Object.equals() method. This confusion is frequently evident in the context of processing
As a general rule, use the
Object.equals() method to check whether two objects have equivalent contents and use the equality operators
!= to test whether two references specifically refer to the same object. This latter test is referred to as referential equality. For classes that require overriding the default
equals() implementation, care must be taken to also override the
hashCode() method (see MET09-J. Classes that define an equals() method must also define a hashCode() method).
Numeric boxed types (for example,
Double) should also be compared using
Object.equals() rather than the
== operator. While reference equality may appear to work for
Integer values between the range −128 and 127, it may fail if either of the operands in the comparison are outside that range. Numeric relational operators other than equality (such as
>=) can be safely used to compare boxed primitive types (see EXP03-J. Do not use the equality operators when comparing values of boxed primitives for more information).
Noncompliant Code Example
This noncompliant code example declares two distinct
String objects that contain the same value:
The reference equality operator
== evaluates to
true only when the values it compares refer to the same underlying object. The references in this example are unequal because they refer to distinct objects.
Compliant Solution (
This compliant solution uses the
Object.equals() method when comparing string values:
Compliant Solution (
Reference equality behaves like abstract object equality when it is used to compare two strings that are results of the
String.intern() method. This compliant solution uses
String.intern() and can perform fast string comparisons when only one copy of the string
one is required in memory.
String.intern() should be reserved for cases in which the tokenization of strings either yields an important performance enhancement or dramatically simplifies code. Examples include programs engaged in natural language processing and compiler-like tools that tokenize program input. For most other programs, performance and readability are often improved by the use of code that applies the
Object.equals() approach and that lacks any dependence on reference equality.
The Java Language Specification (JLS) [JLS 2013] provides very limited guarantees about the implementation of
String.intern(). For example,
- The cost of
String.intern()grows as the number of intern strings grows. Performance should be no worse than O(n log n), but the JLS lacks a specific performance guarantee.
- In early Java Virtual Machine (JVM) implementations, interned strings became immortal: they were exempt from garbage collection. This can be problematic when large numbers of strings are interned. More recent implementations can garbage-collect the storage occupied by interned strings that are no longer referenced. However, the JLS lacks any specification of this behavior.
- In JVM implementations prior to Java 1.7, interned strings are allocated in the
permgenstorage region, which is typically much smaller than the rest of the heap. Consequently, interning large numbers of strings can lead to an out-of-memory condition. In many Java 1.7 implementations, interned strings are allocated on the heap, relieving this restriction. Once again, the details of allocation are unspecified by the JLS; consequently, implementations may vary.
String interning may also be used in programs that accept repetitively occurring strings. Its use boosts the performance of comparisons and minimizes memory consumption.
When canonicalization of objects is required, it may be wiser to use a custom canonicalizer built on top of
ConcurrentHashMap; see Joshua Bloch's Effective Java, second edition, Item 69 [Bloch 2008], for details.
Confusing reference equality and object equality can lead to unexpected results.
Using reference equality in place of object equality is permitted only when the defining classes guarantee the existence of at most one object instance for each possible object value. The use of static factory methods, rather than public constructors, facilitates instance control; this is a key enabling technique. Another technique is to use an
Use reference equality to determine whether two references point to the same object.
|The Checker Framework|
|Interning Checker||Check for errors in equality testing and interning (see Chapter 5)|
|[Bloch 2008]||Item 69, "Prefer Concurrency Utilities to |
ES, "Comparison of String Objects Using