A memory leak may happen when an object can't be collected and can't be accessed by running code. The situation when memory that is no longer needed isn't released is referred to as a memory leak. In an environment with a GC, such as the JVM, a memory leak may happen when a reference to an object that's no longer needed is still stored in another object. This happens due to logical errors in program code, when an object holds a reference to another one when the last isn't used and isn't accessible in the program code anymore. The following diagram represents this case:
The GC cares about unreachable, also known as unreferenced, objects, but handling unused referenced objects depends on application logic. Leaked objects allocate memory, which means that less space is available for new objects. So if there's a memory leak, the GC will work frequently and the risk of the OutOfMemoryError exception increases.
Let's look at an example written in Kotlin of the popular RxJava2 library:
fun main(vars: Array<String>) {
var memoryLeak: MemoryLeak? = MemoryLeak()
memoryLeak?.start()
memoryLeak = null
memoryLeak = MemoryLeak()
memoryLeak.start()
Thread.currentThread().join()
}
class MemoryLeak {
init {
objectNumber ++
}
private val currentObjectNumber = objectNumber
fun start() {
Observable.interval(1, TimeUnit.SECONDS)
.subscribe { println(currentObjectNumber) }
}
companion object {
@JvmField
var objectNumber = 0
}
}
In this example, the join() method of the main thread is used to prevent the ending of application execution until other threads run. The objectNumber field of the MemoryLeak class counts created instances. Whenever a new instance of the MemoryLeak class is created, the value of objectNumber increments and is copied to the currentObjectNumber property.
The MemoryLeak class also has the start() method. This method contains an instance of Observable that emits an incremented number every second. Observable is the multi-valued base-reactive class that offers factory methods, intermediate operators, and the ability to consume synchronous and/or asynchronous reactive data-flows. Observable has many factory functions that create new instances to perform different actions. In our case, we'll use the interval function that takes two arguments—the sampling rate and the instance of the TimeUnit enum (https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/TimeUnit.html), which is the time unit in which the sampling rate is defined. The subscribe method takes an instance of a class that has the Consumer type. The most common approach is to create a lambda to handle emitted values.
The main function is the starting point of our application. In this function, we create a new instance of the MemoryLeak class, then invoke the start() method. After this, we assign null to the memoryLeak reference and repeat the previous step.
This is the most common issue when using RxJava. The first instance of the MemoryLeak class cannot be collected because the passed Consumer obtains references to it. Hence one of the active threads, which is a root object, obtains references to the first instance of MemoryLeak. Since we don't have a reference to this object, it's unused, but it can't be collected. The output of the application looks as follows:
1
2
1
2
2
1
1
2
1
2
1
As you can see, both instances of Observable run and use the currentObjectNumber property, and both instances of the MemoryLeak class sequentially allocate memory. That's why we should release resources when an object is no longer needed. To deal with this issue, we have to rewrite the code as follows:
fun main(vars: Array<String>) {
var memoryLeak: NoMemoryLeak? = NoMemoryLeak()
memoryLeak?.start()
+ memoryLeak?.disposable?.dispose()
memoryLeak = NoMemoryLeak()
memoryLeak.start()
Thread.currentThread().join()
}
class NoMemoryLeak {
init {
objectNumber ++
}
private val currentObjectNumber = objectNumber
+ var disposable: Disposable? = null
fun start() {
+ disposable = Observable.interval(1, TimeUnit.SECONDS)
.subscribe { println(currentObjectNumber) }
}
companion object {
@JvmField
var objectNumber = 0
}
}
And now the output looks like this:
2
2
2
2
2
2
The subscribe() method returns an instance of the Disposable type, which has the dispose() method. Using this approach, we can prevent the memory leak.
Using instances of mutable classes without overriding the equals() and hashCode() methods as keys for Map can also lead to a memory leak. Let's look at the following example:
class MutableKey(var name: String? = null)
fun main(vars: Array<String>) {
val map = HashMap<MutableKey, Int>()
map.put(MutableKey("someName"), 2)
print(map[MutableKey("someName")])
}
The output will be the following:
null
The get method of HashMap uses the hashCode() and equals() methods of a key to find and return a value. The current implementation of the MutableKey class doesn't override these methods. That's why if you lose the reference to the original key instance, you won't be able to retrieve or remove the value. It's definitely a memory leak because map is a local variable and sequentially it's the root object.
We can remedy the situation by making the MutableKey class data. If a class is marked as data, the compiler automatically derives the equals() and hashCode() methods from all properties declared in the primary constructor. So the MutableKey class will look as follows:
data class MutableKey(var name: String? = null)
And now the output will be:
2
Now, this class works as expected. But we can face another issue with the MutableKey class. Let's rewrite main as follows:
fun main(vars: Array<String>) {
val key = MutableKey("someName")
val map = HashMap<MutableKey, Int>()
map.put(key, 2)
key.name = "anotherName"
print(map[key])
}
Now, the output will be:
null
Because the hash, after re-assigning the name property, isn't the same as it was before:
fun main(vars: Array<String>) {
val key = MutableKey("someName")
println(key.hashCode())
val map = HashMap<MutableKey, Int>()
map.put(key, 2)
key.name = "anotherName"
println(key.hashCode())
print(map[key])
}
The output will now be:
1504659871
-298337234
null
This means that our code isn't simple and reliable. And we can still have the memory leak. The concept of an immutable object is extremely helpful in this case. Using this concept, we can protect objects from corruption, which is exactly the issue we need to prevent.
A strategy of creating classes for immutable objects in Java is complex and includes the following key moments:
- Do not provide setters
- All fields have to be marked with the final and private modifiers
- Mark a class with the final modifier
- References that are held by fields of an immutable class should refer to immutable objects
- Objects that are composed by an immutable class have to also be immutable
An immutable class that is created according to this strategy may looks as follows:
public final class ImmutableKey {
private final String name;
public ImmutableKey(String name) {
this.name = name;
}
public String getName() {
return name;
}
}
This is all very easy in Kotlin:
data class ImmutableKey(val name: String? = null)
All we need is it define all properties with val in primary constructor. We'll get a compiler error if we try to assign a new value to the name property. Immutability is an extremely powerful concept that allows us to implement some mechanisms, such as the String pool.