If you are not new to programming, string must be your best friend so far. In many cases, you may like it more than your spouse or partner. As we all know, you can't live without string, in fact, you can't even complete your application without a single use of string. OK, enough has been expressed about string and I am already feeling dizzy by the string usage just like JVM in the earlier versions. Jokes apart, let's talk about what has changed in Java 9 that will help your application perform better. Although this is an internal change, as an application developer, it is important to understand the concept so you know where to focus for performance improvements.
Java 9 has taken a step toward improving string performance. If you have ever come across JDK 6's failed attempt UseCompressedStrings
, then you must be looking for ways to improve string performance. Since UseCompressedStrings
was an experimental feature that was error prone and not designed very well, it was removed in JDK 7. Don't feel bad about it, I know it's terrible but as always the golden days eventually come. The JEP team has gone through immense pain to add a compact string feature that will reduce the footprint of string and its related classes.
Compact strings will improve the footprint of string and help in using memory space efficiently. It also preserves compatibility for all related Java and native interfaces. The second important feature is Indify String Concatenation, which will optimize a string at runtime.
In this section, we will take a closure look at these two features and their impact on overall application performance.
Why Bother Compressing Strings?
Now you know a little bit about heap, let's look at the String
class and how strings are represented on heap. If you dissect the heap of your application, you will notice that there are two objects, one is the Java language String
object that references the second object char[]
that actually handles the data. The char
datatype is UTF-16 and hence takes up to 2 bytes. Let's look at the following example of how two different language strings look:
So you can see that Latin1 String
only consumes 1 byte, and hence we are losing about 50% of the space here. There is an opportunity to represent it in a more dense form and improve the footprint, which will eventually help in speeding up garbage collection as well.
Now, before making any changes to this, it is important to understand its impact on real-life applications. It is essential to know whether applications use 1 byte per char[]
strings or 2 bytes per char[]
strings.
To get an answer to this, the JPM team analyzed a lot of heap dumps of real-world data. The result highlighted that a majority of heap dumps have around 18 percent to 30 percent of the entire heap consumed by chars[]
, which come from string. Also, it was prominent that most strings were represented by a single byte per char[]
. So, it is clear that if we try to improve the footprint for strings with a single byte, it will give significant performance boost to many real-life applications.
After having gone through a lot of different solutions, the JPM team has finally decided to come up with a strategy to compress string during its construction. First, optimistically try to compress in 1 byte and if it is not successful, copy it as 2 bytes. There are a few shortcuts possible, for example, the use of a special case encoder like ISO-8851-1, which will always spit 1 byte.
This implementation is a lot better than JDK 6's UseCompressedStrings
implementation, which was only helpful to a handful of applications as it was compressing string by repacking and unpacking on every single instance. Hence the performance gain comes from the fact that it can now work on both the forms.
What is the Escape Route?
Even though it all sounds great, it may affect the performance of your application if it only uses 2 byte per char[]
string. In that case, it make sense not to use the earlier mentioned, check, and directly store string as 2 bytes per char[]
. Hence, the JPM team has provided a kill switch --XX: -CompactStrings
using which you can disable this feature.
What is the Performance Gain?
The previous optimization affects the heap as we saw earlier that the string is represented in the heap. Hence, it is affecting the memory footprint of the application. In order to evaluate the performance, we really need to focus on the garbage collector. We will explore the garbage collection topic later, but for now, let's just focus on the run-time performance.
Indify String Concatenation
I am sure you must be thrilled by the concept of the compact string feature we just learned about. Now let's look at the most common usage of string, which is concatenation. Have you ever wondered what really happens when we try to concatenate two strings? Let's explore. Take the following example:
In the preceding example, we are trying to concatenate a few strings with the int
value. The compiler will then take your awesome strings, initialize a new StringBuilder
instance, and then append all these individuals strings. Take a look at the following bytecode generation by javac
. I have used the ByteCode Outline plugin for Eclipse to visualize the disassembled bytecode of this method. You may download it from http://andrei.gmxhome.de/bytecode/index.html:
Quick Note: How do we interpret this?
INVOKESTATIC
: This is useful for invoking static methodsINVOKEVIRTUAL
: This uses of dynamic dispatch for invoking public and protected non-static methodsINVOKEINTERFACE
: This is very similar to INVOKEVIRTUAL
except that the method dispatch is based on an interface typeINVOKESPECIAL
: This is useful for invoking constructors, methods of a superclass, and private methods
However, at runtime, due to the inclusion of -XX:+-OptimizeStringConcat
into the JIT compiler, it can now identify the append of StringBuilder
and the toString
chains. In case the match is identified, produce low-level code for optimum processing. Compute all the arguments' length, figure out the final capacity, allocate the storage, copy the strings, and do the in place conversion of primitives. After this, handover this array to the String
instance without copying. It is a profitable optimization.
But this also has a few drawbacks in terms of concatenation. One example is that in case of a concatenating string with long or double, it will not optimize properly. This is because the compiler has to do .getChar
first which adds overhead.
Also, if you are appending int
to String
, then it works great; however, if you have an incremental operator like i++
, then it breaks. The reason behind this is that you need to rewind to the beginning of the expression and re-execute, so you are essentially doing ++
twice. And now the most important change in Java 9 compact string. The length spell like value.length >> coder
; C2
cannot optimize it as it does not know about the IR.
Hence, to solve the problem of compiler optimization and runtime support, we need to control the bytecode, and we cannot expect javac
to handle that.
We need to delay the decision of which concatenation can be done at runtime. So can we have just method String.concat
which will do the magic. Well, don't rush into this yet as how would you design the method concat
. Let's take a look. One way to go about this is to accept an array of the String
instance:
However, this approach will not work with primitives as you now need to convert each primitive to the String
instance and also, as we saw earlier, the problem is that long and double string concatenation will not allow us to optimize it. I know, I can sense the glow on your face like you got a brilliant idea to solve this painful problem. You are thinking about using the Object
instance instead of the String
instance, right? As you know the Object
instance is catch all. Let's look at your brilliant idea:
First, if you are using the Object
instance, then the compiler needs to do autoboxing. Additionally, you are passing in the varargs
array, so it will not perform optimally. So, are we stuck here? Does it mean we cannot use the preeminent compact string feature with string concatenation? Let's think a bit more; maybe instead of using the method runtime
, let javac
handle the concatenation and just give us the optimized bytecode. That sounds like a good idea. Well, wait a minute, I know you are thinking the same thing. What if JDK 10 optimizes this further? Does that mean, when I upgrade to the new JDK, I have to recompile my code again and deploy it again? In some cases, its not a problem, in other cases, it is a big problem. So, we are back to square one.
We need something that can be handled at runtime. Ok, so that means we need something which will dynamically invoke the methods. Well, that rings a bell. If we go back in our time machine, at the dawn of the era of JDK 7 it gave us invokedynamic
. I know you can see the solution, I can sense the sparkle in your eyes. Yes, you are right, invokedynamic
can help us here. If you are not aware of invokedynamic
, let's spend some time to understand it. For those who have already mastered the topic, you could skip it, but I would recommend you go through this again.
The invokedynamic
feature is the most notable feature in the history of Java. Rather than having a limit to JVM bytecode, we now can define our own way for operations to work. So what is invokedynamic
? In simple terms, it is the user-definable bytecode. This bytecode (instead of JVM) determines the execution and optimization strategies. It offers various method pointers and adapters which are in the form of method handling APIs. The JVM then work on the pointers given in the bytecode and use reflection-like method pointers to optimize it. This way, you, as a developer, can get full control over the execution and optimization of code.
It is essentially a mix of user-defined bytecode (which is known as bytecode + bootstrap) and method handles. I know you are also wondering about the method handles--what are they and how to use them? Ok, I heard you, let's talk about method handles.
Method handles provide various pointers, including field, array, and method, to pass data and get results back. With this, you can do argument manipulation and flow control. From JVM's point of view, these are native instructions that it can optimize as if it were bytecode. However, you have the option to programmatically generate this bytecode.
Let's zoom in to the method handles and see how it all ties up together. The main package's name is java.lang.invoke
, which has MethodHandle
, MethodType
, and MethodHandles
. MethodHandle
is the pointer that will be used to invoke the function. MethodType
is a representation of a set of arguments and return value coming from the method. The utility class MethodHandles
will act as a pointer to a method which will get an instance of MethodHandle
and map the arguments.
We won't be going in deep for this section, as the aim was just to make you aware of what the invokedynamic
feature is and how it works so you will understand the string concatenation solution. So, this is where we get back to our discussion on string concatenation. I know, you were enjoying the invokedynamic
discussion, but I guess I was able to give you just enough insight to make you understand the core idea of Indify String Concatenation.
Let's get back on the concatenation part where we were looking for a solution to concatenate our awesome compact strings. For concatenating the compact strings, we need to take care of types and the number of types of methods and this is what the invokedynamic
gives us.
So let's use invokedynamic
for concat
. Well, not so quick, my friend. There is a fundamental problem with this approach. We cannot just use invokedynamic
as it is to solve this problem. Why? Because there is a circular reference. The concat
function needs java.lang.invoke
, which uses concat
. This continues, and eventually you will get StackOverflowError
.
Take a look at the following code:
So if we were to use invokedynamic
here, the invokedynamic
call would look like this:
There is a need to break the circular reference. However, in the current JDK implementation, you cannot control what java.invoke
calls from the complete JDK library. Also, removing the complete JDK library reference from java.invoke
has severe side effects. We only need the java.base
module for Indify String Concatenation, and if we can figure out a way to just call the java.base
module, then it will significantly improve the performance and avoid unpleasant exceptions. I know what you are thinking. We just studied the coolest addition to Java 9, Project Jigsaw. It provides modular source code and now we can only accept the java.base
module. This solves the biggest problem we were facing in terms of concatenating two strings, primitives, and so on.
After going through a couple of different strategies, the Java Performance Management team has settled on the following strategy:
- Make a call to the
toString()
method on all reference args. - Make a call to the
tolength()
method or since all the underlying methods are exposed, just call T.stringSize(T t)
on every args. - Figure out the coders and call
coder()
for all reference args. - Allocate
byte[]
storage and then copy all args. And then, convert primitives in-place. - Invoke a private constructor
String
by handing over the array for concatenation.
With this, we are able to get an optimized string concat in the same code and not in C2 IR
. This strategy gives us 2.9x better performance and 6.4x less garbage.