Chapter 1. Learning Java 9 Underlying Performance Improvements

Just when you think you have a handle on lambdas and all the performance-related features of Java 8, along comes Java 9. What follows are several of the capabilities that made it into Java 9 that you can use to help improve the performance of your applications. These go beyond byte-level changes like for string storage or garbage collection changes, which you have little control over. Also, ignore implementation changes like those for faster object locking, since you don't have to do anything differently and you automatically get these improvements. Instead, there are new library features and completely new command-line tools that will help you create apps quickly.

In this lesson, we will cover the following topics:

Modular development and its impact on performance
Various string-related performance improvements, including compact string and indify string concatenation
Advancement in concurrency
Various underlying compiler improvements, such as tiered attribution and Ahead-of-Time (AOT) compilation
Security manager improvements
Enhancements in graphics rasterizers

Introducing the New Features of Java 9

In this lesson, we will explore many under the cover improvements to performance that you automatically get by just running your application in the new environment. Internally, string changes also drastically reduce memory footprint requirements for times when you don't need full-scale Unicode support in your character strings. If most of your strings can be encoded either as ISO-8859-1 or Latin-1 (1 byte per character), they'll be stored much more efficiently in Java 9. So, let's dive deep into the core libraries and learn the underlying performance improvements.

Modular Development and Its Impact

In software engineering, modularity is an important concept. From the point of view of performance as well as maintainability, it is important to create autonomous units called modules. These modules can be tied together to make a complete system. The modules provides encapsulation where the implementation is hidden from other modules. Each module can expose distinct APIs that can act as connectors so that other modules can communicate with it. This type of design is useful as it promotes loose coupling, helps focus on singular functionality to make it cohesive, and enables testing it in isolation. It also reduces system complexity and optimizes application development process. Improving performance of each module helps improving overall application performance. Hence, modular development is a very important concept.

I know you may be thinking, wait a minute, isn't Java already modular? Isn't the object-oriented nature of Java already providing modular operation? Well, object-oriented certainly imposes uniqueness along with data encapsulation. It only recommends loose coupling but does not strictly enforce it. In addition, it fails to provide identity at the object level and also does not have any versioning provision for the interfaces. Now you may be asking, what about JAR files? Aren't they modular? Well, although JARs provide modularization to some extent, they don't have the uniqueness that is required for modularization. They do have a provision to specify the version number, but it is rarely used and also hidden in the JAR's manifest file.

So we need a different design from what we already have. In simple terms, we need a modular system in which each module can contain more than one package and offers robust encapsulation compared to the standard JAR files.

This is what Java 9's modular system offers. In addition to this, it also replaces the fallible classpath mechanism by declaring dependencies explicitly. These enhancements improve the overall application performance as developers can now optimize the individual self-contained unit without affecting the overall system.

This also makes the application more scalable and provides high integrity.

Let's look at some of the basics of the module system and how it is tied together. To start off with, you can run the following commands to see how the module system is structured:

$java --list-modules

If you are interested in a particular module, you can simply add the module name at the end of the command, as shown in the following command:

$java --list-modules java.base

The earlier command will show all the exports in packages from the base module. Java base is the core of the system.

This will show all the graphical user interface packages. This will also show requires which are the dependencies:

$java --list-modules java.desktop

So far so good, right? Now you may be wondering, I got my modules developed but how to integrate them together? Let's look into that. Java 9's modular system comes with a tool called JLink. I know you can guess what I am going to say now. You are right, it links a set of modules and creates a runtime image. Now imagine the possibilities it can offer. You can create your own executable system with your own custom modules. Life is going to be a lot more fun for you, I hope! Oh, and on the other hand, you will be able to control the execution and remove unnecessary dependencies.

Let's see how to link modules together. Well, it's very simple. Just run the following command:

$jlink --module-path $JAVA_HOME/jmods:mlib --add-modules java.desktop --output myawesomeimage

This linker command will link all the modules for you and create a runtime image. You need to provide a module path and then add the module that you want to generate a figure and give a name. Isn't it simple?

Now, let's check whether the previous command worked properly or not. Let's verify the modules from the figure:

$myawesomeimage/bin/java --list-modules

The output looks like this:

With this, you will now be able to distribute a quick runtime with your application. It is awesome, isn't it? Now you can see how we moved from a somewhat monolithic design to a self-contained cohesive one. Each module contains its own exports and dependencies and JLink allows you to create your own runtime. With this, we got our modular platform.

Note that the aim of this section is to just introduce you to the modular system. There is a lot more to explore but that is beyond the scope of this book. In this book, we will focus on the performance enhancement areas.

Quick Introduction to Modules

I am sure that after reading about the modular platform, you must be excited to dive deep into the module architecture and see how to develop one. Hold your excitement please, I will soon take you on a journey to the exciting world of modules.

As you must have guessed, every module has a property name and is organized by packages. Each module acts as a self-contained unit and may have native code, configurations, commands, resources, and so on. A module's details are stored in a file named module-info.java, which resides in the root directory of the module source code. In that file, a module can be defined as follows:

module <name>{
}

In order to understand it better, let's go through an example. Let's say, our module name is PerformanceMonitor. The purpose of this module is to monitor the application performance. The input connectors will accept method names and the required parameters for that method. This method will be called from our module to monitor the module's performance. The output connectors will provide performance feedback for the given module. Let's create a module-info.java file in the root directory of our performance application and insert the following section:

module com.java9highperformance.PerformanceMonitor{
}

Awesome! You got your first module declaration. But wait a minute, it does not do anything yet. Don't worry, we have just created a skeleton for this. Let's put some flesh on the skeleton. Let's assume that our module needs to communicate with our other (magnificent) modules, which we have already created and named--PerformanceBase, StringMonitor, PrimitiveMonitor, GenericsMonitor, and so on. In other words, our module has an external dependency. You may be wondering, how would we define this relationship in our module declaration? Ok, be patient, this is what we will see now:

module com.java9highperformance.PerformanceMonitor{
    exports com.java9highperformance.StringMonitor;
    exports com.java9highperformance.PrimitiveMonitor;
    exports com.java9highperformance.GenericsMonitor;
    requires com.java9highperformance.PerformanceBase;
    requires com.java9highperformance.PerformanceStat;
    requires com.java9highperformance.PerformanceIO;
}

Yes, I know you have spotted two clauses, that is, exports and requires. And I am sure you are curious to know what they mean and why we have them there. We'll first talk about these clauses and what they mean when used in the module declaration:

exports: This clause is used when your module has a dependency on another module. It denotes that this module exposes only public types to other modules and none of the internal packages are visible. In our case, the module com.java9highperformance.PerformanceMonitor has a dependency on com.java9highperformance.StringMonitor, com.java9highperformance.PrimitiveMonitor, and com.java9highperformance.GenericsMonitor. These modules export their API packages com.java9highperformance.StringMonitor, com.java9highperformance.PrimitiveMonitor, and com.java9highperformance.GenericsMonitor, respectively.
requires: This clause denotes that the module depends upon the declared module at both compile and runtime. In our case, com.java9highperformance.PerformanceBase, com.java9highperformance.PerformanceStat, and com.java9highperformance.PerformanceIO modules are required by our com.java9highperformance.PerformanceMonitor module. The module system then locates all the observable modules to resolve all the dependencies recursively. This transitive closure gives us a module graph which shows a directed edge between two dependent modules.

Note

Note: Every module is dependent on java.base even without explicitly declaring it. As you already know, everything in Java is an object.

Now you know about the modules and their dependencies. So, let's draw a module representation to understand it better. The following figure shows the various packages that are dependent on com.java9highperformance.PerformanceMonitor.

Modules at the bottom are exports modules and modules on the right are requires modules.

Now let's explore a concept called readability relationship. Readability relationship is a relationship between two modules where one module is dependent on another module. This readability relationship is a basis for reliable configuration. So in our example, we can say com.java9highperformance.PerformanceMonitor reads com.java9highperformance.PerformanceStat.

Let's look at com.java9highperformance.PerformanceStat module's description file module-info.java:

module com.java9highperformance.PerformanceStat{
    requires transitive java.lang;
}

This module depends on the java.lang module. Let's look at the PerformanceStat module in detail:

package com.java9highperformance.PerformanceStat;
import java.lang.*;

public Class StringProcessor{
    public String processString(){...}
}

In this case, com.java9highperformance.PerformanceMonitor only depends on com.java9highperformance.PerformanceStat but com.java9highperformance.PerformanceStat depends on java.lang. The com.java9highperformance.PerformanceMonitor module is not aware of the java.lang dependency from the com.java9highperformance.PerformanceStat module. This type of problem is taken care of by the module system. It has added a new modifier called transitive. If you look at com.java9highperformance.PerformanceStat, you will find it requires transitive java.lang. This means that any one depending on com.java9highperformance.PerformanceStat reads on java.lang.

See the following graph which shows the readability graph:

Now, in order to compile the com.java9highperformance.PerformanceMonitor module, the system must be able to resolve all the dependencies. These dependencies can be found from the module path. That's obvious, isn't that? However, don't misunderstand the classpath with the module path. It is a completely different breed. It doesn't have the issues that the packages have.

String Operations Performance

If you are not new to programming, string must be your best friend so far. In many cases, you may like it more than your spouse or partner. As we all know, you can't live without string, in fact, you can't even complete your application without a single use of string. OK, enough has been expressed about string and I am already feeling dizzy by the string usage just like JVM in the earlier versions. Jokes apart, let's talk about what has changed in Java 9 that will help your application perform better. Although this is an internal change, as an application developer, it is important to understand the concept so you know where to focus for performance improvements.

Java 9 has taken a step toward improving string performance. If you have ever come across JDK 6's failed attempt UseCompressedStrings, then you must be looking for ways to improve string performance. Since UseCompressedStrings was an experimental feature that was error prone and not designed very well, it was removed in JDK 7. Don't feel bad about it, I know it's terrible but as always the golden days eventually come. The JEP team has gone through immense pain to add a compact string feature that will reduce the footprint of string and its related classes.

Compact strings will improve the footprint of string and help in using memory space efficiently. It also preserves compatibility for all related Java and native interfaces. The second important feature is Indify String Concatenation, which will optimize a string at runtime.

In this section, we will take a closure look at these two features and their impact on overall application performance.

Compact String

Before we talk about this feature, it is important to understand why we even care about this. Let's dive deep into the underworld of JVM (or as any star wars fan would put it, the dark side of the Force). Let's first understand how JVM treats our beloved string and that will help us understand this new shiny compact string improvement. Let's enter into the magical world of heap. And as a matter of fact, no performance book is complete without a discussion of this mystical world.

The World of Heap

Each time JVM starts, it gets some memory from the underlining operating system. It is separated into two distinct regions called heap space and Permgen. These are home to all your application's resources. And as always with all good things in life, this home is limited in size. This size is set during the JVM initialization; however, you can increase or decrease this by specifying the JVM parameters, -Xmx, and -XX:MaxPermSize.

The heap size is divided into two areas, the nursery or young space and the old space. As the name suggests, the young space is home to new objects. This all sounds great but every house needs a cleanup. Hence, JVM has the most efficient cleaner called garbage collector (most efficient? Well... let's not get into that just yet). As any productive cleaner would do, the garbage collector efficiently collects all the unused objects and reclaims memory. When this young space gets filled up with new objects, the garbage collector takes charge and moves any of those who have lived long enough in the young space to the old space. This way, there is always room for more objects in the young space.

And in the same way, if the old space becomes filled up, the garbage collector reclaims the memory used.

Why Bother Compressing Strings?

Now you know a little bit about heap, let's look at the String class and how strings are represented on heap. If you dissect the heap of your application, you will notice that there are two objects, one is the Java language Stringobject that references the second object char[] that actually handles the data. The char datatype is UTF-16 and hence takes up to 2 bytes. Let's look at the following example of how two different language strings look:

2 byte per char[]
Latin1 String : 1 byte per char[]

So you can see that Latin1 String only consumes 1 byte, and hence we are losing about 50% of the space here. There is an opportunity to represent it in a more dense form and improve the footprint, which will eventually help in speeding up garbage collection as well.

Now, before making any changes to this, it is important to understand its impact on real-life applications. It is essential to know whether applications use 1 byte per char[] strings or 2 bytes per char[] strings.

To get an answer to this, the JPM team analyzed a lot of heap dumps of real-world data. The result highlighted that a majority of heap dumps have around 18 percent to 30 percent of the entire heap consumed by chars[], which come from string. Also, it was prominent that most strings were represented by a single byte per char[]. So, it is clear that if we try to improve the footprint for strings with a single byte, it will give significant performance boost to many real-life applications.

What Did They Do?

After having gone through a lot of different solutions, the JPM team has finally decided to come up with a strategy to compress string during its construction. First, optimistically try to compress in 1 byte and if it is not successful, copy it as 2 bytes. There are a few shortcuts possible, for example, the use of a special case encoder like ISO-8851-1, which will always spit 1 byte.

This implementation is a lot better than JDK 6's UseCompressedStrings implementation, which was only helpful to a handful of applications as it was compressing string by repacking and unpacking on every single instance. Hence the performance gain comes from the fact that it can now work on both the forms.

What is the Escape Route?

Even though it all sounds great, it may affect the performance of your application if it only uses 2 byte per char[]string. In that case, it make sense not to use the earlier mentioned, check, and directly store string as 2 bytes per char[]. Hence, the JPM team has provided a kill switch --XX: -CompactStrings using which you can disable this feature.

What is the Performance Gain?

The previous optimization affects the heap as we saw earlier that the string is represented in the heap. Hence, it is affecting the memory footprint of the application. In order to evaluate the performance, we really need to focus on the garbage collector. We will explore the garbage collection topic later, but for now, let's just focus on the run-time performance.

Indify String Concatenation

I am sure you must be thrilled by the concept of the compact string feature we just learned about. Now let's look at the most common usage of string, which is concatenation. Have you ever wondered what really happens when we try to concatenate two strings? Let's explore. Take the following example:

public static String getMyAwesomeString(){
    int javaVersion = 9;
    String myAwesomeString = "I love " + "Java " + javaVersion + " high       performance book by Mayur Ramgir";
    return myAwesomeString;
}

In the preceding example, we are trying to concatenate a few strings with the int value. The compiler will then take your awesome strings, initialize a new StringBuilder instance, and then append all these individuals strings. Take a look at the following bytecode generation by javac. I have used the ByteCode Outline plugin for Eclipse to visualize the disassembled bytecode of this method. You may download it from http://andrei.gmxhome.de/bytecode/index.html:

// access flags 0x9
public static getMyAwesomeString()Ljava/lang/String;
  L0
  LINENUMBER 10 L0
  BIPUSH 9
  ISTORE 0
  L1
  LINENUMBER 11 L1
  NEW java/lang/StringBuilder
  DUP
  LDC "I love Java "
  INVOKESPECIAL java/lang/StringBuilder.<init> (Ljava/lang/String;)V
  ILOAD 0
  INVOKEVIRTUAL java/lang/StringBuilder.append (I)Ljava/lang/StringBuilder;
  LDC " high performance book by Mayur Ramgir"
  INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
  INVOKEVIRTUAL java/lang/StringBuilder.toString ()Ljava/lang/String;
  ASTORE 1
  L2
  LINENUMBER 12 L2
  ALOAD 1
  ARETURN
  L3
  LOCALVARIABLE javaVersion I L1 L3 0
  LOCALVARIABLE myAwesomeString Ljava/lang/String; L2 L3 1
  MAXSTACK = 3
  MAXLOCALS = 2

Quick Note: How do we interpret this?

INVOKESTATIC: This is useful for invoking static methods
INVOKEVIRTUAL: This uses of dynamic dispatch for invoking public and protected non-static methods
INVOKEINTERFACE: This is very similar to INVOKEVIRTUAL except that the method dispatch is based on an interface type
INVOKESPECIAL: This is useful for invoking constructors, methods of a superclass, and private methods

However, at runtime, due to the inclusion of -XX:+-OptimizeStringConcat into the JIT compiler, it can now identify the append of StringBuilder and the toString chains. In case the match is identified, produce low-level code for optimum processing. Compute all the arguments' length, figure out the final capacity, allocate the storage, copy the strings, and do the in place conversion of primitives. After this, handover this array to the String instance without copying. It is a profitable optimization.

But this also has a few drawbacks in terms of concatenation. One example is that in case of a concatenating string with long or double, it will not optimize properly. This is because the compiler has to do .getChar first which adds overhead.

Also, if you are appending int to String, then it works great; however, if you have an incremental operator like i++, then it breaks. The reason behind this is that you need to rewind to the beginning of the expression and re-execute, so you are essentially doing ++ twice. And now the most important change in Java 9 compact string. The length spell like value.length >> coder; C2 cannot optimize it as it does not know about the IR.

Hence, to solve the problem of compiler optimization and runtime support, we need to control the bytecode, and we cannot expect javac to handle that.

We need to delay the decision of which concatenation can be done at runtime. So can we have just method String.concat which will do the magic. Well, don't rush into this yet as how would you design the method concat. Let's take a look. One way to go about this is to accept an array of the String instance:

public String concat(String... n){
    //do the concatenation
}

However, this approach will not work with primitives as you now need to convert each primitive to the Stringinstance and also, as we saw earlier, the problem is that long and double string concatenation will not allow us to optimize it. I know, I can sense the glow on your face like you got a brilliant idea to solve this painful problem. You are thinking about using the Object instance instead of the String instance, right? As you know the Objectinstance is catch all. Let's look at your brilliant idea:

public String concat(Object... n){
    //do the concatenation
}

First, if you are using the Object instance, then the compiler needs to do autoboxing. Additionally, you are passing in the varargs array, so it will not perform optimally. So, are we stuck here? Does it mean we cannot use the preeminent compact string feature with string concatenation? Let's think a bit more; maybe instead of using the method runtime, let javac handle the concatenation and just give us the optimized bytecode. That sounds like a good idea. Well, wait a minute, I know you are thinking the same thing. What if JDK 10 optimizes this further? Does that mean, when I upgrade to the new JDK, I have to recompile my code again and deploy it again? In some cases, its not a problem, in other cases, it is a big problem. So, we are back to square one.

We need something that can be handled at runtime. Ok, so that means we need something which will dynamically invoke the methods. Well, that rings a bell. If we go back in our time machine, at the dawn of the era of JDK 7 it gave us invokedynamic. I know you can see the solution, I can sense the sparkle in your eyes. Yes, you are right, invokedynamic can help us here. If you are not aware of invokedynamic, let's spend some time to understand it. For those who have already mastered the topic, you could skip it, but I would recommend you go through this again.

Invokedynamic

The invokedynamic feature is the most notable feature in the history of Java. Rather than having a limit to JVM bytecode, we now can define our own way for operations to work. So what is invokedynamic? In simple terms, it is the user-definable bytecode. This bytecode (instead of JVM) determines the execution and optimization strategies. It offers various method pointers and adapters which are in the form of method handling APIs. The JVM then work on the pointers given in the bytecode and use reflection-like method pointers to optimize it. This way, you, as a developer, can get full control over the execution and optimization of code.

It is essentially a mix of user-defined bytecode (which is known as bytecode + bootstrap) and method handles. I know you are also wondering about the method handles--what are they and how to use them? Ok, I heard you, let's talk about method handles.

Method handles provide various pointers, including field, array, and method, to pass data and get results back. With this, you can do argument manipulation and flow control. From JVM's point of view, these are native instructions that it can optimize as if it were bytecode. However, you have the option to programmatically generate this bytecode.

Let's zoom in to the method handles and see how it all ties up together. The main package's name is java.lang.invoke, which has MethodHandle, MethodType, and MethodHandles. MethodHandle is the pointer that will be used to invoke the function. MethodType is a representation of a set of arguments and return value coming from the method. The utility class MethodHandles will act as a pointer to a method which will get an instance of MethodHandle and map the arguments.

We won't be going in deep for this section, as the aim was just to make you aware of what the invokedynamic feature is and how it works so you will understand the string concatenation solution. So, this is where we get back to our discussion on string concatenation. I know, you were enjoying the invokedynamic discussion, but I guess I was able to give you just enough insight to make you understand the core idea of Indify String Concatenation.

Let's get back on the concatenation part where we were looking for a solution to concatenate our awesome compact strings. For concatenating the compact strings, we need to take care of types and the number of types of methods and this is what the invokedynamic gives us.

So let's use invokedynamic for concat. Well, not so quick, my friend. There is a fundamental problem with this approach. We cannot just use invokedynamic as it is to solve this problem. Why? Because there is a circular reference. The concat function needs java.lang.invoke, which uses concat. This continues, and eventually you will get StackOverflowError.

Take a look at the following code:

String concat(int i, long l, String s){
    return s + i + l
}

So if we were to use invokedynamic here, the invokedynamic call would look like this:

InvokeDynamic #0: makeConcat(String, int, long)

There is a need to break the circular reference. However, in the current JDK implementation, you cannot control what java.invoke calls from the complete JDK library. Also, removing the complete JDK library reference from java.invoke has severe side effects. We only need the java.base module for Indify String Concatenation, and if we can figure out a way to just call the java.base module, then it will significantly improve the performance and avoid unpleasant exceptions. I know what you are thinking. We just studied the coolest addition to Java 9, Project Jigsaw. It provides modular source code and now we can only accept the java.base module. This solves the biggest problem we were facing in terms of concatenating two strings, primitives, and so on.

After going through a couple of different strategies, the Java Performance Management team has settled on the following strategy:

Make a call to the toString() method on all reference args.
Make a call to the tolength() method or since all the underlying methods are exposed, just call T.stringSize(T t) on every args.
Figure out the coders and call coder() for all reference args.
Allocate byte[] storage and then copy all args. And then, convert primitives in-place.
Invoke a private constructor String by handing over the array for concatenation.

With this, we are able to get an optimized string concat in the same code and not in C2 IR. This strategy gives us 2.9x better performance and 6.4x less garbage.

Storing Interned Strings in CDS Archives

The main goal of this feature is to reduce memory footprint caused by creating new instances of string in every JVM process. All the classes that are loaded in any JVM process can be shared with other JVM processes via Class Data Sharing (CDS) archives.

Oh, I did not tell you about CDS. I think it's important to spend some time to understand what CDS is, so you can understand the underlying performance improvement.

Many times, small applications in particular spend a comparatively long time on startup operations. To reduce this startup time, a concept called CDS was introduced. CDS enables sharing of a set of classes loaded from the system JAR file into a private internal representation during the JRE installation. This helps a lot as then any further JVM invocations can take advantage of these loaded classes' representation from the shared archive instead of loading these classes again. The metadata related to these classes is shared among multiple JVM processes.

CDS stores strings in the form of UTF-8 in the constant pool. When a class from these loaded classes begins the initialization process, these UTF-8 strings are converted into String objects on demand. In this structure, every character in every confined string takes 2 bytes in the String object and 1 byte to 3 bytes in the UTF-8, which essentially wastes memory. Since these strings are created dynamically, different JVM processes cannot share these strings.

Shared strings need a feature called pinned regions in order to make use of the garbage collector. Since the only HotSpot garbage collector that supports pinning is G1; it only works with the G1 garbage collector.

Concurrency Performance

Multithreading is a very popular concept. It allows programs to run multiple tasks at the same time. These multithreaded programs may have more than one unit which can run concurrently. Every unit can handle a different task keeping the use of available resources optimal. This can be managed by multiple threads that can run in parallel.

Java 9 improved contended locking. You may be wondering what is contended locking. Let's explore. Each object has one monitor that can be owned by one thread at a time. Monitors are the basic building blocks of concurrency. In order for a thread to execute a block of code marked as synchronized on an object or a synchronized method declared by an object, it must own this object's monitor. Since there are multiple threads trying to get access to the mentioned monitor, JVM needs to orchestrate the process and only allow one thread at a time. It means the rest of threads go in a wait state. This monitor is then called contended. Because of this provision, the program wastes time in the waiting state.

Also, Java Virtual Machine (JVM) does some work orchestrating the lock contention. Additionally, it has to manage threads, so once the existing thread finishes its execution, it can allow a new thread to go in. This certainly adds overhead and affects performance adversely. Java 9 has taken a few steps to improve in this area. The provision refines the JVM's orchestration, which will ultimately result in performance improvement in highly contested code.

The following benchmarks and tests can be used to check the performance improvements of contented Java object monitors:

CallTimerGrid (This is more of a stress test than a benchmark)
Dacapo-bach (earlier dacapo2009)
_ avrora
_ batik
_ fop
_ h2
_ luindex
_ lusearch
_ pmd
_ sunflow
_ tomcat
_ tradebeans
_ tradesoap
_ xalan
DerbyContentionModelCounted
HighContentionSimulator
LockLoops-JSR166-Doug-Sept2009 (earlier LockLoops)
PointBase
SPECjbb2013-critical (earlier specjbb2005)
SPECjbb2013-max
specjvm2008
volano29 (earlier volano2509)

Compiler Improvements

Several efforts have been made to improve the compiler's performance. In this section, we will focus on the improvements to the compiler side.

Tiered Attribution

The first and foremost change providing compiler improvement is related to Tiered Attribution (TA). This change is more related to lambda expressions. At the moment, the type checking of poly expression is done by type checking the same tree multiple times against different targets. This process is called Speculative Attribution (SA), which enables the use of different overload resolution targets to check a lambda expression.

This way of type checking, although a robust technique, adversely affects performance significantly. For example, with this approach, n number of overload candidates check against the same argument expression up to n * 3 once per overload phase, strict, loose, and varargs. In addition to this, there is one final check phase. Where lambda returns a poly method call results in combinatorial explosion of attribution calls, this causes a huge performance problem. So we certainly need a different method of type checking for poly expressions.

The core idea is to make sure that a method call creates bottom-up structural types for each poly argument expression with every single details, which will be needed to execute the overload resolution applicability check before performing the overload resolution.

So in summary, the performance improvement was able to achieve an attribute of a given expression by decreasing the total number of tries.

Ahead-of-Time Compilation

The second noticeable change for compiler improvement is Ahead-of-Time compilation. If you are not familiar with the term, let's see what AOT is. As you probably know, every program in any language needs a runtime environment to execute. Java also has its own runtime which is known as Java Virtual Machine (JVM). The typical runtime that most of us use is a bytecode interpreter, which is JIT compiler as well. This runtime is known as HotSpot JVM.

This HotSpot JVM is famous for improving performance by JIT compilation as well as adaptive optimization. So far so good. However, this does not work well in practice for every single application. What if you have a very light program, say, a single method call? In this case, JIT compilation will not help you much. You need something that will load up faster. This is where AOT will help you. With AOT as opposed to JIT, instead of compiling to bytecode, you can compile into native machine code. The runtime then uses this native machine code to manage calls for new objects into mallocs as well as file access into system calls. This can improve performance.

Security Manager Improvements

Ok, let's talk about security. If you are not one of those who cares about application security over pushing more features in a release, then the expression on your face may be like Uh! What's that? If you are one those, then let's first understand the importance of security and find a way to consider this in your application development tasks. In today's SaaS-dominated world, everything is exposed to the outside world. A determined individual (a nice way of saying, a malicious hacker), can get access to your application and exploit the security holes you may have introduced through your negligence. I would love to talk about application security in depth as this is another area I am very much interested in. However, application security is out of the scope of this book. The reason we are talking about it here is that the JPM team has taken an initiative to improve the existing security manager. Hence, it is important to first understand the importance of security before talking about the security manager.

Hopefully, this one line of description may have generated secure programming interest in you. However, I do understand that sometimes you may not have enough time to implement a complete secure programming model due to tight schedules. So, let's find a way which can fit with your tight schedule. Let's think for a minute; is there any way to automate security? Can we have a way to create a blueprint and ask our program to stay within the boundaries? Well, you are in luck, Java does have a feature called security manager. It is nothing but a policy manager that defines a security policy for the application. It sounds exciting, doesn't it? But what does this policy look like? And what does it contain? Both are fair questions to ask. This security policy basically states actions that are dangerous or sensitive in nature. If your application does not comply with this policy, then the security manager throws SecurityException. On the other side, you can have your application call this security manager to learn about the permitted actions. Now, let's look at the security manager in detail.

In case of a web applet, a security manager is provided by the browser, or the Java Web Start plugin runs this policy. In many cases, applications other than web applets run without a security manager unless those applications implement one. It's a no brainer to say that if there is no security manager and no security policy attached, the application acts without restrictions.

Now we know a little about the security manager, let's look at the performance improvement in this area. As per the Java team, there may be a possibility that an application running with a security manager installed degrades performance by 10 percent to 15 percent. However, it is not possible to remove all the performance bottlenecks but narrowing this gap can assist in improving not only security but also performance.

The Java 9 team looked at some of the optimizations, including the enforcement of security policy and the evaluation of permissions, which will help improve the overall performance of using a security manager. During the performance testing phase, it was highlighted that even though the permission classes are thread safe, they show up as a HotSpot. Numerous improvements have been made to decrease thread contention and improve throughput.

Computing the hashcode method of java.security.CodeSource has been improved to use a string form of the code source URL to avoid potentially expensive DNS lookups. Also, the checkPackageAccess method of java.lang.SecurityManager, which contains the package checking algorithm, has been improved.

Some other noticeable changes in security manager improvements are as follows:

The first noticeable change is that using ConcurrentHashMap in place of Collections.synchronizedMap helps improving throughput of the Policy.implie method. Look at the following graph, taken from the OpenJDK site, which highlights the significant increase in the throughput with ConcurrentHashMap:
In addition to this, HashMap, which had been used for maintaining internal collection of CodeSource in java.security.SecureClassLoader, has been replaced by ConcurrentHashMap.
There are a few other small improvements like an improvement in the throughput by removing the compatibility code from the getPermissions method (CodeSource), which synchronizes on identities.
Another significant gain in performance is achieved using ConcurrentHashMap instead of HashMap surrounded by synchronized blocks in the permission checking code, which yielded in greater thread performance.

Graphics Rasterizers

If you are into Java 2D and using OpenJDK, you will appreciate the efforts taken by the Java 9 team. Java 9 is mainly related to a graphics rasterizer, which is part of the current JDK. OpenJDK uses Pisces, whereas Oracle JDK uses Ductus. Oracle's closed-source Ductus rasterizer performs better than OpenJDK's Pisces.

These graphics rasterizers are useful for anti-aliased rendering except fonts. Hence, for a graphics-intensive application, the performance of this rasterizer is very important. However, Pisces is failing in many fronts and its performance problems are very visible. Hence, the team has decided to replace this with a different rasterizer called Marlin Graphics Renderer.

Marlin is developed in Java and, most importantly, it is the fork of the Pisces rasterizer. Various tests have been done on it and the results are very promising. It consistently performs better than Pisces. It demonstrates multithreaded scalability and even outperforms the closed-source Ductus rasterizer for a single-threaded application.

Summary

In this lesson, we have seen some of the exciting features that will improve your application's performance without making any effort from your end.

In the next lesson, we will learn about JShell and the Ahead-of-Time (AOT) compiler. We will also learn about Read-Eval-Print Loop (REPL) tool.

Assessments

JLink is a ___________ of Java 9 modular system.
What is the relationship between two modules where one module is dependent on another module?
1. Readability relationship
2. Operability relationship
3. Modular relationship
4. Entity relationship
State whether True or False: Each time JVM starts, it gets some memory from the underlining operating system.
Which of the following perform some work orchestrating the lock contention?
1. Pinned regions
2. Readability relationship
3. Java Virtual Machine
4. Class data sharing
Which of the following enables the use of different overload resolution targets to check a lambda expression?
1. Tiered attribution
2. HotSpot JVM
3. Speculative attribution
4. Permgen