Storing interned strings in CDS archives [JEP 250]
The method in which strings are stored and accessed to and from Class Data Sharing (CDS) archives is inefficient, excessively time consuming, and wastes memory. The following diagram illustrates the method in which Java stores interned strings in a CDS archive:
The inefficiency stems from the current storage schema. Especially when the Class Data Sharing tool dumps the classes into the shared archive file, the constant pools containing CONSTANT_String
items have a UTF-8 string representation.
Note
UTF-8 is an 8-bit variable-length character encoding standard.
The problem
With the current use of UTF-8, the strings must be converted to string objects, instances of the java.lang.String
class. This conversion takes place on-demand which can result in slower systems and unnecessary memory usage. The processing time is extremely short, but the memory usage cannot be overlooked. Every character in an interned string requires at least 3 bytes of memory...