After watching Brian Goetz’s Presentation
on Valhalla, I started thinking more seriously about how value classes work. There are a few things
that are exciting, but a few that are pretty concerning too. Below are my thoughts; please
reach out if I missed something!
Equality (==) is No Longer Cheap
Pre-Valhalla, checking if two variables were the same was cheap. A single word comparison.
Valhalla changes that to depend on the runtime type of the object. This also implies an extra
null check, since the VM needs can’t load the class word eagerly. With a segfault handler
to try and skip the null check, the performance of == would no longer be consistent.
This isn’t the end of the world for high performance computing, but it doesn’t seem like that
big of a win. Everyone’s code bears the cost.
It appears most of the performance optimizations available to Valhalla are not yet in, so it’s
hard to tell if the memory layout improvements are worth the expense.
Minor: IdentityHashMap now is a performance liability. Don’t accidentally put in a value object
or else.
AtomicReference
How value classes will interact with AtomicReference seems to be an issue. While value objects
can be passed around by value, they can also be passed by reference, depending on the VM.
However, AtomicReference is defined in terms of == for ops like compareAndSet. Value objects
no longer have an atomic comparison. What will happen? Consider the following sequence of
events:
value record Point(int x, int y, int z) {}
static final AtomicReference<Point> POINT =
new AtomicReference<>(new Point(1, 2, 3));
A regular AtomicReference would return false for T1, despite the value being the expected value
before, during, and after the call. We can use it to resolve a race. A value based object though:
what could it do?
Where is the Class Word?
Without object identity, most of the object header isn’t needed. The identity hash code,
synchronization bits, and probably any GC bits aren’t needed any more. But, what about
valueObj.getClass() ?
I can’t see an easy way of implementing it. If the class word is adjacent to the object state in
memory, we don’t get nearly the memory savings we wanted.
If we had a single class pointer for an array of value objects, it still doesn’t help. Consider:
value record Point(int x, int y, int z) {}
Object[] points =
new Object[]{new Point(1, 2, 3), new Point(4, 5, 6)};
for (Object p : points) { System.out.println(p.getClass()); }
The VM would have to either prove every object in the array has the same class, or else store it
per object.
It would be great to see how the class pointer is elided in real life.
Intrusive Linked Lists and Trees
Value objects’ state is implicitly final, which means they can’t really be used for mutable data
structures. One of the things I miss from my C days is having a value included in a linked list
node. This saves space, but doesn’t appear to work for value objects. The same goes for trees.
I haven’t thought extensively about it, but denser data-structures don’t seem to be served by the
Valhalla update.
Values Really Don’t Have Identities.
Ending on a positive note, one of the things I liked about JEP 401
was the attention called to mutating a value object. Specifically:
Field mutation is closely tied to identity: an object whose field is being updated is the same
object before and after the update
Many years ago, I had an argument with a coworker about Go’s non-reentrant mutex, v.s. Java’s
reentrant synchronizers. As most [civil] arguments go, both of us learned something new: Go’s
mutexes can be locked multiple times. Behold!
package main
import (
"fmt"
"sync"
)
func main() {
var m sync.Mutex
m.Lock()
m = *(new(sync.Mutex))
m.Lock()
defer m.Unlock()
fmt.Println("Hello")
}
This code shows the problem. The mutex becomes a new object upon reassignment, despite being
the same variable. If the second .Lock() call is removed, this code actually panics, despite
the Lock call coming before the Unlock, and there being the same number of Locks and Unlocks.
Java is saying the same thing here. Mutability implies identity.
Conclusion
At this point, I think the Valhalla branch is interesting, but not enough to carry it’s own weight.
Without being able to see the awesome performance and memory improvements, it’s hard to tell if
the language and VM complexity are justified.
public class Timer {
public static void main(String [] args) throws Exception {
Instant start = Instant.now();
System.err.println("Starting at " + start);
Thread.sleep(Duration.ofSeconds(10));
Instant end = Instant.now();
System.out.println("Slept for " + Duration.between(start, end));
}
}
On the surface, it looks correct. The code tries to sleep for 10 seconds, and then prints out how long it actually slept for. However, there is a subtle bug: It’s using calendar time instead of monotonic time
Instant.now() is Calendar Time
Instant.now() seems like a good API to use. It’s typesafe, modern, and has nanosecond resolution! All good right? The problem is that the time comes from computer’s clock, which can move around unpredictably. To show this, I recorded running this program:
As we can see, the program takes a little over 10 seconds to run. However, what would happen if the system clock were to be adjusted? Let’s look:
Time went backwards and our program didn’t measure the duration correctly! This can happen during daylight savings time switches, users changing their system clock manually, and even when returning from sleep or hibernate power states.
Use System.nanoTime to Measure Duration
To avoid clock drift, we can use System.nanoTime(). This API returns a timestamp that is arbitrary, but is consistent during the run of our program. Here’s how to use it:
public class Timer {
public static void main(String [] args) throws Exception {
long start = System.nanoTime();
System.err.println("Starting at " + start);
Thread.sleep(Duration.ofSeconds(10));
long end = System.nanoTime();
System.out.println("Slept for " + Duration.ofNanos(end - start));
}
}
We don’t get to use the object oriented time APIs, but those weren’t meant for recording duration anyways. It feels a little more raw to use long primitives, but the result is always correct. If you are looking for a typesafe way to do this, consider using Guava’s Stopwatch class.
The nanoTime() call is great in lot’s of situations:
Logging how long a Function takes to run
Calculating how long to wait in an exponential back-off retry loop
Picking a time to schedule future work.
Recording in metrics how long a Function takes to run
What about System.currentTimeMillis()?
While this function worked well for a long time, it has been superseded by Instant.now(). I usually see other programmers use this function because they only care about millisecond granularity. However, this suffer from the same clock drift problem as Instant.now().
Recently I’ve been working on improving the ergonomics of my tracing library PerfMark. One of the interesting things I noticed is that the JVM was loading some classes I didn’t expect it to. Let’s find out why!
“That’s Weird”
PerfMark works on much older JVM’s, but strives to be as fast as possible. To accomplish this, the code bootstraps itself based on what advanced JVM features are available (MethodHandles, VarHandles, Hidden Classes, etc.). Because of this, the library needs to avoid loading classes that aren’t usable. In the event that no advanced features are available, the library safely disables itself. The code shows how to load optionally available classes safely:
// SecretPerfMarkImpl.PerfMarkImpl
static {
Generator gen = null;
Throwable problem = null;
try {
Class> clz =
Class.forName(
"io.perfmark.java7.SecretMethodHandleGenerator$MethodHandleGenerator");
gen = clz.asSubclass(Generator.class).getConstructor().newInstance();
} catch (Throwable t) {
problem = t;
}
if (gen != null) {
generator = gen;
} else {
generator = new NoopGenerator();
}
// More Initialization code.
}
As we can see, the static initializer tries to load a special class that requires Java 7 or higher. If it isn’t available, we fall back to the no-op implementation. The actual types of the classes are not as important, except that Generator is an abstract class. We can check to see what is actually loaded by passing the
-Xlog:class+load=info
flag to the JVM. This lets us see what classes are loaded and when:
This is kind of strange. The MethodHandleGenerator class is definitely available, but it loads after the NoopGenerator. Worse, it seems like both classes end up being loaded. What’s going on?
“When Flags Aren’t Enough”
Let’s ratchet up the verbosity to see what the loader is doing:
-Xlog:class+resolve=debug
Running with this shows that the class loading it caused by the verification step the JVM performs when loading classes:
While the JVM is well documented, it is hard to penetrate for someone who isn’t making their own implementation. What we want to know is why this class is relevant to verification. Rather than go over the Specification with a fine tooth comb, let’s just put a breakpoint into the JVM itself!
Instrumenting the JVM
First, let’s get a copy:
git clone https://github.com/openjdk/jdk.git
After fumbling with the configuration arguments, let’s try out a slowdebug build.
I (ab)used the JDK that Gradle downloaded for me, but the rest of the configuration is pretty regular. I am using slowdebug and with-native-debug-symbols because for some reason GDB was unable to find the function names in the back trace. I used --enable-headless-only because I don’t have all the header files locally. Okay, let’s build!
CONF=linux-x86_64-server-slowdebug make
This takes about 8 minutes on my Skylake processor. Soon enough, we have a fully functional JDK. Because this is a hack, I modified the java command Gradle builds for me manually to call GDB.:
$ JAVA_HOME=~/git/jdk/build/linux-x86_64-server-slowdebug/jdk/ \
./build/install/perfmark-examples/bin/perfmark-examples
GNU gdb (Debian 12.1-4+b1) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/carl/git/jdk/build/linux-x86_64-server-slowdebug/jdk//bin/java...
(gdb)
Alright, We can boot our JVM with GDB, and we are ready to see what it’s doing. However, we need a way to see when the NoopGenerator class really get’s loaded. After some walking through the code, I find that verification happens in a file called verifier.cpp. Let’s add a print in Verifier::trace_class_resolution:
void Verifier::trace_class_resolution(
Klass* resolve_class, InstanceKlass* verify_class) {
assert(verify_class != nullptr, "Unexpected null verify_class");
ResourceMark rm;
Symbol* s = verify_class->source_file_name();
const char* source_file = (s != nullptr ? s->as_C_string() : nullptr);
const char* verify = verify_class->external_name();
const char* resolve = resolve_class->external_name();
if (strstr(resolve, "NoopGenerator") != nullptr) {
log_info(class, load)("Found NoopGenerator: %s", resolve);
}
// print in a single call to reduce interleaving between threads
if (source_file != nullptr) {
log_debug(class, resolve)(
"%s %s %s (verification)", verify, resolve, source_file);
} else {
log_debug(class, resolve)("%s %s (verification)", verify, resolve);
}
}
As of this writing, this happens on line 129. Let’s rebuild and re-run:
At this point, we need to make GDB not pester us as we are stepping through. The JVM uses Segmentation Faults to implement efficient NullPointerException calls, so we want to avoid being notified of that. It also uses other signals (i.e. SIGUSR2) for thread pausing and resuming, which we aren’t interested in:
Reading symbols from /home/carl/git/jdk/build/linux-x86_64-server-slowdebug/jdk//bin/java...
(gdb) handle SIGSEGV noprint nostop
Signal Stop Print Pass to program Description
SIGSEGV No No Yes Segmentation fault
(gdb) handle SIGUSR2 noprint nostop
Signal Stop Print Pass to program Description
SIGUSR2 No No Yes User defined signal 2
(gdb)
Okay, let’s insert a breakpoint:
(gdb) break verifier.cpp:129
No source file named verifier.cpp.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (verifier.cpp:129) pending.
(gdb)
Alright, let’s run!
(gdb) run
Starting program: /home/carl/git/jdk/build/linux-x86_64-server-slowdebug/jdk/bin/java
// .... Lot's of output
Thread 2 "java" hit Breakpoint 1, Verifier::trace_class_resolution
(resolve_class=0x8000c44d8, verify_class=0x8000c4000)
at /home/carl/git/jdk/src/hotspot/share/classfile/verifier.cpp:129
129 log_info(class, load)("Found NoopGenerator: %s", resolve);
(gdb)
Now that we have hit out break point, let’s see how the class loader actually got here.
(gdb) bt
#0 Verifier::trace_class_resolution (resolve_class=0x8000c44d8, verify_class=0x8000c4000) at /home/carl/git/jdk/src/hotspot/share/classfile/verifier.cpp:129
#1 0x00007ffff6c99e82 in VerificationType::resolve_and_check_assignability (klass=0x8000c4000, name=0x7ffff05705c8, from_name=0x7ffff0570d48, from_field_is_protected=false, from_is_array=false,
from_is_object=true, __the_thread__=0x7ffff0032830) at /home/carl/git/jdk/src/hotspot/share/classfile/verificationType.cpp:85
#2 0x00007ffff6c9a055 in VerificationType::is_reference_assignable_from (this=0x7ffff571b5a0, from=..., context=0x7ffff571c870, from_field_is_protected=false, __the_thread__=0x7ffff0032830)
at /home/carl/git/jdk/src/hotspot/share/classfile/verificationType.cpp:122
#3 0x00007ffff6b1827f in VerificationType::is_assignable_from (this=0x7ffff571b5a0, from=..., context=0x7ffff571c870, from_field_is_protected=false, __the_thread__=0x7ffff0032830)
at /home/carl/git/jdk/src/hotspot/share/classfile/verificationType.hpp:289
#4 0x00007ffff6cad7ad in StackMapFrame::pop_stack (this=0x7ffff571bdf0, type=..., __the_thread__=0x7ffff0032830) at /home/carl/git/jdk/src/hotspot/share/classfile/stackMapFrame.hpp:236
#5 0x00007ffff6ca8321 in ClassVerifier::verify_field_instructions (this=0x7ffff571c870, bcs=0x7ffff571bd80, current_frame=0x7ffff571bdf0, cp=..., allow_arrays=true, __the_thread__=0x7ffff0032830)
at /home/carl/git/jdk/src/hotspot/share/classfile/verifier.cpp:2367
#6 0x00007ffff6ca439a in ClassVerifier::verify_method (this=0x7ffff571c870, m=..., __the_thread__=0x7ffff0032830) at /home/carl/git/jdk/src/hotspot/share/classfile/verifier.cpp:1693
#7 0x00007ffff6c9d1c6 in ClassVerifier::verify_class (this=0x7ffff571c870, __the_thread__=0x7ffff0032830) at /home/carl/git/jdk/src/hotspot/share/classfile/verifier.cpp:645
#8 0x00007ffff6c9b499 in Verifier::verify (klass=0x8000c4000, should_verify_class=true, __the_thread__=0x7ffff0032830) at /home/carl/git/jdk/src/hotspot/share/classfile/verifier.cpp:201
#9 0x00007ffff6444437 in InstanceKlass::verify_code (this=0x8000c4000, __the_thread__=0x7ffff0032830) at /home/carl/git/jdk/src/hotspot/share/oops/instanceKlass.cpp:752
#10 0x00007ffff6444a91 in InstanceKlass::link_class_impl (this=0x8000c4000, __the_thread__=0x7ffff0032830) at /home/carl/git/jdk/src/hotspot/share/oops/instanceKlass.cpp:873
#11 0x00007ffff64444c2 in InstanceKlass::link_class (this=0x8000c4000, __the_thread__=0x7ffff0032830) at /home/carl/git/jdk/src/hotspot/share/oops/instanceKlass.cpp:758
#12 0x00007ffff644539d in InstanceKlass::initialize_impl (this=0x8000c4000, __the_thread__=0x7ffff0032830) at /home/carl/git/jdk/src/hotspot/share/oops/instanceKlass.cpp:1027
#13 0x00007ffff64443a2 in InstanceKlass::initialize (this=0x8000c4000, __the_thread__=0x7ffff0032830) at /home/carl/git/jdk/src/hotspot/share/oops/instanceKlass.cpp:740
#14 0x00007ffff659bc52 in find_class_from_class_loader (env=0x7ffff0032b50, name=0x7ffff0572b78, init=1 '\001', loader=..., protection_domain=..., throwError=0 '\000', __the_thread__=0x7ffff0032830)
at /home/carl/git/jdk/src/hotspot/share/prims/jvm.cpp:3537
#15 0x00007ffff658c395 in JVM_FindClassFromCaller (env=0x7ffff0032b50, name=0x7ffff571edd0 "io/perfmark/impl/SecretPerfMarkImpl$PerfMarkImpl", init=1 '\001', loader=0x7ffff571ef08, caller=0x7ffff571ef00)
at /home/carl/git/jdk/src/hotspot/share/prims/jvm.cpp:825
#16 0x00007ffff54bff29 in Java_java_lang_Class_forName0 (env=0x7ffff0032b50, this=0x7ffff571eef0, classname=0x7ffff571ef18, initialize=1 '\001', loader=0x7ffff571ef08, caller=0x7ffff571ef00)
at /home/carl/git/jdk/src/java.base/share/native/libjava/Class.c:145
#17 0x00007fffe855aaad in ?? ()
#18 0x0000000000000002 in ?? ()
#19 0x00007fffbc187288 in ?? ()
#20 0x0000000000000000 in ?? ()
I highlighted the relevant methods. It seems like the JVM is keen on making sure that NoopGenerator is assignable to Generator, in the event that line is executed. With this is mind, we can guess how to fix it.
Cheating the Verifier
So far, the class hierarchy looks something like:
public abstract class Generator {
// ...
}
final class NoopGenerator extends Generator {
// ...
}
Recall the original function that triggers this behavior:
// SecretPerfMarkImpl.PerfMarkImpl
static {
Generator gen = ...
if (gen != null) {
generator = gen;
} else {
generator = new NoopGenerator();
}
// More Initialization code.
}
The JVM wants to be sure that in case NoopGeneratoris loaded, it will actually be Generator. The problem is that it is eagerly doing the verification of the type, even if it is seldom loaded. With this in mind, we can come up with a solution.
Warning: Hacks
Do not write the following code without a pretty beefy comment:
static {
Generator gen = ...
if (gen != null) {
generator = gen;
} else {
generator = (Generator) (Object) new NoopGenerator();
}
// More Initialization code.
}
This double cast (Object masking?) modifies the byte code to doubt itself about it’s assignment. Note the change in the byte code:
This minor change causes the verifier to not double check the type until runtime. I won’t re-paste all the debug output, but let’s see that the change had an effect. We will use JMH with the classloader profiler to measure the changes:
BEFORE
Benchmark Mode Cnt Score Error Units
forName_init ss 400 8730.958 ± 142.833 us/op
forName_init:·class.load ss 400 1.166 ± 0.015 classes/sec
forName_init:·class.load.norm ss 400 72.105 ± 0.112 classes/op
forName_init:·class.unload ss 400 ≈ 0 classes/sec
forName_init:·class.unload.norm ss 400 ≈ 0 classes/op
AFTER
Benchmark Mode Cnt Score Error Units
forName_init ss 400 8411.036 ± 148.305 us/op
forName_init:·class.load ss 400 1.136 ± 0.016 classes/sec
forName_init:·class.load.norm ss 400 71.040 ± 0.113 classes/op
forName_init:·class.unload ss 400 ≈ 0 classes/sec
forName_init:·class.unload.norm ss 400 ≈ 0 classes/op
As we can see the number of classes loaded drops by about one. We won’t read too much into the speed up, since the error bars are pretty high already.
Conclusion
The JVM checks our safety by making sure our classes are sound, but sometimes we want to defer those checks until later. This post shows how to diagnose such cases, and how to avoid doing unnecssary class loads.