Carl

Carl Mastrangelo

A programming and hobby blog.


Java’s Mysterious Interrupt

One of the lesser known parts of Java is InterruptedException. With its vague-sounding name and non-committal Javadoc, few people are able to figure out what they need to do with it. Since I myself didn’t really grok it, I decided to put together a post explaining how interruption works. If you are looking for a more practical guide on how to handle it, I recommend Dealing with InterruptedException by Brian Goetz. This post is more focused on how interruption works.

Thread Interruption in Java

At the core, thread interruption is a way to wake up certain JDK method calls. When you call Thread.interrupt(), it does exactly one thing:

  1. Sets the interrupted bit on the thread to true.

Pretty simple right? In reality it isn’t so simple. The JDK authors also added secondary behavior that is triggered by calling interrupt():

  1. If the thread that was just interrupted is currently executing Object.wait(), Thread.join(), or Thread.sleep(), or any of their overloads, then those methods will clear the interrupted bit and throw an InterruptedException.

  2. If the thread that was just interrupted is currently executing LockSupport.park() or any of its sibling calls, the park call will return early.

  3. More vaguely, if the thread is currently blocked on network I/O (e.g. Selector.select(), the network channel (or Socket) will be closed. Note that this is only true of the JDK implementation of network calls, not in general.

That’s it! All other interruption behavior comes from these mechanisms. All the Future class behavior, all the Executor behavior, all the Lock and Condition behavior regarding interrupts is built on top. Before we dig into how it really works though, let’s take a quick detour to ask “Why?”

Why Use Interrupts or InterruptedException?

I don’t have a single answer here, but generally you shouldn’t. There is enough confusion that the chances of getting it right are slim. Also, library code you depend on probably doesn’t handle it properly. With that in mind, here are some legitimate use cases:

While there are occasional uses for interruption, it should be avoided. The fundamental problem with interrupts is that they are effectively globals. They are scoped to the thread, rather than to the task. You don’t really want to interrupt the thread, you’d rather interrupt the Future or whatever unit of work.

That said, it would be better if we understood how it worked before we ward it off. That way it’s an informed decision if and when we use it.

Where Do Interrupts Come From?

Since we are often on the receiving side, a natural question is where did the interrupt come from? The answer, as the trope goes, is that it came from inside the house. It comes from our own code. Unlike other threading wake-ups, interrupts are not spurious and do have an origin.

While hinted above, pressing Ctrl-C doesn’t actually cause an interrupt. SIGINT and other signals don’t result in interrupt() being called. (We could certainly add this behavior, though.) Since calling interrupt() is the only way to cause an interrupt, we can reasonably audit all places where it came from.

How Do Interrupts Work?

To see what calling Thread.interrupt() does, let’s start following the call path. Starting in Thread.java:

    public void interrupt() {
        if (this != Thread.currentThread())
            checkAccess();

        synchronized (blockerLock) {
            Interruptible b = blocker;
            if (b != null) {
                interrupt0();           // Just to set the interrupt flag
                b.interrupt(this);
                return;
            }
        }
        interrupt0();
    }

// ...

private native void interrupt0();

Seems pretty terse. That blocker variable looks interesting, so let’s come back to that later. For now, it seems calling interrupt0() is how it gets triggered. To find that, we need to find the native version of this code. Looking in Thread.c:

// ...
    {"interrupt0",       "()V",        (void *)&JVM_Interrupt},
// ...

Following the indirection again we find JVM_Interrupt is in jvm.cpp:

JVM_ENTRY(void, JVM_Interrupt(JNIEnv* env, jobject jthread))
  ThreadsListHandle tlh(thread);
  JavaThread* receiver = NULL;
  bool is_alive = tlh.cv_internal_thread_to_JavaThread(jthread, &receiver, NULL);
  if (is_alive) {
    Thread::interrupt(receiver);
  }
JVM_END

… which we chase to thread.cpp:

void Thread::interrupt(Thread* thread) {
  os::interrupt(thread);
}

… which finally ends at os_posix.cpp:

void os::interrupt(Thread* thread) {
  OSThread* osthread = thread->osthread();
  if (!osthread->interrupted()) {
    osthread->set_interrupted(true);
    ParkEvent * const slp = thread->_SleepEvent ;
    if (slp != NULL) slp->unpark() ;
  }
  ((JavaThread*)thread)->parker()->unpark();
  ParkEvent * ev = thread->_ParkEvent ;
  if (ev != NULL) ev->unpark() ;
}

We’ll stop here, as we’ve reached the part we are interested in. The code checks to see if the interrupted bit is set, and if not, set it. Secondly, it wakes it up by unparking it. As we can see, it is safe to interrupt the thread from multiple callsites. Additionally, if we search the source code for set_interrupted(true), we find this is the only place that calls it. This backs up the earlier claim that interruptions only come from calling Thread.interrupt().

How does interrupting wake up Thread.sleep()?

How does this wake up threads that are sleeping? If we follow the call chain from Thread.sleep, we actually end up back in the same os_posix.cpp file, edited for brevity:

int os::sleep(Thread* thread, jlong millis, bool interruptible) {
  ParkEvent * const slp = thread->_SleepEvent ;

  if (interruptible) {
    jlong prevtime = javaTimeNanos();

    for (;;) {
      if (os::is_interrupted(thread, true)) {
        return OS_INTRPT;
      }
      jlong newtime = javaTimeNanos();
      if (newtime - prevtime >= 0) {
        millis -= (newtime - prevtime) / NANOSECS_PER_MILLISEC;
      }
      if (millis <= 0) {
        return OS_OK;
      }
      prevtime = newtime;
      JavaThread *jt = (JavaThread *) thread;
      ThreadBlockInVM tbivm(jt);
      OSThreadWaitState osts(jt->osthread(), false /* not Object.wait() */);

      slp->park(millis);

      jt->check_and_wait_while_suspended();
    }
  }
}

The salient point is that Java uses the parking mechanism to put threads to sleep, and unparks them using interrupt. There is tight integration between the JDK and the JVM to make this all work.

How does interrupting wake up Object.wait()?

I won’t trace through the JVM source code any more but the process is pretty similar. What’s interesting about it is that there is an alternative mechanism for waking up Object.wait(), which is using the Object.notify() call. Normally, to call notify(), we need to hold the monitor for that object. Eerily, to wake up the wait() call, we use thread interruption. This is how interruption wakes up an object monitor:

void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
  // ...

  // check if the notification happened
  if (!WasNotified) {
    // no, it could be timeout or Thread.interrupt() or both
    // check for interrupt event, otherwise it is timeout
    if (interruptible && Thread::is_interrupted(Self, true) && !HAS_PENDING_EXCEPTION) {
      TEVENT(Wait - throw IEX from epilog);
      THROW(vmSymbols::java_lang_InterruptedException());
    }
  }

  // NOTE: Spurious wake up will be consider as timeout.
  // Monitor notify has precedence over thread interrupt.
}

This is where InterruptedException comes from when we call wait(). Waiting on an object puts the thread to sleep until something interesting happens. Usually the interesting event is user defined, and triggered by calling notify(). In the case of interruption though, there is no counter party working with wait(). Any outsider can wake up the waiting thread, assuming your SecurityManager allows it.

Occasionally, calling interrupt() on a waiting thread doesn’t seem to work, and the thread seemingly ignores it. This is because wait() asserts that when it completes, it will be holding the monitor. For example, consider the following snippet of code:

Thread thread1 = new Thread() {
  public void run() {
  synchronized (lock) {
    while (!ready) {
      lock.wait();
    }
    // use data
  }
};
thread1.start();

Here, wait() assures us lock is held. But, what happens if our thread is interrupted? The waiter must reacquire the lock before it can throw InterruptedException. This guarantees that when exiting the synchronized block, the lock is correctly released. Where this gets problematic is when the lock is held by another thread. Consider Thread 2:

synchronized (lock) {
  thread1.interrupt();
  thread1.join();
}

Uh oh, a deadlock! This code is trying to terminate thread1 but is holding the lock. Thread 1 is not holding the lock, so how could this be a deadlock? Well, as soon as as thread 1 is interrupted, it is going to try to grab the lock and throw an exception. It tries to quickly exit, but since thread 2 is currently holding the lock, it has to wait. Meanwhile, thread 2 is trying to join thread 1, waiting for it to wrap things up. Since they both have something the other one wants, and are unwilling to give up, neither will make progress.

The example above is a little contrived, but we can see it was written with the best intentions. Thread 2 doesn’t even need to hold the lock for very long to cause the interrupt to not work. Instead, it could call thread1.interrupt() and then go do some other work while holding the lock to cause similar behavior.

How does interrupting wake up Selector.select()?

This is probably the most interesting case of interruption. Selector.select() is a blocking call that waits for some network condition to be true. It’s used to implement non-blocking I/O. On Linux, it is implemented as a call to the epoll_wait syscall. What’s curious about it is that somehow Java interrupts seem to be able to wake it up.

But wait, how can that be possible? We have seen from the OpenJDK source code above that interrupt() just sets a bit on a Java Thread object, and calls unpark(). Unparking is just a lightweight wrapper around pthread_cond_signal, which means epoll_wait doesn’t really know about it. What gives?

Swing and a Miss: Event FDs

A good first guess would be Event FDs. These are designed to be able to wake up a polling operation in the kernel. The usage pattern involves registering the FD with the poller (i.e. poll, epoll, or select). When a thread writes to the fd, the poller notices and wakes up. This is the approach Netty takes in its own epoll implementation. There are two main issues with this idea though. First, looking through the OpenJDK source, its own epoller doesn’t do this. The only FDs it seems to create are pipe fds. The pipe FDs are used to implement Selector.wakeup() but that isn’t quite the same thing as Thread.interrupt(). So, Event FDs or other special FDs are not the answer.

Strike Two: Signals

A second guess would be Posix Signals. These would be a reasonable guess since the OpenJDK epoll implementation does handle the case of signals. However, there isn’t a place where the signal is actually sent. If it were the case that signals were being used to unblock network operations, then there must be some way that calling interrupt() results in signal() being invoked. Looking through the source code we don’t find any sign of this.

As a side note, signals wouldn’t work in general here, because other, similar API calls ignore them. For example FileInputStream.read() can be interrupted. If we dig down to the actual syscall read() though, the OpenJDK just retries on signals. The fact that it completely ignores signals, but can be interrupted means that interrupt() cannot be based on sending signals.

The Truth About Uninterruptible Operations

The previous two guesses being out, it would appear that interruption can’t really work. There are operations that block and cannot be interrupted. These are the only way to accomplish certain tasks, like reading from a file. To reconcile this with interruption, the JDK attaches some strings. It promises to make it all work, but at a pretty high cost: the Selector/Channel/File will be closed.

Waaaay back up top, there were a few lines that we put aside which are now relevant. From the implementation of Thread.interrupt() we see:

            Interruptible b = blocker;
            if (b != null) {
                interrupt0();           // Just to set the interrupt flag
                b.interrupt(this);
                return;
            }

OpenJDK has some tricks up its sleeve that allow it to do things regular Java programmers can’t. When an operation would block, the code installs an Interruptible to the thread. When the thread is interrupted, the Interruptible is invoked to run some small bit of code. In the case of Selector, an Interruptible is installed prior to blocking. If Thread.interrupt() is called, the channel is closed, causing the blocking thread to unblock itself. This is also how file reads work, which would otherwise be uninterruptible. This is also how LockSupport.park() is able to wake up.

This is the trade-off made to make interruptions work. After the object is interrupted, it isn’t usable anymore. This is acceptable in the case that we want the program to shut down urgently. We probably don’t care about our resources in use, and we were just going to close them anyways.

One final thing to note: the unblocking code happens high up, in the Java layer. In order for this to work, interruptions can only be caused by calling Thread.interrupt(). Therefore, putting a breakpoint in the Thread code should be sufficient to find how our threads are being interrupted.

Conclusion

As we can see, thread interruptions in Java have very tight coupling with the JDK and the JVM. They aren’t magic, but they aren’t quite so easy to understand either. While I can’t encourage that they should be used, I do suggest handling them properly in code that you write.

P.S.

There are actually ways to interrupt in Java other than in the Java code itself. In particular, JNI and JVMTI can do it. However, they can break pretty much any rules they want to, and users are taking their fate into their own hands. I point this out for completeness, though it doesn’t really change the way interrupts propagate.


Home

You can find me on Twitter @CarlMastrangelo