Await without waiting – Rado Buransky

Scala has recently introduced async and await features. It allows to write clean and easy-to-understand code for cases where otherwise complex composition of futures would be needed. The same thing already exists in C# for quite a while. But I always had a feeling that I don’t really know how does it work. I tried to look at it from my old-school C++ thread point of view. Which thread runs which piece of code and where is some kind of synchronization between them? Let’s take a look at the following example in Scala:

async {
  ... some code A ...
  await { ... some code B ... }
  ... some code C ...
}

I don’t want to go into disgusting details here, but the point is to stop looking at the “async” as at a monolithic sequence of statements. In fact it gets split into several blocks of code that can be executed independently, but in well defined order. Try to imagine that each block becomes a “work item” for a thread. Code is also just a piece of data, a data structure. It can be an item in a queue. When a thread from thread pool is available, it picks up a work item from the top of the queue and executes it. Execution of each work item can possibly produce more work items.

I am sure you have started asking how many of these queues we have, how many worker threads for each queue and what about their priorities. These are details that you can google out. But back to the original question. Where is the awaiting?

Technically speaking there’s none. Threads don’t wait for a specific code to finish. Threads are just monkeys. They execute whatever is at the top of the queue. The “await” statement causes the code to be split into separate work items and defines order in which they must be executed. The block of code C is chained with execution of block B. Once B is done, C can be executed. Eventually, by an arbitrary thread. So the thread executing the body of the async block:

Calls block A
Fires off execution of block B (possibly executed by another thread)
Done. Free to do something else. Go for a beer.

The result is that no thread is blocked by waiting for another thread to complete. A thread is either executing a code, or waiting for a work item to be queued. This is really cool. This way you can run a highly parallel application with just a few threads behind – usually the number of CPU cores. Play Framework works like this. Quite an opposite approach compared to Apache Tomcat where the default thread pool size is 200. There’s no need to have a thread per HTTP request.

This is a lot oversimplified. The truth is just a plain boring computer science:
SIP-22 – Async
Scala Async Project