We’ve already said that shared-state concurrency can be the less-than-great solution when it comes to writing concurrent software. What is it about treating our data like a children’s ball pit1and our algorithms like a gaggle
1If you’ve never seen half a dozen kids thrown into a pit with about two thousand plastic balls in it, where the chief goal seems to be throwing them at the collection of dads, then
of sugar-infused toddlers that makes it hard to scale, debug, and understand?
And why does a system that doesn’t look like that require so many lines of code that it’s still difficult to debug and understand?
The product
We’ll illustrate some ideas with something that’s a pretty common occur- rence: the modeling of a user. We’ll also model something else that’s pretty common: software evolution. When you start writing your app, the require- ments are small and you, therefore, build small. As the requirements grow and the audience increases, you pile on the code and the features. The day you need to switch from sequential programming to concurrent program- ming, all hell breaks loose.
In the beginning
When you start out, you have a nice little User class that looks about the same as any otherUserclass you’d find in the “Hello World” ofUserclasses:
public class User {
private String first = "";
private String last = "";
public String getFirstName() { return this.first;
}
public void setFirstName(String s) { this.first = s;
}
public String getLastName() { return this.last;
}
public void setLastName(String s) { this.last = s;
} }
you’ve really missed out.
A few months or a year goes by and you get to the point where you have a few threads. Everything’s cool until you start to see some weird stuff happening with your output. Every once in a while some names get messed up. How hard could it be, right?
The first concurrency round
What’s happening is that your concurrency isn’t allowing your changes to be visible between threads at the right time, so you toss in some synchronized versions of your methods.
public class User {
private String first = "";
private String last = "";
synchronized public String getFirstName() { return this.first;
}
synchronized public void setFirstName(String s) { this.first = s;
}
synchronized public String getLastName() { return this.last;
}
synchronized public void setLastName(String s) { this.last = s;
} }
This helps, but someone points out that usingvolatilewould be better, so you do that instead:
public class User {
private volatile String first = "";
private volatile String last = "";
public String getFirstName() { return this.first;
}
public void setFirstName(String s) {
this.first = s;
}
public String getLastName() { return this.last;
}
public void setLastName(String s) { this.last = s;
} }
That looks nicer. Now things are cooking!
The real problem shows up
The visibility of your changes is now awesome, but something new has shown up, and this is the real problem. Every once in a while, in the tradi- tion of the Heisenbug, you retrieve a name that doesn’t exist. You’ve tracked down what you think is the offending line of code:
System.out.println(user.getFirstName() + " " + user.getLastName());
But every once in a while you see this:
Spider Lantern
You shake your head a bit and have a look through your database for a
“Spider Lantern” but you can’t find one. You find “Spider Man” and “Green Lantern,” but not “Spider Lantern”.
You really hope that it’s just some weird interleaving of output on the terminal, but it’s not. You’ve got a bigger problem.
You have code running on a thread that is trying to change someone’s name from “Green Lantern” to “Spider Man,” but the method by which it has to do it is pretty messed up:
user.setFirstName("Spider");
user.setLastName("Man");
In a concurrent system, there are many CPU cycles between those two lines of code where something can sneak in and grab the user’s first name and last name. It grabs the new first name, “Spider,” and the old last name,
“Lantern,” and spits them out. Damn.
The problem’s solution
There are many ways to solve this problem; the good ones involve changing your API. However, you’d have to change so much code that it just doesn’t seem worth it, so you try the cheap way out.
import java.util.concurrent.locks.ReentrantLock;
public class User {
private ReentrantLock lock = new ReentrantLock();
private String first = "";
private String last = "";
public void lock() { lock.lock();
}
public void unlock() { lock.unlock();
}
public String getFirstName() { return this.first;
}
public void setFirstName(String s) { try {
lock();
this.first = s;
} finally { unlock();
} }
public String getLastName() { return this.last;
}
public void setLastName(String s) { try {
lock();
this.last = s;
} finally { unlock();
} } }
The ol’ stand-by revolves around locks. You toss some concurrency locks around the problem and figure that if “getters” try to grab the lock and someone else has it, then you’re golden. What you’ll do is change the one lineprintlnto lock the whole guy first:
try {
user.lock();
System.out.println(user.getFirstName() +
" " + user.getLastName());
} finally { user.unlock();
}
And that totally works! Except that it doesn’t. It reduces the window of the race condition but it doesn’t completely stop the following from happen- ing:
// Thread 1
user.setFirstName("Green");
// Thread 2 try {
user.lock();
System.out.println(user.getFirstName() +
" " + user.getLastName());
} finally { user.unlock();
}
// Thread 1
user.setLastName("Lantern");
If that happens you have the same problem. In order to get around that, you need to lock during setting, the same way as we’ve done it while getting.
Even though we’ve locked inside setFirstName() and setLastName(),
that isn’t good enough; we’d need to do the same manual locking that we did while getting insideThread 2above.
But even if you do that, look at what’s happened:
public void lock() { lock.lock();
}
public void unlock() { lock.unlock();
}
You’ve exposed your lock strategy to the world! There’s absolutely noth- ing to stop someone (or you) from doing this:
user.lock();
System.out.println(user.getFirstName() +
" " + user.getLastName());
Did someone forget to unlock? I think they did. . . Best of luck trying to lock that again from a different thread.
BLURGH!
It’s just plain ridiculous. To fix this problem, you have to change the API, use some sort of Database (DB) transaction, or something else that’s hideous.
And this is just this one issue; race conditions and concurrency issues can show up infarmore subtle ways than this. We didn’t even look at deadlocks, which could have been quite interesting if we had chosen to use two sepa- rate locks for this problem. And imagine if this were a library that you had published to the world, and not just made as a convenience for yourself.
If you haven’t experienced this before, trust me, it’s not the sort of thing you want to waste your time on.
Threads
Now that we’ve covered the main problem that programmers have had for so long with shared-state concurrency, we can move forward and look at some of the machinery that helps us run our stuff.
In our modern applications, threads are taking care of the heavy lifting required to keep our code running concurrently. Some people seem to think that threads are cheap. They’re really not. Threads are an incredibly ex- pensive resource to just start spending like you’re dealing with Brewster’s Millions.2 Some concurrency frameworks of the past even thought it was reasonable to spin up a new thread for every incoming network request. The rule of thumb with those was to make sure your app didn’t get too many incoming network requests. That’s pretty silly.
I’ve also seen people make their apps go “faster” by spinning up mul- tiple threads to do some work in parallel and then kill them when it’s time to stop them. If there are 200 incoming requests, they’ll happily spin up 10,000 threads to do their work for them, and this can happen on-and-off every couple of seconds! Threads are not meant to be used this way. Really.
This gets worse once those threads start blocking each other with locks, synchronized blocks, or other concurrency mechanisms. The shared-state concurrency model can turn your threading methods into spaghetti really quickly.
Thread pools
If you don’t want to be spinning up threads manually, then what’s the better option? Well, it’s thread pools. Thread pools are important in concurrent programming, and are equally important when programming with Akka, al- though we don’t often use them directly.
To get any concurrent work done within a single process, you need to have threads. There are several drawbacks to using threads directly:
• They have a fixed life cycle. If you put ajava.lang.Runnableon a thread instance, then when thatRunnablecompletes, the thread dies right along with it and can’t be restarted.
• They take time to start up. Creating a thread certainly doesn’t have a zero cost when it comes to creation.
• They’re certainly not free when it comes to memory usage.
• There are operating system limits on these things; you aren’t free to create an infinite number of them.
2Yeah, it was a pretty bad movie, but I was a kid when it came out, so it was awesome.
• You pay a huge cost in the management of threads with respect to context switching. A thread needs to run on a processor, and if it isn’t currently allocated to one, then the OS needs to remove one from a processor and put another one in its place. Moving all that data around is also expensive.
To eliminate and/or hide all of these problems, we use thread pools. Java has created a set of reusable thread pools for you that have many of the won- derful aspects of thread pools, which developers have created for themselves over the years.
The thread pool creates a managed layer on which your concurrent meth- ods can execute. They ensure that the system is being used efficiently, so long as you’re not specifying thread pools of an unreasonable size or amount. Us- ing them helps you avoid creating and destroying threads by yourself all the time, and it ensures that concurrent work gets throttled, to a certain degree.
The thread balance
One challenge of managing threads is to ensure that you have enough, but you don’t have too many. If you have two cores and ten thousand threads, you’re probably not doing yourself any favors. The reason for this is due to the context switching that the OS must do on your behalf.
All of the threads you have must run at some point; otherwise, they’ll starve for attention from the CPU, which is certainly something that must be avoided. So the OS slices them off some time. In order to do that, it must freeze the running state of a given thread, pull it off of the CPU, store it some- where fun, and then put the new thread in its place for a few microseconds, and then switch it out to make room for the next one.
Maximizing processor time
All of this context switching takes time and you want to avoid it as much as possible. If you’re multiplying two hundred thousand matrices together on a machine with eight cores, then the right decision is definitelynotto break the work up into one thousand threads of two hundred matrices each. In this sit- uation, you want approximately eight threads running approximately twenty five thousand matrix multiplications each. That will minimize your context switching and maximize the amount of time that your application spends on the processor. It’s not an exact science due to the fact that a general-purpose
computing OS will always be doing more than just dealing with your ap- plication, but in this case, the approximation isn’t too bad. A couple more threads here might help.
CPU versus IO
There are really two kinds of work in most applications: that which requires the CPU and that which performs synchronous IO. Asynchronous IO isn’t much of a problem because you don’t have anything tying up threads while the IO is taking place, so we only concern ourselves with synchronous IO.
Clearly, if you have an application that is 100% bound to the CPU (let’s say it’s only calculatingπ), then you can keep the number of threads down to a value that’s commensurate with the number of cores. But if you’re per- forming a fair bit of synchronous IO, then what?
IO is a problem because it’s slow. While the IO is performing, your application isn’t busy; it’s just waiting for the IO to complete. And while it’s waiting, it’s tying up a thread in your application. If you’ve only allocated 8 threads, and they’re all doing IO, where’s your CPU work going to go?
It’s often a good idea to separate your IO from CPU work by creating separate thread pools. The IO pool will be “large” in comparison to the CPU pool since the threads on the IO pool spend most of their time avoiding the CPU. You can then tune the IO pool independently from the CPU pool. In general, this is a real pain, which is why the world is really starting to get serious about asynchronous IO.
Blocking calls
To make sure that you’re not making a ton of blocking calls, your language or your toolkit needs to help you. Back in the days before C++11, we didn’t have closures. As such, creating non-blocking code was a big problem. You ended up doing things like this:
class MyBusinessLogic { public:
// ... stuff ...
bool ourCallbackFunction(const SomeResult& results) {
// do stuff return results.ok();
}
// ... stuff ...
private:
void someCode() {
someObject.call(param1, param2,
bind(&MyBusinessLogic::ourCallbackFunction, this, _1));
} };
And that’s when it’s generally easy. MyBusinessLogic
calls someObject, which allows someObject to call back into
MyBusinessLogic, but what if that’s not the end of the story? What if we want more chaining, more decision making, and more delegates? It just gets worse and worse.
Java isn’t much better since everything is a noun (i.e., class or instance thereof), even though you can instantiate anonymous classes. C++ at least can have simple verbs (good ol’ functions) that we can pass around if needed.
This is why we have so many blocking calls in our code today—doing anything else is justtoo damn hard.
Nonblocking APIs
A few years ago, Node.js3came on the scene with a single-threaded solution to our blocking call problem. Node.js’s bread and butter is the idea that if all IO is asynchronous, then code in user-land is free to execute and react to IO events. Your user-level code is on a single thread so you don’t need any synchronization primitives to protect its data, and with the help of Java Script’s closure mechanisms, we get the tools we need to write a ton of non- blocking code (albeit, heavily nested at times).
Node.js and Akka differ in many ways, but they both make no blocking a huge goal. If you can swap out a Java library that blocks in its IO for one that doesn’t, do it.
We won’t discuss Node.js anymore; Node.js and Akka are trying to solve verydifferent problems in the software world and it’s simply not reasonable to compare them. I bring up Node.js at this time since it’s gained quite a
3http://www.nodejs.org/
following, and it may help ground you in the importance of non-blocking IO.