More you can do with routers

Routers have many more possibilities, which you’ll start to recognize as we explore more of what Akka has to offer. Here are some to consider:

• When we get into remote actors, you can immediately see the poten- tial of theScatterGatherFirstCompletedRouter, as well as imple- menting some rudimentary load balancing across machines, or even some fault tolerance.

• The business logic possibilities are pretty huge. Imagine an evolving protocol, where the messages contain the version of the protocol. As things evolve, you can put in a router or even a family of cascading routers that find the right version of the business logic to handle that protocol level. This helps eliminate the hideousness that often accom- panies maintaining backward-compatibility in our code as protocols change.

• The Akka documentation’s example of a custom router involves the counting of votes for the Republican or Democratic party, respectively.

You can grow this solution while incorporating the same ideas behind the protocol versioning earlier, and simply segment your application

in any manner you see fit. It’s a clear win for isolating parts of your application logic from each other.

• You can also create a configurable router, which we could have done with the SectionSpecificAttendantRouterSpec, but didn’t. You follow the same kind of pattern, but you have to have a constructor that accepts thecom.typesafe.config.Configobject, which allows it to be dynamically configured. Consult the Akka documentation on routing for more information.

• You can even code up your own configurable resizers as well!

10.8 With that said. . .

Earlier I said that you shouldn’t try to stuff a router into a solution when it’s not appropriate, and before we finish this chapter, it needs to be said again.

While routers are very keen bits of functionality, there’s no such thing as a silver bullet. Often times, your needs may surpass what is possible in gener- alized code and routers have a tendency to be very simplistic in many cases.

For some reason, routing messages seems to be a very specific problem in many situations, requiring a specific solution.

I don’t want to rain on the “router parade”—use them as often as it’s reasonable to do so—but don’t be afraid to toss them if they’re not working out for you.

10.9 Chapter summary

Remember a long time ago when I said that you’d be glad to leave the warmth of the type system behind in favour of the untyped actor? Well, routing is a hugewin for making that concession. Nice choice you made there.

You’ve gained more understanding about:

• Akka’s routing concepts and methodology and how easy it is to use Akka’s pre-defined routers, both from the configuration system and from code.

• How to structure your application so as to divorce you, the program- mer, from the administrators (which may also be you) with respect to tuning your application’s performance.

• How to write your apps so that you can make some pretty drastic changes to it “in the field” without requiring a rebuild and reship. Em- powering the customer to be the master of his or her own destiny is a verygood and powerful thing.

• How to create your own routers and you have some ideas as to why you might want to do that.

• The foundational aspects of actor programming; you’ve seen more testing concepts and the power of substitutability when it comes to the untyped ActorRef. These concepts, and the acrobatics we can now perform whenstructuringour actor applications, is absolutely vital to powerful actor programming.

• The TestProbe and how Akka logs using theActorSystem’s event stream.

You’re really starting to get a deep understanding of how the Akka programming paradigm works, and you should have a lot of confidence right now. In fact, I would recommend that you take that high level of confidence and put it to good use; if you’re not married or otherwise attached, head out to your local singles bar and pick up. Everyone’s attracted to confidence at this insane level!

Dispatchers and Mailboxes

We’re getting ready to put actors aside for the moment and move on to some of Akka’s other features, but before we do that we need to learn a little bit about what keeps Akka ticking.

Way back in Chapter 5, we discussed the dispatcher and the mailbox without going too deeply into what they are, what purpose they really serve, and what control you have over them. This chapter will help bring closure to our understanding of actors while, at the same time bridge us over into the next topic of futures. We’ll also reference routing, where we describe the

“missing” router.

11.1 Dispatchers

Up until now, we haven’t really caredhowour code was executed, so long as it worked. While you don’t really have to understand how Akka does what it does, since the defaults work so well, it can certainly help you tune your application’s execution to run ever so sweetly.

In case you hadn’t figured it out yet, the job of actually executing the code you write, managing the threads, and all of that goodness is handled by the dispatcher. The dispatcher hides all (well, most) of the complexity from you, so that you can go about your business of writing code. You can tune the dispatcher in a few different ways, as well as supply different dispatchers for different parts of your system. Let’s review the dispatchers that Akka provides.

TheDispatcher

The main dispatcher, which is also the default dispatcher, is simply called

Dispatcher. It might become unclear as to when we’re describing the concept dispatcher, and the concrete default implementationDispatcher. As such, I’ll adopt a new name for the latter; I’ll refer to the concrete default implementation as theevent-basedDispatcher, and the concept will simply bedispatcher.

There’s a reason that the event-based Dispatcher is the default: it’s awesome. It’s generic, can work with any type of actor and any type of mailbox, and defaults to using the JSR166 fork-join pool. This together means ultimate flexibility and blazing speed.1 The event-basedDispatcher

allows you a great level of freedom when writing your concurrent code. As long as you keep your message handlers short and sweet, the event-based

Dispatcherwill treat you very well.

PinnedDispatcher

At times you might want to deviate from the default. For example, you don’t want to share your threads with ten million other actors. Maybe you require an actor to always be first in the queue for thread time, or you want to ensure that it will simply never starve for CPU attention. This is why thePinnedDispatcher was created; any actor that you assign to the

PinnedDispatcher will have a dedicated thread pool of size 1. That actor will be the only actor assigned to that thread pool, and is therefore guaranteed thread time on that pool.

The PinnedDispatcher is good for solving the problems previously stated, but it makes an absolutely horrific default. If you have ten million actors, you can’t possibly use aPinnedDispatchersince you can’t allocate ten million dedicated threads. Even if the JVM would let you do this before it tossed its cookies all over your motherboard, you’d spend more of your time context switching than doing anything else.

So, use these with care. The average program shouldn’t need to house more than half a dozen actors on aPinnedDispatcher. If you have a situation where you’re doing more than that, either rethink your solution, rework

1We won’t quote numbers in this book. Benchmarking is a black art, dripping with voodoo and rubber chickens. It’sfast, but if you want to really quantify it, grab some rubber chickens and go hog-wild.

your problem, or if you really have a legitimate case, please toss me an email;

I’d love to hear about it.

BalancingDispatcher

This one was “missing” when we talked about routing back in chapter 10.

What came close was theSmallestMailboxRouter, which would find the actor in the pool that had the smallest mailbox and insert the message into that mailbox. But it couldn’t “steal” work from one actor and give it to another actor that happened to be idle.

Actor

Event-Based Dispatcher

Message Message Message Mailbox

Actor Message

Message Message Mailbox

Actor Message

Message Message Mailbox

Figure 11.1ãA conceptual view of the event-basedDispatcher. The BalancingDispatcher fills this need. However, it can only do so because it is far more limiting than the previous two dispatchers with respect to the actors with which it can work. Whereas the event-based

Dispatcherlooks something likeFigure 11.1, theBalancingDispatcher

looks more like Figure 11.2. The fact that Akka decouples the mailbox from the actor and the dispatcher means that no “stealing” needs to hap- pen. Since only one mailbox is shared between all of the actors to which the

BalancingDispatcherwill dispatch, theBalancingDispatchercan pick any one it wishes without disturbing the others.

However, as you probably have already guessed, the actors that the

BalancingDispatcher feeds need to have the same actor implementation.

Truthfully, nothing’s stopping you from having different implementations

Actor

Balancing

Dispatcher Actor

Message Message Message Mailbox

Actor

Figure 11.2ãA conceptual view of theBalancingDispatcher.

for each actor, but in practice you wouldn’t do this. From your code’s per- spective, theBalancingDispatchersimply sends messages to one of your actors at random. You can’t guarantee which implementation will be chosen at any given time, so you’d simply make all of them identical to remove the complexity of the perceived randomness.

It shouldn’t come as a surprise that aBalancingDispatcher’s job isn’t

“intelligent” routing of messages; that is more suited to a router.

CallingThreadDispatcher

The last dispatcher that ships with Akka is almost something we shouldn’t even mention. . . really. OK, I’ll mentioned it, but for Ra’s sake don’t use it. TheCallingThreadDispatcherwas created to write deterministic tests and is, in fact, the dispatcher you use when you create an instance of the

TestFSMRef, as we saw inChapter 9. This dispatcher has no concurrency;

it relies on the calling thread for execution. This means that your tests don’t have to worry about concurrency, which is great, but if you used it in production code, you wouldn’t have concurrency.

I’ve spoken with some who think they could be “clever” in using this dispatcher in production code to serialize this, that, or the other thing for some reason or another. Don’t. Once you start taking a system that is designed to be concurrent across multiple threads, and then start synchronizing that

across one thread, you’re going to open yourself up to a world of pain and Akka isn’t going to be there to help you. If you need to synchronize work across a number of algorithms, then Akka’s got you covered. Use messages to sequence work, use futures (to be seen soon), dataflow (to be seen soon), etc. This is what Akka does! Don’t try to get it to do what it already does in ways it’s not intended to do it.

Note

Akka leaves a lot of room for you to write clever code, but if you start being “clever” outside of the Akka paradigm then don’t go crying to it when a group of beasts from the pit of hell climb up through the floor boards to give you a really bad pinch on the bum.

11.2 Dispatcher tweaking

The dispatcher has many of its parameters defined in the configuration system, which means you can tweak it. Actually anyone can tweak it in the running system, allowing people other than you to tune your application for performance.That’s nice.

We won’t go through all of the configuration options available to dispatchers and their underlying thread choices—you can find that in the fan- tastic Akka reference documentation—but we will look at some of the key concepts. Let’s make a fictitious dispatcher in configuration to look at the possibilities.

zzz.akka.investigation {

# "a-dispatcher" is the name with which we refer to it a-dispatcher {

# You could also use BalancingDispatcher, PinnedDispatcher or

# your own derivation of MessageDispatcherConfigurator.

type = "Dispatcher"

# You could also use thread-pool-executor executor = "fork-join-executor"

# By increasing this value, you can maximize thread usage, at

# the cost of fairness.

throughput = 10

# Since we've chosen the executor to be a fork-join-executor, we

# need to configure it here fork-join-executor {

# The minimum number of threads to have parallelism-min = 2

# The scaling factor for calculating the number of threads to

# allocate based on the hardware capabilities. The formula

# used is "ceil(number of processors * parallelism-factor)"

parallelism-factor = 2.0

# The maximum number of threads to allocate parallelism-max = 32

} } }

As stated, that’s not everything that you can do with a dispatcher configuration, but it hits some highlights. One of the most important properties in the dispatcher configuration is throughput. When the dispatcher grabs a thread and starts dispatching work on that thread, we can specify several messages that it can process before moving on. As the comment states above, you can maximize thread usage at the cost of “fairness.”

Actor 1

Event-Based Dispatcher

Message

… 50 more ...

Message Mailbox

Actor 2 Message

… 1,000 more ...

Message Mailbox

Actor 3 Message

… 2 more ...

Message Mailbox

Figure 11.3ãAn alteredthroughputchanges how these mailboxes drain.

InFigure 11.3, you can see that the sizes of the mailboxes are different

between actors 1, 2, and 3. If we set the value of throughput to 1, then actor 3’s mailbox will clearly drain before either actor 1 or 2’s mailbox.

However, if we set the value of throughputto 100, then actor 1’s mailbox may completely drain before actor 3’s, assuming that actor 1’s mailbox gets thread time before actor 3’s.

That’s the essence of thethroughput property. It tells Akka to waste less time switching between mailboxes and more time dispatching work from each mailbox individually. However, this is less “fair” with respect to getting thread time per mailbox. From the example inFigure 11.3, we can see that actor 3 is potentially treated unfairly, since its tiny mailbox can’t get thread time until we’ve completely drained actor 1.

Setting a high value forthroughputwill increase the raw speed of message processing in some cases, but may also manifest certain latencies and create the feeling of “burstiness” from time to time, where work visibly gets done in chunks, with time gaps in between. If those aren’t factors, then you may want to setthroughputhigher than you would otherwise.

Modifying the default dispatcher configuration

Thereference.confthat ships with the Akka actor package holds the configuration for the default dispatcher. You can always change the values of the default dispatcher by overriding the configuration yourself. For example, if you wanted to change the value ofthroughputfor the default dispatcher, then you would put the following into yourapplication.conf:

akka.actor.default-dispatcher.throughput = 20

You can do this for any parameter that you see, for any configuration that you see. This doesn’t merely apply to dispatchers.

11.3 Mailboxes

Mailboxes are another one of those things we’ve been using for a long time, but haven’t talked about much. They don’t have a ton of complexity, but they do offer another component that you can configure, modify, or even extend with your own implementation.

As we’ve seen, the mailbox contains messages that have not yet been processed. The message that goes into the mailbox is wrapped in an enve-

lope that carries the sender along with it. The sender comes from when the message dispatches to your actor.

In essence, the mailbox is just a queue, or more accurately ithasa queue, but the distinction here is of little value. Akka allows you to change the na- ture of the mailbox in the dispatcher’s configuration. As with the dispatcher, there are several pre-existing implementations from which you can choose.

UnboundedMailbox

This is the default. It’s backed by a

java.util.concurrent.ConcurrentLinkedQueue, does not block during the enqueue operation, and is not bounded by size. This is the default mailbox and should serve you very well. The only time it should really be a problem is when it grows without bound and eats up all of your memory.

However, if you have a situation where theUnboundedMailboxfills up and causes an OutOfMemoryError, then what you’re probably seeing is a symptomof a problem, not the root cause, and thus the fix lies elsewhere.

BoundedMailbox

The BoundedMailboxis an alternative to theUnboundedMailboxin that it can only contain a maximum number of messages. When the mailbox is full, the next enqueue operation will block the calling thread for a specified period of time (defaults are in the configuration, which you can override). If the timeout expires before the message can be enqueued, the message will be re-routed to the dead letter office.

Note

The guy performing the enqueue operation isn’t going to be alerted that his operation has failed. Depending on how you see things, this could be good or bad. If you don’t need or want to deal with the problem then you’re golden, but if you need to understand that the message didn’t get to where you wanted it to go, then a bounded mailbox might not be what you’re looking for.

This helps solve the out-of-memory problem, but it introduces another problem that’s summed up in a simple question: “What maximum size should I allow for the mailbox?”

That’s a tough question to answer most of the time. Perhaps situations exist where the actor relies on a service that is slow for a few minutes. This

may cause the actor’s mailbox to fill up, and thus throw an error at someone eventually. But what happens when, two seconds after the error is thrown, the service speeds up again and drains the mailbox very quickly? In most cases, that lost message might be a real pain in the area where your legs meet your back.

You might want to have a bounded mailbox when you have time limits elsewhere in the application; the idea being that the message is of no value if you can’t enqueue it in the next five seconds. If it can’t be done then some guy earlier up in the chain is just going to move on and do something else.

In that case, it’s a huge benefit that it couldn’t get enqueued or processed.

Tip

Prefer unbounded mailboxes over bounded ones. A bounded mailbox may look like the solution to the problem you’re having, but it’s more probable that the problem you’re having is inyour codeand needs to be solved in your code.

UnboundedPriorityMailboxandBoundedPriorityMailbox

The UnboundedPriorityMailbox and BoundedPriorityMailbox are

“priority queue” versions of the previous two. When you create one of these, you specify how Akka should prioritize different messages, and they will route properly to the actor.

To create the logic that maps message types to priority levels, we use the

PriorityGenerator, which provides a convenient method for this defini- tion:

package zzz.akka.investigation

import akka.dispatch.PriorityGenerator case class HighPriority(work: String) case class LowPriority(work: String) val myPrioComparator = PriorityGenerator {

// Lower numbers mean higher priority case HighPriority(_) => 0

case LowPriority(_) => 2

// Default to "medium" priority case otherwise => 1

}

A critical look at shared-state concurrency

You grabbed the right toolkit