CHAPTER 5 PARALLELIZATION AND THREADING ENHANCEMENTS 118 Coordination Data Structures (CDS) and Threading Enhancements In .NET 4.0, the thread pool has been enhanced, and a number of new synchronization classes have been introduced. Thread Pool Enhancements Creating many threads to perform small amounts of work can actually end up taking longer than performing the work on a single thread. This is due to time slicing and the overhead involved in locking, and adding and removing items to the thread pools queue. Previously ,the queue of work in the thread pool was held in a linked list structure and utilized a monitor lock. Microsoft improved this by changing to a data structure that is lock-free and involves the garbage collector doing less work. Microsoft says that this new structure is very similar to ConcurrentQueue (discussed shortly). The great news is that you should find that if your existing applications are using the thread pool and you upgrade them to .NET4.0 then your applications performance should be improved with no changes to your code required. Thread.Yield() Calling the new Thread.Yield() method tells the thread to give its remaining time with the processor (time slice) to another thread. It is up to the operating system to select the thread that receives the additional time. The thread that yield is called on is then rescheduled in the future. Note that yield is restricted to the processor/core that the yielded thread is operating within. Monitor.Enter() The Monitor.Enter() method has a new overload that takes a Boolean parameter by reference and sets it to true if the monitor call is successful. For example: bool gotLock = false; object lockObject = new object(); try { Monitor.Enter(lockObject, ref gotLock); //Do stuff } finally { if (gotLock) { Monitor.Exit(lockObject); } } CHAPTER 5 PARALLELIZATION AND THREADING ENHANCEMENTS 119 Concurrent Collections The concurrent collection classes are thread-safe versions of many of the existing collection classes that should be used for multithreaded or parallelized applications. They can all be found lurking in the System.Collections.Concurrent namespace. When you use any of these classes, it is not necessary to write any locking code because these classes will take care of locking for you. MSDN documentation states that these classes will also offer superior performance to ArrayList and generic list classes when accessed from multiple threads. ConcurrentStack Thread-safe version of stack (LIFO collection). ConcurrentQueue Thread-safe version of queue (FIFO collection). ConcurrentDictionary Thread-safe version of dictionary class. ConcurrentBag ConcurrentBag is a thread-safe, unordered, high-performance collection of items contained in System.dll. ConcurrentBags are used when it is not important to maintain the order of items in the collection. ConcurrentBags also allow the insertion of duplicates. ConcurrentBags can be very useful in multithreaded environments because each thread that accesses the bag has its own dequeue. When the dequeue is empty for an individual thread, it will then access the bottom of another thread’s dequeue reducing the chance of contention occurring. Note that this same technique is used within the thread pool for providing load balancing. BlockingCollection BlockingCollection is a collection that enforces upper and lower boundaries in a thread-safe manner. If you attempt to add an item when the upper or lower bounds have been reached, the operation will be blocked, and execution will pause. If on the other hand, you attempt to remove an item when the BlockingCollection is empty, this operation will also be blocked. This is useful for a number of scenarios, such as the following: • Increasing performance by allowing threads to both retrieve and add data from it. For example, it could read from disk or network while another processes items. • Preventing additions to a collection until the existing items are processed. The following example creates two threads: one that will read from the blocking collection and another to add items to it. Note that we can enumerate through the collection and add to it at the same time, which is not possible with previous collection types. CHAPTER 5 PARALLELIZATION AND THREADING ENHANCEMENTS 120 CAUTION It is important to note that the enumeration will continue indefinitely until the CompleteAdding() method is called. using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Dynamic; using System.Threading.Tasks; using System.Diagnostics; using System.Threading; using System.Collections.Concurrent; namespace ConsoleApplication7 { class Program { public static BlockingCollection<string> blockingCol = new BlockingCollection<string>(5); public static string[] Alphabet = new string[5] { "a", "b", "c", "d", "e" }; static void Main(string[] args) { ThreadPool.QueueUserWorkItem(new WaitCallback(ReadItems)); Console.WriteLine("Created thread to read items"); //Creating thread to read items note how we are already enumurating collection! ThreadPool.QueueUserWorkItem(new WaitCallback(AddItems)); Console.WriteLine("Created thread that will add items"); //Stop app closing Console.ReadKey(); } public static void AddItems(object StateInfo) { int i = 0; while (i < 200) { blockingCol.Add(i++.ToString()); Thread.Sleep(10); } } CHAPTER 5 PARALLELIZATION AND THREADING ENHANCEMENTS 121 public static void ReadItems(object StateInfo) { //Warning this will run forever unless blockingCol.CompleteAdding() is called foreach (object o in blockingCol.GetConsumingEnumerable()) { Console.WriteLine("Read item: " + o.ToString()); } } } } Synchronization Primitives .NET4.0 introduces a number of synchronization classes (discussed in the following sections). Barrier The Barrier class allows you to synchronize threads at a specific point. The MSDN documentation has a good analogy: the barrier class works a bit like a few friends driving from different cities and agreeing to meet up at a gas station (the barrier) before continuing their journey. The following example creates two threads: one thread will take twice as long as the other to complete its work. When both threads have completed their work, execution will continue after the call to SignalAndWait()() has been made by both threads. using System.Threading; class Program { static Barrier MyBarrier; static void Main(string[] args) { //There will be two participants in this barrier MyBarrier = new Barrier(2); Thread shortTask = new Thread(new ThreadStart(DoSomethingShort)); shortTask.Start(); Thread longTask = new Thread(new ThreadStart(DoSomethingLong)); longTask.Start(); Console.ReadKey(); } static void DoSomethingShort() { Console.WriteLine("Doing a short task for 5 seconds"); Thread.Sleep(5000); Console.WriteLine("Completed short task"); CHAPTER 5 PARALLELIZATION AND THREADING ENHANCEMENTS 122 MyBarrier.SignalAndWait(); Console.WriteLine("Off we go from short task!"); } static void DoSomethingLong() { Console.WriteLine("Doing a long task for 10 seconds"); Thread.Sleep(10000); Console.WriteLine("Completed a long task"); MyBarrier.SignalAndWait(); Console.WriteLine("Off we go from long task!"); } } The Barrier class also allows you to change participants at runtime through the AddParticipant()() and RemoveParticipant() methods. Cancellation Tokens Cancellation tokens are a struct that provide a consistent means of cancellation. You might want to use a cancellation token to cancel a function or task that is taking too long or using too much of a machine’s resources. Support is provided in many of the Task and PLINQ methods for the use of cancellation tokens. To use cancellation tokens, you first need to create a CancellationTokenSource. Then you can utilize it to pass a cancellation token into the target method by using the Token property. Within your method, you can then check the token’s IsCancellationRequested property and throw an operation cancelled exception if you find this to be true (e.g. a cancellation has occurred). When you want to perform a cancellation, you simply need to call the Cancel() method on the cancellation source that will then set the token’s IsCancellationRequested() method to true. This sounds more complex than it actually is; the following example demonstrates this process: static CancellationTokenSource cts = new CancellationTokenSource(); static void Main(string[] args) { Task t = Task.Factory.StartNew(() => DoSomething(), cts.Token); System.Threading.Thread.Sleep(2000); cts.Cancel(); Console.ReadKey(); } public static void DoSomething() { try { while (true) { Console.WriteLine("doing stuff"); CHAPTER 5 PARALLELIZATION AND THREADING ENHANCEMENTS 123 if (cts.Token.IsCancellationRequested == true) { Console.WriteLine("cancelled"); throw new OperationCanceledException(cts.Token); } } } catch (OperationCanceledException ex) { //operation cancelled do any clean up here Console.WriteLine("Cancellation occurred"); } } CountDownEvent The new CountDownEvent is initialized with an integer value and can block code until the value reaches 0 (the value is decremented by calling the signal method). CountDownEvent is particularly useful for keeping track of scenarios in which many threads have been forked. The following example blocks until the count has been decremented twice: using System.Collections.Concurrent; using System.Threading; namespace Chapter5 { static CountdownEvent CountDown = new CountdownEvent(2); static void Main(string[] args) { ThreadPool.QueueUserWorkItem(new WaitCallback(CountDownDeduct)); ThreadPool.QueueUserWorkItem(new WaitCallback(CountDownDeduct)); //Wait until countdown decremented by DecrementCountDown method CountDown.Wait(); Console.WriteLine("Completed"); Console.ReadKey(); } static void CountDownDeduct(object StateInfo) { System.Threading.Thread.Sleep(5000); Console.WriteLine("Deducting 1 from countdown"); CountDown.Signal(); } } CHAPTER 5 PARALLELIZATION AND THREADING ENHANCEMENTS 124 ManualResetEventSlim and SemaphoreSlim ManualResetEventSlim and SemaphoreSlim are lightweight versions of the existing ManualResetEvent and Semaphore classes. The new classes do not use resource-expensive kernel features as their predecessors did. SpinLock SpinLock forces a program to loop until it can obtain and lock access to a particular resource. This should be used when you don’t have to wait too long. Although looping (rather than handing control over to another thread) sounds like a wasteful thing to do, it can potentially be much more efficient than stopping to process other threads because it avoids a context switch (a resource-intensive process in which the current CPU state is stored, and a new state is loaded). private static SpinLock MySpinLock = new SpinLock(); static void Main(string[] args) { bool Locked = false; try { MySpinLock.Enter(ref Locked); //Work that requires lock would be done here } finally { if (Locked) { MySpinLock.Exit(); } } } ThreadLocal<T> ThreadLocal is a lazy initialized variable for each thread (see Chapter 4 for more info about lazy initialized variables). CHAPTER 5 PARALLELIZATION AND THREADING ENHANCEMENTS 125 Future Considerations By parallelizing an application, you can greatly speed it up (or slow it down if you do it badly!). It is worth considering the following: • It is a shame that the ability to utilize all available processing power on a machine (for example, dormant GPUs) was not included in this release. • Many developers feel the Concurrency and Coordination Runtime (CCR) should have been included in this release. The CCR assists with creating loosely coupled asynchronous applications and was originally included with Microsoft Robotics Studio (it has since been separated out). At the time of writing, the CCR is not free for commercial usage. For more info on CCR, please refer to: http://msdn. microsoft.com/en-gb/library/bb648752.aspx. • Looking toward the future, is it possible that a future version of Task Manager could allow you to distribute work across multiple machines paving the way for grid computing libraries within .NET? • Multicore shift will mean that existing pricing/licensing models need to be reconsidered. Danny Shih Interview I talked to Danny Shih (Program Manager on parallel extensions for the .NET team) about his thoughts on the parallel extensions: The underlying architecture of the Task Scheduler changed during development to use the thread pool; can you say a bit more about this decision? Our managed scheduler (on which TPL was originally built) and the ThreadPool basically served the same purpose, and in Dev10, the two teams were working on different enhancements. The ThreadPool team was working on things like hill-climbing (an algorithm to determine and adjust to the optimal number of threads for a given workload), and we had added things like work-stealing queues to our managed scheduler. So to avoid duplicating code and to take advantage of all new enhancements, we wanted to either build the ThreadPool on TPL or build TPL on the ThreadPool. For various reasons, we took the latter approach. What do you see as the potential pitfalls when using the new parallel enhancements? I think the major one is adding parallelism to an application when it’s unsafe to do so. New APIs like Parallel.For() make it extremely easy introduce concurrency, both correctly and incorrectly. A common scenario is parallelizing a serial loop that has iterations that depend on other iterations (possibly resulting in deadlock) or that has iterations that access shared state without synchronization (possible race conditions resulting in incorrect results). Where do you see the .NET parallelization/threading APIs heading in the future? In future versions, we’re definitely trying to refine our APIs (adding stuff we think we missed, mainly). We’re also discussing cluster and GPG support, but there’s nothing to announce there yet. CHAPTER 5 PARALLELIZATION AND THREADING ENHANCEMENTS 126 Phil Whinstanley http://weblogs.asp.net/plip/ I talked to Phil Whinstanley (ASP.NET MVP and author) about his experience of the parallel enhancements in .NET4. “Working on a very heavy IO (disk and network) project (zero configuration hosted build server) we parallelized the process in a matter of minutes changing a foreach to a Parallel.ForEach() giving us a performance increase which reduced the time taken to execute from one minute and thirty seconds down to fourteen seconds. We were gobsmacked” Conclusion In the future, the majority of machines will have multicore processors. The new parallelization improvements give the developer an easy-to-use and powerful way to take advantage of this resource. Parallelization should enable the creation of applications that would currently run too slow to be viable. The applications that have the most to gain from parallelization are those in the fields of games, graphics, mathematical/scientific modeling, and artificial intelligence. Parallelization will require us to make a major shift in the way we design and develop as we move from solving problems in serial to parallel. We currently have difficulty developing bug-free serial applications, and parallelization will undoubtedly increase the complexity of applications, so it is important not to underestimate the additional complexity running code in parallel will add to your application. Further Reading • Axum: a research project that aims to provide a “safe and productive parallel programming model for .NET” • http://msdn.microsoft.com/en-us/devlabs/dd795202.aspx • http://www.danielmoth.com/Blog/ • Fantastic free document on parallelization patterns: http://www.microsoft.com/downloads/details.aspx?FamilyID=86b3d32b-ad26- 4bb8-a3ae-c1637026c3ee&displaylang=en • A language aimed at making parallel applications “safer, more scalable and easier to write:” http://blogs.msdn.com/maestroteam/default.aspx –. • http://managed-world.com/archive/2009/02/09/an-intro-to-barrier.aspx • http://channel9.msdn.com/shows/Going+Deep/Erika-Parsons-and-Eric- Eilebrecht CLR-4-Inside-the-new-Threadpool/ • Joe Duffy and Herb Stutter, Concurrent Programming on Windows: Architecture, Principles, and Patterns (Microsoft .Net Development). Addison Wesley, 2008 . static void DoSomethingLong() { Console.WriteLine("Doing a long task for 10 seconds"); Thread.Sleep( 100 00) ; Console.WriteLine("Completed a long task"); MyBarrier.SignalAndWait();. http://weblogs.asp .net/ plip/ I talked to Phil Whinstanley (ASP .NET MVP and author) about his experience of the parallel enhancements in .NET 4. “Working on a very heavy IO (disk and network) project. • http://managed-world.com/archive/ 200 9 /02 /09 /an-intro-to-barrier.aspx • http://channel9.msdn.com/shows/Going+Deep/Erika-Parsons-and-Eric- Eilebrecht CLR -4- Inside-the-new-Threadpool/ • Joe