Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 78 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
78
Dung lượng
10,36 MB
Nội dung
start until the others are finished; waiting for the last task implicitly means waiting for all three here. Continuations are slightly more interesting when the initial task produces a result— the continuation can then do something with the output. For example, you might have a task that fetches some data from a server and then have a continuation that puts the result into the user interface. Of course, we need to be on the correct thread to update the UI, but the TPL can help us with this. Schedulers The TaskScheduler class is responsible for working out when and how to execute tasks. If you don’t specify a scheduler, you’ll end up with the default one, which uses the thread pool. But you can provide other schedulers when creating tasks—both StartNew and ContinueWith offer overloads that accept a scheduler. The TPL offers a scheduler that uses the SynchronizationContext, which can run tasks on the UI thread. Example 16-21 shows how to use this in an event handler in a WPF application. Example 16-21. Continuation on a UI thread void OnButtonClick(object sender, RoutedEventArgs e) { TaskScheduler uiScheduler = TaskScheduler.FromCurrentSynchronizationContext(); Task<string>.Factory.StartNew(GetData) .ContinueWith((task) => UpdateUi(task.Result), uiScheduler); } string GetData() { WebClient w = new WebClient(); return w.DownloadString("http://oreilly.com/"); } void UpdateUi(string info) { myTextBox.Text = info; } This example creates a task that returns a string, using the default scheduler. This task will invoke the GetData function on a thread pool thread. But it also sets up a contin- uation using a TaskScheduler that was obtained by calling FromCurrentSynchronization Context. This grabs the SynchronizationContext class’s Current property and returns a scheduler that uses that context to run all tasks. Since the continuation specifies that it wants to use this scheduler, it will run the UpdateUi method on the UI thread. The upshot is that GetData runs on a thread pool thread, and then its return value is passed into UpdateUi on the UI thread. 662 | Chapter 16: Threads and Asynchronous Code We could use a similar trick to work with APM implementations, because task factories provide methods for creating APM-based tasks. Tasks and the Asynchronous Programming Model TaskFactory and TaskFactory<TResult> provide various overloads of a FromAsync method. You can pass this the Begin and End methods from an APM implementation, along with the arguments you’d like to pass, and it will return a Task or Task<TRe sult> that executes the asynchronous operation, instead of one that invokes a delegate. Example 16-22 uses this to wrap the asynchronous methods we used from the Dns class in earlier examples in a task. Example 16-22. Creating a task from an APM implementation TaskScheduler uiScheduler = TaskScheduler.FromCurrentSynchronizationContext(); Task<IPHostEntry>.Factory.FromAsync( Dns.BeginGetHostEntry, Dns.EndGetHostEntry, "oreilly.com", null) .ContinueWith((task) => UpdateUi(task.Result.AddressList[0].ToString()), uiScheduler); FromAsync offers overloads for versions of the APM that take zero, one, two, or three arguments, which covers the vast majority of APM implementations. As well as passing the Begin and End methods, we also pass the arguments, and the additional object argument that all APM Begin methods accept. (For the minority of APM implementa- tions that either require more arguments or have out or ref parameters, there’s an overload of FromAsync that accepts an IAsyncResult instead. This requires slightly more code, but enables you to wrap any APM implementation as a task.) We’ve seen the main ways to create tasks, and to set up associations between them either with parent-child relationships or through continuations. But what happens if you want to stop some work after you’ve started it? Neither the thread pool nor the APM supports cancellation, but the TPL does. Cancellation Cancellation of asynchronous operations is surprisingly tricky. There are lots of awk- ward race conditions to contend with. The operation you’re trying to cancel might already have finished by the time you try to cancel it. Or if it hasn’t it might have gotten beyond the point where it is able to stop, in which case cancellation is doomed to fail. Or work might have failed, or be about to fail when you cancel it. And even when cancellation is possible, it might take awhile to do. Handling and testing every possible combination is difficult enough when you have just one operation, but if you have multiple related tasks, it gets a whole lot harder. The Task Parallel Library | 663 Fortunately, .NET 4 introduces a new cancellation model that provides a well thought out and thoroughly tested solution to the common cancellation problems. This can- cellation model is not limited to the TPL—you are free to use it on its own, and it also crops up in other parts of the .NET Framework. (The data parallelism classes we’ll be looking at later can use it, for example.) If you want to be able to cancel an operation, you must pass it a CancellationToken. A cancellation token allows the operation to discover whether the operation has been canceled—it provides an IsCancellationRequested property—and it’s also possible to pass a delegate to its Register method in order to be called back if cancellation happens. CancellationToken only provides facilities for discovering that cancellation has been requested. It does not provide the ability to initiate cancellation. That is provided by a separate class called CancellationTokenSource. The reason for splitting the discovery and control of cancellation across two types is that it would otherwise be impossible to provide a task with cancellation notifications without also granting that task the capability of initiating cancellation. CancellationTokenSource is a factory of cancella- tion tokens—you ask it for a token and then pass that into the operation you want to be able to cancel. Example 16-23 is similar to Example 16-21, but it passes a cancellation token to StartNew, and then uses the source to cancel the operation if the user clicks a Cancel button. Example 16-23. Ineffectual cancellation private CancellationTokenSource cancelSource; void OnButtonClick(object sender, RoutedEventArgs e) { cancelSource = new CancellationTokenSource(); TaskScheduler uiScheduler = TaskScheduler.FromCurrentSynchronizationContext(); Task<string>.Factory.StartNew(GetData, cancelSource.Token) .ContinueWith((task) => UpdateUi(task.Result), uiScheduler); } void OnCancelClick(object sender, RoutedEventArgs e) { if (cancelSource != null) { cancelSource.Cancel(); } } string GetData() { WebClient w = new WebClient(); return w.DownloadString("http://oreilly.com/"); } 664 | Chapter 16: Threads and Asynchronous Code void UpdateUi(string info) { cancelSource = null; myTextBox.Text = info; } In fact, cancellation isn’t very effective in this example because this particular task consists of code that makes a single blocking method call. Cancellation will usually do nothing here in practice—the only situation in which it would have an effect is if the user managed to click Cancel before the task had even begun to execute. This illustrates an important issue: cancellation is never forced—it uses a cooperative approach, be- cause the only alternative is killing the thread executing the work. And while that would be possible, forcibly terminating threads tends to leave the process in an uncertain state—it’s usually impossible to know whether the thread you just zapped happened to be in the middle of modifying some shared state. Since this leaves your program’s integrity in doubt, the only thing you can safely do next is kill the whole program, which is a bit drastic. So the cancellation model requires cooperation on the part of the task in question. The only situation in which cancellation would have any effect in this particular example is if the user managed to click the Cancel button before the task had even begun. If you have divided your work into numerous relatively short tasks, cancellation is more useful—if you cancel tasks that have been queued up but not yet started, they will never run at all. Tasks already in progress will continue to run, but if all your tasks are short, you won’t have to wait long. If you have long-running tasks, however, you will need to be able to detect cancellation and act on it if you want to handle cancellation swiftly. This means you will have to arrange for the code you run as part of the tasks to have access to the cancellation token, and they must test the IsCancellationRequested prop- erty from time to time. Cancellation isn’t the only reason a task or set of tasks might stop before finishing— things might be brought to a halt by exceptions. Error Handling A task can complete in one of three ways: it can run to completion, it can be canceled, or it can fault. The Task object’s TaskStatus property reflects this through RanToComple tion, Canceled, and Faulted values, respectively, and if the task enters the Faulted state, its IsFaulted property also becomes true. A code-based task will enter the Faulted state if its method throws an exception. You can retrieve the exception information from the task’s Exception property. This returns an AggregateException, which contains a list of exceptions in its InnerExceptions property. It’s a list because certain task usage patterns can end up hitting multiple exceptions; for example, you might have multiple failing child tasks. The Task Parallel Library | 665 If you don’t check the IsFaulted property and instead just attempt to proceed, either by calling Wait or by attempting to fetch the Result of a Task<TResult>, the Aggrega teException will be thrown back into your code. It’s possible to write code that never looks for the exception. Example 16-17 starts two tasks, and since it ignores the Task objects returned by StartNew, it clearly never does anything more with the tasks. If they were children of another task that wouldn’t matter—if you ignore exceptions in child tasks they end up causing the parent task to fault. But these are not child tasks, so if exceptions occur during their execution, the program won’t notice. However, the TPL tries hard to make sure you don’t ignore such exceptions—it uses a feature of the garbage collector called finalization to discover when a Task that faulted is about to be collected without your program ever having noticed the exception. When it detects this, it throws the AggregateException, which will cause your program to crash unless you’ve configured your process to deal with unhandled exceptions. (The .NET Framework runs all finalizers on a dedicated thread, and it’s this thread that the TPL throws the exception on.) The TaskScheduler class offers an UnobservedTaskException event that lets you customize the way these unhan- dled exceptions are dealt with. The upshot is that you should write error handling for any nonchild tasks that could throw. One way to do this is to provide a continuation specifically for error handling. The ContinueWith method takes an optional argument whose type is the TaskContinua tionOptions enumeration, which has an OnlyOnFaulted value—you could use this to build a continuation that will run only when an unanticipated exception occurs. (Of course, unanticipated exceptions are always bad news because, by definition, you weren’t expecting them and therefore have no idea what state your program is in. So you probably need to terminate the program, which is what would have happened anyway if you hadn’t written any error handling. However, you do get to write errors to your logs, and perhaps make an emergency attempt to write out unsaved data some- where in the hope of recovering it when the program restarts.) But in general, it’s pref- erable to handle errors by putting normal try catch blocks inside your code so that the exceptions never make it out into the TPL in the first place. Data Parallelism The final concurrency feature we’re going to look at is data parallelism. This is where concurrency is driven by having lots of data items, rather than by explicitly creating numerous tasks or threads. It can be a simple approach to parallelism because you don’t have to tell the .NET Framework anything about how you want it to split up the work. With tasks, the .NET Framework has no idea how many tasks you plan to create when you create the first one, but with data parallelism, it has the opportunity to see more of the problem before deciding how to spread the load across the available logical processors. So in some scenarios, it may be able to make more efficient use of the available resources. 666 | Chapter 16: Threads and Asynchronous Code Parallel For and ForEach The Parallel class provides a couple of methods for performing data-driven parallel execution. Its For and ForEach methods are similar in concept to C# for and foreach loops, but rather than iterating through collections one item at a time, on a system with multiple logical processors available it will process multiple items simultaneously. Example 16-24 uses Parallel.For. This code calculates pixel values for a fractal known as the Mandelbrot set, a popular parallelism demonstration because each pixel value can be calculated entirely independently of all the others, so the scope for parallel ex- ecution is effectively endless (unless machines with more logical processors than pixels become available). And since it’s a relatively expensive computation, the benefits of parallel execution are easy to see. Normally, this sort of code would contain two nested for loops, one to iterate over the rows of pixels and one to iterate over the columns in each row. In this example, the outer loop has been replaced with a Parallel.For. (So this particular code cannot exploit more processors than it calculates lines of pixels— therefore, we don’t quite have scope for per-pixel parallelism, but since you would typically generate an image a few hundred pixels tall, there is still a reasonable amount of scope for concurrency here.) Example 16-24. Parallel.For static int[,] CalculateMandelbrotValues(int pixelWidth, int pixelHeight, double left, double top, double width, double height, int maxIterations) { int[,] results = new int[pixelWidth, pixelHeight]; // Non-parallel version of following line would have looked like this: // for(int pixelY = 0; pixelY < pixelHeight; ++pixelY) Parallel.For(0, pixelHeight, pixelY => { double y = top + (pixelY * height) / (double) pixelHeight; for (int pixelX = 0; pixelX < pixelWidth; ++pixelX) { double x = left + (pixelX * width) / (double) pixelWidth; // Note: this lives in the System.Numerics namespace in the // System.Numerics assembly. Complex c = new Complex(x, y); Complex z = new Complex(); int iter; for (iter = 1; z.Magnitude < 2 && iter < maxIterations; ++iter) { z = z * z + c; } if (iter == maxIterations) { iter = 0; } results[pixelX, pixelY] = iter; } }); Data Parallelism | 667 return results; } This structure, seen in the preceding code: Parallel.For(0, pixelHeight, pixelY => { }); iterates over the same range as this: for(int pixelY = 0, pixelY < pixelHeight; ++pixelY) { } The syntax isn’t identical because Parallel.For is just a method, not a language feature. The first two arguments indicate the range—the start value is inclusive (i.e., it will start from the specified value), but the end value is exclusive (it stops one short of the end value). The final argument to Parallel.For is a delegate that takes the iteration variable as its argument. Example 16-24 uses a lambda, whose minimal syntax introduces the least possible extra clutter over a normal for loop. Parallel.For will attempt to execute the delegate on multiple logical processors si- multaneously, using the thread pool to attempt to make full, efficient use of the avail- able processors. The way it distributes the iterations across logical processors may come as a surprise, though. It doesn’t simply give the first row to the first logical processor, the second row to the second logical processor, and so on. It carves the available rows into chunks, and so the second logical processor will find itself starting several rows ahead of the first. And it may decide to subdivide further depending on the progress your code makes. So you must not rely on the iteration being done in any particular order. It does this chunking to avoid subdividing the work into pieces that are too small to handle efficiently. Ideally, each CPU should be given work in lumps that are large enough to minimize context switching and synchronization overheads, but small enough that each CPU can be kept busy while there’s work to be done. This chunking is one reason why data parallelism can sometimes be more efficient than using tasks directly—the parallelism gets to be exactly as fine-grained as necessary and no more so, minimizing overheads. Arguably, calling Example 16-24 data parallelism is stretching a point—the “data” here is just the numbers being fed into the calculations. Parallel.For is no more or less data- oriented than a typical for loop with an int loop counter—it just iterates a numeric variable over a particular range in a list. However, you could use exactly the same construct to iterate over a range of data instead of a range of numbers. Alternatively, there’s Parallel.ForEach, which is very similar in use to Parallel.For, except, as you’d expect, it iterates over any IEnumerable<T> like a C# foreach loop, instead of using a range of integers. It reads ahead into the enumeration to perform chunking. (And if 668 | Chapter 16: Threads and Asynchronous Code you provide it with an IList<T> it will use the list’s indexer to implement a more efficient partitioning strategy.) There’s another way to perform parallel iteration over enumerable data: PLINQ. PLINQ: Parallel LINQ Parallel LINQ (PLINQ) is a LINQ provider that enables any IEnumerable<T> to be pro- cessed using normal LINQ query syntax, but in a way that works in parallel. On the face of it, it’s deceptively simple. This: var pq = from x in someList where x.SomeProperty > 42 select x.Frob(x.Bar); will use LINQ to Objects, assuming that someList implements IEnumerable<T>. Here’s the PLINQ version: var pq = from x in someList.AsParallel() where x.SomeProperty > 42 select x.Frob(x.Bar); The only difference here is the addition of a call to AsParallel, an extension method that the ParallelEnumerable class makes available on all IEnumerable<T> implementa- tions. It’s available to any code that has brought the System.Linq namespace into scope with a suitable using declaration. AsParallel returns a ParallelQuery<T>, which means that the normal LINQ to Objects implementation of the standard LINQ operators no longer applies. All the same operators are available, but now they’re supplied by ParallelEnumerable, which is able to execute certain operators in parallel. Not all queries will execute in parallel. Some LINQ operators essentially force things to be done in a certain order, so PLINQ will inspect the structure of your query to decide which parts, if any, it can usefully run in parallel. Iterating over the results with foreach can restrict the extent to which the query can execute in parallel, because foreach asks for items one at a time—upstream parts of the query may still be able to execute concurrently, but the final results will be sequen- tial. If you’d like to execute code for each item and to allow work to proceed in parallel even for this final processing step, PLINQ offers a ForAll operator: pq.ForAll(x => x.DoSomething()); This will execute the delegate once for each item the query returns, and can do so in parallel—it will use as many logical processors concurrently as possible to evaluate the query and to call the delegate you provide. Data Parallelism | 669 This means that all the usual multithreading caveats apply for the code you run from ForAll. In fact, PLINQ can be a little dangerous as it’s not that obvious that your code is going to run on multiple threads—it manages to make parallel code look just a bit too normal. This is not always a problem—LINQ tends to encourage a functional style of programming in its queries, meaning that most of the data involved will be used in a read-only fashion, which makes dealing with threading much simpler. But code exe- cuted by ForAll is useful only if it has no side effects, so you need to be careful with whatever you put in there. Summary To exploit the potential of multicore CPUs, you’ll need to run code on multiple threads. Threads can also be useful for keeping user interfaces responsive in the face of slow operations, although asynchronous programming techniques can be a better choice than creating threads explicitly. While you can create threads explicitly, the thread pool—used either directly or through the Task Parallel Library—is often preferable because it makes it easier for your code to adapt to the available CPU resources on the target machine. For code that needs to process large collections of data or perform uniform calculations across large ranges of numbers, data parallelism can help paral- lelize your execution without adding too much complication to your code. No matter what multithreading mechanisms you use, you are likely to need the syn- chronization and locking primitives to ensure that your code avoids concurrency hazards such as races. The monitor facility built into every .NET object, and exposed through the Monitor class and C# lock keyword, is usually the best mechanism to use, but some more specialized primitives are available that can work better if you happen to find yourself in one of the scenarios for which they are designed. 670 | Chapter 16: Threads and Asynchronous Code CHAPTER 17 Attributes and Reflection As well as containing code and data, a .NET program can also contain metadata. Metadata is information about the data—that is, information about the types, code, fields, and so on—stored along with your program. This chapter explores how some of that metadata is created and used. A lot of the metadata is information that .NET needs in order to understand how your code should be used—for example, metadata defines whether a particular method is public or private. But you can also add custom metadata, using attributes. Reflection is the process by which a program can read its own metadata, or metadata from another program. A program is said to reflect on itself or on another program, extracting metadata from the reflected assembly and using that metadata either to in- form the user or to modify the program’s behavior. Attributes An attribute is an object that represents data you want to associate with an element in your program. The element to which you attach an attribute is referred to as the tar- get of that attribute. For example, in Chapter 12 we saw the XmlIgnore attribute applied to a property: [XmlIgnore] public string LastName { get; set; } This tells the XML serialization system that we want it to ignore this particular property when converting between XML and objects of this kind. This illustrates an important feature of attributes: they don’t do anything on their own. The XmlIgnore attribute contains no code, nor does it cause anything to happen when the relevant property is read or modified. It only has any effect when we use XML serialization, and the only reason it does anything then is because the XML serialization system goes looking for it. Attributes are passive. They are essentially just annotations. For them to be useful, something somewhere needs to look for them. 671 [...]... significantly less horrid than the C# 3.0 code for COM automation, they are both a little cumbersome We have to use helper methods— GetProperty, InvokeSelf, or Invoke to retrieve properties and invoke functions But Silverlight 4 supports C# 4.0, and all script objects can now be used through the dynamic keyword, as Example 18 -9 shows Example 18 -9 Accessing JavaScript in C# 4.0 dynamic window = HtmlPage.Window;... 18 Dynamic Older versions of C# had trouble interacting with certain kinds of programs, especially those in the Microsoft Office family You could get the job done, but before C# 4.0, it needed a lot of effort, and the results were ugly The problem came down to a clash of philosophies: Office embraces a dynamic style, while C# used to lean heavily toward the static style C# 4.0 now provides better support... invocation services provided by reflection, which were described in Chapter 17 Unfortunately, doing that from C# 3.0 looks even more unpleasant than Example 18-3 Fortunately, C# 4.0 adds new dynamic features to the language that let us write code like Example 18-2, just as Word intended The dynamic Type C# 4.0 introduces a new type called dynamic In some ways it looks just like any other type such as int, string,... 18-2: Example 18-2 Word automation as Microsoft intended var doc = wordApp.Documents.Open("WordFile .docx" , ReadOnly:true); in C# 3.0 you would have been forced to write the considerably less attractive code shown in Example 18-3 Example 18-3 Word automation before C# 4.0 object fileName = @"WordFile .docx" ; object missing = System.Reflection.Missing.Value; object readOnly = true; var doc = wordApp.Documents.Open(ref... msg; } Before C# 4.0, you could invoke this in a couple of ways, both of which are illustrated in Example 18-8 Example 18-8 Accessing JavaScript in C# 3.0 ScriptObject showMessage = (ScriptObject) HtmlPage.Window.GetProperty("showMessage"); showMessage.InvokeSelf("Hello, world"); // Or ScriptObject window = HtmlPage.Window; window.Invoke("showMessage", "Hello, world"); 694 | Chapter 18: Dynamic... equals sign The inconsistency is for historical reasons Attributes have always supported positional and named arguments, but methods and normal constructor calls only got them in C# 4.0 The mechanisms work quite differently—the C# 4.0 named argument syntax is mainly there to support optional arguments, and it only deals with real method arguments, whereas with attributes, named arguments are not arguments... missing, ref missing, ref missing); * Yes, so C# supports variable-length argument lists, but it fakes it Such methods really have a fixed number of arguments, the last of which happens to be an array There is only one variable-length Console.WriteLine method, and the compiler is able to determine statically when you use it Static Versus Dynamic | 6 89 Not only has C# 3.0 insisted that we supply a value for... adds an important interop scenario: interoperability between C# code and browser objects Those might be objects from the DOM, or from script In either case, these objects have characteristics that fit much better with dynamic than with normal C# syntax, because these objects decide which properties are available at runtime Silverlight 3 used C# 3.0, so dynamic was not available It was still possible... as well as being valid C# (It’s idiomatically unusual—in a web page, the window object is the global object, and so you’d normally leave it out, but you’re certainly allowed to refer to it explicitly, so if you were to paste that last line into script in a web page, it would do the same thing as it does in C#. ) So dynamic has given us the ability to use JavaScript objects in C# with a very similar... dynamic b = new MyType(); b.SetBoth("Bar", 99 ); Console.WriteLine(b); Our MyType example also overloads the + operator—it defines what should occur when we attempt to add two of these objects together This means we can take the two objects from Example 18-12 and pass them to the AddAnything method from Example 18-4, as Example 18-13 shows The dynamic Type | 697 . } } Output: Calling DoFunc(7). Result: 9. 3333333333333333 BugID: 121 Programmer: Jesse Liberty Date: 01 /03 /08 Comment: BugID: 107 Programmer: Jesse Liberty Date: 01 / 04 / 08 Comment: Fixed off by one errors When. ******** [BugFixAttribute(121, "Jesse Liberty", " ;01 /03 /08 ")] [BugFixAttribute( 107 , "Jesse Liberty", " ;01 / 04 / 08 ", Comment = "Fixed off by one errors")] . history: [BugFixAttribute(121,"Jesse Liberty"," ;01 /03 /08 ")] [BugFixAttribute( 107 ,"Jesse Liberty"," ;01 / 04 / 08 ", Comment="Fixed off by one errors")] public