Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 102 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
102
Dung lượng
7,37 MB
Nội dung
CHAPTER 15 ■ LAMBDA EXPRESSIONS 526 var expr = Expression<Func<int,int>>.Lambda<Func<int,int>>( Expression.Add(n, Expression.Constant(1)), n ); Func<int, int> func = expr.Compile(); for( int i = 0; i < 10; ++i ) { Console.WriteLine( func(i) ); } } } The bolded lines here replace the single line in the prior example in which the expr variable is assigned the lambda expression n => n+1. I think you’ll agree that the first example is much easier to read. However, this longhand example helps express the true flexibility of expression trees. Let’s break down the steps of building the expression. First, you need to represent the parameters in the parameter list of the lambda expression. In this case, there is only one: the variable n. Thus we start with the following: var n = Expression.Parameter( typeof(int), "n" ); ■ Note In these examples, I am using implicitly typed variables to save myself a lot of typing and to reduce clutter for readability. Remember, the variables are still strongly typed. The compiler simply infers their type at compile time rather than requiring you to provide the type. This line of code says that we need an expression to represent a variable named n that is of type int. Remember that in a plain lambda expression, this type can be inferred based upon the delegate type provided. Now, we need to construct a BinaryExpression instance that represents the addition operation, as shown next: Expression.Add(n, Expression.Constant(1)) Here, I’ve said that my BinaryExpression should consist of adding an expression representing a constant, the number 1, to an expression representing the parameter n. You might have already started to notice a pattern. The framework implements a form of the Abstract Factory design pattern for creating instances of expression elements. That is, you cannot create a new instance of BinaryExpression, or any other building block of expression trees, using the new operator along with the constructor of the type. The constructor is not accessible, so you must use the static methods on the Expression class to create those instances. They give us as consumers the flexibility to express what we want and allow the Expression implementation to decide which type we really need. CHAPTER 15 ■ LAMBDA EXPRESSIONS 527 ■ Note If you look up BinaryExpression, UnaryExpression, ParameterExpression, and so on in the MSDN documentation, you will notice that there are no public constructors on these types. Instead, you create instances of Expression derived types using the Expression type, which implements the factory pattern and exposes static methods for creating instances of Expression derived types. Now that you have the BinaryExpression, you need to use the Expression.Lambda<> method to bind the expression (in this case, n+1) with the parameters in the parameter list (in this case, n). Notice that in the example I use the generic Lambda<> method so that I can create the type Expression<Func<int,int>>. Using the generic form gives the compiler more type information to catch any errors I might have introduced at compile time rather than let those errors bite me at run time. One more point I want to make that demonstrates how expressions represent operations as data is with the Expression Tree Debugger Visualizer in Visual Studio 2010. If you execute the previous example within the Visual Studio Debugger, once you step past the point where you assign the expression into the expr variable, you will notice that in either the “Autos” or “Locals” windows, the expression is parsed and displayed as {n => (n + 1)} even though it is of type System.Linq.Expressions.Expression<System.Func<int,int>>. Naturally, this is a great help while creating complicated expression trees. ■ Note If I had used the nongeneric version of the Expression.Lambda method, the result would have been an instance of LambdaExpression rather than Expression. LambdaExpression also implements the Compile method; however, instead of a strongly typed delegate, it returns an instance of type Delegate. Before you can invoke the Delegate instance, you must cast it to the specific delegate type; in this case, Func<int, int> or another delegate with the same signature, or you must call DynamicInvoke on the delegate. Either one of those could throw an exception at run time if you have a mismatch between your expression and the type of delegate you think it should generate. Operating on Expressions Now I want to show you an example of how you can take an expression tree generated from a lambda expression and modify it to create a new expression tree. In this case, I will take the expression (n+1) and turn it into 2*(n+1): using System; using System.Linq; using System.Linq.Expressions; public class EntryPoint { static void Main() { Expression<Func<int,int>> expr = n => n+1; CHAPTER 15 ■ LAMBDA EXPRESSIONS 528 // Now, reassign the expr by multiplying the original // expression by 2. expr = Expression<Func<int,int>>.Lambda<Func<int,int>>( Expression.Multiply( expr.Body, Expression.Constant(2) ), expr.Parameters ); Func<int, int> func = expr.Compile(); for( int i = 0; i < 10; ++i ) { Console.WriteLine( func(i) ); } } } The bolded lines show the stage at which I multiply the original lambda expression by 2. It’s very important to notice that the parameters passed into the Lambda<> method (the second parameter) need to be exactly the same instances of the parameters that come from the original expression; that is, expr.Parameters. This is required. You cannot pass a new instance of ParameterExpression to the Lambda<> method; otherwise, at run time you will receive an exception similar to the following because the new ParameterExpression instance, even though it might have the same name, is actually a different parameter instance: System.InvalidOperationException: Lambda Parameter not in scope There are many classes derived from the Expression class and many static methods for creating instances of them and combining other expressions. It would be monotonous for me to describe them all here. Therefore, I recommend that you refer to the MSDN Library documentation regarding the System.Linq.Expressions namespace for all the fantastic details. Functions as Data If you have ever studied functional languages such as Lisp, you might notice the similarities between expression trees and how Lisp and similar languages represent functions as data structures. Most people encounter Lisp in an academic environment, and many times concepts that one learns in academia are not directly applicable to the real world. But before you eschew expression trees as merely an academic exercise, I want to point out how they are actually very useful. As you might already guess, within the scope of C#, expression trees are extremely useful when applied to LINQ. I will give a full introduction to LINQ in Chapter 16, but for our discussion here, the most important fact is that LINQ provides a language-native, expressive syntax for describing operations on data that are not naturally modeled in an object-oriented way. For example, you can create a LINQ expression to search a large in-memory array (or any other IEnumerable type) for items that match a certain pattern. LINQ is extensible and can provide a means of operating on other types of stores, such as XML and relational databases. In fact, out of the box, C# supports LINQ to SQL, LINQ to Dataset, LINQ to Entities, LINQ to XML, and LINQ to Objects, which collectively allow you to perform LINQ operations on any type that supports IEnumerable. So how do expression trees come into play here? Imagine that you are implementing LINQ to SQL to query relational databases. The user’s database could be half a world away, and it might be very expensive to perform a simple query. On top of that, you have no way of judging how complex the user’s CHAPTER 15 ■ LAMBDA EXPRESSIONS 529 LINQ expression might be. Naturally, you want to do everything you can to provide the most efficient experience possible. If the LINQ expression is represented in data (as an expression tree) rather than in IL (as a delegate), you can operate on it. Maybe you have an algorithm that can spot places where an optimization can be utilized, thus simplifying the expression. Or maybe when your implementation analyzes the expression, you determine that the entire expression can be packaged up, sent across the wire, and executed in its entirety on the server. Expression trees give you this important capability. Then, when you are finished operating on the data, you can translate the expression tree into the final executable operation via a mechanism such as the LambdaExpression.Compile method and go. Had the expression only been available as IL code from the beginning, your flexibility would have been severely limited. I hope now you can appreciate the true power of expression trees in C#. Useful Applications of Lambda Expressions Now that I have shown you what lambda expressions look like, let’s consider some of the things you can do with them. You can actually implement most of the following examples in C# using anonymous methods or delegates. However, it’s amazing how a simple syntactic addition to the language can clear the fog and open up the possibilities of expressiveness. Iterators and Generators Revisited I’ve described how you can create custom iterators with C# in a couple of places in this book already. 5 Now I want to demonstrate how you can use lambda expressions to create custom iterators. The point I want to stress is how the code implementing the algorithm, in this case the iteration algorithm, is then factored out into a reusable method that can be applied in almost any scenario. ■ Note Those of you who are also C++ programmers and familiar with using the Standard Template Library (STL) will find this notion a familiar one. Most of the algorithms defined in the std namespace in the <algorithm> header require you to provide predicates to get their work done. When the STL arrived on the scene back in the early 1990s, it swept the C++ programming community like a refreshing functional programming breeze. I want to show how you can iterate over a generic type that might or might not be a collection in the strict sense of the word. Additionally, you can externalize the behavior of the iteration cursor as well as how to access the current value of the collection. With a little thought, you can factor out just about everything from the custom iterator creation method, including the type of the item stored, the type of the cursor, the start state of the cursor, the end state of the cursor, and how to advance the cursor. All 5 Chapter 9 introduces iterators via the yield statement, and Chapter 14 expanded on custom iterators in the section titled “Borrowing from Functional Programming.” CHAPTER 15 ■ LAMBDA EXPRESSIONS 530 these are demonstrated in the following example, in which I iterate over the diagonal of a two- dimensional array: using System; using System.Linq; using System.Collections.Generic; public static class IteratorExtensions { public static IEnumerable<TItem> MakeCustomIterator<TCollection, TCursor, TItem>( this TCollection collection, TCursor cursor, Func<TCollection, TCursor, TItem> getCurrent, Func<TCursor, bool> isFinished, Func<TCursor, TCursor> advanceCursor) { while( !isFinished(cursor) ) { yield return getCurrent( collection, cursor ); cursor = advanceCursor( cursor ); } } } public class IteratorExample { static void Main() { var matrix = new List<List<double>> { new List<double> { 1.0, 1.1, 1.2 }, new List<double> { 2.0, 2.1, 2.2 }, new List<double> { 3.0, 3.1, 3.2 } }; var iter = matrix.MakeCustomIterator( new int[] { 0, 0 }, (coll, cur) => coll[cur[0]][cur[1]], (cur) => cur[0] > 2 || cur[1] > 2, (cur) => new int[] { cur[0] + 1, cur[1] + 1 } ); foreach( var item in iter ) { Console.WriteLine( item ); } } } Let’s look at how reusable MakeCustomIterator<> is. Admittedly, it takes some time to get used to the lambda syntax, and those used to reading imperative coding styles might find it hard to follow. Notice that it takes three generic type arguments. TCollection is the type of the collection, which in this example is specified as List<List<double>> at the point of use. TCursor is the type of the cursor, which in this case is a simple array of integers that can be considered coordinates of the matrix variable. And TItem is the type that the code returns via the yield statement. The rest of the type arguments to MakeCustomIterator<> are delegate types that it uses to determine how to iterate over the collection. CHAPTER 15 ■ LAMBDA EXPRESSIONS 531 First, it needs a way to access the current item in the collection, which, for this example, is expressed in the following lambda expression which uses the values within the cursor array to index the item within the matrix: (coll, cur) => coll[cur[0]][cur[1]] Then it needs a way to determine whether you have reached the end of the collection, for which I supply the following lambda expression that just checks to see whether the cursor has stepped off of the edge of the matrix: (cur) => cur[0] > 2 || cur[1] > 2 And finally it needs to know how to advance the cursor, which I have supplied in the following lambda expression, which simply advances both coordinates of the cursor: (cur) => new int[] { cur[0] + 1, cur[1] + 1 } After executing the preceding code, you should see output similar to the following, which shows that you have indeed walked down the diagonal of the matrix from the top left to the bottom right. At each step along the way, MakeCustomIterator<> has delegated work to the given delegates to perform the work. 1 2.1 3.2 Other implementations of MakeCustomIterator<> could accept a first parameter of type IEnumerable<T>, which in this example would be IEnumerable<double>. However, when you impose that restriction, whatever you pass to MakeCustomIterator<> must implement IEnumerable<>. The matrix variable does implement IEnumerable<>, but not in the form that is easily usable, because it is IEnumerable<List<double>>. Additionally, you could assume that the collection implements an indexer, as described in the Chapter 4 section “Indexers,” but to do so would be restricting the reusability of MakeCustomIterator<> and which objects you could use it on. In the previous example, the indexer is actually used to access the current item, but its use is externalized and wrapped up in the lambda expression given to access the current item. Moreover, because the operation of accessing the current item of the collection is externalized, you could even transform the data in the original matrix variable as you iterate over it. For example, I could have multiplied each value by 2 in the lambda expression that accesses the current item in the collection, as shown here: (coll, cur) => coll[cur[0]][cur[1]] * 2; Can you imagine how painful it would have been to implement MakeCustomIterator<> using delegates in the C# 1.0 days? This is exactly what I mean when I say that even just the addition of the lambda expression syntax to C# opens one’s eyes to the incredible possibilities. As a final example, consider the case in which your custom iterator does not even iterate over a collection of items at all and is used as a number generator instead, as shown here: using System; CHAPTER 15 ■ LAMBDA EXPRESSIONS 532 using System.Linq; using System.Collections.Generic; public class IteratorExample { static IEnumerable<T> MakeGenerator<T>( T initialValue, Func<T, T> advance ) { T currentValue = initialValue; while( true ) { yield return currentValue; currentValue = advance( currentValue ); } } static void Main() { var iter = MakeGenerator<double>( 1, x => x * 1.2 ); var enumerator = iter.GetEnumerator(); for( int i = 0; i < 10; ++i ) { enumerator.MoveNext(); Console.WriteLine( enumerator.Current ); } } } After executing this code, you will see the following results: 1 1.2 1.44 1.728 2.0736 2.48832 2.985984 3.5831808 4.29981696 5.159780352 CHAPTER 15 ■ LAMBDA EXPRESSIONS 533 You could allow this method to run infinitely, and it would stop only if you experienced an overflow exception or you stopped execution. But the items you are iterating over don’t exist as a collection; rather, they are generated on an as-needed basis each time you advance the iterator. You can apply this concept in many ways, even creating a random number generator implemented using C# iterators. More on Closures (Variable Capture) and Memoization In the Chapter 10 section titled “Beware the Captured Variable Surprise,” I described how anonymous methods can capture the contexts of their lexical surroundings. Many refer to this phenomenon as variable capture. In functional programming parlance, it’s also known as a closure. 6 Here is a simple closure in action: using System; using System.Linq; public class Closures { static void Main() { int delta = 1; Func<int, int> func = (x) => x + delta; int currentVal = 0; for( int i = 0; i < 10; ++i ) { currentVal = func( currentVal ); Console.WriteLine( currentVal ); } } } The variable delta and the delegate func embody the closure. The expression body references delta, and therefore must have access to it when it is executed at a later time. To do this, the compiler “captures” the variable for the delegate. Behind the scenes, what this means is that the delegate body contains a reference to the actual variable delta. But notice that delta is a value type on the stack. The compiler must be doing something to ensure that delta lives longer than the scope of the method within which is it declared because the delegate will likely be called later, after that scope has exited. Moreover, because the captured variable is accessible to both the delegate and the context containing the lambda expression, it means that the captured variable can be changed outside the scope and out of band of the delegate. In essence, two methods (Main and the delegate) both have access to delta. This behavior can be used to your advantage, but when unexpected, it can cause serious confusion. 6 For a more general discussion of closures, visit http://en.wikipedia.org/wiki/Closure_%28computer_science%29. CHAPTER 15 ■ LAMBDA EXPRESSIONS 534 ■ Note In reality, when a closure is formed, the C# compiler takes all those variables and wraps them up in a generated class. It also implements the delegate as a method of the class. In very rare cases, you might need to be concerned about this, especially if it is found to be an efficiency burden during profiling. Now I want to show you a great application of closures. One of the foundations of functional programming is that the function itself is treated as a first-class object that can be manipulated and operated upon as well as invoked. You’ve already seen how lambda expressions can be converted into expression trees so you can operate on them, producing more or less complex expressions. But one thing I have not discussed yet is the topic of using functions as building blocks for creating new functions. As a quick example of what I mean, consider two lambda expressions: x => x * 3 x => x + 3.1415 You could create a method to combine such lambda expressions to create a compound lambda expression as I’ve shown here: using System; using System.Linq; public class Compound { static Func<T, S> Chain<T, R, S>( Func<T, R> func1, Func<R, S> func2 ) { return x => func2( func1(x) ); } static void Main() { Func<int, double> func = Chain( (int x) => x * 3, (int x) => x + 3.1415 ); Console.WriteLine( func(2) ); } } The Chain<> method accepts two delegates and produces a third delegate by combining the two. In the Main method, you can see how I used it to produce the compound expression. The delegate that you get after calling Chain<> is equivalent to the delegate you get when you convert the following lambda expression into a delegate: x => (x * 3) + 3.1415 Having a method to chain arbitrary expressions like this is useful indeed, but let’s look at other ways to produce a derivative function. Imagine an operation that takes a really long time to compute. Examples are the factorial operation or the operation to compute the n th Fibonacci number. An example that I ultimately like to show demonstrates the Reciprocal Fibonacci constant, which is CHAPTER 15 ■ LAMBDA EXPRESSIONS 535 where F k is a Fibonacci number. 7 To begin to demonstrate that this constant exists computationally, you need to first come up with an operation to compute the n th Fibonacci number: using System; using System.Linq; public class Proof { static void Main() { Func<int, int> fib = null; fib = (x) => x > 1 ? fib(x-1) + fib(x-2) : x; for( int i = 30; i < 40; ++i ) { Console.WriteLine( fib(i) ); } } } When you look at this code, the first thing that jumps up and grabs you is the formation of the Fibonacci routine; that is, the fib delegate. It forms a closure on itself! This is definitely a form of recursion and behavior that I desire. However, if you execute the example, unless you have a powerhouse of a machine, you will notice how slow it is, even though all I did was output the 30 th to 39 th Fibonacci numbers! If that is the case, you don’t even have a prayer of demonstrating the Fibonacci constant. The slowness comes from the fact that for each Fibonacci number that you compute, you have to do a little more work than you did to compute the two prior Fibonacci numbers, and you can see how this work quickly mushrooms. You can solve this problem by trading a little bit of space for time by caching the Fibonacci numbers in memory. But instead of modifying the original expression, let’s look at how to create a method that accepts the original delegate as a parameter and returns a new delegate to replace the original. The ultimate goal is to be able to replace the first delegate with the derivative delegate without affecting the code that consumes it. One such technique is called memorization. 8 This is the technique whereby you cache function return values and each return value’s associated input parameters. This works only if the function has no entropy, meaning that for the same input parameters, it always returns the same result. Then, prior to calling the actual function, you first check to see whether the result for the given parameter set has already been computed and return it rather than calling the function. Given a very complex function, this technique trades a little bit of memory space for significant speed gain. Let’s look at an example: 7 Weisstein, Eric W. "Reciprocal Fibonacci Constant." From MathWorld A Wolfram Web Resource. http://mathworld.wolfram.com/ReciprocalFibonacciConstant.html 8 You can read more about memoization at http://en.wikipedia.org/wiki/Memoization. Also, Wes Dyer has an excellent entry regarding memoization on his blog at http://blogs.msdn.com/wesdyer/archive/2007/01/26/function-memoization.aspx. [...]... C# at http://blogs.msdn.com/wesdyer In his blog entry he demonstrates how to implement a Y fixed-point combinator that generalizes the notion of anonymous recursion shown previously.12 Summary In this chapter, I introduced you to the syntax of lambda expressions, which are, for the most part, replacements for anonymous methods In fact, it’s a shame that lambda expressions did not come along with C#. .. these techniques in C# 2.0 using anonymous methods, the introduction of lambda syntax to the language makes using such techniques more natural and less cumbersome The following chapter introduces LINQ I will also continue to focus on the functional programming aspects that it brings to the table 542 C H A P T E R 16 ■■■ LINQ: Language Integrated Query C-style languages (including C#) are imperative... Hejlsberg and Peter Golde The idea was to create a more natural and language-integrated way to access data from within a language such as C# However, at the same time, it was undesirable to implement it in such a way that it would destabilize the implementation of the C# compiler and become too cumbersome for the language As it turns out, it made sense to implement some building blocks in the language... some of that burden 1 For more extensive coverage of LINQ, I suggest you check out Foundations of LINQ in C#, by Joseph C Rattz, Jr (Apress, 2007) 543 CHAPTER 16 ■ LINQ: LANGUAGE INTEGRATED QUERY A Bridge to Data Throughout this book, I have stressed how just about all the new features introduced by C# 3.0 foster a functional programming model There’s a good reason for that, in the sense that data query... LastName = "Doe", Salary = 123000, StartDate = DateTime.Parse("4/12/1998") }, new Employee { FirstName = "Milton", LastName = "Waddams", Salary = 100 0000, StartDate = DateTime.Parse("12/3/1969") } }; var query = from employee in employees where employee.Salary > 100 000 orderby employee.LastName, employee.FirstName select new { LastName = employee.LastName, FirstName = employee.FirstName }; Console.WriteLine(... After all, C# is a language that syntactically evolved from C++ and Java, and the LINQ syntax looks nothing like those languages ■ Note For those of you familiar with SQL, the first thing you probably noticed is that the query is backward from what you are used to In SQL, the select clause is normally the beginning of the expression There are several reasons why the reversal makes sense in C# One reason... getting the work done In fact, it’s more or less what the compiler is doing under the covers The LINQ syntax is very foreign looking in a predominantly imperative language like C# It’s easy to jump to the conclusion that the C# language underwent massive modifications in order to implement LINQ Actually, the compiler simply transforms the LINQ expression into a series of extension method calls that... Enumerable.Where( employees, emp => emp.Salary > 100 000), emp => emp.LastName ), emp => emp.FirstName ), emp => new {LastName = emp.LastName, FirstName = emp.FirstName} ); But why would you want to do such a thing? I merely show it here for illustration purposes so you know what is actually going on under the covers Those who are really attached to C# 2.0 anonymous methods could even go one step further... show an example of this in the later section titled “Techniques from Functional Programming,” in which I build upon an example from Chapter 14 548 CHAPTER 16 ■ LINQ: LANGUAGE INTEGRATED QUERY C# Query Keywords C# 2008 introduces a small set of new keywords for creating LINQ query expressions, some of which we have already seen in previous sections They are from, join, where, group, into, let, ascending,... might remember from grade school, albeit not in tabular format: using System; using System.Linq; public class MultTable { static void Main() { var query = from x in Enumerable.Range(0 ,10) from y in Enumerable.Range(0 ,10) select new { X = x, Y = y, Product = x * y }; foreach( var item in query ) { Console.WriteLine( "{0} * {1} = {2}", item.X, item.Y, item.Product ); } } } Remember that LINQ expressions . represent operations as data is with the Expression Tree Debugger Visualizer in Visual Studio 2 010. If you execute the previous example within the Visual Studio Debugger, once you step past. MakeCustomIterator<> using delegates in the C# 1.0 days? This is exactly what I mean when I say that even just the addition of the lambda expression syntax to C# opens one’s eyes to the incredible. even creating a random number generator implemented using C# iterators. More on Closures (Variable Capture) and Memoization In the Chapter 10 section titled “Beware the Captured Variable Surprise,”