Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 91 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
91
Dung lượng
5,64 MB
Nội dung
Chapter 20: Platform Interoperability and Unsafe Code738 • Provide public wrapper methods around the external methods that handle the data type conversions and error handling. • Overload the wrapper methods and provide a reduced number of required parameters by inserting defaults for the extern method call. • Use enum or const to provide constant values for the API as part of the API’s declaration. • For all P/Invoke methods that support GetLastError(), be sure to assign the SetLastError named attribute to true. This allows the reporting of errors via System.ComponentModel.Win32Exception. • Wrap resources, such as handles, into classes that derive from Sys- tem.Runtime.InteropServices.SafeHandle or that support IDis- posable . • Function pointers in unmanaged code map to delegate instances in managed code. Generally, this requires the declaration of a specific delegate type that matches the signature of the unmanaged function pointer. • Map input/output and output parameters to ref parameters instead of relying on pointers. The last bullet implies C#’s support for pointers, described in the next section. Pointers and Addresses On occasion, developers will want to be able to access and work with memory, and with pointers to memory locations, directly. This is neces- sary for certain operating system interaction as well as with certain types of time-critical algorithms. To support this, C# requires use of the unsafe code construct. Unsafe Code One of C#’s great features is that it is strongly typed and supports type checking throughout the runtime execution. What makes this feature especially great is that it is possible to circumvent this support and manipulate memory and addresses directly. You would do this when working with things such as memory-mapped devices, or if you wanted Pointers and Addresses 739 to implement time-critical algorithms. The key is to designate a portion of the code as unsafe. Unsafe code is an explicit code block and compilation option, as shown in Listing 20.11. The unsafe modifier has no effect on the generated CIL code itself. It is only a directive to the compiler to permit pointer and address manipulation within the unsafe block. Furthermore, unsafe does not imply unmanaged. Listing 20.11: Designating a Method for Unsafe Code class Program { { // } } You can use unsafe as a modifier to the type or to specific members within the type. In addition, C# allows unsafe as a statement that flags a code block to allow unsafe code (see Listing 20.12). Listing 20.12: Designating a Code Block for Unsafe Code class Program { static int Main(string[] args) { { // } } } Code within the unsafe block can include unsafe constructs such as pointers. unsafe static int Main(string[] args) unsafe NOTE It is important to note that it is necessary to explicitly indicate to the compiler that unsafe code is supported. Chapter 20: Platform Interoperability and Unsafe Code740 From the command line, this requires the /unsafe switch. For example, to compile the preceding code, you need to use the command shown in Out- put 20.1. You need to use the /unsafe switch because unsafe code opens up the pos- sibility of buffer overflows and similar possibilities that expose the poten- tial for security holes. The /unsafe switch includes the ability to directly manipulate memory and execute instructions that are unmanaged. Requir- ing /unsafe, therefore, makes the choice of potential exposure explicit. Pointer Declaration Now that you have marked a code block as unsafe, it is time to look at how to write unsafe code. First, unsafe code allows the declaration of a pointer. Consider the following example. byte* pData; Assuming pData is not null, its value points to a location that contains one or more sequential bytes; the value of pData represents the memory address of the bytes. The type specified before the * is the referent type, or the type located where the value of the pointer refers. In this example, pData is the pointer and byte is the referent type, as shown in Figure 20.1. Because pointers (which are just byte values) are not subject to garbage collection, C# does not allow referent types other than unmanaged types, OUTPUT 20.1: csc.exe /unsafe Program.cs Figure 20.1: Pointers Contain the Address of the Data byte* pData byte[] data 0x0338EE9C –0x0338EE98 –0x0338EE9C 0x18 0x42 Pointers and Addresses 741 which are types that are not reference types, are not generics, and do not contain reference types. Therefore, the following is not valid: string* pMessage Neither is this: ServiceStatus* pStatus where ServiceStatus is defined as shown in Listing 20.13; the problem again is that ServiceStatus includes a string field. Listing 20.13: Invalid Referent Type Example struct ServiceStatus { int State; string Description; // Description is a reference type } Language Contrast: C/C++—Pointer Declaration In C/C++, multiple pointers within the same declaration are declared as follows: int *p1, *p2; Notice the * on p2; this makes p2 an int* rather than an int. In contrast, C# always places the * with the data type: int* p1, p2; The result is two variables of type int*. The syntax matches that of declar- ing multiple arrays in a single statement: int[] array1, array2; Pointers are an entirely new category of type. Unlike structs, enums, and classes, pointers don’t ultimately derive from System.Object. Chapter 20: Platform Interoperability and Unsafe Code742 In addition to custom structs that contain only unmanaged types, valid referent types include enums, predefined value types ( sbyte, byte, short, ushort, int, uint, long, ulong, char, float, double, decimal, and bool), and pointer types (such as byte**). Lastly, valid syntax includes void* pointers, which represent pointers to an unknown type. Assigning a Pointer Once code defines a pointer, it needs to assign a value before accessing it. Just like other reference types, pointers can hold the value null; this is their default value. The value stored by the pointer is the address of a location. Therefore, in order to assign it, you must first retrieve the address of the data. You could explicitly cast an integer or a long into a pointer, but this rarely occurs without a means of determining the address of a particular data value at execution time. Instead, you need to use the address operator ( &) to retrieve the address of the value type: byte* pData = &bytes[0]; // Compile error The problem is that in a managed environment, data can move, thereby invalidating the address. The error message is “You can only take the address of [an] unfixed expression inside of a fixed statement initializer.” In this case, the byte referenced appears within an array and an array is a reference type (a moveable type). Reference types appear on the heap and are subject to garbage collection or relocation. A similar problem occurs when referring to a value type field on a moveable type: int* a = &"message".Length; Either way, to complete the assignment, the data needs to be a value type, fixed, or explicitly allocated on the call stack. Fixing Data To retrieve the address of a moveable data item, it is necessary to fix, or pin, the data, as demonstrated in Listing 20.14. Listing 20.14: Fixed Statement byte[] bytes = new byte[24]; fixed (byte* pData = &bytes[0]) // pData = bytes also allowed Pointers and Addresses 743 { // } Within the code block of a fixed statement, the assigned data will not move. In this example, bytes will remain at the same address, at least until the end of the fixed statement. The fixed statement requires the declaration of the pointer variable within its scope. This avoids accessing the variable outside the fixed state- ment, when the data is no longer fixed. However, it is the programmer’s responsibility to ensure he doesn’t assign the pointer to another variable that survives beyond the scope of the fixed statement—possibly in an API call, for example. Similarly, using ref or out parameters will be problem- atic for data that will not survive beyond the method call. Since a string is an invalid referent type, it would appear invalid to define pointers to strings. However, as in C++, internally a string is a pointer to the first character of an array of characters, and it is possible to declare pointers to characters using char*. Therefore, C# allows declaring a pointer of type char* and assigning it to a string within a fixed statement. The fixed statement prevents the movement of the string during the life of the pointer. Similarly, it allows any moveable type that supports an implicit conversion to a pointer of another type, given a fixed statement. You can replace the verbose assignment of &bytes[0] with the abbrevi- ated bytes, as shown in Listing 20.15. Listing 20.15: Fixed Statement without Address or Array Indexer byte[] bytes = new byte[24]; fixed (byte* pData = bytes) { // } Depending on the frequency and time to execute, fixed statements have the potential to cause fragmentation in the heap because the garbage collec- tor cannot compact fixed objects. To reduce this problem, the best practice is to pin blocks early in the execution and to pin fewer large blocks rather than many small blocks. .NET 2.0 (and above) reduces the .NET Framework problem as well, due to some additional fragmentation-aware code. Chapter 20: Platform Interoperability and Unsafe Code744 Allocating on the Stack You should use the fixed statement on an array to prevent the garbage col- lector from moving the data. However, an alternative is to allocate the array on the call stack. Stack allocated data is not subject to garbage collec- tion or to the finalizer patterns that accompany it. Like referent types, the requirement is that the stackalloc data is an array of unmanaged types. For example, instead of allocating an array of bytes on the heap, you can place it onto the call stack, as shown in Listing 20.16. Listing 20.16: Allocating Data on the Call Stack byte* bytes = stackalloc byte[42]; Because the data type is an array of unmanaged types, it is possible for the runtime to allocate a fixed buffer size for the array and then to restore that buffer once the pointer goes out of scope. Specifically, it allocates sizeof(T) * E , where E is the array size and T is the referent type. Given the require- ment of using stackalloc only on an array of unmanaged types, the run- time restores the buffer back to the system simply by unwinding the stack, eliminating the complexities of iterating over the f-reachable queue and compacting reachable data. Therefore, there is no way to explicitly free stackalloc data. Dereferencing a Pointer Accessing the value of a type referred to by a pointer requires that you dereference the pointer, placing the indirection operator prior to the pointer type. byte data = *pData;, for example, dereferences the location of the byte referred to by pData and returns the single byte at that location. Using this principle in unsafe code allows the unorthodox behavior of modifying the “immutable” string, as shown in Listing 20.17. In no way is this recommended, but it does expose the potential of low-level memory manipulation. Listing 20.17: Modifying an Immutable String string text = "S5280ft"; Console.Write("{0} = ", text); unsafe // Requires /unsafe switch. { Pointers and Addresses 745 fixed (char* pText = text) { char* p = pText; *++p = 'm'; *++p = 'i'; *++p = 'l'; *++p = 'e'; *++p = ' '; *++p = ' '; } } Console.WriteLine(text); The results of Listing 20.17 appear in Output 20.2. In this case, you take the original address and increment it by the size of the referent type ( sizeof(char)), using the preincrement operator. Next, you dereference the address using the indirection operator and then assign the location with a different character. Similarly, using the + and – opera- tors on a pointer changes the address by the * sizeof(T) operand, where T is the referent type. Similarly, the comparison operators ( ==, !=, <, >, <=, and =>) work to compare pointers translating effectively to the comparison of address loca- tion values. One restriction on the dereferencing operator is the inability to derefer- ence a void*. The void* data type represents a pointer to an unknown type. Since the data type is unknown, it can’t be dereferenced to another type. Instead, to access the data referenced by a void*, you must cast it to first assign it to any other pointer type and then to dereference the later type, for example. You can achieve the same behavior as Listing 20.17 by using the index operator rather than the indirection operator (see Listing 20.18). Listing 20.18: Modifying an Immutable with the Index Operator in Unsafe Code string text; text = "S5280ft"; Console.Write("{0} = ", text); OUTPUT 20.2: S5280ft = Smile Chapter 20: Platform Interoperability and Unsafe Code746 Unsafe // Requires /unsafe switch. { fixed (char* pText = text) { pText[1] = 'm'; pText[2] = 'i'; pText[3] = 'l'; pText[4] = 'e'; pText[5] = ' '; pText[6] = ' '; } } Console.WriteLine(text); The results of Listing 20.18 appear in Output 20.3. Modifications such as those in Listing 20.17 and Listing 20.18 lead to unexpected behavior. For example, if you reassigned text to "S5280ft" following the Console.WriteLine() statement and then redisplayed text, the output would still be Smile because the address of two equal string lit- erals is optimized to one string literal referenced by both variables. In spite of the apparent assignment text = "S5280ft"; after the unsafe code in Listing 20.17, the internals of the string assignment are an address assignment of the modified "S5280ft" location, so text is never set to the intended value. Accessing the Member of a Referent Type Dereferencing a pointer makes it possible for code to access the members of the referent type. However, this is possible without the indirection oper- ator ( &). As Listing 20.19 shows, it is possible to directly access a referent type’s members using the -> operator (shorthand for (*p)). Listing 20.19: Directly Accessing a Referent Type’s Members unsafe { OUTPUT 20.3: S5280ft = Smile Summary 747 Angle angle = new Angle(30, 18, 0); Angle* pAngle = ∠ System.Console.WriteLine("{0}° {1}' {2}", } The results of Listing 20.19 appear in Output 20.4. SUMMARY This chapter’s introduction outlined the low-level access to the underlying operating system C# exposes. To summarize this, consider the Main() function listing for determining whether execution is with a virtual com- puter (see Listing 20.20). Listing 20.20: Designating a Block for Unsafe Code using System.Runtime.InteropServices; class Program { unsafe static int Main(string[] args) { // Assign redpill byte[] redpill = { 0x0f, 0x01, 0x0d, // asm SIDT instruction 0x00, 0x00, 0x00, 0x00, // placeholder for an address 0xc3}; // asm return instruction fixed (byte* matrix = new byte[6], redpillPtr = redpill) { // Move the address of matrix immediately // following the SIDT instruction of memory. *(uint*)&redpillPtr[3] = (uint)&matrix[0]; using (VirtualMemoryPtr codeBytesPtr = new VirtualMemoryPtr(redpill.Length)) { Marshal.Copy( pAngle ->Hours, pAngle->Minutes, pAngle->Seconds); OUTPUT 20.4: 30? 18’ 0 unsafe { [...]... installation that includes the compiler and the NET Framework with C# 3.0 syntax support is the redistributable package for the NET Framework 3.0 or higher This is available at http:// msdn.microsoft.com/en-us/netframework • For a rich IDE that includes IntelliSense and support for project files, install a version of the Visual Studio 2008 IDE or later This includes Visual C# Express, which is available free at... execution of a program that is compiled for the CLI 769 This page intentionally left blank A Downloading and Installing the C# Compiler and the CLI Platform T C# programs, it is necessary to install a version of the compiler and the CLI platform O COMPILE AND RUN Microsoft’s NET The predominant CLI platform is Microsoft NET and this is the platform of choice for development on Microsoft Windows • The... PATH=%PATH%;%Windir%\Microsoft .NET\ Framework\ , again substituting the value of appropriately Output A.1 provides an example OUTPUT A.1: Set PATH=%PATH%;%Windir%\Microsoft .NET\ Framework\ v2.0.50727 Once the path includes the framework, it is possible to use the NET C# compiler, CSC.EXE, without providing the full path to its location Mono For CLI development on platforms other than Microsoft... visibly outside a module It includes concepts for how types can be combined to form new types Summary TABLE 21.2: Common C#- Related Acronyms (Continued) Acronym Definition Description FCL NET Framework Class Library The class library that comprises Microsoft’s NET Framework It includes Microsoft’s implementation of the BCL as well as a large library of classes for such things as web development, distributed... fantastic structure of C# This chapter demonstrated the ability, in spite of such high-level programming capabilities, to perform very low-level operations as well Before I end the book, the next chapter briefly describes the underlying execution platform and shifts the focus from the C# language to the broader platform in which C# programs execute 21 The Common Language Infrastructure C# programmers encounter... implementations TABLE 21.1: Primary C# Compilers Compiler Description Microsoft Visual C# NET Compiler Microsoft’s NET C# compiler is dominant in the industry, but is limited to running on the Windows family of operating systems You can download it free as part of the Microsoft NET Framework SDK from http://msdn.microsoft.com/en-us/netframework/ default.aspx Mono Project The Mono Project is an open source implementation... specification for C# 1.01 and the ECMA-335 specification for the CLI 1.2.2 Furthermore, many implementations include prototype features prior to the establishment of those features in standards C# Compilation to Machine Code The HelloWorld program listing from Chapter 1 is obviously C# code, and you compiled it for execution using the C# compiler However, the processor still cannot directly interpret compiled C#. .. accessing the underlying platform functionality, rather than writing it all from scratch The platform portability offered by NET, DotGNU, Rotor, and Mono varies depending on the goals of the platform developers For obvious reasons, NET was targeted to run only on the Microsoft series of operating systems Rotor, also produced by Microsoft, was primarily designed as a means for teaching and fostering research... Its inclusion of support for FreeBSD proves the portability characteristics of the CLI Some of the libraries included in NET (such as WinForms, ASP .NET, ADO .NET, and more) are not available in Rotor DotGNU and Mono were initially targeted at Linux but have since been ported to many different operating systems Furthermore, the goal of these CLIs was to provide a means for taking NET applications and porting... available for execution from any directory Without Visual Studio NET installed, no special compiler command prompt item appears in the Start menu Instead, you need to reference the full compiler pathname explicitly or add it to the path The compiler is located at %Windir%\Microsoft .NET\ Framework\ , where is the version of the NET Framework (v1.0.3705, v1.1.4322, v2.0.50727, v3.0, and . HelloWorld::Main 00 000 000 push ebp 00 000 001 mov ebp,esp 00 000 0 03 sub esp,28h 00 000 006 mov dword ptr [ebp-4] ,0 000 000 0d mov dword ptr [ebp-0Ch] ,0 000 000 14 cmp dword ptr ds: [00 1 833 E0h] ,0 000 000 1b je 00 000 022 00 000 01d. 00 000 022 00 000 01d call 75F9C9E0 00 000 022 mov ecx,dword ptr ds: [01 C31418h] 00 000 028 call dword ptr ds: [03 C8E 854 h] 00 000 02e nop 00 000 02f mov esp,ebp 00 000 03 1 pop ebp 00 000 03 2 ret Machine Code C# Code CIL. args) { // Assign redpill byte[] redpill = { 0x0f, 0x01, 0x0d, // asm SIDT instruction 0x 00, 0x 00, 0x 00, 0x 00, // placeholder for an address 0xc3}; // asm return instruction fixed (byte* matrix