Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 43 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
43
Dung lượng
321,42 KB
Nội dung
■Note In order to build the previous example, you’ll need to add a reference to the System.Windows.Forms.dll assembly, located in the Microsoft.NET\Framework\ v2.0.xxxxx directory. This example displays the strings using the MessageBox type defined in Windows.Forms, since the console isn’t good at displaying Unicode characters. The format specifier that we’ve chosen is "C" to display the number in a currency format. For the first display, you use the CultureInfo instance attached to the current thread. For the following two, you create a CultureInfo for both Germany and Russia. Note that in forming the string, the System.Double type has used the CurrencyDecimalSeparator, CurrencyDecimalDigits, and CurrencySymbol properties of the NumberFormatInfo instance returned from the CultureInfo.GetFormat method. Had you displayed a DateTime instance, then the DateTime implementation of IFormattable.ToString() would have utilized an instance of DateTimeFormatInfo returned from the CultureInfo.GetFormat() in a similar way. Console.WriteLine() and String.Format() Throughout this book, you’ve seen Console.WriteLine() used in the examples. One of the forms of WriteLine() that is useful and identical to some overloads of String.Format() allows you to build a composite string by replacing format tags within a string with a variable num- ber of parameters passed in. Let’s look at a quick example of string format usage: Imports System Imports System.Globalization Imports System.Windows.Forms Public Class EntryPoint Shared Sub Main(ByVal args As String()) If args.Length < 3 Then Console.WriteLine("Please provide 3 parameters") Return End If Dim composite As String = _ String.Format("{0}, {1}, and {2}.", args(0), args(1), args(2)) Console.WriteLine(composite) End Sub End Class Here are the results from the previous example: Jack, Jill, and Spot. CHAPTER 10 ■ WORKING WITH STRINGS192 801-6CH10.qxd 3/3/07 3:03 AM Page 192 You can see that a placeholder is delimited by braces and that the number within it is the zero-based index to the following parameter list. The String.Format method, as well as the Console.WriteLine method, has an overload that accepts a variable number of parameters to use as the replacement values. In this example, the String.Format method replaces each placeholder using the general formatting of the type that you can get via a call to the parame- terless version of ToString(). If the instance being placed in this spot supports IFormattable, the IFormattable.ToString method is called with a Nothing format specifier, which usually is the same if you had supplied the "G", or general, format specifier. Incidentally, within the source string, if you need to insert actual braces that will show in the output, you must double them by putting in either {{ or }}. The exact format of the replacement item is {index[,alignment][:formatString]}, where the items within brackets are optional. The index value is a zero-based value used to reference one of the trailing parameters provided to the method. The alignment represents how wide the entry should be within the composite string. For example, if you set it to eight characters in width and the string is narrower than that, then the extra space is padded with spaces. Lastly, the FormatString portion of the replacement item allows you to denote precisely what formatting to use for the item. The format string is the same style of string that you would have used if you were to call IFormattable.ToString() on the instance itself. Unfortunately, you can’t specify a particular IFormatProvider instance for each one of the replacement strings. If you need to create a composite string from items using multiple format providers or cultures, you must resort to using IFormattable.ToString() directly. Examples of String Formatting in Custom Types Let’s take a look at another example using the venerable Complex type that we’ve used before. This time, let’s implement IFormattable on it to make it a little more useful when generating a string version of the instance: Imports System Imports System.Text Imports System.Globalization Public Structure Complex Implements IFormattable Private real As Double Private imaginary As Double Public Sub New(ByVal real As Double, ByVal imaginary As Double) Me.real = real Me.imaginary = imaginary End Sub 'IFormattable implementation Public Overloads Function ToString(ByVal format As String, _ ByVal formatProvider As IFormatProvider) As String _ Implements IFormattable.ToString CHAPTER 10 ■ WORKING WITH STRINGS 193 801-6CH10.qxd 3/3/07 3:03 AM Page 193 Dim sb As StringBuilder = New StringBuilder() If format = "DBG" Then sb.Append(Me.[GetType]().ToString() + "" & vbCrLf & "") sb.AppendFormat("" & Chr(9) & "real:" & Chr(9) & _ "{0}" & vbCrLf & "", real) sb.AppendFormat("" & Chr(9) & "imaginary:" & Chr(9) & _ "{0}" & vbCrLf & "", imaginary) Else sb.Append("( ") sb.Append(real.ToString(format, formatProvider)) sb.Append(" : ") sb.Append(imaginary.ToString(format, formatProvider)) sb.Append(" )") End If Return sb.ToString() End Function End Structure Public Class EntryPoint Shared Sub Main() Dim local As CultureInfo = CultureInfo.CurrentCulture Dim germany As CultureInfo = New CultureInfo("de-DE") Dim cpx As Complex = New Complex(12.3456, 1234.56) Dim strCpx As String = cpx.ToString("F", local) Console.WriteLine(strCpx) strCpx = cpx.ToString("F", germany) Console.WriteLine(strCpx) Console.WriteLine("" & vbCrLf & "Debugging output:" & vbCrLf & _ "{0:DBG}", cpx) End Sub End Class This is the output from the previous example: ( 12.35 : 1234.56 ) ( 12,35 : 1234,56 ) Debugging output: ConsoleApplication2.Complex real: 12.3456 imaginary: 1234.56 CHAPTER 10 ■ WORKING WITH STRINGS194 801-6CH10.qxd 3/3/07 3:03 AM Page 194 The real meat of this example lies within the implementation of IFormattable. ToString(). You implement a "DBG" format string for this type that will create a string that shows the internal state of the object and may be useful for debug purposes. If the format string is not equal to "DBG", then you simply defer to the IFormattable implementation of System.Double. Notice the use of StringBuilder to create the string that is eventually returned. Also, we chose to use the Console.WriteLine method and its format item syntax to send the debugging output to the console just to show a little variety in usage. ICustomFormatter ICustomFormatter is an interface that allows you to replace or extend a built-in or already existing IFormattable interface for an object. Whenever you call String.Format() or StringBuilder.AppendFormat() to convert an object instance to a string, before the method calls through to the object’s implementation of IFormattable.ToString(), it first checks to see if the passed-in IFormatProvider provides a custom formatter. It does this by calling IFormatProvider.GetFormat() while passing a type of ICustomFormatter. If the formatter returns an implementation of ICustomFormatter, then the method will use the custom formatter. Otherwise, it will use the object’s implementation of IFormattable.ToString() or the object’s implementation of Object.ToString() in cases where it doesn’t implement IFormattable. Consider the following example where we’ve reworked the previous Complex example but externalized the debugging output capabilities outside of the Complex structure: Imports System Imports System.Text Imports System.Globalization Public Class ComplexDbgFormatter Implements ICustomFormatter Implements IFormatProvider 'IFormatProvider implementation Public Function GetFormat(ByVal formatType As Type) As Object _ Implements System.IFormatProvider.GetFormat If formatType Is GetType(ICustomFormatter) Then Return Me Else Return CultureInfo.CurrentCulture.GetFormat(formatType) End If End Function 'ICustomFormatter implementation Public Function Format(ByVal formatString As String, ByVal arg As Object, _ ByVal formatProvider As IFormatProvider) As String _ Implements System.ICustomFormatter.Format If TypeOf arg Is IFormattable AndAlso formatString = "DBG" Then CHAPTER 10 ■ WORKING WITH STRINGS 195 801-6CH10.qxd 3/3/07 3:03 AM Page 195 Dim cpx As Complex = DirectCast(arg, Complex) 'Generate debugging output for this object. Dim sb As StringBuilder = New StringBuilder() sb.Append(arg.[GetType]().ToString() + "" & Chr(10) & "") sb.AppendFormat("" & Chr(9) & "real:" & Chr(9) & "{0}" & _ Chr(10) & "", cpx.Real) sb.AppendFormat("" & Chr(9) & "imaginary:" & Chr(9) & "{0}" & _ Chr(10) & "", cpx.Img) Return sb.ToString() Else Dim formattable As IFormattable = TryCast(arg, IFormattable) If formattable Is Nothing Then Return formattable.ToString(formatString, formatProvider) Else Return arg.ToString() End If End If End Function End Class Public Structure Complex Implements IFormattable Private mReal As Double Private mImaginary As Double Public Sub New(ByVal real As Double, ByVal imaginary As Double) Me.mReal = real Me.mImaginary = imaginary End Sub Public ReadOnly Property Real() As Double Get Return mReal End Get End Property Public ReadOnly Property Img() As Double Get Return mImaginary End Get End Property CHAPTER 10 ■ WORKING WITH STRINGS196 801-6CH10.qxd 3/3/07 3:03 AM Page 196 'IFormattable implementation Public Overloads Function ToString(ByVal format As String, _ ByVal formatProvider As IFormatProvider) As String _ Implements IFormattable.ToString Dim sb As StringBuilder = New StringBuilder() sb.Append("( ") sb.Append(mReal.ToString(format, formatProvider)) sb.Append(" : ") sb.Append(mImaginary.ToString(format, formatProvider)) sb.Append(" )") Return sb.ToString() End Function End Structure Public Class EntryPoint Shared Sub Main() Dim local As CultureInfo = CultureInfo.CurrentCulture Dim germany As CultureInfo = New CultureInfo("de-DE") Dim cpx As Complex = New Complex(12.3456, 1234.56) Dim strCpx As String = cpx.ToString("F", local) Console.WriteLine(strCpx) strCpx = cpx.ToString("F", germany) Console.WriteLine(strCpx) Dim dbgFormatter As ComplexDbgFormatter = New ComplexDbgFormatter() strCpx = [String].Format(dbgFormatter, "{0:DBG}", cpx) Console.WriteLine("" & vbCrLf & "Debugging output:" & _ vbCrLf & "{0}", strCpx) End Sub End Class Of course, this example is a bit more complex (no pun intended). But if you were not the original author of the Complex type, then this would be your only way to provide custom for- matting for that type. Using this method, you can provide custom formatting to any of the other built-in types in the system. Comparing Strings When it comes to comparing strings, the .NET Framework provides quite a bit of flexibility. You can compare strings based on cultural information as well as without cultural considera- tion. You can also compare strings using case sensitivity or not, and the rules for how to do CHAPTER 10 ■ WORKING WITH STRINGS 197 801-6CH10.qxd 3/3/07 3:03 AM Page 197 case-insensitive compares vary from culture to culture. There are several ways to compare strings offered within the Framework, some of which are exposed directly on the System.String type through the static String.Compare method. You can choose from a few overloads, and the most basic of them use the CultureInfo attached to the current thread to handle comparisons. You often need to compare strings and don’t want to carry the overhead of culture- specific comparisons. A perfect example is when you’re comparing internal string data from a configuration file or when you’re comparing file directories. The .NET 2.0 Framework intro- duces a new enumeration, StringComparison, which allows you to choose a true nonculture-based comparison. The StringComparison enumeration looks like the following: Public Enum StringComparison CurrentCulture CurrentCultureIgnoreCase InvariantCulture InvariantCultureIgnoreCase Ordinal OrdinalIgnoreCase End Enum The last two items in the enumeration are the items of interest. An ordinal-based compar- ison is the most basic string comparison that simply compares the character values of the two strings based on the numeric value of each character compared (it actually compares the raw binary values of each character). Doing comparisons this way removes all cultural bias from the comparisons and increases the efficiency of these comparisons tremendously. The .NET 2.0 Framework features a new class called StringComparer that implements the IComparer interface. Things such as sorted collections can use StringComparer to manage the sort. The System.StringComparer type follows the same pattern as the IFormattable locale support. You can use the StringComparer.CurrentCulture property to get a StringComparer instance specific to the culture of the current thread. Additionally, you can get the StringComparer instance from StringComparer.CurrentCultureIgnoreCase to do case- insensitive comparison, as well as culture-invariant instances using the InvariantCulture and InvariantCultureIgnoreCase properties. Lastly, you can use the Ordinal and OrdinalIgnoreCase properties to get instances that compare based on ordinal string comparison rules. As you may expect, if the culture information attached to the current thread isn’t what you need, you can create StringComparer instances based upon explicit locales simply by calling the StringComparer.Create method and passing the desired CultureInfo representing the locale you want, as well as a flag denoting whether you want a case-sensitive or case- insensitive comparer. The string used to specify which locale to use is the same as that for CultureInfo. When choosing between the various comparison techniques, the general rule of thumb is to use the culture-specific or culture-invariant comparisons for any user-facing data—that is, data that will be presented to end users in some form or fashion—and ordinal comparisons otherwise. However, it’s rare that you’d ever use InvariantCulture compared strings to display to users. Use the ordinal comparisons when dealing with data that is completely internal. CHAPTER 10 ■ WORKING WITH STRINGS198 801-6CH10.qxd 3/3/07 3:03 AM Page 198 Working with Strings from Outside Sources Within .NET, all strings are represented using Unicode UTF-16 character arrays. However, you often might need to interface with the outside world using some other form of encoding, such as UTF-8. Sometimes, even when interfacing with other entities that use 16-bit Unicode strings, those entities may use big-endian Unicode strings, whereas the Intel platform typi- cally uses little-endian Unicode strings. This conversion work is easy with the System.Text.Encoding class. This section goes into some of the details of System.Text.Encoding. This cursory example demonstrates how to convert to and from various encodings using the Encoding objects served up by the System.Text.Encoding class: Imports System Imports System.Text Imports System.Windows.Forms Public Class EntryPoint Shared Sub Main() Dim leUnicodeStr As String = "???????!" Dim leUnicode As Encoding = Encoding.Unicode Dim beUnicode As Encoding = Encoding.BigEndianUnicode Dim utf8 As Encoding = Encoding.UTF8 Dim leUnicodeBytes As Byte() = leUnicode.GetBytes(leUnicodeStr) Dim beUnicodeBytes As Byte() = _ Encoding.Convert(leUnicode, beUnicode, leUnicodeBytes) Dim utf8Bytes As Byte() = Encoding.Convert(leUnicode, utf8, leUnicodeBytes) MessageBox.Show(leUnicodeStr, "Original String") Dim sb As StringBuilder = New StringBuilder() For Each b As Byte In leUnicodeBytes sb.Append(b).Append(" : ") Next MessageBox.Show(sb.ToString(), "Little Endian Unicode Bytes") sb = New StringBuilder() For Each b As Byte In beUnicodeBytes sb.Append(b).Append(" : ") Next MessageBox.Show(sb.ToString(), "Big Endian Unicode Bytes") CHAPTER 10 ■ WORKING WITH STRINGS 199 801-6CH10.qxd 3/3/07 3:03 AM Page 199 sb = New StringBuilder() For Each b As Byte In utf8Bytes sb.Append(b).Append(" : ") Next MessageBox.Show(sb.ToString(), "UTF Bytes") End Sub End Class The example first starts by creating a System.String with some Russian text in it. As men- tioned, the string contains a Unicode string, but is it a big-endian or little-endian Unicode string? The answer depends on what platform you’re running on. On an Intel system, it is normally little-endian. However, since you’re not supposed to access the underlying byte rep- resentation of the string because it is encapsulated from you, it doesn’t matter. In order to get the bytes of the string, you should use one of the Encoding objects that you can get from System.Text.Encoding. In the example, you get local references to the Encoding objects for handling big-endian Unicode, little-endian Unicode, and UTF-8. Once you have those, you can use them to convert the string into any byte representation that you want. As you can see, you get three representations of the same string and send the byte sequence values to the con- sole. In this example, since the text is based on the Cyrillic alphabet, the UTF-8 byte array is longer than the Unicode byte array. Had the original string been based on the Latin character set, the UTF-8 byte array would be shorter than the Unicode byte array, usually by half. The point is, you should never make any assumption about the storage requirements for any of the encodings. If you need to know how much space is required to store the encoded string, call the Encoding.GetByteCount method to get that value. ■Caution Never make assumptions regarding the internal string representation format of the CLR. Noth- ing says that the internal representation cannot vary from one platform to the next. It would be unfortunate if your code made assumptions based upon an Intel platform and then failed to run on a Sun platform running the Mono CLR. Microsoft could even choose to run Windows on another platform one day, just as Apple has chosen to start using Intel processors. Usually, you need to go the opposite way with the conversion and convert an array of bytes from the outside world into a string that the system can then manipulate easily. For example, the Bluetooth protocol stack uses big-endian Unicode strings to transfer string data. To convert the bytes into a System.String, use the GetString method on the encoder that you’re using. You must also use the encoder that matches the source encoding of your data. This brings up an important note to keep in mind. When passing string data to and from other systems in raw byte format, you must always know the encoding scheme used by the protocol you’re using. Most importantly, you must always use that encoding’s matching Encoding object to convert the byte array into a System.String, even if you know that the encoding in the protocol is the same as that used internally with System.String on the plat- form you’re building the application. Why? Suppose you’re developing your application on an Intel platform and the protocol encoding is little-endian, which you know is the same as the CHAPTER 10 ■ WORKING WITH STRINGS200 801-6CH10.qxd 3/3/07 3:04 AM Page 200 platform encoding. If you take a shortcut and don’t use the System.Text.Encoding.Unicode object to convert the bytes to the string, when you decide to run the application on a platform that happens to use big-endian strings internally, you’ll be surprised when the application starts to crumble because you falsely assumed what encoding System.String uses internally. Efficiency is not a problem if you always use the encoder, because on platforms where the internal encoding is the same as the external encoding, the conversion will essentially boil down to nothing. In the previous example, you saw use of the StringBuilder class in order to send the array of bytes to the console. Let’s now take a look at what the StringBuilder type is all about. StringBuilder Since System.String objects are immutable, sometimes they create efficiency bottlenecks when you’re trying to build strings on the fly. You can create composite strings using the + operator as follows: Dim compound As String = "Vote" + " for " + "Pedro" However, this method isn’t efficient, since you have to create four strings to get the job done. Although this line of code is rather contrived, you can imagine that the efficiency of a complex system that does lots of string manipulation can quickly go downhill. Consider a case where you implement a custom base64 encoder that appends characters incrementally as it processes a binary file. The .NET library already offers this functionality in the System.Convert class, but let’s ignore that for the sake of example. If you were to repeatedly use the + operator in a loop to create a large base64 string, your performance would quickly degrade as the source data increased in size. For these situations, you can use the System.Text. StringBuilder class, which implements a mutable string specifically for building composite strings efficiently. We won’t go over each of the methods of StringBuilder in detail; however, we’ll cover some of the salient points. StringBuilder internally maintains an array of characters that it manages dynamically. The workhorse methods of StringBuilder are Append(), Insert(), and AppendFormat(). These methods are richly overloaded in order to support appending and inserting string forms of the many common types. When you create a StringBuilder instance, you have various constructors to choose from. The default constructor creates a new StringBuilder instance with the system-defined default capacity. However, that capacity doesn’t constrain the size of the string that it can create. Rather, it represents the amount of string data the StringBuilder can hold before it needs to grow the internal buffer and in- crease the capacity. If you know how big your string will likely end up being, you can give the StringBuilder that number in one of the constructor overloads, and it will initialize the buffer accordingly. This can help the StringBuilder instance from having to reallocate the buffer too often while you fill it. You can also define the maximum-capacity property in the constructor overloads. By default, the maximum capacity is System.Int32.MaxValue, which is currently 2,147,483,647, but that exact value is subject to change as the system evolves. If you need to protect your StringBuilder buffer from growing over a certain size, you may provide an alternate maxi- mum capacity in one of the constructor overloads. If either an append or insert operation forces the need for the buffer to grow greater than the maximum capacity, an ArgumentOutOfRangeException will be thrown. CHAPTER 10 ■ WORKING WITH STRINGS 201 801-6CH10.qxd 3/3/07 3:04 AM Page 201 [...]... First part match " & vbCrLf & _ "([01]?\d\d? # At least one digit," & vbCrLf & _ " # possibly prepended by 0 or 1" & vbCrLf & _ " # and possibly followed by another digit" & _ vbCrLf & "# OR " & vbCrLf & _ "|2[0-4]\d # Starts with a 2, after a number from 0-4" & _ vbCrLf & " # and then any digit" & vbCrLf & _ "# OR " & vbCrLf & _ "|25[0-5]) # 25 followed by a number from 0-5" & vbCrLf & _ "\ # The whole... vbCrLf & _ "|25[0-5]) # 25 followed by a number from 0-5" & vbCrLf & _ "\ # The whole group is followed by a period." & _ vbCrLf & "# REPEAT " & vbCrLf & "([01]?\d\d?|2[0-4]\d|25[0-5])\ " & _ vbCrLf & "# REPEAT " & vbCrLf & "([01]?\d\d?|2[0-4]\d|25[0-5])\ " & _ vbCrLf & "# REPEAT " & vbCrLf & "([01]?\d\d?|2[0-4]\d|25[0-5])" Dim regex As Regex = _ New Regex(pattern, RegexOptions.IgnorePatternWhitespace)... Dim array2 As Integer() = New Integer() {2, 4, 6, 8} Dim array3 As Integer() = {1, 3, 5, 7} Dim i As Integer = 0 For i = 0 To array1.Length - 1 array1(i) = i * 2 Next For Each item As Integer In array1 215 801-6CH11.qxd 2 16 3/2/07 8:28 AM Page 2 16 CHAPTER 11 ■ ARRAYS AND COLLECTIONS Console.WriteLine("array1: " + item.ToString) Next Console.WriteLine(vbCrLf) For Each item As Integer In array2 Console.WriteLine("array2:... collection interfaces and iterators, which are new to Visual Basic 2005 (VB 2005) , along with the cool things you can do with them Traditionally, creating enumerators for collection types has been mundane and annoying Iterators make this task a breeze, while making your code a lot more readable in the process Introduction to Arrays A VB array is a built-in, implicit type to the runtime When you declare... group can be from one to three digits in length This is a simplistic search because it will match an invalid IP address such as 999.888.777 .66 6 A better search for the IP address would look like the following: Imports System Imports System.Text.RegularExpressions 801-6CH10.qxd 3/3/07 3:04 AM Page 205 CHAPTER 10 ■ WORKING WITH STRINGS Public Class EntryPoint Shared Sub Main(ByVal args As String()) If args.Length... Imports System.Text.RegularExpressions Public Class EntryPoint Shared Sub Main(ByVal args As String()) If args.Length < 1 Then Console.WriteLine("You must provide a string.") 205 801-6CH10.qxd 2 06 3/3/07 3:04 AM Page 2 06 CHAPTER 10 ■ WORKING WITH STRINGS Return End If 'Create regex to search for IP address pattern Dim pattern As String = "([01]?\d\d?|2[0-4]\d|25[0-5])\." + _ "([01]?\d\d?|2[0-4]\d|25[0-5])\."... agree that the string and text-handling facilities built into the CLR, NET, and VB are well designed and easy to use In Chapter 11, we’ll cover arrays and other, more versatile collection types After arrays and collection types, our discussion will turn to the topic of iteration 213 801-6CH10.qxd 3/3/07 3:04 AM Page 214 801-6CH11.qxd 3/2/07 8:28 AM CHAPTER Page 215 11 Arrays and Collections C ollection... that you’re interested in ■ Note All arrays created within VB using the standard VB array declaration syntax will have a lower bound of 0 However, if you’re dealing with arrays used for mathematical purposes, as well as arrays that come from assemblies written in other languages, you may need to consider that the lower bound may not be 0 219 801-6CH11.qxd 220 3/2/07 8:28 AM Page 220 CHAPTER 11 ■ ARRAYS... type first ■ Note Generally, when implementing your own list types, you should derive your implementation from Collection(Of T) in the System.Collections.ObjectModel namespace 225 801-6CH11.qxd 2 26 3/2/07 8:28 AM Page 2 26 CHAPTER 11 ■ ARRAYS AND COLLECTIONS Dictionaries The NET 2.0 Framework implements IDictionary(Of TKey, TValue) as a strongly typed counterpart to IDictionary Concrete types that implement... useful This namespace contains only three types, and the main reason these types were broken out into their own namespace is that the VB environment already contains a Collection type that is implemented by a namespace it imports by default The VB team was also concerned that VB users might be confused seeing two types with similar names and drastically different behaviors popping up in IntelliSense These . vbCrLf & "Debugging output:" & vbCrLf & _ "{0:DBG}", cpx) End Sub End Class This is the output from the previous example: ( 12.35 : 1234. 56 ) ( 12,35 : 1234, 56. 999.888.777 .66 6. A better search for the IP address would look like the following: Imports System Imports System.Text.RegularExpressions CHAPTER 10 ■ WORKING WITH STRINGS204 801-6CH10.qxd 3/3/07. Double Get Return mImaginary End Get End Property CHAPTER 10 ■ WORKING WITH STRINGS1 96 801-6CH10.qxd 3/3/07 3:03 AM Page 1 96 'IFormattable implementation Public Overloads Function ToString(ByVal