1. Trang chủ
  2. » Công Nghệ Thông Tin

Phát triển web với PHP và MySQL - p 14 pot

10 256 0

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 10
Dung lượng 565,12 KB

Nội dung

return a number greater than zero. If str1 is less than str2, strcmp() will return a number less than zero. This function is case sensitive. The function strcasecmp() is identical except that it is not case sensitive. The function strnatcmp() and its non-case sensitive twin, strnatcasecmp(), were added in PHP 4. These functions compare strings according to a “natural ordering,” which is more the way a human would do it. For example, strcmp() would order the string “2” as greater than the string “12” because it is lexicographically greater. strnatcmp() would do it the other way around. You can read more about natural ordering at http://www.linuxcare.com.au/ projects/natsort/ Testing String Length with strlen() We can check the length of a string with the strlen() function. If you pass it a string, this function return its length. For example, strlen(“hello”) returns 5. This can be used for validating input data. Consider the email address on our form, stored in $email. One basic way of validating an email address stored in $email is to check its length. By my reasoning, the minimum length of an email address is six characters—for example, a@a.to if you have a country code with no second level domains, a one-letter server name, and a one-letter email address. Therefore, an error could be produced if the address was not this length: if (strlen($email) < 6) { echo “That email address is not valid”; exit; // finish execution of PHP script } Clearly, this is a very simplistic way of validating this information. We will look at better ways in the next section. Matching and Replacing Substrings with String Functions It’s common to want to check if a particular substring is present in a larger string. This partial matching is usually more useful than testing for equality. In our Smart Form example, we want to look for certain key phrases in the customer feedback and send the mail to the appropriate department. If we want to send emails talking about Bob’s shops to the retail manager, we want to know if the word “shop” (or derivatives thereof) appear in the message. String Manipulation and Regular Expressions C HAPTER 4 4 S TRING M ANIPULATION 105 06 7842 CH04 3/6/01 3:41 PM Page 105 Given the functions we have already looked at, we could use explode() or strtok() to retrieve the individual words in the message, and then compare them using the == operator or strcmp(). However, we could also do the same thing with a single function call to one of the string matching or regular expression matching functions. These are used to search for a pattern inside a string. We’ll look at each set of functions one by one. Finding Strings in Strings: strstr(), strchr(), strrchr(), stristr() To find a string within another string you can use any of the functions strstr(), strchr(), strrchr(), or stristr(). The function strstr() is the most generic, and can be used to find a string or character match within a longer string. Note that in PHP, the strchr() function is exactly the same as strstr(), although its name implies that it is used to find a character in a string, similar to the C version of this function. In PHP, either of these functions can be used to find a string inside a string, including finding a string containing only a single character. The prototype for strstr() is as follows: string strstr(string haystack, string needle); You pass the function a haystack to be searched and a needle to be found. If an exact match of the needle is found, the function returns the haystack from the needle onwards, otherwise it returns false. If the needle occurs more than once, the returned string will start from the first occurrence of needle. For example, in the Smart Form application, we can decide where to send the email as follows: $toaddress = “feedback@bobsdomain.com”; // the default value // Change the $toaddress if the criteria are met if (strstr($feedback, “shop”)) $toaddress = “retail@bobsdomain.com”; else if (strstr($feedback, “delivery”)) $toaddress = “fulfilment@bobsdomain.com”; else if (strstr($feedback, “bill”)) $toaddress = “accounts@bobsdomain.com”; This code checks for certain keywords in the feedback and sends the mail to the appropriate person. If, for example, the customer feedback reads “I still haven’t received delivery of my last order,” the string “delivery” will be detected and the feedback will be sent to fulfilment@bobsdomain.com. Using PHP P ART I 106 06 7842 CH04 3/6/01 3:41 PM Page 106 There are two variants on strstr(). The first variant is stristr(), which is nearly identical but is not case sensitive. This will be useful for this application as the customer might type “delivery”, “Delivery”, or “DELIVERY”. The second variant is strrchr(), which is again nearly identical, but will return the haystack from the last occurrence of the needle onwards. Finding the Position of a Substring: strpos(), strrpos() The functions strpos() and strrpos() operate in a similar fashion to strstr(), except, instead of returning a substring, they return the numerical position of a needle within a haystack. The strpos() function has the following prototype: int strpos(string haystack, string needle, int [offset] ); The integer returned represents the position of the first occurrence of the needle within the haystack. The first character is in position 0 as usual. For example, the following code will echo the value 4 to the browser: $test = “Hello world”; echo strpos($test, “o”); In this case, we have only passed in a single character as the needle, but it can be a string of any length. The optional offset parameter is used to specify a point within the haystack to start searching. For example echo strpos($test, “o”, 5); This code will echo the value 7 to the browser because PHP has started looking for the charac- ter o at position 5, and therefore does not see the one at position 4. The strrpos() function is almost identical, but will return the position of the last occurrence of the needle in the haystack. Unlike strpos(), it only works with a single character needle. Therefore, if you pass it a string as a needle, it will only use the first character of the string to match. In any of these cases, if the needle is not in the string, strpos() or strrpos() will return false. This can be problematic because false in a weakly typed language such as PHP is equivalent to 0, that is, the first character in a string. String Manipulation and Regular Expressions C HAPTER 4 4 S TRING M ANIPULATION 107 06 7842 CH04 3/6/01 3:41 PM Page 107 You can avoid this problem by using the === operator to test return values: $result = strpos($test, “H”); if ($result === false) echo “Not found” else echo “Found at position 0”; Note that this will only work in PHP 4—in earlier versions you can test for false by testing the return value to see if it is a string (that is, false). Replacing Substrings: str_replace(), substr_replace() Find-and-replace functionality can be extremely useful with strings. We have used find and replace in the past for personalizing documents generated by PHP—for example by replacing <<name>> with a person’s name and <<address>> with their address. You can also use it for censoring particular terms, such as in a discussion forum application, or even in the Smart Form application. Again, you can use string functions or regular expression functions for this purpose. The most commonly used string function for replacement is str_replace(). It has the follow- ing prototype: string str_replace(string needle, string new_needle, string haystack); This function will replace all the instances of needle in haystack with new_needle. For example, because people can use the Smart Form to complain, they might use some color- ful words. As programmers, we can prevent Bob’s various departments from being abused in that way: $feedback = str_replace($offcolor, “%!@*”, $feedback); The function substr_replace() is used to find and replace a particular substring of a string. It has the following prototype: string substr_replace(string string, string replacement, int start, int [length] ); This function will replace part of the string string with the string replacement. Which part is replaced depends upon the values of the start and optional length parameters. The start value represents an offset into the string where replacement should begin. If it is 0 or positive, it is an offset from the beginning of the string; if it is negative, it is an offset from the end of the string. For example, this line of code will replace the last character in $test with “X”: $test = substr_replace($test, “X”, -1); Using PHP P ART I 108 06 7842 CH04 3/6/01 3:41 PM Page 108 The length value is optional and represents the point at which PHP will stop replacing. If you don’t supply this value, the string will be replaced from start to the end of the string. If length is zero, the replacement string will actually be inserted into the string without over- writing the existing string. A positive length represents the number of characters that you want replaced with the new string. A negative length represents the point at which you’d like to stop replacing characters, counted from the end of the string. Introduction to Regular Expressions PHP supports two styles of regular expression syntax: POSIX and Perl. The POSIX style of regular expression is compiled into PHP by default, but you can use the Perl style by compil- ing in the PCRE (Perl-compatible regular expression) library. We’ll cover the simpler POSIX style, but if you’re already a Perl programmer, or want to learn more about PCRE, read the online manual at http://php.net. So far, all the pattern matching we’ve done has used the string functions. We have been limited to exact match, or to exact substring match. If you want to do more complex pattern matching, you should use regular expressions. Regular expressions are difficult to grasp at first but can be extremely useful. The Basics A regular expression is a way of describing a pattern in a piece of text. The exact (or literal) matches we’ve done so far are a form of regular expression. For example, earlier we were searching for regular expression terms like “shop” and “delivery”. Matching regular expressions in PHP is more like a strstr() match than an equal comparison because you are matching a string somewhere within another string. (It can be anywhere within that string unless you specify otherwise.) For example, the string “shop” matches the regular expression “shop”. It also matches the regular expressions “h”, “ho”, and so on. We can use special characters to indicate a meta-meaning in addition to matching characters exactly. For example, with special characters you can indicate that a pattern must occur at the start or end of a string, that part of a pattern can be repeated, or that characters in a pattern must be of a particular type. You can also match on literal occurrences of special characters. We’ll look at each of these. String Manipulation and Regular Expressions C HAPTER 4 4 S TRING M ANIPULATION 109 06 7842 CH04 3/6/01 3:41 PM Page 109 Character Sets and Classes Using character sets immediately gives regular expressions more power than exact matching expressions. Character sets can be used to match any character of a particular type—they’re really a kind of wildcard. First of all, you can use the . character as a wildcard for any other single character except a new line (\n). For example, the regular expression .at matches the strings “cat”, “sat”, and “mat”, among others. This kind of wildcard matching is often used for filename matching in operating systems. With regular expressions, however, you can be more specific about the type of character you would like to match, and you can actually specify a set that a character must belong to. In the previous example, the regular expression matches “cat” and “mat”, but also matches “#at”. If you want to limit this to a character between a and z, you can specify it as follows: [a-z] Anything enclosed in the special square brace characters [ and ] is a character class—a set of characters to which a matched character must belong. Note that the expression in the square brackets matches only a single character. You can list a set, for example [aeiou] means any vowel. You can also describe a range, as we just did using the special hyphen character, or a set of ranges: [a-zA-Z] This set of ranges stands for any alphabetic character in upper- or lowercase. You can also use sets to specify that a character cannot be a member of a set. For example, [^a-z] matches any character that is not between a and z. The caret symbol means not when it is placed inside the square brackets. It has another meaning when used outside square brackets, which we’ll look at in a minute. In addition to listing out sets and ranges, a number of predefined character classes can be used in a regular expression. These are shown in Table 4.3. Using PHP P ART I 110 06 7842 CH04 3/6/01 3:41 PM Page 110 TABLE 4.3 Character Classes for Use in POSIX-Style Regular Expressions Class Matches [[:alnum:]] Alphanumeric characters [[:alpha:]] Alphabetic characters [[:lower:]] Lowercase letters [[:upper:]] Uppercase letters [[:digit:]] Decimal digits [[:xdigit:]] Hexadecimal digits [[:punct:]] Punctuation [[:blank:]] Tabs and spaces [[:space:]] Whitespace characters [[:cntrl:]] Control characters [[:print:]] All printable characters [[:graph:]] All printable characters except for space Repetition Often you want to specify that there might be multiple occurrences of a particular string or class of character. You can represent this using two special characters in your regular expres- sion. The * symbol means that the pattern can be repeated zero or more times, and the + sym- bol means that the pattern can be repeated one or more times. The symbol should appear directly after the part of the expression that it applied to. For example [[:alnum:]]+ means “at least one alphanumeric character.” Subexpressions It’s often useful to be able to split an expression into subexpressions so you can, for example, represent “at least one of these strings followed by exactly one of those.” You can do this using parentheses, exactly the same way as you would in an arithmetic expression. For example, (very )*large matches “large”, “very large”, “very very large”, and so on. String Manipulation and Regular Expressions C HAPTER 4 4 S TRING M ANIPULATION 111 06 7842 CH04 3/6/01 3:41 PM Page 111 Counted Subexpressions We can specify how many times something can be repeated by using a numerical expression in curly braces ( {} ).You can show an exact number of repetitions ({3} means exactly 3 repeti- tions), a range of repetitions ({2, 4} means from 2 to 4 repetitions), or an open ended range of repetitions ({2,} means at least two repetitions). For example, (very ){1, 3} matches “very”, “very very” and “very very very”. Anchoring to the Beginning or End of a String You can specify whether a particular subexpression should appear at the start, the end, or both. This is pretty useful when you want to make sure that only your search term and nothing else appears in the string. The caret symbol (^) is used at the start of a regular expression to show that it must appear at the beginning of a searched string, and $ is used at the end of a regular expression to show that it must appear at the end. For example, this matches bob at the start of a string: ^bob This matches com at the end of a string: com$ Finally, this matches any single character from a to z, in the string on its own: ^[a-z]$ Branching You can represent a choice in a regular expression with a vertical pipe. For example, if we want to match com, edu, or net, we can use the expression: (com)|(edu)|(net) Matching Literal Special Characters If you want to match one of the special characters mentioned in this section, such as ., {, or $, you must put a slash (\) in front of it. If you want to represent a slash, you must replace it with two slashes, \\. Using PHP P ART I 112 06 7842 CH04 3/6/01 3:41 PM Page 112 Summary of Special Characters A summary of all the special characters is shown in Tables 4.4 and 4.5. Table 4.4 shows the meaning of special characters outside square brackets, and Table 4.5 shows their meaning when used inside square brackets. TABLE 4.4 Summary of Special Characters Used in POSIX Regular Expressions Outside Square Brackets Character Meaning \ Escape character ^ Match at start of string $ Match at end of string . Match any character except newline (\n) | Start of alternative branch (read as OR) ( Start subpattern ) End subpattern * Repeat 0 or more times + Repeat 1 or more times { Start min/max quantifier } End min/max quantifier TABLE 4.5 Summary of Special Characters Used in POSIX Regular Expressions Inside Square Brackets Character Meaning \ Escape character ^ NOT, only if used in initial position - Used to specify character ranges Putting It All Together for the Smart Form There are at least two possible uses of regular expressions in the Smart Form application. The first use is to detect particular terms in the customer feedback. We can be slightly smarter about this using regular expressions. Using a string function, we’d have to do three different searches if we wanted to match on “shop”, “customer service”, or “retail”. With a regular expression, we can match all three: shop|customer service|retail String Manipulation and Regular Expressions C HAPTER 4 4 S TRING M ANIPULATION 113 06 7842 CH04 3/6/01 3:41 PM Page 113 The second use is to validate customer email addresses in our application by encoding the stan- dardized format of an email address in a regular expression. The format includes some alphanumeric or punctuation characters, followed by an @ symbol, followed by a string of alphanumeric and hyphen characters, followed by a dot, followed by more alphanumeric and hyphen characters and possibly more dots, up until the end of the string, which encodes as fol- lows: ^[a-zA-Z0-9_]+@[a-zA-Z0-9\-]+\.[a-zA-Z0-9\-\.]+$ The subexpression ^[a-zA-Z0-9_]+ means “start the string with at least one letter, number, or underscore, or some combination of those.” The @ symbol matches a literal @. The subexpression [a-zA-Z0-9\-]+ matches the first part of the host name including alphanu- meric characters and hyphens. Note that we’ve slashed out the hyphen because it’s a special character inside square brackets. The \. combination matches a literal The subexpression [a-zA-Z0-9\-\.]+$ matches the rest of a domain name, including letters, numbers, hyphens, and more dots if required, up until the end of the string. A bit of analysis shows that you can produce invalid email addresses that will still match this regular expression. It is almost impossible to catch them all, but this will improve the situation a little. Now that you have read about regular expressions, we’ll look at the PHP functions that use them. Finding Substrings with Regular Expressions Finding substrings is the main application of the regular expressions we just developed. The two functions available in PHP for matching regular expressions are ereg() and eregi(). The ereg() function has the following prototype: int ereg(string pattern, string search, array [matches]); This function searches the search string, looking for matches to the regular expression in pattern. If matches are found for subexpressions of pattern, they will be stored in the array matches, one subexpression per array element. The eregi() function is identical except that it is not case sensitive. Using PHP P ART I 114 06 7842 CH04 3/6/01 3:41 PM Page 114 . and hyphen characters and possibly more dots, up until the end of the string, which encodes as fol- lows: ^[a-zA-Z 0-9 _]+@[a-zA-Z 0-9 -] +.[a-zA-Z 0-9 - .]+$ The subexpression ^[a-zA-Z 0-9 _]+ means “start. of regular expression is compiled into PHP by default, but you can use the Perl style by compil- ing in the PCRE (Perl-compatible regular expression) library. We’ll cover the simpler POSIX style,. str_replace(), substr_replace() Find-and-replace functionality can be extremely useful with strings. We have used find and replace in the past for personalizing documents generated by PHP for example

Ngày đăng: 06/07/2014, 19:20

TỪ KHÓA LIÊN QUAN