102 Chapter 4 String Manipulation and Regular Expressions characters as control characters.The problematic ones are quotes (single and double), backslashes (\), and the NUL character. We need to find a way of marking or escaping these characters so that databases such as MySQL can understand that we meant a literal special character rather than a control sequence.To escape these characters, add a backslash in front of them. For example, " (double quote) becomes \" (backslash double quote), and \ (backslash) becomes \\ (backslash backslash). (This rule applies universally to special characters, so if you have \\ in your string, you need to replace it with \\\\.) PHP provides two functions specifically designed for escaping characters. Before you write any strings into a database, you should reformat them with AddSlashes(), for example: $feedback = AddSlashes($feedback); Like many of the other string functions, AddSlashes() takes a string as parameter and returns the reformatted string. When you use AddSlashes(), the string will be stored in the database with the slash- es in it.When you retrieve the string, you will need to remember to take the slashes out. You can do this using the StripSlashes() function: $feedback = StripSlashes($feedback); Figure 4.3 shows the actual effects of using these functions on the string. Figure 4.3 After calling the AddSlashes() function, all the quotes have been slashed out. StripSlashes() removes the slashes. You can also set PHP up to add and strip slashes automatically.This is called using magic quotes.You can read more about magic quotes in Chapter 21, “Other Useful Features.” 06 525x ch04 1/24/03 2:55 PM Page 102 103 Joining and Splitting Strings with String Functions Joining and Splitting Strings with String Functions Often, we want to look at parts of a string individually. For example, we might want to look at words in a sentence (say for spellchecking), or split a domain name or email address into its component parts. PHP provides several string functions (and one regular expression function) that allow us to do this. In our example, Bob wants any customer feedback from bigcustomer.com to go directly to him, so we will split the email address the customer typed in into parts to find out if they work for Bob’s big customer. Using explode(), implode(), and join() The first function we could use for this purpose, explode(), has the following proto- type: array explode(string separator, string input [, int limit]); This function takes a string input and splits it into pieces on a specified separator string. The pieces are returned in an array.You can limit the number of pieces with the option- al limit parameter, added in PHP 4.0.1. To get the domain name from the customer’s email address in our script, we can use the following code: $email_array = explode('@', $email); This call to explode() splits the customer’s email address into two parts: the username, which is stored in $email_array[0], and the domain name, which is stored in $email_array[1].Now we can test the domain name to determine the customer’s ori- gin, and then send their feedback to the appropriate person: if ($email_array[1]=='bigcustomer.com') $toaddress = 'bob@example.com'; else $toaddress = 'feedback@example.com'; Note if the domain is capitalized, this will not work.We could avoid this problem by converting the domain to all uppercase or all lowercase and then checking: $email_array[1] = strtoupper ($email_array[1]); You can reverse the effects of explode() using either implode() or join(), which are identical. For example, $new_email = implode('@', $email_array); This takes the array elements from $email_array and joins them together with the string passed in the first parameter.The function call is very similar to explode(),but the effect is opposite. 06 525x ch04 1/24/03 2:55 PM Page 103 104 Chapter 4 String Manipulation and Regular Expressions Using strtok() Unlike explode(), which breaks a string into all its pieces at one time, strtok() gets pieces (called tokens) from a string one at a time. strtok() is a useful alternative to using explode() for processing words from a string one at a time. The prototype for strtok() is string strtok(string input, string separator); The separator can be either a character or a string of characters, but note that the input string will be split on each of the characters in the separator string rather than on the whole separator string (as explode does). Calling strtok() is not quite as simple as it seems in the prototype. To get the first token from a string, you call strtok() with the string you want tok- enized, and a separator.To get the subsequent tokens from the string, you just pass a sin- gle parameter—the separator.The function keeps its own internal pointer to its place in the string. If you want to reset the pointer, you can pass the string into it again. strtok() is typically used as follows: $token = strtok($feedback, ' '); echo $token.'<br />'; while ($token!='') { $token = strtok(' '); echo $token.'<br />'; }; As usual, it’s a good idea to check that the customer actually typed some feedback in the form, using, for example, empty().We have omitted these checks for brevity. This prints each token from the customer’s feedback on a separate line, and loops until there are no more tokens. Note that prior to version 4.1.0 PHP’s strtok() didn’t work exactly the same as the one in C. If there are two instances of a separator in a row in your target string (in this example, two spaces in a row), strtok() returns an empty string.You cannot differentiate this from the empty string returned when you get to the end of the target string. Also, if one of the tokens is 0, the empty string will be returned. This made PHP’s strtok() somewhat less useful than the one in C. The new version works correctly, skipping empty strings. Using substr() The substr() function enables you to access a substring between given start and end points of a string. It’s not appropriate for our example, but can be useful when you need to get at parts of fixed format strings. The substr() function has the following prototype: string substr(string string, int start[, int length] ); 06 525x ch04 1/24/03 2:55 PM Page 104 105 Comparing Strings This function returns a substring copied from within string. We will look at examples using this test string: $test = 'Your customer service is excellent'; If you call it with a positive number for start (only), you will get the string from the start position to the end of the string. For example, substr($test, 1); returns our customer service is excellent. Note that the string position starts from 0, as with arrays. If you call substr() with a negative start (only), you will get the string from the end of the string minus start characters to the end of the string. For example, substr($test, -9); returns excellent. The length parameter can be used to specify either a number of characters to return (if it is positive), or the end character of the return sequence (if it is negative). For example, substr($test, 0, 4); returns the first four characters of the string, namely, Your.The following code: echo substr($test, 4, -13); returns the characters between the fourth character and the thirteenth to last character, that is, customer service. Comparing Strings So far we’ve just used == to compare two strings for equality.We can do some slightly more sophisticated comparisons using PHP.We’ve divided these into two categories: par- tial matches and others.We’ll deal with the others first, and then get into partial match- ing, which we will require to further develop the Smart Form example. String Ordering: strcmp(),strcasecmp(), and strnatcmp() These functions can be used to order strings.This is useful when sorting data. The prototype for strcmp() is int strcmp(string str1, string str2); The function expects to receive two strings, which it will compare. If they are equal, it will return 0. If str1 comes after (or is greater than) str2 in lexicographic order, strcmp() will return a number greater than zero. If str1 is less than str2, strcmp() will return a number less than zero.This function is case sensitive. The function strcasecmp() is identical except that it is not case sensitive. 06 525x ch04 1/24/03 2:55 PM Page 105 106 Chapter 4 String Manipulation and Regular Expressions The function strnatcmp() and its non-case sensitive twin, strnatcasecmp(),were added in PHP 4.These functions compare strings according to a “natural ordering,” which is more the way a human would do it. For example, strcmp() would order the string "2" as greater than the string "12" because it is lexicographically greater. strnatcmp() would do it the other way around.You can read more about natural order- ing at http://www.naturalordersort.org/ Testing String Length with strlen() We can check the length of a string with the strlen() function. If you pass it a string, this function will return its length. For example, strlen('hello') returns 5. This can be used for validating input data. Consider the email address on our form, stored in $email. One basic way of validating an email address stored in $email is to check its length. By my reasoning, the minimum length of an email address is six charac- ters—for example, a@a.to if you have a country code with no second level domains, a one-letter server name, and a one-letter email address.Therefore, an error could be pro- duced if the address was not at least this length: if (strlen($email) < 6) { echo 'That email address is not valid'; exit; // finish execution of PHP script } Clearly, this is a very simplistic way of validating this information.We will look at better ways in the next section. Matching and Replacing Substrings with String Functions It’s common to want to check if a particular substring is present in a larger string.This partial matching is usually more useful than testing for equality. In our Smart Form example, we want to look for certain key phrases in the customer feedback and send the mail to the appropriate department. If we want to send emails talking about Bob’s shops to the retail manager, we want to know if the word “shop” (or derivatives thereof) appear in the message. Given the functions we have already looked at, we could use explode() or strtok() to retrieve the individual words in the message, and then compare them using the == operator or strcmp(). However, we could also do the same thing with a single function call to one of the string matching or regular expression matching functions.These are used to search for a pattern inside a string.We’ll look at each set of functions one by one. 06 525x ch04 1/24/03 2:55 PM Page 106 . sophisticated comparisons using PHP. We’ve divided these into two categories: par- tial matches and others.We’ll deal with the others first, and then get into partial match- ing, which we will require. six charac- ters—for example, a@a.to if you have a country code with no second level domains, a one-letter server name, and a one-letter email address.Therefore, an error could be pro- duced if. Page 105 106 Chapter 4 String Manipulation and Regular Expressions The function strnatcmp() and its non-case sensitive twin, strnatcasecmp(),were added in PHP 4.These functions compare strings according