Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 42 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
42
Dung lượng
119,15 KB
Nội dung
Appendix B: Programming Guidelines 551 value="cplusplus-email-list"> <input type="hidden" name="command-field" value="remove"><p> <input type="text" size="40" name="email-address"> <input type="submit" name="submit" value="Remove Address From C++ Mailing List"> </p></form></td> <td width="30" bgcolor="#000000"> </td> </tr> </table> </center></div> Each form contains one data-entry field called email-address , as well as a couple of hidden fields which don’t provide for user input but carry information back to the server nonetheless. The subject-field tells the CGI program the subdirectory where the resulting file should be placed. The command-field tells the CGI program whether the user is requesting that they be added or removed from the list. From the action , you can see that a GET is used with a program called mlm.exe (for “mailing list manager”). Here it is: //: C10:mlm.cpp // A GGI program to maintain a mailing list #include "CGImap.h" #include <fstream> using namespace std; const string contact("Bruce@EckelObjects.com"); // Paths in this program are for Linux/Unix. You // must use backslashes (two for each single // slash) on Win32 servers: const string rootpath("/home/eckel/"); int main() { cout << "Content-type: text/html\n"<< endl; CGImap query(getenv("QUERY_STRING")); if(query["test-field"] == "on") { cout << "map size: " << query.size() << "<br>"; query.dump(cout, "<br>"); } if(query["subject-field"].size() == 0) { cout << "<h2>Incorrect form. Contact " << contact << endl; return 0; } string email = query["email-address"]; Appendix B: Programming Guidelines 552 if(email.size() == 0) { cout << "<h2>Please enter your email address" << endl; return 0; } if(email.find_first_of(" \t") != string::npos){ cout << "<h2>You cannot use white space " "in your email address" << endl; return 0; } if(email.find('@') == string::npos) { cout << "<h2>You must use a proper email" " address including an '@' sign" << endl; return 0; } if(email.find('.') == string::npos) { cout << "<h2>You must use a proper email" " address including a '.'" << endl; return 0; } string fname = email; if(query["command-field"] == "add") fname += ".add"; else if(query["command-field"] == "remove") fname += ".remove"; else { cout << "error: command-field not found. Contact " << contact << endl; return 0; } string path(rootpath + query["subject-field"] + "/" + fname); ofstream out(path.c_str()); if(!out) { cout << "cannot open " << path << "; Contact" << contact << endl; return 0; } out << email << endl; cout << "<br><H2>" << email << " has been "; if(query["command-field"] == "add") cout << "added"; else if(query["command-field"] == "remove") Appendix B: Programming Guidelines 553 cout << "removed"; cout << "<br>Thank you</H2>" << endl; } ///:~ Again, all the CGI work is done by the CGImap . From then on it’s a matter of pulling the fields out and looking at them, then deciding what to do about it, which is easy because of the way you can index into a map and also because of the tools available for standard string s. Here, most of the programming has to do with checking for a valid email address. Then a file name is created with the email address as the name and “.add” or “.remove” as the extension, and the email address is placed in the file. Maintaining your list Once you have a list of names to add, you can just paste them to end of your list. However, you might get some duplicates so you need a program to remove those. Because your names may differ only by upper and lowercase, it’s useful to create a tool that will read a list of names from a file and place them into a container of strings, forcing all the names to lowercase as it does: //: C10:readLower.h // Read a file into a container of string, // forcing each line to lower case. #ifndef READLOWER_H #define READLOWER_H #include " /require.h" #include <iostream> #include <fstream> #include <string> #include <algorithm> #include <cctype> inline char downcase(char c) { using namespace std; // Compiler bug return tolower(c); } std::string lcase(std::string s) { std::transform(s.begin(), s.end(), s.begin(), downcase); return s; } template<class SContainer> void readLower(char* filename, SContainer& c) { std::ifstream in(filename); Appendix B: Programming Guidelines 554 assure(in, filename); const int sz = 1024; char buf[sz]; while(in.getline(buf, sz)) // Force to lowercase: c.push_back(string(lcase(buf))); } #endif // READLOWER_H ///:~ Since it’s a template , it will work with any container of string that supports push_back( ) . Again, you may want to change the above to the form readln(in, s) instead of using a fixed- sized buffer, which is more fragile. Once the names are read into the list and forced to lowercase, removing duplicates is trivial: //: C10:RemoveDuplicates.cpp // Remove duplicate names from a mailing list #include "readLower.h" #include " /require.h" #include <vector> #include <algorithm> using namespace std; int main(int argc, char* argv[]) { requireArgs(argc, 2); vector<string> names; readLower(argv[1], names); long before = names.size(); // You must sort first for unique() to work: sort(names.begin(), names.end()); // Remove adjacent duplicates: unique(names.begin(), names.end()); long removed = before - names.size(); ofstream out(argv[2]); assure(out, argv[2]); copy(names.begin(), names.end(), ostream_iterator<string>(out,"\n")); cout << removed << " names removed" << endl; } ///:~ A vector is used here instead of a list because sorting requires random-access which is much faster in a vector . (A list has a built-in sort( ) so that it doesn’t suffer from the performance that would result from applying the normal sort( ) algorithm shown above). Appendix B: Programming Guidelines 555 The sort must be performed so that all duplicates are adjacent to each other. Then unique( ) can remove all the adjacent duplicates. The program also keeps track of how many duplicate names were removed. When you have a file of names to remove from your list, readLower( ) comes in handy again: //: C10:RemoveGroup.cpp // Remove a group of names from a list #include "readLower.h" #include " /require.h" #include <list> using namespace std; typedef list<string> Container; int main(int argc, char* argv[]) { requireArgs(argc, 3); Container names, removals; readLower(argv[1], names); readLower(argv[2], removals); long original = names.size(); Container::iterator rmit = removals.begin(); while(rmit != removals.end()) names.remove(*rmit++); // Removes all matches ofstream out(argv[3]); assure(out, argv[3]); copy(names.begin(), names.end(), ostream_iterator<string>(out,"\n")); long removed = original - names.size(); cout << "On removal list: " << removals.size() << "\n Removed: " << removed << endl; } ///:~ Here, a list is used instead of a vector (since readLower( ) is a template , it adapts). Although there is a remove( ) algorithm that can be applied to containers, the built-in list::remove( ) seems to work better. The second command-line argument is the file containing the list of names to be removed. An iterator is used to step through that list, and the list::remove( ) function removes every instance of each name from the master list. Here, the list doesn’t need to be sorted first. Unfortunately, that’s not all there is to it. The messiest part about maintaining a mailing list is the bounced messages. Presumably, you’ll just want to remove the addresses that produce bounces. If you can combine all the bounced messages into a single file, the following program has a pretty good chance of extracting the email addresses; then you can use RemoveGroup to delete them from your list. Appendix B: Programming Guidelines 556 //: C10:ExtractUndeliverable.cpp // Find undeliverable names to remove from // mailing list from within a mail file // containing many messages #include " /require.h" #include <cstdio> #include <string> #include <set> using namespace std; char* start_str[] = { "following address", "following recipient", "following destination", "undeliverable to the following", "following invalid", }; char* continue_str[] = { "Message-ID", "Please reply to", }; // The in() function allows you to check whether // a string in this set is part of your argument. class StringSet { char** ss; int sz; public: StringSet(char** sa, int sza):ss(sa),sz(sza) {} bool in(char* s) { for(int i = 0; i < sz; i++) if (strstr(s, ss[i]) != 0) return true; return false; } }; // Calculate array length: #define ALEN(A) ((sizeof A)/(sizeof *A)) StringSet starts(start_str, ALEN(start_str)), Appendix B: Programming Guidelines 557 continues(continue_str, ALEN(continue_str)); int main(int argc, char* argv[]) { requireArgs(argc, 2, "Usage:ExtractUndeliverable infile outfile"); FILE* infile = fopen(argv[1], "rb"); FILE* outfile = fopen(argv[2], "w"); require(infile != 0); require(outfile != 0); set<string> names; const int sz = 1024; char buf[sz]; while(fgets(buf, sz, infile) != 0) { if(starts.in(buf)) { puts(buf); while(fgets(buf, sz, infile) != 0) { if(continues.in(buf)) continue; if(strstr(buf, " ") != 0) break; const char* delimiters= " \t<>():;,\n\""; char* name = strtok(buf, delimiters); while(name != 0) { if(strstr(name, "@") != 0) names.insert(string(name)); name = strtok(0, delimiters); } } } } set<string>::iterator i = names.begin(); while(i != names.end()) fprintf(outfile, "%s\n", (*i++).c_str()); } ///:~ The first thing you’ll notice about this program is that contains some C functions, including C I/O. This is not because of any particular design insight. It just seemed to work when I used the C elements, and it started behaving strangely with C++ I/O. So the C is just because it works, and you may be able to rewrite the program in more “pure C++” using your C++ compiler and produce correct results. A lot of what this program does is read lines looking for string matches. To make this convenient, I created a StringSet class with a member function in( ) that tells you whether any of the strings in the set are in the argument. The StringSet is initialized with a constant two-dimensional of strings and the size of that array. Although the StringSet makes the code easier to read, it’s also easy to add new strings to the arrays. Appendix B: Programming Guidelines 558 Both the input file and the output file in main( ) are manipulated with standard I/O, since it’s not a good idea to mix I/O types in a program. Each line is read using fgets( ) , and if one of them matches with the starts StringSet , then what follows will contain email addresses, until you see some dashes (I figured this out empirically, by hunting through a file full of bounced email). The continues StringSet contains strings whose lines should be ignored. For each of the lines that potentially contains an addresses, each address is extracted using the Standard C Library function strtok( ) and then it is added to the set<string> called names . Using a set eliminates duplicates (you may have duplicates based on case, but those are dealt with by RemoveGroup.cpp . The resulting set of names is then printed to the output file. Mailing to your list There are a number of ways to connect to your system’s mailer, but the following program just takes the simple approach of calling an external command (“fastmail,” which is part of Unix) using the Standard C library function system( ) . The program spends all its time building the external command. When people don’t want to be on a list anymore they will often ignore instructions and just reply to the message. This can be a problem if the email address they’re replying with is different than the one that’s on your list (sometimes it has been routed to a new or aliased address). To solve the problem, this program prepends the text file with a message that informs them that they can remove themselves from the list by visiting a URL. Since many email programs will present a URL in a form that allows you to just click on it, this can produce a very simple removal process. If you look at the URL, you can see it’s a call to the mlm.exe CGI program, including removal information that incorporates the same email address the message was sent to. That way, even if the user just replies to the message, all you have to do is click on the URL that comes back with their reply (assuming the message is automatically copied back to you). //: C10:Batchmail.cpp // Sends mail to a list using Unix fastmail #include " /require.h" #include <iostream> #include <fstream> #include <string> #include <strstream> #include <cstdlib> // system() function using namespace std; string subject("New Intensive Workshops"); string from("Bruce@EckelObjects.com"); string replyto("Bruce@EckelObjects.com"); ofstream logfile("BatchMail.log"); int main(int argc, char* argv[]) { Appendix B: Programming Guidelines 559 requireArgs(argc, 2, "Usage: Batchmail namelist mailfile"); ifstream names(argv[1]); assure(names, argv[1]); string name; while(getline(names, name)) { ofstream msg("m.txt"); assure(msg, "m.txt"); msg << "To be removed from this list, " "DO NOT REPLY TO THIS MESSAGE. Instead, \n" "click on the following URL, or visit it " "using your Web browser. This \n" "way, the proper email address will be " "removed. Here's the URL:\n" << "http://www.mindview.net/cgi-bin/" "mlm.exe?subject-field=workshop-email-list" "&command-field=remove&email-address=" << name << "&submit=submit\n\n" " \n\n"; ifstream text(argv[2]); assure(text, argv[1]); msg << text.rdbuf() << endl; msg.close(); string command("fastmail -F " + from + " -r " + replyto + " -s \"" + subject + "\" m.txt " + name); system(command.c_str()); logfile << command << endl; static int mailcounter = 0; const int bsz = 25; char buf[bsz]; // Convert mailcounter to a char string: ostrstream mcounter(buf, bsz); mcounter << mailcounter++ << ends; if((++mailcounter % 500) == 0) { string command2("fastmail -F " + from + " -r " + replyto + " -s \"Sent " + string(buf) + " messages \" m.txt eckel@aol.com"); system(command2.c_str()); } } } ///:~ Appendix B: Programming Guidelines 560 The first command-line argument is the list of email addresses, one per line. The names are read one at a time into the string called name using getline( ) . Then a temporary file called m.txt is created to build the customized message for that individual; the customization is the note about how to remove themselves, along with the URL. Then the message body, which is in the file specified by the second command-line argument, is appended to m.txt . Finally, the command is built inside a string : the “-F” argument to fastmail is who it’s from, the “-r” argument is who to reply to. The “-s” is the subject line, the next argument is the file containing the mail and the last argument is the email address to send it to. You can start this program in the background and tell Unix not to stop the program when you sign off of the server. However, it takes a while to run for a long list (this isn’t because of the program itself, but the mailing process). I like to keep track of the progress of the program by sending a status message to another email account, which is accomplished in the last few lines of the program. A general information-extraction CGI program One of the problems with CGI is that you must write and compile a new program every time you want to add a new facility to your Web site. However, much of the time all that your CGI program does is capture information from the user and store it on the server. If you could use hidden fields to specify what to do with the information, then it would be possible to write a single CGI program that would extract the information from any CGI request. This information could be stored in a uniform format, in a subdirectory specified by a hidden field in the HTML form, and in a file that included the user’s email address – of course, in the general case the email address doesn’t guarantee uniqueness (the user may post more than one submission) so the date and time of the submission can be mangled in with the file name to make it unique. If you can do this, then you can create a new data-collection page just by defining the HTML and creating a new subdirectory on your server. For example, every time I come up with a new class or workshop, all I have to do is create the HTML form for signups – no CGI programming is required. The following HTML page shows the format for this scheme. Since a CGI POST is more general and doesn’t have any limit on the amount of information it can send, it will always be used instead of a GET for the ExtractInfo.cpp program that will implement this system. Although this form is simple, yours can be as complicated as you need it. //:! C10:INFOtest.html <html><head><title> Extracting information from an HTML POST</title> </head> <body bgcolor="#FFFFFF" link="#0000FF" vlink="#800080"> <hr> <p>Extracting information from an HTML POST</p> [...]... DataPair& DataPair::get(istream& in) { first.erase(); second.erase(); string ln; Appendix B: Programming Guidelines 568 getline (in, ln); while(ln.find("[{[") == string::npos) if(!getline (in, ln)) return *this; // End first = ln.substr(3, ln.find("]}]") - 3); getline (in, ln); // Throw away [([ while(getline (in, ln)) if(ln.find("])]") == string::npos) second += ln + string(" "); else return *this; } FormData::FormData(char*... Interfacing with Pascal & C (Self-published via the Eisys imprint; only available via the Web site) Using C++ C++ Inside & Out Thinking in C++, 1st edition Black Belt C++, the Master’s Collection (edited by Bruce Eckel) (out of print) Thinking in Java, 2nd edition Depth & dark corners Books that go more deeply into topics of the language, and help you avoid the typical pitfalls inherent in developing... ifstream in( fileName); assure (in, fileName); require(getline (in, filePath) != 0); // Should be start of first line: require(filePath.find("///{") == 0); filePath = filePath.substr(strlen("///{")); require(getline (in, email) != 0); // Should be start of 2nd line: require(email.find("From[") == 0); int begin = strlen("From["); int end = email.find("]"); int length = end - begin; email = email.substr(begin,... command line Using it under Linux/Unix is easy since file-name global expansion (“globbing”) is handled for you So you say: DataToSpreadsheet *.txt >> spread.out In Win32 (at a DOS prompt) it’s a bit more involved, since you must do the “globbing” yourself: For %f in (*.txt) do DataToSpreadsheet %f >> spread.out This technique is generally useful for writing Win32/DOS command lines Appendix B: Programming... using namespace std; if(argc < minArgs + 1) { fprintf(stderr, msg, minArgs); fputs("\n", stderr); exit(1); } } inline void assure(std::ifstream& in, const char* filename = "") { using namespace std; if( !in) { fprintf(stderr, "Could not open file %s\n", filename); exit(1); } } inline void assure(std::ofstream& in, const char* filename = "") { using namespace std; if( !in) { fprintf(stderr, "Could not open... formatting manipulators, 95 inserter, 69 fseek( ), 78 interface get pointer, 117 command-line, 72 get( ), 75 graphical user (GUI), 72 getline( ), 75 repairing an interface with multiple inheritance, 364 good( ), 73 hex, 95 interpreter, printf( ) run-time, 66 hex (hexadecimal), 70 invalid_argument ignore( ), 75 Standard C++ library exception type, 393 internal, 97 IOSTREAM.H, 74 internal formatting data,... using namespace std; if (!requirement) { fputs(msg, stderr); fputs("\n", stderr); exit(1); } } inline void requireArgs(int argc, int args, const char* msg = "Must use %d arguments") { using namespace std; if (argc != args + 1) { fprintf(stderr, msg, args); fputs("\n", stderr); exit(1); } } inline void requireMinArgs(int argc, int minArgs, const char* msg = "Must use at least %d arguments") { 577 using... achieved in Win32 with start) Here, no ampersand is used, so system( ) does not return until the command is finished – which is a good thing, since the next operation is to delete the temporary file which is used in the command Appendix B: Programming Guidelines 571 The final operation in this project is to extract the data into an easily-usable form A spreadsheet is a useful way to handle this kind of information,... request: //: C10:ExtractInfo.cpp // Extracts all the information from a CGI POST // submission, generates a file and stores the // information on the server By generating a // unique file name, there are no clashes like // you get when storing to a single file #include "CGImap.h" #include #include #include #include using namespace std; const string contact("Bruce@EckelObjects.com");... makes sense to start by creating a general-purpose tool that will automatically parse any file that is created by ExtractInfo.cpp: //: C10:FormData.h #include #include #include #include using namespace std; class DataPair : public pair { public: DataPair() {} DataPair(istream& in) { get (in) ; } DataPair& get(istream& in) ; operator bool() { return . #define READLOWER_H #include " /require.h" #include <iostream> #include <fstream> #include <string> #include <algorithm> #include <cctype> inline. and place them into a container of strings, forcing all the names to lowercase as it does: //: C10:readLower.h // Read a file into a container of string, // forcing each line to lower case // Find undeliverable names to remove from // mailing list from within a mail file // containing many messages #include " /require.h" #include <cstdio> #include <string>