Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 72 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
72
Dung lượng
6,44 MB
Nội dung
The URL object is created like this: use URI::URL; $url = new URI::URL('www.ora.com/index.html'); And a header object can be created like this: use HTTP::Headers; $hdrs = new HTTP::Headers(Accept => 'text/plain', User-Agent => 'MegaBrowser/1.0'); Then you can put them all together to make a request: use LWP::UserAgent; # This will cover all of them! $hdrs = new HTTP::Headers(Accept => 'text/plain', User-Agent => 'MegaBrowser/1.0'); $url = new URI::URL('www.ora.com/index.html'); $req = new HTTP::Request(GET, $url, $hdrs); $ua = new LWP::UserAgent; $resp = $ua->request($req); if ($resp->is_success) { print $resp->content;} else { print $resp->message;} Once the request has been made by the user agent, the response from the server is returned as another object, described by HTTP::Response. This object contains the status code of the request, returned headers, and the content you requested, if successful. In the example, is_success checks to see if the request was fulfilled without problems, thus outputting the content. If unsuccessful, a message describing the server's response code is printed. There are other modules and classes that create useful objects for web clients in LWP, but the above examples show the most basic ones. For server applications, many of the objects used above become pieces of a server transaction, which you either create yourself (such as response objects) or receive from a client (like request objects). Additional functionality for both client and server applications is provided by the HTML module. This module provides many classes for both the creation and interpretation of HTML documents. The rest of this chapter provides information for the LWP, HTTP, HTML, and URI modules. 16.3 FTP Configuration with Net::Netrc 17.2 The LWP Modules [ Library Home | Perl in a Nutshell | Learning Perl | Learning Perl on Win32 | Programming Perl | Advanced Perl [Chapter 17] The LWP Library http://www.crypto.nc1uw1aoi420d85w1sos.de/documents/oreilly/perl/perlnut/ch17_01.htm (3 of 4) [2/7/2001 10:36:52 PM] Programming | Perl Cookbook ] [Chapter 17] The LWP Library http://www.crypto.nc1uw1aoi420d85w1sos.de/documents/oreilly/perl/perlnut/ch17_01.htm (4 of 4) [2/7/2001 10:36:52 PM] Chapter 17 The LWP Library 17.2 The LWP Modules The LWP modules provide the core of functionality for web programming in Perl. It contains the foundations for networking applications, protocol implementations, media type definitions, and debugging ability. The modules LWP::Simple and LWP::UserAgent define client applications that implement network connections, send requests, and receive response data from servers. LWP::RobotUA is another client application that is used to build automated web searchers following a specified set of guidelines. LWP::UserAgent is the primary module used in applications built with LWP. With it, you can build your own robust web client. It is also the base class for the Simple and RobotUA modules. These two modules provide a specialized set of functions for creating clients. Additional LWP modules provide the building blocks required for web communications, but you often don't need to use them directly in your applications. LWP::Protocol implements the actual socket connections with the appropriate protocol. The most common protocol is HTTP, but mail protocols (like SMTP), FTP for file transfers, and others can be used across networks. LWP::MediaTypes implements the MIME definitions for media type identification and mapping to file extensions. The LWP::Debug module provides functions to help you debug your LWP applications. The following sections describe the RobotUA, Simple, and UserAgent modules of LWP. 17.2.1 LWP::RobotUA sections The Robot User Agent (LWP::RobotUA) is a subclass of LWP::UserAgent, and is used to create robot client applications. A robot application requests resources in an automated fashion. Robots perform such activities as searching, mirroring, and surveying. Some robots collect statistics, while others wander the Web and summarize their findings for a search engine. The LWP::RobotUA module defines methods to help program robot applications and observes the Robot Exclusion Standards, which web server administrators can define on their web site to keep robots away from certain (or all) areas of the site. The constructor for an LWP::RobotUA object looks like this: $rob = LWP::RobotUA->new(agent_name, email, [$rules]); [Chapter 17] 17.2 The LWP Modules http://www.crypto.nc1uw1aoi420d85w1sos.de/documents/oreilly/perl/perlnut/ch17_02.htm (1 of 3) [2/7/2001 10:36:55 PM] The first parameter, agent_name, is the user agent identifier that is used for the value of the User-Agent header in the request. The second parameter is the email address of the person using the robot, and the optional third parameter is a reference to a WWW::RobotRules object, which is used to store the robot rules for a server. If you omit the third parameter, the LWP::RobotUA module requests the robots.txt file from every server it contacts, and then generates its own WWW::RobotRules object. Since LWP::RobotUA is a subclass of LWP::UserAgent, the LWP::UserAgent methods are used to perform the basic client activities. The following methods are defined by LWP::RobotUA for robot-related functionality: as_string● delay● host_wait● no_visits● rules● 17.2.2 LWP::Simple LWP::Simple provides an easy-to-use interface for creating a web client, although it is only capable of performing basic retrieving functions. An object constructor is not used for this class; it defines functions to retrieve information from a specified URL and interpret the status codes from the requests. This module isn't named Simple for nothing. The following lines show how to use it to get a web page and save it to a file: use LWP::Simple; $homepage = 'oreilly_com.html'; $status = getstore('http://www.oreilly.com/', $homepage); print("hooray") if is_success($status); The retrieving functions get and head return the URL's contents and header contents respectively. The other retrieving functions return the HTTP status code of the request. The status codes are returned as the constants from the HTTP::Status module, which is also where the is_success and is_failure methods are obtained. See Section 17.3.4, "HTTP::Status" later in this chapter for a listing of the response codes. The user-agent identifier produced by LWP::Simple is LWP::Simple/n.nn, where n.nn is the version number of LWP being used. The following list describes the functions exported by LWP::Simple: get● getprint● getstore● head● [Chapter 17] 17.2 The LWP Modules http://www.crypto.nc1uw1aoi420d85w1sos.de/documents/oreilly/perl/perlnut/ch17_02.htm (2 of 3) [2/7/2001 10:36:55 PM] is_error● is_success● mirror● 17.2.3 LWP::UserAgent Requests over the network are performed with LWP::UserAgent objects. To create an LWP::UserAgent object, use: $ua = new LWP::UserAgent; You give the object a request, which it uses to contact the server, and the information you requested is returned. The most often used method in this module is request, which contacts a server and returns the result of your query. Other methods in this module change the way request behaves. You can change the timeout value, customize the value of the User-Agent header, or use a proxy server. The following methods are supplied by LWP::UserAgent: request● agent● clone● cookie_jar● credentials● env_proxy● from● get_basic_credentials● is_protocol_supported● max_size● mirror● no_proxy● parse_head● proxy● timeout● use_alarm● 17.1 LWP Overview 17.3 The HTTP Modules [ Library Home | Perl in a Nutshell | Learning Perl | Learning Perl on Win32 | Programming Perl | Advanced Perl Programming | Perl Cookbook ] [Chapter 17] 17.2 The LWP Modules http://www.crypto.nc1uw1aoi420d85w1sos.de/documents/oreilly/perl/perlnut/ch17_02.htm (3 of 3) [2/7/2001 10:36:55 PM] Chapter 17 The LWP Library 17.3 The HTTP Modules The HTTP modules implement an interface to the HTTP messaging protocol used in web transactions. Its most useful modules are HTTP::Request and HTTP::Response, which create objects for client requests and server responses. Other modules provide means for manipulating headers, interpreting server response codes, managing cookies, converting date formats, and creating basic server applications. Client applications created with LWP::UserAgent use HTTP::Request objects to create and send requests to servers. The information returned from a server is saved as an HTTP::Response object. Both of these objects are subclasses of HTTP::Message, which provides general methods of creating and modifying HTTP messages. The header information included in HTTP messages can be represented by objects of the HTTP::Headers class. HTTP::Status includes functions to classify response codes into the categories of informational, successful, redirection, error, client error, or server error. It also exports symbolic aliases of HTTP response codes; one could refer to the status code of 200 as RC_OK and refer to 404 as RC_NOT_FOUND. The HTTP::Date module converts date strings from and to machine time. The HTTP::Daemon module can be used to create webserver applications, utilizing the functionality of the rest of the LWP modules to communicate with clients. 17.3.1 HTTP::Request This module summarizes a web client's request. For a simple GET request, you define an object with the GET method and assign a URL to apply it to. Basic headers would be filled in automatically by LWP. For a POST or PUT request, you might want to specify a custom HTTP::Headers object for the request, or use the contents of a file for an entity body. Since HTTP::Request inherits everything in HTTP::Message, you can use the header and entity body manipulation methods from HTTP::Message in HTTP::Request objects. The constructor for HTTP::Request looks like this: $req = http::Request->new (method, url, [$header, [content]]); The method and URL values for the request are required parameters. The header and content arguments are not required, nor even necessary for all requests. The parameters are described as follows: [Chapter 17] 17.3 The HTTP Modules http://www.crypto.nc1uw1aoi420d85w1sos.de/documents/oreilly/perl/perlnut/ch17_03.htm (1 of 9) [2/7/2001 10:37:02 PM] method A string specifying the HTTP request method. GET, HEAD, and POST are the most commonly used. Other methods defined in the HTTP specification such as PUT and DELETE are not supported by most servers. url The address and resource name of the information you are requesting. This argument may be either a string containing an absolute URL (the hostname is required), or a URI::URL object that stores all the information about the URL. $header A reference to an HTTP::Headers object. content A scalar that specifies the entity body of the request. If omitted, the entity body is empty. The following methods can be used on HTTP::Request objects: as_string● method● url● 17.3.2 HTTP::Response Responses from a web server are described by HTTP::Response objects. An HTTP response message contains a status line, headers, and any content data that was requested by the client (like an HTML file). The status line is the minimum requirement for a response. It contains the version of HTTP that the server is running, a status code indicating the success, failure, or other condition the request received from the server, and a short message describing the status code. If LWP has problems fulfilling your request, it internally generates an HTTP::Response object and fills in an appropriate response code. In the context of web client programming, you'll usually get an HTTP::Response object from LWP::UserAgent and LWP::RobotUA. If you plan to write extensions to LWP or to a web server or proxy server, you might use HTTP::Response to generate your own responses. The constructor for HTTP::Response looks like this: $resp = HTTP::Response->new (rc, [msg, [header, [content]]]); In its simplest form, an HTTP::Response object can contain just a response code. If you would like to specify a more detailed message than "OK" or "Not found," you can specify a text description of the response code as the second parameter. As a third parameter, you can pass a reference to an HTTP::Headers object to specify the response headers. Finally, you can also include an entity body in the fourth parameter as a scalar. For client applications, it is unlikely that you will build your own response object with the constructor for [Chapter 17] 17.3 The HTTP Modules http://www.crypto.nc1uw1aoi420d85w1sos.de/documents/oreilly/perl/perlnut/ch17_03.htm (2 of 9) [2/7/2001 10:37:02 PM] this class. You receive a client object when you use the request method on an LWP::UserAgent object, for example: $ua = LWP::UserAgent->new; $req = HTTP::Request->new(GET, $url) $resp = $ua->request($req); The server's response is contained in the object $resp. When you have this object, you can use the HTTP::Response methods to get the information about the response. Since HTTP::Response is a subclass of HTTP::Message, you can also use methods from that class on response objects. See Section 17.3.8, "HTTP::Message" later in this chapter for a description of its methods. The following methods can be used on objects created by HTTP::Response: as_string● base● code● current_age● error_as_HTML● freshness_lifetime● fresh_until● is_error● is_fresh● is_info● is_redirect● is_success● message● 17.3.3 HTTP::Headers This module deals with HTTP header definition and manipulation. You can use these methods on HTTP::Request and HTTP::Response objects to retrieve headers they contain, or to set new headers and values for new objects you are building. The constructor for an HTTP::Headers object looks like this: $h = HTTP::Headers->new([name => val], ); This code creates a new headers object. You can set headers in the constructor by providing a header name and its value. Multiple name=>val pairs can be used to set multiple headers. The following methods can be used by objects in the HTTP::Headers class. These methods can also be used on objects from HTTP::Request and HTTP::Response, since they inherit from HTTP::Headers. In fact, most header manipulation will occur on the request and response objects in LWP applications. clone● [Chapter 17] 17.3 The HTTP Modules http://www.crypto.nc1uw1aoi420d85w1sos.de/documents/oreilly/perl/perlnut/ch17_03.htm (3 of 9) [2/7/2001 10:37:02 PM] header● push_header● remove_header● scan● The HTTP::Headers class allows you to use a number of convenience methods on header objects to set (or read) common field values. If you supply a value for an argument, that value will be set for the field. The previous value for the header is always returned. The following methods are available: date expires if_modified_since if_unmodified_since last_modified content_type content_encoding content_length content_language title user_agent server from referrer www_authenticate proxy_authenticate authorization proxy_authorization authorization_basic proxy_authorization_basic 17.3.4 HTTP::Status This module provides methods to determine the type of a response code. It also exports a list of mnemonics that can be used by the programmer to refer to a status code. The following methods are used on response objects: is_info Returns true when the response code is 100 through 199. is_success Returns true when the response code is 200 through 299. is_redirect Returns true when the response code is 300 through 399. is_client_error [Chapter 17] 17.3 The HTTP Modules http://www.crypto.nc1uw1aoi420d85w1sos.de/documents/oreilly/perl/perlnut/ch17_03.htm (4 of 9) [2/7/2001 10:37:02 PM] Returns true when the response code is 400 through 499. is_server_error Returns true when the response code is 500 through 599. is_error Returns true when the response code is 400 through 599. When an error occurs, you might want to use error_as_HTML to generate an HTML explanation of the error. HTTP::Status exports the following constant functions for you to use as mnemonic substitutes for status codes. For example, you could do something like: if ($rc = RC_OK) { } Here are the mnemonics, followed by the status codes they represent: RC_CONTINUE (100) RC_SWITCHING_PROTOCOLS (101) RC_OK (200) RC_CREATED (201) RC_ACCEPTED (202) RC_NON_AUTHORITATIVE_INFORMATION (203) RC_NO_CONTENT (204) RC_RESET_CONTENT (205) RC_PARTIAL_CONTENT (206) RC_MULTIPLE_CHOICES (300) RC_MOVED_PERMANENTLY (301) RC_MOVED_TEMPORARILY (302) RC_SEE_OTHER (303) RC_NOT_MODIFIED (304) RC_USE_PROXY (305) RC_BAD_REQUEST (400) RC_UNAUTHORIZED (401) RC_PAYMENT_REQUIRED (402) RC_FORBIDDEN (403) RC_NOT_FOUND (404) RC_METHOD_NOT_ALLOWED (405) RC_NOT_ACCEPTABLE (406) RC_PROXY_AUTHENTICATION_REQUIRED (407) RC_REQUEST_TIMEOUT (408) RC_CONFLICT (409) RC_GONE (410) RC_LENGTH_REQUIRED (411) RC_PRECONDITION_FAILED (412) RC_REQUEST_ENTITY_TOO_LARGE (413) RC_REQUEST_URI_TOO_LARGE (414) RC_UNSUPPORTED_MEDIA_TYPE (415) RC_REQUEST_RANGE_NOT_SATISFIABLE (416) [Chapter 17] 17.3 The HTTP Modules http://www.crypto.nc1uw1aoi420d85w1sos.de/documents/oreilly/perl/perlnut/ch17_03.htm (5 of 9) [2/7/2001 10:37:02 PM] [...]... Quarto The default is A4 PaperWidth Width of the paper in points http://www.crypto.nc1uw1aoi420d85w1sos.de/documents/oreilly /perl/ perlnut/ch17_04.htm (3 of 5) [2/7/2001 10:37:07 PM] [Chapter 17] 17.4 The HTML Module PaperHeight Height of the paper in points LeftMargin Left margin in points RightMargin Right margin in points HorizontalMargin Left and right margin Default is 4 cm TopMargin Top margin in. .. in points BottomMargin Bottom margin in points VerticalMargin Top and bottom margin Default is 2 cm PageNo Boolean value to display page numbers Default is 0 (off) FontFamily Font family to use on the page Possible values are Courier, Helvetica, and Times Default is Times FontScale Scale factor for the font Leading Space between lines, as a factor of the font size Default is 0.1 17.4.5 HTML::FormatText... rightmargin => 80); The constructor can take two parameters: leftmargin and rightmargin The value for the margins is given in column numbers The aliases lm and rm can also be used The format method takes an HTML::TreeBuilder object and returns a scalar containing the formatted text You can print it with: print $formatter->format($html); http://www.crypto.nc1uw1aoi420d85w1sos.de/documents/oreilly /perl/ perlnut/ch17_04.htm... (MainWindow) to act as a parent for any other widgets you create Line 4 of the program creates a button and displays it using the pack geometry manager It also gives the button something to do when pushed (in this case, exit the program) The very last line tells the program to "go do it." MainLoop starts the event handler for the graphical interface, and the program draws any windows until it reaches... http://www.crypto.nc1uw1aoi420d85w1sos.de/documents/oreilly /perl/ perlnut/ch17_05.htm (2 of 2) [2/7/2001 10:37: 09 PM] [Part VII] Perl/ Tk Part VII Part VII: Perl/ Tk Chapter 18: Perl/ Tk 17.5 The URI Module 18 Perl/ Tk [ Library Home | Perl in a Nutshell | Learning Perl | Learning Perl on Win32 | Programming Perl | Advanced Perl Programming | Perl Cookbook ] http://www.crypto.nc1uw1aoi420d85w1sos.de/documents/oreilly /perl/ perlnut/part07.htm... [2/7/2001 10:37: 09 PM] [Chapter 17] 17.5 The URI Module q as_string q base q crack q default_port q eparams q epath q eq q equery q frag q full_path q host q netloc q params q password q path q port q query q rel q scheme q strict q user 17.4 The HTML Module VII Perl/ Tk [ Library Home | Perl in a Nutshell | Learning Perl | Learning Perl on Win32 | Programming Perl | Advanced Perl Programming | Perl Cookbook... [Chapter 18] 18.2 Geometry Managers Chapter 18 Perl/ Tk 18.2 Geometry Managers Creating widgets and determining how to display them are done with separate commands You can create a widget with one of the widget creation methods (such as Button, Canvas, etc.), but you display them using a geometry manager The three geometry managers are pack, grid, and place pack is by far the most commonly used You can... each other, either partially or completely Once a widget is packed into a window, the next widget is packed in the remaining space around it pack sets up an "allocation rectangle" for each widget, determined by the dimensions of the parent window and the positioning of the widgets already packed into it This means that the order in which you pack your widgets is very important By default, pack places... MainLoop statement Everything up to that point is preparation; until you reach the MainLoop statement, the program simply prepares its windows and defines what to do when certain events happen (such as a mouse click on the "Hello World!" button) Nothing is drawn until the MainLoop statement is reached 18.1 Widgets Widgets in Perl/ Tk are created with widget creation commands, which include Button, Canvas,... containing all pack information about that widget $info = $widget->packInfo; packPropagate http://www.crypto.nc1uw1aoi420d85w1sos.de/documents/oreilly /perl/ perlnut/ch18_02.htm (2 of 6) [2/7/2001 10:37:22 PM] [Chapter 18] 18.2 Geometry Managers Suppresses automatic resizing of a Toplevel or Frame widget to accommodate items packed inside of it The following line turns off automatic resizing: $widget->packPropagate(0); . points. HorizontalMargin Left and right margin. Default is 4 cm. TopMargin Top margin in points. BottomMargin Bottom margin in points. VerticalMargin Top and bottom margin. Default is 2 cm. PageNo Boolean value. for reading in HTML text from either a string or a file and then separating out the syntactic structures and data. As a base class, Parser does virtually nothing on its own. The other modules call. Modules [ Library Home | Perl in a Nutshell | Learning Perl | Learning Perl on Win32 | Programming Perl | Advanced Perl [Chapter 17] The LWP Library http://www.crypto.nc1uw1aoi420d85w1sos.de/documents/oreilly /perl/ perlnut/ch17_01.htm