Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 45 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
45
Dung lượng
8,79 MB
Nội dung
ptg What Happens on the Internet 299 In Hour 1, “What Is TCP/IP?,” you learned about the organizations governing the Internet, including the Internet Advisory Board (IAB) and the Internet Engineering Task Force (IETF). The language of the Internet is, of course, TCP/IP, but it is worth highlighting a significant element of the TCP/IP infrastructure that provides for Internet messaging on a global scale: the common naming and numbering system overseen by ICANN. The DNS naming system is more than the name resolution pro- tocols described in Hour 11, “Name Resolution.” Name service on a global scale requires an enormous human effort to manage the lower-tier organizations that manage the orderly assignment of Internet names. Without the powerful DNS nam- ing system, the Internet would not be the pervasive force in daily life it is today. What Happens on the Internet The Internet really is a big TCP/IP network, and if you’re not worried about security or time delays, you can use the Internet for almost anything you can do on a routed corporate LAN. Of course, the security considerations are substantial. You definitely should not use the Internet for anything you could do on a routed corporate LAN, but you could if you wanted to. Hours 22 and 23 discuss some of the reasons why you need to be more careful about security in an unprotected space like the open Internet. IXP ISP 1 ISP 2 Router Router To Internet Point of Presence FIGURE 16.1 An ISP leases a Point of Presence (POP) on the Internet. From the Library of Athicom Parinayakosol ptg 300 HOUR 16: The Internet: A Closer Look It is important to remember that all computers participating in a networking activ- ity (on the Internet or on any other network) have one thing in common: They are running software that was designed for the activity in which they are engaged. Networking doesn’t just happen. It requires protocol software (such as the TCP/IP software described in Hours 2–7), and it also requires applications at each end of the connection that are specifically designed to communicate with each other. As shown in Figure 16.2, most computers on the Internet can be classified as either clients (computers that request services) or servers (computers that provide services). A client application on the client computer was written specifically to interact with the server application on the server computer. The server application was written to lis- ten for requests from the client and to respond to the requests. Client Server Request Response FIGURE 16.2 On the Internet, a computer typi- cally acts as a client or a server. Figure 16.3 shows the whole teaming ecosystem at a glance. A user sitting at a single computer anywhere in the world can connect to any of thousands of servers elsewhere in the world. A hierarchy of DNS servers resolves the target domain name to an IP address (in a process that is invisible to the user), and the client software on the user’s computer establishes a connection. The server might provide web pages for the user to browse and view, instant messaging, or files to download with FTP. Or perhaps the user is connecting to a mail server to download incoming messages. From the simple beginning of a few networked mainframes, the Internet has mor- phed into a sprawling jumble of services that the original professors and researchers couldn’t have imagined. In addition to sending email and surfing the web, a new generation of Internet users can make phone calls, connect webcams, watch televi- sion, download music, listen to podcasts, and blog their deepest emotions—all through the miracle of TCP/IP. You’ll learn more about many of these new web technologies in later hours. From the Library of Athicom Parinayakosol ptg URIs and URLs 301 URIs and URLs As shown in Figure 16.3, the Internet is a gigantic mass of client systems requesting resources and server systems providing resources. If you look closer at the process, though, you’ll realize that the protocol addressing rules discussed earlier in this book are not enough to support the rich array of services available on the Internet. The IP address or domain name can locate a host. The port number can point to a service running on the host. But what is the client requesting? What is the server supposed to do? Is there input for which the client is requesting output? Experts have long understood the importance of providing a standard format for requesting Internet resources. Some have argued, in fact, that the presence of a unified request format is another reason why the Internet seems like a single big, cohesive essence rather than just a jumble of computers. The request format most familiar to Internet users is what is commonly called a Uniform Resource Locator (URL). The URL is best known for the classic web address format: http://www.mercurial.org. URLs are so common now that they appear with little or no explanation on TV commercials and bubble gum wrappers. Internet User DNS Servers Web Server Email Server FTP Server SSH Remote Access Server Internet FIGURE 16.3 The Internet is a vast sea of services acces- sible from any- where on the Earth. From the Library of Athicom Parinayakosol ptg 302 HOUR 16: The Internet: A Closer Look What we think of as a URL is actually a special case of a more general format known as a Uniform Resource Identifier (URI). The two acronyms are sometimes used interchangeably, but the distinction is important. Recent Internet documents have attempted to converge the terms. RFC 3986, “Uniform Resource Identifier Generic Syntax,” states that future documents should use the more general term URI instead of URL. The term Identifier is better than Locator for the general case because every request doesn’t actually point to a location. The specification for the structure of a URI is over 60 pages, but the basic format is as follows: scheme://authority/path?query#fragment The scheme identifies a system for interpreting the request. The scheme field is often associated with a protocol. Table 16.1 shows some of the schemes used on the Internet today. The classic http scheme is used with web addresses. Although alter- native schemes such as gopher are less important than they once were, others, such as ftp, are still in common usage. The authority, which begins with a double slash (//) defines the user, host, and port associated with the request. A full expression of the authority component might look like: //joeyesterday@www.bonzai.com:8042 As you learned in Hour 6, a default port number is often associated with the proto- col, so the port number is typically omitted. The username is only necessary if the user must provide credentials to access the resource, which is uncommon for the web but more common with a protocol like FTP. Even if the user is required to provide credentials, you still might not need to specify a user in the URI. Many services prompt for a user ID and password after the initial request. Without the user and the port, the authority field looks more like the basic web address we all appreciate: //www.bonzai.com or coupled with the scheme component: http://www.bonzai.com In this example, the host is expressed as a DNS domain name, but you can also refer to a host by its IP address. By the Way From the Library of Athicom Parinayakosol ptg URIs and URLs 303 The path component points down through a hierarchy of directories to a file that is the subject of the request. In the case of http, if the path is omitted, the request points to a default web page for the domain (the home page). Most users by now are familiar with the need to type in additional directory and filenames after the domain name: http://www.bonzai.com/trees/LittleTrees.pdf The query and fragment components of the URI are rarely typed or interpreted by humans. The precise meaning of these components can vary depending on the scheme, and some schemes don’t even support the query and fragment components. The easiest way to observe the query field in the wild is to type a search request into a search engine like Google and then examine the URI that appears in the address bar. The preceding example considers the URI in the context of the hugely popular HTTP protocol used on the World Wide Web. (You’ll learn more about HTTP and its com- panion markup language HTML in Hour 17.) Keep in mind, though, that each of the different scheme specifications can define how to interpret the information in the URI. The generic URI specification is intentionally kept separate from the details defined in the specifications for each of the schemes so that the schemes can evolve without requiring a change to the basic format. Table 16.1 also lists the RFCs associ- ated with each scheme. TABLE 16.1 URI Schemes Scheme Description Reference file A file on the host system RFC1738 ftp File Transfer Protocol RFC1738 gopher The Gopher protocol RFC4266 http Hypertext Transfer Protocol RFC2616 https Hypertext Transfer Protocol Secure RFC2818 im Instant Messaging RFC3860 ldap Lightweight Directory Access Protocol RFC4516 mailto Electronic mail address RFC2368 nfs Network File System protocol RFC2224 pop Post Office Protocol v3 RFC2384 telnet Telnet Interactive session RFC4248 From the Library of Athicom Parinayakosol ptg 304 HOUR 16: The Internet: A Closer Look Summary The Internet consists of computers all over the world requesting and providing serv- ices. The URI format offers a standard means for identifying and locating those resources. All these protocols are different, however, and the details of communica- tion vary depending on the service. Later chapters introduce you to some of the critical services at work on the Internet today. Q&A Q. My company wants to become an Internet service provider (ISP). We have attempted to establish a Point of Presence (POP) connection with a nearby NAP, but no places are available. How can we get connected? A. You can lease bandwidth from a wholesale ISP. Q. Why have some Asian and Eastern European countries suggested starting their own independent alternatives to DNS and the URI format? A. The restriction of the Latin character set is unintuitive for users who speak lan- guages with non-Latin characters. Key Terms Review the following list of key terms: . Authority—The portion of the URI identifying the host, users, and port. . Internet Exchange Point (IXP)—A facility that provides access to the Internet. . Point of Presence (POP)—An attachment point to the Internet leased by an ISP. . Scheme—The portion of the URI that identifies the protocol or system for interpreting the rest of the URI. . Uniform Resource Identifier (URI)—An alphanumeric string used to identify and Internet resource. . Uniform Resource Locator (URL)—A type of URI that locates a resource. A common URL form is web addresses (www.sams.com). From the Library of Athicom Parinayakosol ptg HOUR 17 HTTP, HTML, and the World Wide Web What You’ll Learn in This Hour: . HTML . HTTP The World Wide Web began as a universal graphic display framework for the Internet. Since its inception, the Web has come to dominate public perceptions of the Internet, and it has revolutionized the way we think about application interfaces. This hour provides an introduction to HTTP, HTML, and the Web. At the completion of this hour, you will be able to . Show how the World Wide Web works . Build a basic web page using text and HTML tags . Discuss the HTTP protocol and describe how it works What Is the World Wide Web? The view of the web page you see through the window of your web browser is the result of a conversation between the browser and a web server computer. The language used for that conversation is called Hypertext Transfer Protocol (HTTP). The data delivered from the server to the client is a finely crafted jumble of text, images, addresses, and formatting codes rendered to a unified document through an amazing versatile formatting language called Hypertext Markup Language (HTML). From the Library of Athicom Parinayakosol ptg 306 HOUR 17: HTTP, HTML, and the World Wide Web The basic elements of what we know today as the World Wide Web were created by Tim Berners-Lee in 1989 at the CERN research institute in Geneva, Switzerland. Berners-Lee created a subtle and powerful information system by bringing together three technologies that were already in development at the time: . Markup language—A system of instructions and formatting codes embedded in text . Hypertext—A means for embedding links to documents, images, and other elements in text . The Internet—(As you know by now) A global computer network of clients requesting services and servers providing services through TCP/IP Markup languages began in the 1960s as a means for adding formatting and type- setting codes to the simple text used by early computers. At the time, text files were used throughout the computing world for configuration files, online help docu- ments, and electronic mail messages. When people started using computers for let- ters, memos, and other finished documents, they needed a way to specify elements such as headlines, italics, bold font, and margins. Some of the early markup lan- guages (such as TeX, which is still in use today) were developed as a means for scientists to format and typeset mathematical equations. By the time modern day word processing programs began to emerge, vendors had developed numerous systems (many of them proprietary) for coding formatting information into a text document. Some of these systems used ASCII-based codes. Others used different digital markers to denote formatting information. Of course, these formatting code systems work only if the application that writes the document and the application that reads the document agree on what each code means. Berners-Lee and other HTML pioneers wanted a universal, vendor-neutral system for encoding format information. They wanted this markup system to include not just typesetting codes but also references to image files and links to other documents. The concept of hypertext (a live link within text that switches the view to the docu- ment referenced in the link) also evolved in the 1960s. Berners-Lee brought the hypertext concept to the Internet through the development of the URL (or URI—see Hour 16, “The Internet: A Closer Look”). Links let the reader view the online infor- mation in small doses. The reader can choose whether to link to another page for By the Way From the Library of Athicom Parinayakosol ptg What Is the World Wide Web? 307 additional information. HTML documents can be assembled into unified systems of pages and links (see Figure 17.1). A visitor can find a different path through the data depending on how the visitor traverses the links. And the Web developer has almost unlimited ability to define where a link will lead. The link can lead to another HTML document in the same directory, a document in a different directory, or even a document on a different computer. The link might lead to a totally differ- ent website on another computer across the world. FIGURE 17.1 A website is a unified system of pages and links. As you learned in Hour 16, the form of URL most associated with the Web is http://www.dobro.com It is also common to see a path and filename appended to the URL: http://www.dobro.com/techniques/repair/fix.html A web browser navigates by URLs. You access a web page by entering the URL of the page in the address box of the browser window (see Figure 17.2). When you click on a link, the browser opens the web page specified in the link’s URL. From the Library of Athicom Parinayakosol ptg 308 HOUR 17: HTTP, HTML, and the World Wide Web To summarize this brief introduction, a basic HTML document contains some combination of . Text . Graphics . Text formatting codes (font and layout information) . References to secondary files such as graphics files . Links to other HTML documents or to other locations in the current document To visit a website, the user enters the URL of the website into the web browser win- dow. The browser initiates a connection to the web server specified in the URL. The server sends the HTML data across the network to the web browser. The web browser interprets the HTML data to create the view of the web page that appears in the browser window. Understanding HTML HTML is the payload that is transmitted through the processes of HTTP. As you learned earlier in this hour, an HTML document includes text, formatting codes, Address Box FIGURE 17.2 Enter the URL in the address box of the browser window. From the Library of Athicom Parinayakosol [...]... Header Fields Field Value Must Be Description Content-Length integer Size of the content object in octets Content-Encoding x-compress x-gzip Value representing the type of encoding associated with the message Date Standard date format defined in RFC 85 0 Date in Greenwich Mean Time when the object was created Last-modified date Standard date format defined in RFC 85 0 Date in Greenwich Mean Time when the object... programming interface to pass the data to programs that process the user information If the user is purchasing a product, these behind-the-scenes programs may check credit card information or send a shipment order to the mail room If the user is adding his name to a mailing list or joining a restricted online site, a program may add the user information to a database FIGURE 17.5 A server-side scripting... , , , , and Marks the beginning and end of a heading Each heading tag represents a different heading level is the highest level Marks the beginning and end of a section of bold text Marks the beginning and end of a section of underlined text Marks the beginning and end of a section of italicized text Marks the beginning and end of a section with special font... Parinayakosol Understanding HTTP 313 A link appears in the HTML file as a tag The simplest form of a link uses the tag with the URL of the link destination given as a value for the HREF attribute For instance, in the preceding example, if you would like the words “Archipelago of Parakeets” to appear as hypertext with a link to a website that tells about the archipelago, enclose the words within tags as... formatting, file references, and links associated with a web page Some important HTML tags are shown in Table 17.1 TABLE 17.1 Some Important HTML Tags Tag Description Marks beginning and end of HTML content in the file Marks the beginning and end of the header section Marks the beginning and end of the body section, which describes the text that will appear in the browser window... a single template It is a fairly simple matter to get a computer program or script to assemble HTML content This dynamic approach enables a website to interact with the user The server can formulate the web page in response to user input Server-side scripting also lets the server accept input from the client and process that input behind the scenes A common server-side scripting scenario is show in. .. corresponding tag at the end of the file Within the beginning and ending tags, the document is divided into the following two sections: The head (enclosed between the and tags) contains information about the document The information in the head does not appear on the web page, although the tag specifies a title that will appear in the title bar of the browser window... format for extending the capabilities of Internet email A MIME-enabled email application encodes the binary attachment into MIME format before transmission When the message is downloaded to the recipient, a MIME-enabled email application on the recipient’s computer decodes the attachment and restores it to its original form MIME brings several innovations to Internet mail, including the following: Expanded... Internet’s email infrastructure is derived from a pair of documents published in 1 982 : RFC 82 1 (“Simple Mail Transfer Protocol”) and RFC 82 2 (“Standard for the Format of ARPA Internet Text Messages”) Later documents have refined these specifications, including RFC 282 1, which defines a new version of SMTP, and RFC 282 2 “Internet Message Format.” Other proposed email formats have developed through the... world) Email developed early in the history of networking Almost as soon as computers From the Library of Athicom Parinayakosol 322 HOUR 18: Email were linked into networks, computer engineers began to wonder if humans as well as machines could communicate across those same network links The current Internet email system dates back to ARPAnet days Most of the Internet’s email infrastructure is derived . Happens on the Internet 299 In Hour 1, “What Is TCP/ IP? ,” you learned about the organizations governing the Internet, including the Internet Advisory Board (IAB) and the Internet Engineering Task Force. the Internet is, of course, TCP/ IP, but it is worth highlighting a significant element of the TCP/ IP infrastructure that provides for Internet messaging on a global scale: the common naming and. page in response to user input. Server-side scripting also lets the server accept input from the client and process that input behind the scenes. A common server-side scripting scenario is show in