Web Applications Server-side web programming in Java began with Java servlets, which were in some ways the server- side analogue of Java applets. Servlets are also similar to CGI programs. The main difference between a servlet and a CGI program is that servlets have to run in the context of an application that's known as a servlet container. In this sense, they're more like ASP programs. Just as the IIS understands how to interpret and execute ASPs, the servlet container understands how to pass requests on to servlets and send the output back to the user. In some cases, the web server serves as the servlet container; in others, the servlet container is a separate application that is connected to the web server. In any case, you can write servlets and deploy them on your servlet container. Once you've written a servlet and mapped it to a particular URL, it can respond to requests the same way a CGI program can. What does this have to do with Java Server Pages? Java Server Pages (or JSPs) are just a simpler way to create servlets. A JSP looks a lot like an ASPit's an HTML page that optionally contains scriptlets and directives. In the JSP world, the scriptlets are written using Java. The trick here is that when a servlet container serves up a JSP, it converts it into a servlet, compiles that servlet into a Java class file, and then maps it to the path where the JSP is located. So, a JSP at /index.jsp is turned into a servlet that is called whenever that path is requested. The syntax for JSPs is virtually identical to that of ASP files. Scriptlets are defined in exactly the same way. In Java, it looks like this: <% String aString = "This is a string."; response.write(aString); %> The expression evaluation feature in JSP is also exactly the same as it is in ASP. To print out the value of aString without bothering with response.write(), use the following: <%= aString %> There are some other constructs associated with JSP as well. These directives follow this pattern: <%@page language="Java" %> The fact that it starts with <%@ indicates that it's a directive. page is the name of the directive and language is an attribute. This directive indicates that the language used in the scriptlets on the page is Java. In truth, this is the only valid option; JSPs must use Java as their programming language. Perhaps the most common attribute of the page directive that you'll see is the import attribute, which is used to indicate that a particular class is used on your page. If you're not a Java programmer, imports might be a bit confusing for you. Just remember that you'll see a lot of them of you work on complex JSPs. There's also an alternative form of directives for JSPs that use an XML-based notation. To see how they differ from normal directives, let's look at how you include files in the JSP world. The first method uses a normal-looking include: <%@include file="footer.jsp" %> This directive includes a file called footer.jsp from the current directory in place of the directive. When you use the include directive, the included file is inserted at compile time. What this means is that the file:///G|/1/0672328860/ch19lev1sec1.html (5 von 7) [19.12.2006 13:50:09] Web Applications file is included before the JSP is converted to a servlet. For programmers, it means that code in the included file can interact with code in the files that include it. For example, you could set the copyright date in a variable in the including file, and reference that variable to print out the variable in the included file. You can also include files using JSP's XML-style directives. To include footer.jsp using the XML directive, the following code is used: <jsp:include template="footer.jsp" /> When you use this type of include, it's treated as a runtime include. This differs from the previous in that runtime includes are only included after the page has been converted into a servlet and run. The include is processed or read in separately at that point, which means that variables can't be shared between the included file and the including file. The last common constructs you'll hear about in the JSP world are taglibs. To make things easier for people who aren't Java programmers, the developers of the J2EE specification created a way to provide custom tags (called taglibs, short for tag libraries) that you can use as part of your pages. Not only can programmers create their own custom tags, but there are a number of projects working to create standard custom tags that encapsulate common functionality needed by many web applications. The taglib directive is used to make a tag library available for a JSP: <%@ taglib uri="/WEB-INF/app.tld" prefix="app" %> The uri attribute provides the URL for the descriptor file for the tag library. The prefix attribute indicates how tags associated with the tag library are identified. For example, if there's a tag library tag called blockquote, it is differentiated from the standard <blockquote> tag by using the prefix, like this: <app:blockquote>Some stuff.</app:blockquote> Many programmers write their own tag libraries that provide functionality specific to their applications. However, Sun has also defined a standard set of tag libraries to provide functionality common to many applications. This group of libraries is called JSTL (Java Standard Tag Libraries). You can read more about JSTL at http://java.sun.com/products/jsp/jstl/. You can find an actual implementation of JSTL at http://jakarta.apache.org/taglibs/doc/standard-doc/ intro.html. The JSTL tags provide functionality for things such as loops, conditional operations, and processing XML. There's a lot more to building web applications using Java and J2EE than I've discussed here. I've just provided an overview for you in case you need to apply your HTML skills to a web application written in Java. Hopefully, when you run into one of these applications, you'll have seen enough here not to be confused by the JSP syntax. PHP PHP is yet another language that enables you to embed scripts in your web pages. ASP is really part of Microsoft's overall software development platform, and similarly, J2EE is part of the Java universe. PHP, file:///G|/1/0672328860/ch19lev1sec1.html (6 von 7) [19.12.2006 13:50:09] Web Applications on the other hand, is completely independent. Rather than building on a general-purpose language, PHP is a programming language unto itself. The language uses a C-like syntax that also has some things in common with Perl. Like ASP and JSP, it can be interspersed with your HTML. Usually, you'll find that PHP files have the extension .php, but the web server can be configured to treat any files as PHP files. You can even set things up so that files with the extension .html are treated as PHP files. There are two ways to include script code in your pages: <?php echo("Hello."); ?> There's also a more concise notation for adding scripts to your page: <? echo("Hello."); ?> This was the traditional notation for adding PHP code to web pages, but it conflicts with XML, so <?php ? > was added to differentiate between the two. If you're starting out, you should stick with the <?php ?> notation because doing so could save you trouble later, and it's not that much more trouble. One of the nicest things about PHP is that it's completely free, and can be easily installed to work with Apache, the most popular web server. For this reason, many, many web hosting providers include support for PHP with their hosting packages. It's fairly simple to install, and is neither large nor unwieldy, so you can run it yourself with little trouble. PHP is also easier to learn than some of the other systems because it's not just an extension of a larger programming environment. You can find out more about PHP at http://www.php.net/. As I've done with the other technologies, let me explain how to include external files in your pages. In the PHP world, there are four functions that can be used to include external files in PHP documents. All of them are compile-time includes for the purposes of PHP. The functions are include(), require(), include_once(), and require_once(). Both include() and require() accept the path to a file as an argument. The difference is that if you use include() and the file cannot be read for some reason, a warning is printed but the page continues to be processed. If you use require(), a fatal error occurs if the included file cannot be read. include_once() and require_once() are exactly the same, except that if the file to be included has already been included earlier on the page, the include will be ignored. This may seem strange if you're thinking about including content, but it's helpful if you're including code. Let's say you have a file that sets up a bunch of variables used later on your page. It probably makes sense to use require_once() to make sure that those variables aren't set more than once. file:///G|/1/0672328860/ch19lev1sec1.html (7 von 7) [19.12.2006 13:50:09] Server-Side Includes Server-Side Includes Now that I've described a number of popular web application platforms, let me describe something much simpler that you may find more useful, at least in the immediate future. Most web servers support an extension called server-side includes (SSI) that enable you to place directives in your pages that are processed by the server when the pages are downloaded. All directives follow the same format. The basic format is <! #directive attribute=value attribute=value > Note that SSI directives are cleverly disguised as HTML comments. The advantage is that if the server doesn't process the directives, the browser will ignore them. This is particularly useful when you're previewing pages that have SSI directives embedded in them locally. Let's look at the directive in more detail. The directive name indicates how the directive should be used. In the ASP section earlier in this lesson, you saw that the include directive is used to include external files in an HTML document. The directive is followed by one or more attribute and value pairs. These pairs define how the directive is used. For example, the include directive has two possible attributes (which are mutually exclusive): file and virtual. The values associated with both of them are paths to files to include. As mentioned previously, the file attribute is used to load files from the current directory or below it. The virtual attribute indicates that the file can appear anywhere in the document root. Here are a couple of examples of include directives: <! #include file="includes/footer.html" > <! #include virtual="/includes/footer.html" > Any web server that claims to support server-side includes will support the include directive. Apache, the torchbearer for SSI, supports many other directives as well. Table 19.1 contains a list of all the directives supported by Apache. You can read more about them at http://httpd.apache.org/docs/mod/mod_include.html. Table 19.1. SSI Directives Supported by Apache Directive Usage config Enables you to configure the error message to be displayed if a directive fails, and the format for displaying date and time information and file size information. echo Prints out the value of an include variable. file:///G|/1/0672328860/ch19lev1sec2.html (1 von 4) [19.12.2006 13:50:10] Server-Side Includes exec Executes a CGI program or command and includes the results in the page. (This directive is usually disabled for security reasons.) fsize Prints out the size of the specified file. The virtual and file attributes are used with this directive, just as they are with the include directive. flastmod Prints the date the specified file was last modified. Supports the file and virtual attributes. include Includes a file in the page. printenv Prints a list of all environment variables and their values. set Sets a variable to a value. Using Server-Side Includes Let's look at how you might use server-side includes on a real site. As I said before, includes are the most common SSI directives used on most sites. Using includes, you can save a ton of work by bundling up content that's common to more than one page into separate files that are included on each of those pages. Let's look at an example that demonstrates how to use SSI on a real site. On many sites, you'll find that navigational elements are shared among pages, and the header and footer are common to every page. First of all, let's take a look at the files common to every page, the header and footer. Here's the source code to the header file: <div id="header"> <img Welcome to the largest baseball site on the Web. </div> The header consists of a single <div> containing a banner image and the tagline for the site. I've assigned the ID header to the <div> so that I can use CSS to modify how it is presented. Now let's look at the footer. It contains some basic copyright information that's found on every page. <div id="footer"> Copyright 2003, Baseball Online.<br /> Send questions to <a href="mailto:webmaster@example.com">webmaster</a>. </div> Now for the navigational elements for the site. If I were going the easy route, I'd just create one navigation file to be used on all the pages on the site, but that's not appropriate in many cases. One purpose of navigation is to enable users to move to various areas of the site. However, another purpose is to give users a sense of where they are on the site. So, instead of creating one navigation file to be used everywhere, I'm going to create three navigation files, one for each section of the site. The content of the three files follows. First is the navigation for the games section: <div id="navigation"> <a href="/">Home</a><br /> Games<br /> file:///G|/1/0672328860/ch19lev1sec2.html (2 von 4) [19.12.2006 13:50:10] Server-Side Includes <a href="/teams/">Teams</a><br /> <a href="/leagues/">Leagues</a> </div> Here's the navigation for the teams section: <div id="navigation"> <a href="/">Home</a><br /> <a href="/games/">Games</a><br /> Teams<br /> <a href="/leagues/">Leagues</a> </div> And, finally, here's the navigation for the leagues section: <div id="navigation"> <a href="/">Home</a><br /> <a href="/games/">Games</a><br /> <a href="/teams/">Teams</a><br /> Leagues </div> Now that I've created all of my included files, I can stuff them in a directory called includes off of the root directory. In this example, they'll be called header.html, footer.html, nav_games.html, nav_teams. html , and nav_leagues.html. Now let's look at a real page that uses these includes: <!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.0 Strict//EN' "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>Games: A Games Page</title> <link rel="stylesheet" type="text/css" href="/includes/styles.css" /> </head> </head> <body> <! #include virtual="/includes/header.html" > <! #include virtual="/includes/nav_games.html" > <div id="content">This is the content for a games page.</div> <! #include virtual="/includes/footer.html" > </body> </html> As you can see, the page I created consists mainly of includes. There are a few other things to note as well. Both in the includes and in the main page, there are no tags that modify how the data is presented. Instead, I just put everything inside <div> tags with IDs. The linked style sheet in this document contains all the style rules for the page, and controls the layout of the entire page. Here's the source for the page once all the included files have been included by the web server: <!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.0 Strict//EN' "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> file:///G|/1/0672328860/ch19lev1sec2.html (3 von 4) [19.12.2006 13:50:10] Server-Side Includes <head> <title>Games: A Games Page</title> <link rel="stylesheet" type="text/css" href="/includes/styles.css" /> </head> </head> <body> <div id="header"> <img src="header.gif" alt="Baseball Online" /><br /> Welcome to the largest baseball site on the Web. </div> <div id="navigation"> <a href="/">Home</a><br /> Games<br /> <a href="/teams/">Teams</a><br /> <a href="/leagues/">Leagues</a> </div> <div id="content">This is the content for a games page.</div> <div id="footer"> Copyright 2003, Baseball Online.<br /> Send questions to <a href="mailto:webmaster@example.com">webmaster</a>. </div> </body> </html> file:///G|/1/0672328860/ch19lev1sec2.html (4 von 4) [19.12.2006 13:50:10] Using Apache Access Control Files Using Apache Access Control Files Apache is the most popular web server, and is especially popular among ISPs that provide web hosting. For that reason, when I discuss features of web servers, I'm going to discuss Apache. You can apply the general techniques described here to any web server, but the specifics will apply only to Apache. Under Apache, assuming the server administrator has given you the right to do so, you can set up access control files that you can use to manage access to your directories, along with a lot of other things. These days, many kinds of server configuration directives can be used on a per-directory basis from the access file. The access control file is generally named .htaccess. It can actually be named anything that the server administrator chooses, but .htaccess is the default, and there's really no good reason to change it. Because the filename begins with a period, it's a hidden file under UNIX. This keeps it from cluttering things up when you're looking for content files in a directory. Apache configuration directives begin with the directive, and the rest of the line consists of the parameters associated with the directive. They're entered one per line. The format is usually something like this: Directive valueForDirective Sometimes the configuration directives are slightly more complex, but that's the general format. The first type of configuration I'll explain how to manipulate is actual access to the pages in the directory (and its subdirectories). Managing Access to Pages Controlling access to pages can be somewhat complex, mainly because Apache's access control system is very flexible. First, let's look at the access control directives themselves. The four you need to pay attention to are allow, deny, require, and order. The allow and deny directives enable you to control access to pages based on the IP address or domain name of the computer the visitor is using. Let's say you want to disallow all users from samspublishing.com (meaning that they're using a computer on the samspublishing.com network) from some pages on your site. You could just stick deny from samspublishing.com in your .htaccess file. These directives match parts of hostnames, so even if the user's hostname is firewall.samspublishing.com, he'll still be denied. By the same token, you can deny based on IP address: deny from 192.168.1 In this case, anyone on the network 192.168.1.* would be denied. You can be as restrictive as you want. You could create a directive like this, if you wanted: deny from com file:///G|/1/0672328860/ch19lev1sec3.html (1 von 4) [19.12.2006 13:50:11] Using Apache Access Control Files Needless to say, this would prevent anyone on a .com network from viewing your pages. That's pretty harsh. If you're not careful, you can restrict everyone from seeing your pages. If that's what you intend to do, you can just use this directive: deny from all Why would you want to do that? Well, it makes more sense when combined with two other directives: order and allow. Using order, you can specify the order in which deny and allow rules are applied. allow is just like deny, except that it allows users that meet the rule you specify to see the pages. So, you can write rules like this: order deny, allow deny from all allow from samspublishing.com This restricts use of your pages to only people on the samspublishing.com network. Based on the order directive, the deny rule is applied first, shutting out everyone. Then if the user meets your allow rule, she's allowed in. The last directive in this family is require. This directive, rather than basic access on how the user's machine identifies itself, is used to require user authentication. Before you can use it, you need to set a few things up. First, you'll need to create a file containing a list of usernames and passwords that can be used to access your site. Apache helpfully provides a tool to create these files, called htpasswd. It's invoked in the following manner: htpasswd -c /usr/local/apache/passwords account The arguments are as follows: the flag -c indicates that the password file should be created if it doesn't already exist. The next argument, /usr/local/apache/passwords, is the name of the password file. The last argument is the name of the account you want to create. The program will then ask you to type the password for the account and confirm it. Once you've done so, the account is created, along with the password file. At that point, you can add more accounts to your file by repeatedly running htpasswd (without the -c flag) and passing in new account names each time. The reason you can't just create a text file in a text editor is that the passwords are encrypted when they're saved in the file. htpasswd takes care of that. You can also create groups of users by creating a group file. To create this file, just use your favorite text editor and use the following format: groupname: account1 account2 account3 Substitute groupname with a group name of your own choosing, such as managers, and then list all the accounts from your password file that are members of the group. Once you've set up your password file and (optionally) your group file, you're ready to start using the require directive. To set up what's referred to as an authentication realm, the following directives are used: ● AuthType The authentication scheme to use. Except in rare circumstances, you'll use Basic here. file:///G|/1/0672328860/ch19lev1sec3.html (2 von 4) [19.12.2006 13:50:11] Using Apache Access Control Files ● AuthName The name of the realm. This name will be displayed when the user is prompted to log in. ● AuthUserFile The path of the password file you created. ● AuthGroupFile The path to the group file, if there is one. Now let's look at how these are used in the file: AuthType Basic AuthName "Administrator Files" AuthUserFile /usr/local/apache/passwords AuthGroupFile /usr/local/apache/groups Once you've set up the authentication realm, you can start using require directives to specify who's allowed to see the pages and who isn't. The format for require directives is as follows: require group administrators require user fred bob jim betty First, you specify whether the require directive refers to users or groups, and then you include a list of usernames or group names. If you included the previous directives in your .htaccess files, the users fred, bob, jim, and betty would be able to access the pages, along with any users in the administrators group. Redirecting Users Although .htaccess files were once associated strictly with access control, their capabilities were eventually expanded to encompass nearly all of the configuration directives for Apache. There's a full list of configuration directives for Apache 1.3 at http://httpd.apache.org/docs/mod/directives.html. If you're using Apache 2.0, you can use http://httpd.apache.org/docs/2.0/mod/directives.html. I want to talk about two in particular: Redirect and RedirectMatch. First, let's talk a bit about redirection. Redirecting users from one URL to another is all too common. For example, let's say you have a directory called aboutus on your website, and you want to move everything to a directory called about. One common way of handling things so that your users don't get a dreaded 404 Not Found error when they go to the old URL is to put an index.html file in that directory that looks like this: <html> <head> <title>Moved</title> <meta http-equiv="refresh" content="1; url=http://www.example.com/about" /> </head> <body> <p>This page has moved to a <a href="/about/">new location</a>.</p> </body> </html> The <meta> tag basically tells the browser to wait one second and then proceed to the URL specified. The content on the page is there just to handle the rare case in which the user's browser doesn't do what file:///G|/1/0672328860/ch19lev1sec3.html (3 von 4) [19.12.2006 13:50:11] . Libraries). You can read more about JSTL at http://java.sun.com/products/jsp/jstl/. You can find an actual implementation of JSTL at http://jakarta.apache.org/taglibs/doc/standard-doc/ intro .html. The. date in a variable in the including file, and reference that variable to print out the variable in the included file. You can also include files using JSP's XML-style directives. To include. previous in that runtime includes are only included after the page has been converted into a servlet and run. The include is processed or read in separately at that point, which means that variables