of the HTML form tags. It will only be a matter of time before we will use XML to define our Java Beans and to compile the XML into the Beans byte codes. It is interesting to note that in the current JDK 1.2 release all the AWT and Swing components are implemented as Java Beans. Chapter 9. ApplicationServers • How Did We Get to Here? • What Is an Application Server? • Some Explanations Examining what is happening in the information systems (IS) shops of corporate America, we see a move away from the deployment of traditional client/server applications and a large shift towards multitier, Web based computing. Delivery of application functionality in the form of desktop-based, fat client tools is being dumped in favor of server-generated, lightweight HTML-based user interfaces that derive their presentation layer from the lowly Web browser (that is already installed on almost every desktop in corporate America). The browser used as an application presentation engine and user input collection device (rather than a processing engine) coupled to a powerful server architecture that provides processing power and a multitier approach to data connectivity provides a very powerful and versatile application delivery vehicle. How did we arrive at this point? The answer is through the evolution of Web-based computing. In the early days of the Web (i.e., the not too distant past) we started to develop applications that used the Web browser as the application presentation layer speaking via the HyperText Transfer Protocol and the Common Gateway Interface to programs being run on the Web server at the request of a browser page. Voilà! Our Web server had become an application server; now instead of delivering content to a Web browser, the Web server was delivering an application. Many of the early Web-based applications were pretty simple, consisting usually of an HTML form and a script run by the Web server (these came to be known as CGI scripts) to read data returned to the Web server and act on it. As time progressed and applications became more sophisticated, we became quickly aware of the limitations of using the stateless HTTP and came up with a number of mechanisms (hidden variables, cookies,…) that allowed us to make our applications more stateful and take on the guise of traditional client/server applications without having to resort to the heavyweight client model. Ahhh, life was good and computing was even better. As the Web became more popular and we (corporate America) decided that we needed to take commercial advantage of the new application deployment platform, we began to notice that there were some flaws in our new paradigm. First, applications that operated well for a few users didn't do so well when we tried to scale them to thousands and tens of thousands of users. Second, there was this ever-present pain-in-the-neck of state preservation problem. Determined not to let happen to Web computing what happened in client/server computing, we started to examine our new environment and make improvements where needed. One of the first things that we noticed was that the CGI scripts that we were using to add the processing power to our Web pages was a pretty bad way to do it. There were a number of options to increasing the throughput. We could get rid of those shell and Perl scripts and move that processing to a compiled language (C or C++) that would run faster and free up processor horsepower; or we could take the tried-and-true approach to automotive repair philosophy and "jack up the radiator cap and drop in a new car" (i.e., if our current machine is too slow, save the software and buy a bigger, more powerful machine). Both of these approaches have severe flaws. Replacing the scripts with compiled versions only masked or postponed the real problem and actually introduced a few problems of its own. The real problem was in the way that the scripts were being run (i.e., every time a script was called for, the Web server started up a new process to run the CGI program on, and we quickly ran out of system resources). Another problem was that, if we weren't religious about memory management and I/O programming, it was relatively easy for a hacker to figure out how to overload an I/O buffer and get the process to crash hard enough to bring down the entire site or accidentally give over control of the machine to the hacker. Along comes Java and server-side programming, and developers recognize that, because of the Java security model and its lightweight nature, it would be an ideal tool for doing server-side programming. Servlets are born. The first servlets showed up as .BAT files and scripts that loaded the Java Virtual Machine and ran the Java code as a CGI program would be run. The performance and scalability of these servlets is pretty bad due to having to load so many copies of the JVM, but the stage was set for the servlet API. It was only a matter of time until the release of the all-Java Web server and the integration of the JVM into OEM Web server products like Netscape's Enterprise Server that true server-side servlet computing came on the scene. As we looked more seriously at the Web as an application deployment platform, a new type of program called an application server started to appear. The application server started out as an application that could be run in conjunction with our Web server and would do things like state management and legacy system access. Little by little, the application server architecture took on the general functionality shown in Figure 9-1. Currently there are more than 40 products on the market that all claim to be application servers. Figure 9-1. General application server architecture. The following common threads run through all the applicationservers currently on the market and should be a help when comparing marketing information from the various vendors: 1. Inclusion of a high-performance Web server or the ability to integrate any of the popular currently available commercial Web servers easily. 2. Integrated development environment or the ability to integrate any of the commercially available development IDEs. 3. The ability to interface with Enterprise Resource Planning (ERP) systems especially SAP, BAAN, or PeopleSoft. 4. The ability to interface with Transaction Processing (TP) monitors. 5. Support for stateless and stateful database connections. 6. Connection pooling of database connections. 7. Access to legacy applications and legacy databases. 8. Massive scalability through hardware replication and load balancing. 9. Automatic fail-over capability in the case of a processor failure. 10. Support of the Enterprise Java Beans Specification. This is shown in Figure 9-1 . High-Performance Web Servers Web servers have always played a central role in Web-based computing. In the past they were used to serve content to users and provide the capability to run CGI scripts to create dynamic Web pages with database connectivity. Newer Web servers have integrated JVMs and support the full servlet API, making the use of CGI programming unnecessary. This new breed of Web servers also support remote administration via a Web-based GUI interface and even the administration of multiple servers through the same GUI Integrated Development Environment Because the environment provided by applicationservers is so rich and supports so many APIs, OEMs all provide Integrated Development Environments to help the end user create applications that are timely and supportable. Included with the tools are source control and configuration management systems. Interfacing to Enterprise Resource Planning Systems Today the darling application of the corporate IS shop is the ERP system. One of the things that was painfully (and expensively) pointed out to IS Managers over the last 3 years (Y2K preparation) was how terribly dependent corporations were on old legacy systems written 20 and 30 years ago in COBOL and PL/1. One of the options to becoming Y2K compliant was to move the corporate computing model away from the hodge-podge of legacy applications and databases that had grown up with the corporations that fostered them and toward a relatively standardized model that had been developed relatively recently under the guise of ERP systems. ERP systems, like SAP, BAAN, and PeopleSoft, are based on a large database model of the entire corporation. After all, most corporations have a similar make-up (i.e., an accounting organization, accounts payable, accounts receivable, payroll, personnel, manufacturing, planning, etc.). If all these functions could share a common database and a common set of processes and procedures, the corporation could run more effectively and efficiently. Since one system can never meet all possible needs, applicationservers provide certified tools that allow interfacing with ERP systems in such a way that application server-based programs will be well behaved and supported even through new releases of the ERP system software. Ability to Interface with Transaction Processing Monitors One of the applications that was developed during the client/server paradigm of application development was a program called a Transaction Processing (TP) Monitor (BEA Tuxedo is the most notable of these). TP Monitors are systems that handle high transaction-rate-based jobs like airline reservation systems, banking systems, and such. To be able to interface with these systems from the Web is crucial to e-Commerce. Support Stateful Applications One of the hardest things about programming the Web has always been how to make stateful applications. In our attempt to do this, we have tried every trick we could come up with from cookies to hidden variables. Applicationservers take a more rigorous approach by actually maintaining state databases of our applications. In most cases, databases are maintained in two forms: a state that can be recovered after a reboot (persistent via a DBMS) and a state that cannot be recovered after a reboot (in memory data caching). Connection Pooling of Database Connections One of the lessons we learned from the two-tier client/server model of database programming is that databases are not good connection managers and that making the initial connection to the database in many cases takes longer than the actual database activities we are trying to perform. To help improve overall database performance, the application server will open a number of database connections at startup and then manage those connections for the various applications that are using the database(s). This way the cost (time) of establishing the database connections is only incurred once (at startup). Once opened, the connections are never shut down; instead, they are shared by the applications. After all, a connection is a connection is a connection…and Web-based applications do not usually need a connection for more than the current query. Access to Legacy Applications and Legacy Databases There are a number of ways to provide access to legacy system applications. • One is to incorporate a terminal emulator in the form of a Java applet as one of the client interfaces provided by the application server and actually allow the end user to interact with the legacy system. • Another is to provide a screen scraper that allows the data portion of a legacy screen to be scraped out and placed in a dynamic HTML form or a Java applet. • A more common way is to place a CORBA wrapper around the legacy system and provide a new CORBA (Java or C++) client for the user to interface with the application through. For access to a legacy database, the approach is somewhat dependent on the database and the hosting operating system. In many cases, we are looking at IBM mainframes and DB2, VSAM, and IMS databases. Because these databases are based on IBM's proprietary SNA (System Network Architecture) connectivity scheme rather than TCP/IP, one requirement is the installation of a gateway between the two systems (TCP/IP and SNA) that is responsible for the protocol conversion between the two systems. Because DB2 is a relational database, JDBC drivers can be installed on client workstations; they will allow normal JDBC access to DB2. VSAM (Virtual Sequential Access Method) is an older, flat file data structure in which many legacy databases are maintained. In many cases, these databases should have been moved to DB2 years ago, but the "if it isn't broken don't fix it" mentality prevailed. IMS is a database model from the 1970s that viewed a database as a hierarchical structure and provided its own database programming language to support it. For VSAM and IMS databases there are tools from IBM and companies like Intersolv and Cross Access that allow these databases to be treated as relational tables. Because of the number of gyrations that these have to go through to make everything look like a table, the performance is limited. Scalability Through Load Balancing Scalability is the ability of a system to meet the performance demands of an increasingly larger user community. A single processor, no matter how fast, can only service a finite number of user requests with some degree of performance. Adding multiple processors to a box can buy some additional performance but not really increase the overall scalability of the overall system. To increase the scalability of the system in a meaningful way requires the introduction of additional processing units (boxes), one of which is to be used as an HTTP dispatcher. All requests will come in to the dispatcher, and the dispatcher will direct the request to the least busy box in the . claim to be application servers. Figure 9-1. General application server architecture. The following common threads run through all the application servers. an application deployment platform, a new type of program called an application server started to appear. The application server started out as an application