Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 69 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
69
Dung lượng
2,54 MB
Nội dung
Data Access in J2EE Applications JDBC access from custom tags is superficially appealing, because it's efficient and convenient. Consider the following JSP fragment from the JSP Standard Tag Library 1.0 specification, which transfers an amount from one account to another using two SQL updates. We'll discuss the JSP STL Expression Language in Chapter 13. The ${} syntax is used to access variables already defined on the page: Now let's consider some of the design principles such a JSP violates and the problems that it is likely to produce: o The JSP source fails to reflect the structure of the dynamic page it will generate. The 16 lines of code shown above are certain to be the most important part of a JSP that contains them, yet they generate no content. o (Distributed applications only) Reduced deployment flexibility. Now that the web tier is dependent on the database, it needs to be able to communicate with the database, not just the EJB tier of the application. o Broken error handling. By the time we encounter any errors (such as failure to communicate with the database); we're committed to rendering one particular view. At best we'll end up on a generic error page; at worst, the buffer will have been flushed before the error was encountered, and we'll get a broken page. o The need to perform transaction management in a JSP, to ensure that updates occur together or not at all. Transaction management should be the responsibility of middle tier objects. o Subversion of the principle that business logic belongs in the middle tier. There's no supporting layer of middle tier objects. There's no way to expose the business logic contained in this page to non-web clients or even web services clients. o Inability to perform unit testing, as the JSP exposes no business interface. o Tight coupling between page generation and data structure. If an application uses this approach and the database schema changes, many JSP pages are likely to need updating. o Confusion of presentation with content. What if we wanted to expose the data this page presents in PDF (a binary format that JSP can't generate)? What if we wanted to convert the data to XML and transform it with an XSLT stylesheet? We'd need to duplicate the data access code. The business functionality encapsulated in the database update is tied to JSP, a particular view strategy. 277 Brought to you by ownSky If there is any place for data access from JSP pages using tag libraries, it is in trivial systems or prototypes (the authors of the JSP standard tag library share this view). Never perform data access from JSP pages, even when it is given the apparent respectability of a packaged tag library. JSP pages are view components. Summary In this chapter we've looked at some of the key issues in data access in J2EE systems. We've discussed: o The distinction between business logic and persistence logic. While business logic should be handled by Java business objects, persistence logic can legitimately be performed in a range of J2EE components, or even in the database. o The choice between object-driven and database-driven data modeling, and why database-driven modeling is often preferable. o The challenges of working with relational databases. o O/R mapping concepts. o The use of Data Access Objects - ordinary Java interfaces - to provide an abstraction of data access for use by business objects. A DAO approach differs from an O/R mapping approach in that it is made up of verbs ("disable the accounts of all users in Chile") rather than nouns ("this is a User object; if I set a property the database will be transparently updated"). However, it does not preclude use of O/R mapping. o Exchanging data in distributed applications. We discussed the Value Object J2EE pattern, which consolidates multiple data values in a single serializable object to minimize the number of expensive remote calls required. We considered the possible need for multiple value objects to meet the requirements of different use cases, and considered generic alternatives to typed value objects which may be appropriate when remote callers have a wide variety of data requirements. o Strategies for generating primary keys. o Where to implement data access in J2EE systems. We concluded that data access should be performed in EJBs or middle-tier business objects, and that entity beans are just one approach. Although middle-tier business objects may actually run in a web container, we saw that data access from web-specific components such as servlets and JSP pages is poor practice. I have argued that portability is often unduly prioritized in data access. Portability of design matters greatly: trying to achieve portability of code is often harmful. An efficient, simple solution that requires a modest amount of persistence code to be reimplemented if the database changes creates more business value than a inefficient, less natural but 100% portable solution. One of the lessons of XP is that it's often a mistake to tr) to solve tomorrow's problems today, if this adds complexity in the first instance. Data Modeling in the Sample Application Following this discussion, let's consider data access in our sample application. 278 Brought to you by ownSky Data Access in J2EE Applications The Unicorn Group already uses Oracle 8.1.71. It's likely that other reporting tools will use the database and, in Phase 1, some administration tasks will be performed with database-specific tools. Thus database-driven (rather than object-driven) modeling is appropriate (some of the existing box office application's schema might even be reusable). This book isn't about database design, and I don't claim to be an expert, so we'll cover the data schema quickly. In a real project, DBAs would play an important role in developing it. The schema will reflect the following data requirements: o There will be a number of genres, such as Musical, Opera, Ballet, and Circus. o There will be a number of shows in each genre. It must be possible to associate an HTML document with each show, containing information about the work to be performed, the cast and so on. o Each show has a seating plan. A seating plan describes a fixed number of seats for sale, divided into one or more seat types, each associated with a name (such as Premium Reserve) and code (such as AA) that can be displayed to customers. o There are multiple performances of each show. Each performance will have a price structure which will assign a price to each type of seat. o Although it is possible for each show to have an individual seating plan, and for each performance to have an individual price structure, it is likely that shows will use the default seating plan for the relevant hall, and that all performances of a show will use the same price structure. o Users can create booking reservations that hold a number of seats for a performance. These reservations can progress to confirmations (seat purchases) on submission of valid credit card details. First we must decide what to hold in the database. The database should be the central data repository, but it's not a good place to store HTML content. This is reference data, with no transactional requirements, so it can be viewed as part of the web application and kept inside its directory structure. It can then be modified by HTML coders without the need to access or modify the database. When rendering the web interface, we can easily look up the relevant resources (seating plan images and show information) from the primary key of the related record in the database. For example, the seating plan corresponding to the primary key 1 might be held within the web application at /images/seatingplans/1.j pg. An O/R modeling approach, such as entity EJBs will produce little benefit in this situation. O/R modeling approaches are usually designed for a read-modify-write scenario. In the sample application, e have some reference data (such as genre and show data) that is never modified through the Internet User or Box Office User interfaces. Such read-only reference data can be easily and efficiently obtained JSln g JDBC; O/R approaches are likely to add unnecessary overhead. Along with accessing reference data, the application needs to create booking records to represent users' seat reservations and purchase records when users confirm their reservation. This dynamic data is not well suited to O/R modeling either, as there is no value in caching it. For -Xample, the details of a booking record will be displayed once, when a user completes the booking Process. There is little likelihood of it being needed again, except as part of a periodic reporting process, w hich might print and mail tickets. 279 Brought to you by ownSky As we know that the organization is committed to using Oracle, we want to leverage any useful Oracle features. For example, we can use Oracle Index Organized Tables (IOTs) to improve performance. We can use PL/SQL stored procedures. We can use Oracle data types, such as the Oracle date type, a combined date/time value which is easy to work with in Java (standard SQL and most other databases use separate date and type objects). Both these considerations suggest the use of the DAO pattern, with JDBC as the first implementation choice (we'll discuss how to use JDBC without reducing maintainability in Chapter 8). JDBC produces excellent performance in situations where read-only data is concerned and where caching in an O/R mapping layer will produce no benefit. Using JDBC will also allow us to make use of proprietary Oracle features, without tying our design to Oracle. The DAOs could be implemented using an alternative strategy if the application ever needs to work with another database. The following E-R diagram shows a suitable schema: The DDL file (crea te_ticket.ddl) is included in the download accompanying this book, in the /db directory. Please refer to it as necessary during the following brief discussion. The tables can be divided into reference data and dynamic data. All tables except the SEAT_STATUS, BOOKING, PURCHASE, and REGISTERED_USER tables are essentially reference tables, updated only by Admin role functionality. Much of the complexity in this schema will not directly affect the web application. Each show is associated with a seating plan, which may be either a standard seating plan the relevant hall or a custom seating plan. The SEAT_PLAN_SEAT table associates a seating plan with seats it contains. Different seating plans may include some of the same seats; for example, one seating plan may remove a number of seats or change which seats are deemed to be adjacent. Seating plan information can be loaded once and cached in Java code. Then there will be no need to run further queries to establish which seats are adjacent etc. 280 Brought to you by ownSky Data Access in J2EE Applications Of the dynamic data, rows in the BOOKING table may represent either a seat reservation (which will live fixed time) or a seat purchase (in which case it has a reference to the PURCHASE table). The SEAT_STATUS table is the most interesting, reflecting a slight denormalization of the data model. While if we only created a new seat reservation record for each seat reserved or purchased, we could query to establish which seats were still free (based on the seats for this performance, obtained through relevant seating plan), this would require a complex, potentially slow query. Instead, the SEAT_STATUS table is pre-populated with one row for each seat in each performance. Each row has a liable reference to the BOOKING table; this will be set when a reservation or booking is made. The population of the SEAT_STATUS table is hidden within the database; a trigger (not shown here) is used o add or remove rows when a row are added or removed from the PERFORMANCE table. The SEAT_STATUS table is defined as follows: CREATE TABLE seat_status ( performance_id NUMERIC NOT NULL REFERENCES performance, seat_id NUMERIC NOT NULL REFERENCES seat, price_band_id NUMERIC NOT NULL REFERENCES price_band, booking_id NUMERIC REFERENCES booking, PRIMARY KEY(performance_id, seat_id) ) organization index; The price_band_id is also the id of the seat type. Note the use of an Oracle IOT, specified in the final organization index clause. Denormalization is justified here on the following grounds: o It is easy to achieve in the database, but simplifies queries and stored procedures. o It boosts performance by avoiding complex joins. o The resulting data duplication is not a serious problem in this case. The extent of the duplication is known in advance. The data being duplicated is immutable, so cannot get out of sync. o It will avoid inserts and deletes in the SEAT_STATUS table, replacing them with updates. Inserts and deletes are likely to be more expensive than updates, so this will boost performance. o It makes it easy to add functionality that may be required in the future. For example, it would be easy to take remove some seats from sale by adding a new column in the SEAT_STATUS table. It is still necessary to examine the BOOKING table, as well as the SEAT_STATUS table, to check whether a seat is available, but there is no need to navigate reference data tables. A SEAT_STATUS row without a booking reference always indicates an available seat, but one with a booking reference may also indicate an available seat if the booking has expired without being confirmed. We need to perform an outer join with the BOOKING table to establish this; a query which includes rows in which the foreign key to the BOOKING table is null, as well as rows in which the related row in the BOOKING table indicates an expired reservation. There is no reason that Java code - even in DAOs - should be aware of all the details of this schema. I have made several decisions to conceal some of the schema's complexity from Java code and hide some of the data management inside the database. For example: 281 Brought to you by ownSky o I've used a sequence and a stored procedure to handle reservations (the approach we discussed earlier this chapter). This inserts into the BOOKING table, updates the SEAT_STATUS table and returns the primary key for the new booking object as an out parameter. Java code that uses it need not be aware that making a reservation involves updating two tables. o I've used a trigger to set the purchase_date column in the PURCHASE table to the system date, so that Java code inserting into this table need not set the date. This ensures data integrity and potentially simplifies Java code. o I've used a view to expose seating availability and hide the outer join required with the BOOKING table. This view doesn't need to be updateable; we're merely treating it as a stored query. (However, Java code that only queries needn't distinguish between a view and a table.) Although the rows in the view come only from the SEAT_STATUS table, seats that are unavailable will be excluded. The Oracle view definition is: CREATE OR REPLACE VIEW available_seats AS SELECT seat_status.seat_id, seat_status.performance_id, seat_status.price_band_id FROM seat_status, booking WHERE booking.authorization_code is NULL AND (booking.reserved_until is NULL or booking.reserved_until < sysdate) AND seat_status.booking_id = booking.id(+) ; Using this view enables us to query for available seats of a given type very simply: SELECT seat_id FROM available_seats WHERE performance_id = ? AND price_band_id = ? The advantages of this approach are that the Oracle-specific outer join syntax is hidden from Java code (we could implement the same view in another database with different syntax); Java code is simpler; and persistence logic is handled by the database. There is no need for the Java code to know how bookings are represented. Although it's unlikely that the database schema would be changed once it contained real user data, with this approach it could be without necessarily impacting Java code. Oracle 9i also supports the standard SQL syntax for outer joins. However, the requirement was for the application to work with Oracle 8.1.7i. In all these cases, the database contains only persistence logic. Changes to business rules cannot affect code contained in the database. Databases are good at handling persistence logic, with triggers, stored procedures, views, and the like, so this results in a simpler application. Essentially, we have two contracts decoupling business objects from the database: the DAO interfaces in Java code; and the stored procedure signatures and those table and views used by the DAOs. These amount to the database's public interface as exposed to the J2EE application. Before moving onto implementing the rest of the application, it's important to test the performance of this schema (for example, how quickly common queries will run) and behavior under concurrent usage. As this is database-specific, I won't show this here. However, it's a part of the integrated testing strategy of the whole application. 282 Brought to you by ownSky Data Access in J2EE Applications Finally, we need to consider the locking strategy we want to apply - pessimistic or optimistic locking. Locking will be an issue when users try to reserve seats of the same type for the same performance. The actual allocation of seats (which will involve the algorithm for finding suitable adjacent seats) is a business logic issue, so we will want to handle it in Java code. This means that we will need to query the AVAILABLE_SEATS view for a performance and seat type. Java code, which will have cached and analyzed relevant seating plan reference data, will then examine the available seat ids and choose a number of seats reserve. It will then invoke the reserve_seats stored procedure to reserve seats with the relevant ids. All this will occur in the same transaction. Transactions will be managed by the J2EE server, not the database. Pessimistic locking will mean forcing all users trying to reserve seats for the same performance and seat type to wait until the transaction completes. Pessimistic locking can be enforced easily by adding FOR UPDATE to the SELECT from the AVAILABLE_SEATS view shown above. The next queued user would then be given and have locked until their transaction completed the seat ids still available. Optimistic locking might boost performance by eliminating blocking, but raises the risk of multiple users trying to reserve the same seats. In this case we'd have to check that the SEAT_STATUS rows associated with the selected seat ids hadn't been changed by a concurrent transaction, and would need to fail the reservation in this case (the Java component trying to make the reservation could retry the reservation request without reporting the optimistic locking failure to the user). Thus using optimistic locking might improve performance, but would complicate application code. Using pessimistic locking would pass the work onto the database and guarantee data integrity. We wouldn't face the same locking issue if we did the seat allocation in the database. In Oracle we could even do this in a Java stored procedure. However, this would reduce maintainability and make it difficult to implement a true 00 solution. In accordance with the goal of keeping business logic in Java code running within theJ2EE server, as well as ensuring that design remains portable, we should avoid this approach unless it proves to be the only way to ensure satisfactory performance. The locking strategy will be hidden behind a DAO interface, so we can change it if necessary without needing to modify business objects. Pessimistic locking works well in Oracle, as queries without a FOR UPDATE clause will never block on locked data. This means that using pessimistic locking won't affect queries to count the number of seats still available (required rendering the Display performance screen). In other databases, such queries may block - a good example of the danger that the same database access code will work differently in different databases. Thus we'll decide to use the simpler pessimistic locking strategy if possible. However, as there is scope to change it without trashing the application's design, we can implement optimistic locking if performance testing indicates a problem supporting concurrent use or if need to work with another RDBMS. Finally, the issue of where to perform data access. In this chapter, we decided to use EJB only to handle the transactional booking process. This means that data access for the booking process will be performed in the EJB tier; other (non-transactional) data access will be performed in business objects running in the web container. 283 Brought to you by ownSky Data Access Using Entity Beans Entity beans are the data access components described in the EJB specification. While they have a disappointing track record in practice (which has prompted a major overhaul in the EJB 2.0 specification), their privileged status in the J2EE core means that we must understand them, even if we choose not to use them. In this chapter we'll discuss: o What entity beans aim to achieve, and the experience of using them in practice o The pros and cons of the entity bean model, especially when entity beans are used with relational databases o Deciding when to use entity beans, and how to use them effectively o How to choose between entity beans with container-managed persistence and bean-managed persistence o The significant enhancements in the EJB 2.0 entity bean model, and their implications for using entity beans o Entity bean locking and caching support in leading application servers o Entity bean performance I confess. I don't much like entity beans. I don't believe that they should be considered the default choice for data access in J2EE applications. 285 Brought to you by ownSky If you choose to use entity beans, hopefully this chapter will help you to avoid many common pitfalls. However, I recommend alternative approaches for data access in most applications. In the next chapter we'll consider effective alternatives, and look at how to implement the Data-Access Object pattern. This pattern is usually more effective than entity beans at separating business logic from data-access implementation. Entity Bean Concepts Entity beans are intended to free session beans from the low-level task of working with persistent data, thus formalizing good design practice. They became a core part of the EJB specification in version 1.1; version 2.0 introduced major entity bean enhancements. EJB 2.1 brings further, incremental, enhancements, which I discuss when they may affect future strategy, although they are unavailable inJ2EE 1.3 development. Entity beans offer an attractive programming model, making it possible to use object concepts to access a relational database. Although entity beans are designed to work with any data store, this is by far the most common case in reality, and the one I'll focus on in this chapter. The entity bean promise is that the nuts and bolts of data access will be handled transparently by the container, leaving application developers to concentrate on implementing business logic. In this vision, container providers are expected to provide highly efficient data access implementations. Unfortunately, the reality is somewhat different. Entity beans are heavyweight objects and often don't perform adequately. O/R mapping is a complex problem, and entity beans (even in EJB 2.0) fail to address many of its facets. Blithely using object concepts such as the traversal of associations with entity beans may produce disastrous performance. Entity beans don't remove the complexity of data access; they do reduce it, but largely move it into another layer. Entity bean deployment descriptors (both standard J2EE and container-specific) are very complex, and we simply can't afford to ignore many issues of the underlying data store. There are serious questions about the whole concept of entity beans, which so far haven't been settled reassuringly by experience. Most importantly: o Why do entity beans need remote interfaces, when a prime goal of EJB is to gather business logic into session beans? Although EJB 2.0 allows local access to entity beans, the entity bean model and the relatively cumbersome way of obtaining entity bean references reflects the heritage of entity beans as remote objects. o If entity beans are accessed by reference, why do they need to be looked up using JNDI? o Why do entity beans need infrastructure to handle transaction delimitation and security? Aren't these business logic issues that can best be handled by session beans? o Do entity beans allow us to work with relational databases naturally and efficiently? The entity bean model tends to enforce row-level (rather than set-oriented) access to RDBMS tables, is not what relational databases are designed to do, and may prove inefficient. o Due to their high overhead, EJBs are best used as components, not fine-grained objects. This makes them poorly suited to modeling fine-grained data objects, which is arguably the only cost-effective way to use entity beans. (We'll discuss entity bean granularity in detail shortly). o Is entity bean portability achievable or desirable, as databases behave in different ways There's real danger in assuming that entity beans allow us to forget about basic persistent issues such as locking. 286 Brought to you by ownSky Data Access Using Entity Beans Alternatives such as JDO avoid many of these problems and much of the complexity that entity beans as a result. It’s important to remember that entity beans are only one choice for data access in J2EE applications. Application design should not be based around the use of entity beans. Entity beans are one implementation choice in the EJB tier. Entity beans should not be exposed to clients. The web tier and other EJB clients should never access entity beans directly. They should work only with a layer of session beans implementing the application's business logic. This not only preserves flexibility in the application's design and implementation, but also usually improves performance. This principal, which underpins the Session Facade pattern, is universally agreed: I can't recall the last time I saw anyone advocate using remote access to entity beans. However, I feel that an additional layer of abstraction is desirable to decouple session beans from entity beans. This is because entity beans are inflexible; they provide an abstraction from the persistence store, but make code that uses it dependent on that somewhat awkward abstraction. Session beans should preferably access entity beans only through a persistence facade of ordinary Java data access interfaces. While entity beans impose a particular way of working with data, a standard Java interface does not. This approach not only preserves flexibility, but also future-proofs an application. I have grave doubts about the future of entity beans, as JDO has the potential to provide a simpler, more general and higher-performing solution wherever entity beans are appropriate. By using DAO, we retain the ability to switch to the use of JDO or any other persistence strategy, even after an application has been initially implemented using entity beans. We'll look at examples of this approach in the next chapter. Due to the significant changes in entity beans introduced in EJB 2.0, much advice on using entity beans from the days of EJB 1.1 is outdated, as we'll see. Definition Entity beans are a slippery subject, so let's start with some definitions and reflection on entity beans in practice. The EJB 2.0 specification defines an entity bean as, "a component that represents an object-oriented view of some entities stored in a persistent storage, such as a database, or entities that are implemented by an existing enterprise application". This conveys the aim of entity beans to "objectify" persistent data. However, it doesn't explain why this has to be achieved by EJBs rather than ordinary Java objects. Core J2EE Patterns describes an entity bean as, "a distributed, shared, transactional and persistent object". This does explain why an entity bean needs to be an EJB, although the EJB 2.0 emphasis on local interfaces has moved the goalposts and rendered the "distributed" characteristic obsolete. 287 Brought to you by ownSky [...]... representation For example, both EJBGen and XDoclet can generate local and home interfaces, J2EE standard and application-server-specific deployment descriptors from special Javadoc tags in an entity bean implementation class Such simple tools are powerful and extensible, and far preferable to hand-coding There is a strong case that entity beans should never be hand authored One argument in favor of... write and understand than JDBC code The disadvantages of SQLJ include: o o o o The development- deployment cycle is more complex, requiring the use of the SQLJ precompiler SQLJ doesn't simplify JDBC' s cumbersome error handling requirements SQLJ syntax is less elegant than Java syntax On the other hand, SQLJ may achieve some tasks in far fewer lines of code than Java using JDBC Although SQLJ is a standard,... between application scenarios and different EJB containers If possible to get heavy cache hits, using read-only entity beans or because your container has an efficient he entity beans are a good choice and will perform well Entity Bean Locking Strategies There are two main locking strategies for entity beans, both foreshadowed in the EJB specification (§10 .5. 9 and §10 .5. 10) The terminology used to... simply instantiates multiple entity objects with the same primary key The locking strategy is up to the database, and will be determined by the transaction isolation level on entity bean methods Database locking is described in " Commit Options B and C' in the EJB specification (§10 .5. 9), andJBoss documentation follows this terminology Database locking has the following advantages: o It can support much... to use SQL, on the other hand, is widely understood EJB QL will need to become even more complex to be able to meet real-world requirements o It's purely a query language It's impossible to use it to perform updates The only option is to obtain multiple entities that result from an ejbSelect () method and to modify them individually This wastes bandwidth between J2EE server and RDBMS, requires the traversal... reasons On the other hand, a strategy of data access from session beans (using helper classes) is workable, and is likely to perform better It’s vital that components outside the EJB tier don't work directly with entity beans, but work with session beans that mediate access to entity beans This design principle ensures correct decoupling between data and client-side components, and maximizes flexibility... specification (as opposed to slightly more than a fifth of the much shorter EJB 1.1 specification), and much of the complexity of EJB containers Removing the requirement to implement entity beans would foster competition and innovation in the application server market and would help JDO become a single strong J2EE standard for accessing persistent data But that's just my opinion! I prefer to manage persistence... API in detail, showing how to avoid common JDBC errors and discussing some subtle points that are often neglected As the JDBC API is relatively low-level, and using it is error-prone and requires an unacceptable volume of code, it's important for application code to work with higher-level APIs built on it We'll look at the design, implementation, and usage of a JDBC abstraction framework that greatly... statements to be expressed more concisely than with JDBC and facilitates getting Java variable values to and from the database A SQLJ precompiler translates the SQLJ syntax (Java code with embedded SQL) into regular Java source code The concept of embedded SQL is nothing new: Oracle's Pro*C and other products take the same approach to C and C++, and there are even similar solutions for COBOL 313 Brought... Java.sql.SQLExceptions in SQLJ as in JDBC, and must ensure that we close the connection in the event of errors However, we don't need to work with JDBC Statement and PreparedStatement objects, making this code less verbose than the JDBC equivalent SQLJ Parts 1 and 2 attempt to standardize database-specific functionality such as stored procedures SQLJ can be used in J2EE architecture in the same way as JDBC: . interfaces in Java code; and the stored procedure signatures and those table and views used by the DAOs. These amount to the database's public interface as exposed to the J2EE application. Before. move it into another layer. Entity bean deployment descriptors (both standard J2EE and container-specific) are very complex, and we simply can't afford to ignore many issues of the underlying. legitimately be performed in a range of J2EE components, or even in the database. o The choice between object-driven and database-driven data modeling, and why database-driven modeling is often