Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 86 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
86
Dung lượng
601,56 KB
Nội dung
578 CHAPTER 13 Optimizing fetching and caching select i.* from ITEM i select b.* from BID b where b.ITEM_ID in (select i.ITEM_ID from ITEM i) In annotations, you again have to use a Hibernate extension to enable this optimi- zation: @OneToMany @org.hibernate.annotations.Fetch( org.hibernate.annotations.FetchMode.SUBSELECT ) private Set<Bid> bids = new HashSet<Bid>();} Prefetching using a subselect is a powerful optimization; we’ll show you a few more details about it later, when we walk through a typical scenario. Subselect fetching is, at the time of writing, available only for collections, not for entity proxies. Also note that the original query that is rerun as a subselect is only remembered by Hibernate for a particular Session . If you detach an Item instance without initializing the collection of bids , and then reattach it and start iterating through the collection, no prefetching of other collections occurs. All the previous fetching strategies are helpful if you try to reduce the number of additional SELECT s that are natural if you work with lazy loading and retrieve objects and collections on demand. The final fetching strategy is the opposite of on-demand retrieval. Often you want to retrieve associated objects or collections in the same initial SELECT with a JOIN . 13.2.3 Eager fetching with joins Lazy loading is an excellent default strategy. On other hand, you can often look at your domain and data model and say, “Every time I need an Item , I also need the seller of that Item .” If you can make that statement, you should go into your mapping metadata, enable eager fetching for the seller association, and utilize SQL joins: <class name="Item" table="ITEM"> <many-to-one name="seller" class="User" column="SELLER_ID" update="false" fetch="join"/> </class> Hibernate now loads both an Item and its seller in a single SQL statement. For example: Selecting a fetch strategy 579 Item item = (Item) session.get(Item.class, new Long(123)); This operation triggers the following SQL SELECT : select i.*, u.* from ITEM i left outer join USERS u on i.SELLER_ID = u.USER_ID where i.ITEM_ID = ? Obviously, the seller is no longer lazily loaded on demand, but immediately. Hence, a fetch="join" disables lazy loading. If you only enable eager fetching with lazy="false" , you see an immediate second SELECT . With fetch="join" , you get the seller loaded in the same single SELECT . Look at the resultset from this query shown in figure 13.4. Hibernate reads this row and marshals two objects from the result. It connects them with a reference from Item to User , the seller association. If an Item doesn’t have a seller all u.* columns are filled with NULL . This is why Hibernate uses an outer join, so it can retrieve not only Item objects with sellers, but all of them. But you know that an Item has to have a seller in CaveatEmptor. If you enable <many-to-one not-null="true"/> , Hibernate executes an inner join instead of an outer join. You can also set the eager join fetching strategy on a collection: <class name="Item" table="ITEM"> <set name="bids" inverse="true" fetch="join"> <key column="ITEM_ID"/> <one-to-many class="Bid"/> </set> </class> If you now load many Item objects, for example with createCriteria(Item. class).list() , this is how the resulting SQL statement looks: Figure 13.4 Two tables are joined to eagerly fetch associated rows. 580 CHAPTER 13 Optimizing fetching and caching select i.*, b.* from ITEM i left outer join BID b on i.ITEM_ID = b.ITEM_ID The resultset now contains many rows, with duplicate data for each Item that has many bids , and NULL fillers for all Item objects that don’t have bids . Look at the resultset in figure 13.5. Hibernate creates three persistent Item instances, as well as four Bid instances, and links them all together in the persistence context so that you can navigate this graph and iterate through collections—even when the persistence context is closed and all objects are detached. Eager-fetching collections using inner joins is conceptually possible, and we’ll do this later in HQL queries. However, it wouldn’t make sense to cut off all the Item objects without bids in a global fetching strategy in mapping metadata, so there is no option for global inner join eager fetching of collections. With Java Persistence annotations, you enable eager fetching with a FetchType annotation attribute: @Entity public class Item { @ManyToOne(fetch = FetchType.EAGER) private User seller; @OneToMany(fetch = FetchType.EAGER) private Set<Bid> bids = new HashSet<Bid>(); } This mapping example should look familiar: You used it to disable lazy loading of an association and a collection earlier. Hibernate by default interprets this as an Figure 13.5 Outer join fetching of associated collection elements Selecting a fetch strategy 581 eager fetch that shouldn’t be executed with an immediate second SELECT , but with a JOIN in the initial query. You can keep the FetchType.EAGER Java Persistence annotation but switch from join fetching to an immediate second select explicitly by adding a Hibernate extension annotation: @Entity public class Item { @ManyToOne(fetch = FetchType.EAGER) @org.hibernate.annotations.Fetch( org.hibernate.annotations.FetchMode.SELECT ) private User seller; } If an Item instance is loaded, Hibernate will eagerly load the seller of this item with an immediate second SELECT . Finally, we have to introduce a global Hibernate configuration setting that you can use to control the maximum number of joined entity associations (not collec- tions). Consider all many-to-one and one-to-one association mappings you’ve set to fetch="join" (or FetchType.EAGER ) in your mapping metadata. Let’s assume that Item has a successfulBid association, that Bid has a bidder , and that User has a shippingAddress . If all these associations are mapped with fetch="join" , how many tables are joined and how much data is retrieved when you load an Item ? The number of tables joined in this case depends on the global hibernate. max_fetch_depth configuration property. By default, no limit is set, so loading an Item also retrieves a Bid , a User , and an Address in a single select. Reasonable set- tings are small, usually between 1 and 5. You may even disable join fetching for many-to-one and one-to-one associations by setting the property to 0! (Note that some database dialects may preset this property: For example, MySQLDialect sets it to 2.) SQL queries also get more complex if inheritance or joined mappings are involved. You need to consider a few extra optimization options whenever second- ary tables are mapped for a particular entity class. 13.2.4 Optimizing fetching for secondary tables If you query for objects that are of a class which is part of an inheritance hierar- chy, the SQL statements get more complex: 582 CHAPTER 13 Optimizing fetching and caching List result = session.createQuery("from BillingDetails").list(); This operation retrieves all BillingDetails instances. The SQL SELECT now depends on the inheritance mapping strategy you’ve chosen for BillingDetails and its subclasses CreditCard and BankAccount . Assuming that you’ve mapped them all to one table (a table-per-hierarchy), the query isn’t any different than the one shown in the previous section. However, if you’ve mapped them with implicit polymorphism, this single HQL operation may result in several SQL SELECT s against each table of each subclass. Outer joins for a table-per-subclass hierarchy If you map the hierarchy in a normalized fashion (see the tables and mapping in chapter 5, section 5.1.4, “Table per subclass”), all subclass tables are OUTER JOIN ed in the initial statement: select b1.BILLING_DETAILS_ID, b1.OWNER, b1.USER_ID, b2.NUMBER, b2.EXP_MONTH, b2.EXP_YEAR, b3.ACCOUNT, b3.BANKNAME, b3.SWIFT, case when b2.CREDIT_CARD_ID is not null then 1 when b3.BANK_ACCOUNT_ID is not null then 2 when b1.BILLING_DETAILS_ID is not null then 0 end as clazz from BILLING_DETAILS b1 left outer join CREDIT_CARD b2 on b1.BILLING_DETAILS_ID = b2.CREDIT_CARD_ID left outer join BANK_ACCOUNT b3 on b1.BILLING_DETAILS_ID = b3.BANK_ACCOUNT_ID This is already a interesting query. It joins three tables and utilizes a CASE WHEN END expression to fill in the clazz column with a number between 0 and 2 . Hibernate can then read the resultset and decide on the basis of this num- ber what class each of the returned rows represents an instance of. Many database-management systems limit the maximum number of tables that can be combined with an OUTER JOIN . You’ll possibly hit that limit if you have a wide and deep inheritance hierarchy mapped with a normalized strategy (we’re Selecting a fetch strategy 583 talking about inheritance hierarchies that should be reconsidered to accommo- date the fact that after all, you’re working with an SQL database). Switching to additional selects In mapping metadata, you can then tell Hibernate to switch to a different fetch- ing strategy. You want some parts of your inheritance hierarchy to be fetched with immediate additional SELECT statements, not with an OUTER JOIN in the ini- tial query. The only way to enable this fetching strategy is to refactor the mapping slightly, as a mix of table-per-hierarchy (with a discriminator column) and table-per-subclass with the <join> mapping: <class name="BillingDetails" table="BILLING_DETAILS" abstract="true"> <id name="id" column="BILLING_DETAILS_ID" /> <discriminator column="BILLING_DETAILS_TYPE" type="string"/> <subclass name="CreditCard" discriminator-value="CC"> <join table="CREDIT_CARD" fetch="select"> <key column="CREDIT_CARD_ID"/> </join> </subclass> <subclass name="BankAccount" discriminator-value="BA"> <join table="BANK_ACCOUNT" fetch="join"> <key column="BANK_ACCOUNT_ID"/> </join> </subclass> </class> This mapping breaks out the CreditCard and BankAccount classes each into its own table but preserves the discriminator column in the superclass table. The fetching strategy for CreditCard objects is select , whereas the strategy for BankAccount is the default, join . Now, if you query for all BillingDetails , the following SQL is produced: 584 CHAPTER 13 Optimizing fetching and caching select b1.BILLING_DETAILS_ID, b1.OWNER, b1.USER_ID, b2.ACCOUNT, b2.BANKNAME, b2.SWIFT, b1.BILLING_DETAILS_TYPE as clazz from BILLING_DETAILS b1 left outer join BANK_ACCOUNT b2 on b1.BILLING_DETAILS_ID = b2.BANK_ACCOUNT_ID select cc.NUMBER, cc.EXP_MONTH, cc.EXP_YEAR from CREDIT_CARD cc where cc.CREDIT_CARD_ID = ? select cc.NUMBER, cc.EXP_MONTH, cc.EXP_YEAR from CREDIT_CARD cc where cc.CREDIT_CARD_ID = ? The first SQL SELECT retrieves all rows from the superclass table and all rows from the BANK_ACCOUNT table. It also returns discriminator values for each row as the clazz column. Hibernate now executes an additional select against the CREDIT_ CARD table for each row of the first result that had the right discriminator for a CreditCard . In other words, two queries mean that two rows in the BILLING_ DETAILS superclass table represent (part of) a CreditCard object. This kind of optimization is rarely necessary, but you now also know that you can switch from a default join fetching strategy to an additional immediate select whenever you deal with a <join> mapping. We’ve now completed our journey through all options you can set in mapping metadata to influence the default fetch plan and fetching strategy. You learned how to define what should be loaded by manipulating the lazy attribute, and how it should be loaded by setting the fetch attribute. In annotations, you use FetchType.LAZY and FetchType.EAGER , and you use Hibernate extensions for more fine-grained control of the fetch plan and strategy. Knowing all the available options is only one step toward an optimized and efficient Hibernate or Java Persistence application. You also need to know when and when not to apply a particular strategy. 13.2.5 Optimization guidelines By default, Hibernate never loads data that you didn’t ask for, which reduces the memory consumption of your persistence context. However, it also exposes you to the so-called n+1 selects problem. If every association and collection is Selecting a fetch strategy 585 initialized only on demand, and you have no other strategy configured, a partic- ular procedure may well execute dozens or even hundreds of queries to get all the data you require. You need the right strategy to avoid executing too many SQL statements. If you switch from the default strategy to queries that eagerly fetch data with joins, you may run into another problem, the Cartesian product issue. Instead of executing too many SQL statements, you may now (often as a side effect) create statements that retrieve too much data. You need to find the middle ground between the two extremes: the correct fetching strategy for each procedure and use case in your application. You need to know which global fetch plan and strategy you should set in your mapping meta- data, and which fetching strategy you apply only for a particular query (with HQL or Criteria ). We now introduce the basic problems of too many selects and Cartesian prod- ucts and then walk you through optimization step by step. The n+1 selects problem The n+1 selects problem is easy to understand with some example code. Let’s assume that you don’t configure any fetch plan or fetching strategy in your map- ping metadata: Everything is lazy and loaded on demand. The following example code tries to find the highest Bid s for all Item s (there are many other ways to do this more easily, of course): List<Item> allItems = session.createQuery("from Item").list(); // List<Item> allItems = session.createCriteria(Item.class).list(); Map<Item, Bid> highestBids = new HashMap<Item, Bid>(); for (Item item : allItems) { Bid highestBid = null; for (Bid bid : item.getBids() ) { // Initialize the collection if (highestBid == null) highestBid = bid; if (bid.getAmount() > highestBid.getAmount()) highestBid = bid; } highestBids.put(item, highestBid); } First you retrieve all Item instances; there is no difference between HQL and Cri- teria queries. This query triggers one SQL SELECT that retrieves all rows of the ITEM table and returns n persistent objects. Next, you iterate through this result and access each Item object. 586 CHAPTER 13 Optimizing fetching and caching What you access is the bids collection of each Item . This collection isn’t initial- ized so far, the Bid objects for each item have to be loaded with an additional query. This whole code snippet therefore produces n+1 selects. You always want to avoid n+1 selects. A first solution could be a change of your global mapping metadata for the col- lection, enabling prefetching in batches: <set name="bids" inverse="true" batch-size="10"> <key column="ITEM_ID"/> <one-to-many class="Bid"/> </set> Instead of n+1 selects, you now see n/10+1 selects to retrieve the required collec- tions into memory. This optimization seems reasonable for an auction applica- tion: “Only load the bids for an item when they’re needed, on demand. But if one collection of bids must be loaded for a particular item, assume that other item objects in the persistence context also need their bids collections initialized. Do this in batches, because it’s somewhat likely that not all item objects need their bids.” With a subselect-based prefetch, you can reduce the number of selects to exactly two: <set name="bids" inverse="true" fetch="subselect"> <key column="ITEM_ID"/> <one-to-many class="Bid"/> </set> The first query in the procedure now executes a single SQL SELECT to retrieve all Item instances. Hibernate remembers this statement and applies it again when you hit the first uninitialized collection. All collections are initialized with the sec- ond query. The reasoning for this optimization is slightly different: “Only load the bids for an item when they’re needed, on demand. But if one collection of bids must be loaded, for a particular item, assume that all other item objects in the per- sistence context also need their bids collection initialized.” Finally, you can effectively turn off lazy loading of the bids collection and switch to an eager fetching strategy that results in only a single SQL SELECT : <set name="bids" inverse="true" fetch="join"> Selecting a fetch strategy 587 <key column="ITEM_ID"/> <one-to-many class="Bid"/> </set> This seems to be an optimization you shouldn’t make. Can you really say that “whenever an item is needed, all its bids are needed as well”? Fetching strategies in mapping metadata work on a global level. We don’t consider fetch="join" a common optimization for collection mappings; you rarely need a fully initialized collection all the time. In addition to resulting in higher memory consumption, every OUTER JOIN ed collection is a step toward a more serious Cartesian product problem, which we’ll explore in more detail soon. In practice, you’ll most likely enable a batch or subselect strategy in your map- ping metadata for the bids collection. If a particular procedure, such as this, requires all the bids for each Item in-memory, you modify the initial HQL or Criteria query and apply a dynamic fetching strategy: List<Item> allItems = session.createQuery("from Item i left join fetch i.bids") .list(); List<Item> allItems = session.createCriteria(Item.class) .setFetchMode("bids", FetchMode.JOIN) .list(); // Iterate through the collections Both queries result in a single SELECT that retrieves the bids for all Item instances with an OUTER JOIN (as it would if you have mapped the collection with join="fetch" ). This is likely the first time you’ve seen how to define a fetching strategy that isn’t global. The global fetch plan and fetching strategy settings you put in your mapping metadata are just that: global defaults that always apply. Any optimiza- tion process also needs more fine-grained rules, fetching strategies and fetch plans that are applicable for only a particular procedure or use case. We’ll have much more to say about fetching with HQL and Criteria in the next chapter. All you need to know now is that these options exist. The n+1 selects problem appears in more situations than just when you work with lazy collections. Uninitialized proxies expose the same behavior: You may need many SELECT s to initialize all the objects you’re working with in a particular procedure. The optimization guidelines we’ve shown are the same, but there is one exception: The fetch="join" setting on <many-to-one> or <one-to-one> associa- tions is a common optimization, as is a @ManyToOne(fetch = FetchType.EAGER) [...]... Java Persistence Table 13.2 Hibernate and JPA comparison chart for chapter 13 Hibernate Core Java Persistence and EJB 3.0 Hibernate supports fetch-plan definition with lazy loading through proxies or based on interception Hibernate implements a Java Persistence provider with proxy or interception-based lazy loading Hibernate allows fine-grained control over fetch-plan and fetching strategies Java Persistence. .. the persistence context, in detail Let’s go straight to the optional second-level cache The Hibernate second-level cache The Hibernate second-level cache has process or cluster scope: All persistence contexts that have been started from a particular SessionFactory (or are associ- Figure 13.7 Hibernate s two-level cache architecture Caching fundamentals 599 ated with EntityManagers of a particular persistence. .. optimization, so naturally it isn’t part of the Java Persistence or EJB 3.0 specification Every vendor provides different solutions for optimization, in particular any second-level caching All strategies and options we present in this section work for a native Hibernate application or an application that depends on Java Persistence interfaces and uses Hibernate as a persistence provider A cache keeps a representation... mapping files, you can centralize cache configuration in your hibernate. cfg.xml: < /hibernate- configuration> You enabled transactional caching for... tx.commit(); session.close(); Setting CacheMode.IGNORE tells Hibernate not to interact with the second-level cache, in this particular Session The available options are as follows: ■ CacheMode.NORMAL—The default behavior ■ CacheMode.IGNORE Hibernate never interacts with the second-level cache except to invalidate cached items when updates occur ■ CacheMode.GET Hibernate may read items from the second-level cache,... close to the Hibernate caching system 13.3.2 The Hibernate cache architecture As we hinted earlier, Hibernate has a two-level cache architecture The various elements of this system can be seen in figure 13.7: ■ The first-level cache is the persistence context cache A Hibernate Session lifespan corresponds to either a single request (usually implemented with one database transaction) or a conversation... fetch plan declaration, Hibernate extensions are used for fine-grained fetching strategy optimization Hibernate provides an optional second-level class and collection data cache, configured in a Hibernate configuration file or XML mapping files Use Hibernate annotations for declaration of the cache concurrency strategy on entities and collections The next chapters deal exclusively with querying and how... on the size of the collections you’re retrieving: 3 times 2, 1 times 1, plus 1, total 8 result rows Now imagine that you have 1,000 Selecting a fetch strategy Figure 13.6 589 A product is the result of two outer joins with many rows items in the database, and each item has 20 bids and 5 images—you’ll see a resultset with possibly 100,000 rows! The size of this result may well be several megabytes Considerable... associations that are eagerly fetched with outer-join SELECTs don’t create a product, by nature Finally, although Hibernate lets you create Cartesian products with fetch="join" on two (or even more) parallel collections, it throws an exception if you try to enable fetch="join" on parallel collections The resultset of a product can’t be converted into bag collections, because Hibernate can’t know which rows... supported, assuming that clocks are synchronized in the cluster It’s easy to write an adaptor for other products by implementing org .hibernate cache.CacheProvider Many commercial caching systems are pluggable into Hibernate with this interface Not every cache provider is compatible with every concurrency strategy! The compatibility matrix in table 13.1 will help you choose an appropriate combination Setting . all the Item objects without bids in a global fetching strategy in mapping metadata, so there is no option for global inner join eager fetching of collections. With Java Persistence annotations,. but with a JOIN in the initial query. You can keep the FetchType.EAGER Java Persistence annotation but switch from join fetching to an immediate second select explicitly by adding a Hibernate extension. @org .hibernate. annotations.Fetch( org .hibernate. annotations.FetchMode.SELECT ) private User seller; } If an Item instance is loaded, Hibernate will eagerly load the seller of this item with