CHAPTER 6 • ORACLE SCHEMA OBJECT MANAGEMENT 270 Indexed Clusters An indexed cluster (or just cluster) contains two or more tables. The cluster key is a col- umn or multiple columns that can be used to join the data in these tables. When the tables of the cluster are populated with data, the data in the tables is stored in the same block, and the rows for both tables are related based on the cluster key columns. This has the basic effect of denormalizing the table. Thus, if you are going to cluster tables, they should be tightly related based on the keys. The cluster key of the cluster is defined when you create the cluster. It can consist of up to 16 columns, and the cluster key size is limited to something like half the database block size. Also, the cluster key cannot consist of a LONG, LONG RAW, or LOB datatype (CLOB, BLOB, and so on). One benefit of clustering tables is that the cluster key is stored only once. This results in a slight improvement in performance, because the overall size of the clus- tered tables is smaller than the size of two individual tables storing the same data, so less I/O is required. Another benefit of clusters is that, unlike indexes, they do not brown-out. Thus, performance of SELECT statements should not be negatively impacted by ongoing DML activity, as can happen with indexes. Clusters can improve join performance, but this can be at the cost of slower perfor- mance on scans of the individual tables in the cluster and any DML activity on the cluster. To avoid problems, you should cluster only tables that are commonly joined together and that have little DML activity. Hash Clusters A hash cluster is an alternative to storing data in a table and then creating an index on that table. A hash cluster is the default type of cluster that will be created by the CRE- ATE CLUSTER command. In a hash cluster, the cluster key values are converted into a hash value. This hash value is then stored along with the data associated with that key. The hash value is calculated by using a hashing algorithm, which is simply a mathematical way of gen- erating a unique identifier. (The same keys would generate the same hash value, of course.) You control the number of hash keys through the HASHKEYS parameter of the CREATE CLUSTER command. Thus, the total number of possible cluster key val- ues is defined when the cluster is created. Be careful to correctly choose this number, or you will find that keys may end up being stored together. There are two different kinds of hash clusters: • With a normal hash cluster, Oracle will convert the values of the WHERE clause, if they contain the cluster key, to a hash value. It will then use the hash value as an offset value in the hash cluster, allowing Oracle to quickly go to the row being requested. Copyright ©2002 SYBEX, Inc., Alameda, CA www.sybex.com Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. 271 • With a single-row hash cluster, the hash keys relate directly to the key in the table stored in hash cluster. This eliminates the scan that would be required on a nor- mal hash cluster, since Oracle would scan the entire cluster for a single table lookup by default. When choosing the cluster key, you should consider only columns with a high degree of cardinality. You should also be careful about using hash clusters when a table will be subject to a great deal of range searches, such as searches on date ranges. Hash clusters generally are used when there is a constant need to look up specific and unique values, such as those you might find in a primary key or a unique index. For the hashing algorithm, you have three options. The first is to use Oracle’s inter- nal algorithm to convert the cluster key values into the correct hash value. This works well in most cases. You can also choose to use the cluster key if your cluster key is some uniform value, such as a series of numbers generated by a sequence. The other option is a user-defined algorithm in the form of a PL/SQL function. Creating Clusters You use the CREATE CLUSTER command to create both indexed clusters and hash clusters. In order to create a cluster in your schema, you need the CREATE CLUSTER privilege. To create a cluster in another schema, you must have the CREATE ANY CLUSTER privilege. You also need the appropriate QUOTA set for the tablespace in which you wish to create the clusters. Creating an Indexed Cluster You use the CREATE CLUSTER command to create indexed clusters in your database. Listing 6.17 shows an example of creating an indexed cluster. Listing 6.17: Creating an Indexed Cluster CREATE CLUSTER parent_child (parent_id NUMBER) INDEX SIZE 512 STORAGE (INITIAL 100k NEXT 100k); It is the keyword INDEX in the CREATE CLUSTER command that makes this an indexed cluster. Omit this keyword, and Oracle will default to creating a hash cluster. When you create the cluster, you define how many rows will be identified with each cluster key by using the SIZE parameter. The value associated with the SIZE key- word tells Oracle how much space to reserve for all rows with the same hash value, and this value should be a multiple of the database block size. If SIZE is not a multiple MANAGING CLUSTERS Oracle Database Administration PART II Copyright ©2002 SYBEX, Inc., Alameda, CA www.sybex.com Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. CHAPTER 6 • ORACLE SCHEMA OBJECT MANAGEMENT 272 of the database block size, it will be rounded up to the next multiple of the database block size. For example, if you have 1600 bytes left in the block, and SIZE is set at 512, then you will be able to store three cluster key sets within that block. WARNING It is important to set the SIZE parameter correctly for both indexed clus- ters and hash clusters. If you set SIZE too small, you can cause rows to be chained, which can cause additional I/O. After you have created the cluster, you can add tables to the cluster. Then you need to create a cluster index before you can use the cluster or the tables in the cluster. Adding Tables to an Indexed Cluster Listing 6.18 shows an example of creating two tables and adding them to the cluster created in Listing 6.17. Listing 6.18: Adding Tables to a Cluster CREATE TABLE parent (parent_id NUMBER PRIMARY KEY, last_name VARCHAR2(30) NOT NULL, first_name VARCHAR2(30) NOT NULL, middle_int CHAR(1), sex CHAR(1), married_status CHAR(1) ) CLUSTER parent_child (parent_id); CREATE TABLE children (child_id NUMBER CONSTRAINT pk_children PRIMARY KEY USING INDEX TABLESPACE indexes STORAGE (INITIAL 200k NEXT 200k), parent_id NUMBER, last_name VARCHAR2(30), first_name VARCHAR2(30), middle_int CHAR(1), medical_code VARCHAR2(30) CONSTRAINT children_check_upper CHECK (medical_code = UPPER(medical_code) ) ) CLUSTER parent_child(parent_id); Copyright ©2002 SYBEX, Inc., Alameda, CA www.sybex.com Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. 273 Notice that you can still define primary keys on the tables, use the USING INDEX command to define where the primary key index is to be stored, and include the STORAGE clause. You can create indexes on tables in clusters, just as you would for any other table. Creating a Cluster Index Once you have set up the cluster, you need to create a cluster index so that you can add rows to our cluster. Until this step is done, no data can be added to any of the tables that exist in the cluster. To create the cluster index, you use the CREATE INDEX command using the ON CLUSTER keyword, as shown in the following example: CREATE INDEX ic_parent_children ON CLUSTER parent_child; After creating the cluster index, you can work with the tables in the cluster. NOTE If you accidentally drop the cluster index, you will not lose the data in the cluster. However, you will not be able to use the tables in the cluster until the cluster index is re-created. Creating a Hash Cluster Creating a hash cluster is similar to creating an indexed cluster. The differences are that you omit the INDEX keyword and add the HASHKEYS keyword. Listing 6.19 shows an example of using a CREATE CLUSTER statement to create a hash cluster. Listing 6.19: Creating a Hash Cluster CREATE CLUSTER parent_child (parent_id NUMBER) SIZE 512 HASHKEYS 1000 STORAGE (INITIAL 100k NEXT 100k); The SIZE and HASHKEYS keywords are used to calculate how much space to allo- cate to the cluster. The SIZE keyword defines how much of each block will be used to store a single set of cluster key rows. This value determines the number of cluster key values that will be stored in each block of the cluster. SIZE is defined in bytes, and you can also append a k or m to indicate that the number is in kilobytes or megabytes. The HASHKEYS clause (which is not used when creating an indexed cluster) defines the number of cluster key values that are expected in the cluster. Oracle will round up the number chosen to the next nearest prime number. MANAGING CLUSTERS Oracle Database Administration PART II Copyright ©2002 SYBEX, Inc., Alameda, CA www.sybex.com Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. CHAPTER 6 • ORACLE SCHEMA OBJECT MANAGEMENT 274 It is important to try to calculate the HASHKEYS and SIZE values as exactly as pos- sible, but it is also sometimes a difficult task. If you already have the data stored in another database, or perhaps in a table that you were thinking of moving to a hash cluster, it might be easier to determine the settings for these values. You could simply set HASHKEYS to the number of unique rows in the table, based on a select set of columns that make up either a primary or unique key, or on a pseudo unique key if one does not exist. The SIZE value is a bit more complicated. SIZE could be calculated by first taking the overall size of the data in the object that you are thinking about moving to a hash cluster, and dividing that size by the number of unique key values. This will give you some idea of where to start making the SIZE parameter. (I would add on a bit for over- head and a fudge factor.) By default, Oracle will allocate one block for every cluster key value (which is potentially very expensive). Also, the value given by SIZE cannot be greater than the block size of the database. If it is, Oracle will use the database block size instead. TIP Like many other things in the database world, there is no exact science to calculat- ing cluster sizes. It’s important to get as close as you can to an accurate figure, but this may not be possible until you have seen how the data is actually going to come in and load. In practice, sizing may require one or two reorganizations of the object. So, sizing typically involves doing your best to calculate the right numbers, and then adjusting those numbers based on ongoing operations. Both SIZE and HASHKEYS can have a significant impact on performance. If you allocate too much space (by making either SIZE or HASHKEYS too large), you will be wasting space. Since fewer data rows will exist per block, full scans (not using the hash key) or range scans of the hash cluster will be degraded. Also, Oracle uses the values of SIZE and HASHKEYS, along with the values of INITIAL and NEXT, to deter- mine the initial extent allocations for the hash cluster. The total amount of space that will be allocated to a hash cluster when it is created is the greater of SIZE * HASHKEYS or INITIAL. Thus, when the hash cluster is created, all of the initial space for the expected number of hash keys is mapped out in a struc- ture called a hash table. Subsequent extents are created as overflow space and will be generated based on the NEXT parameter of the STORAGE clause. Any rows added to the cluster will first be added to the initial block in the hash table, based on the hash value of the cluster key column(s) in the row. If the block that is assigned to that clus- ter key is full, then the row will be stored in an overflow block, and a pointer will be Copyright ©2002 SYBEX, Inc., Alameda, CA www.sybex.com Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. 275 stored in the block where the row should have gone. This can cause additional I/O operations, and it is why it is important to size your clusters correctly. WARNING Incorrectly setting the SIZE value can cause chaining to occur. This is because Oracle does not guarantee that any hashed data will remain in a given block (and often, this may not even be possible). Carefully set the SIZE parameter to avoid chaining. Once you have created the hash cluster, you add tables to the cluster, just as you would with an index cluster (see Listing 6.18). You do not need to create a cluster index, as you do for an indexed cluster. After you add the tables to the hash cluster, the cluster is ready for use. Altering Clusters You can use the ALTER CLUSTER command to allocate an additional extent to the cluster in an index cluster (you cannot allocate an additional extent to a hash cluster) or to de-allocate an unused extent. You can also modify the STORAGE clause or the PCTFREE and PCTUSED settings for the cluster, as well as the other settings of the physical attributes clause. Here is an example of allocating an additional extent and modifying the STORAGE parameters of a cluster: ALTER CLUSTER parent_child STORAGE (NEXT 200k); Dropping Clusters To drop an indexed cluster, you must first drop the underlying cluster index, and then you must drop the underlying tables of the cluster. There is no way to decluster a table. Thus, you may need to use the CREATE TABLE AS SELECT command to move the data from tables in the cluster to tables outside the cluster. After you remove the cluster index and the tables of the index, you can use the DROP CLUSTER command. If you don’t care about the data in the tables of the clus- ter, you may use the optional parameter INCLUDING TABLES in the DROP CLUSTER command, and Oracle will remove the underlying tables for you. You may also need to include the CASCADE CONSTRAINTS clause if the tables have referential integrity constraints that will need to be dropped. Here is an example of dropping the PAR- ENT_CHILD cluster created in Listing 6.19, including the tables: DROP CLUSTER parent_child INCLUDING TABLES CASCADE CONSTRAINTS; MANAGING CLUSTERS Oracle Database Administration PART II Copyright ©2002 SYBEX, Inc., Alameda, CA www.sybex.com Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. CHAPTER 6 • ORACLE SCHEMA OBJECT MANAGEMENT 276 Viewing Cluster Information The DBA_CLUSTERS view provides information about the clusters in your database. The following is a description of the view. SQL> DESC dba_clusters Name Null? Type ----------------------------------------- -------- ------------ OWNER NOT NULL VARCHAR2(30) CLUSTER_NAME NOT NULL VARCHAR2(30) TABLESPACE_NAME NOT NULL VARCHAR2(30) PCT_FREE NUMBER PCT_USED NOT NULL NUMBER KEY_SIZE NUMBER INI_TRANS NOT NULL NUMBER MAX_TRANS NOT NULL NUMBER INITIAL_EXTENT NUMBER NEXT_EXTENT NUMBER MIN_EXTENTS NOT NULL NUMBER MAX_EXTENTS NOT NULL NUMBER PCT_INCREASE NUMBER FREELISTS NUMBER FREELIST_GROUPS NUMBER AVG_BLOCKS_PER_KEY NUMBER CLUSTER_TYPE VARCHAR2(5) FUNCTION VARCHAR2(15) HASHKEYS NUMBER DEGREE VARCHAR2(10) INSTANCES VARCHAR2(10) CACHE VARCHAR2(5) BUFFER_POOL VARCHAR2(7) SINGLE_TABLE VARCHAR2(5) Here is an example of a query against the DBA_CLUSTERS view and its results: SELECT owner, cluster_name, tablespace_name FROM dba_clusters WHERE owner NOT LIKE ’SYS’; OWNER CLUSTER_NAME TABLESPACE ------ ------------------------------ ---------- SCOTT TEST USERS SCOTT TEST2 SYSTEM Copyright ©2002 SYBEX, Inc., Alameda, CA www.sybex.com Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. 277 Managing Rollback Segments When creating a database, the DBA needs to carefully consider the number, size, and extents to be allocated to rollback segments in the database. Rollback segments are used to provide rollback of incomplete transactions, read consistency, and database recovery. Planning Rollback Segments There are as many differing opinions about how to initially size a rollback segment as there are models of cars on the road. The answer to the question of how many roll- back segments and what size to make them is the same as the answer to many such DBA questions: It depends. There are a few considerations for planning rollback segments: • The average size of a transaction • The total number of concurrent transactions • The type and frequency of transactions The main concerns with rollback segments in terms of performance and opera- tional success are the appropriate sizing of the rollback segment and contention for a given rollback segment. If you do not have enough rollback segments allocated, con- tention for rollback segments can occur. You also need to have the appropriate sizing of the rollback segment tablespace. A single large job may cause a rollback segment to extend and take up the entire table- space. If the rollback segment has OPTIMAL set (which should always be the case!), it will eventually shrink after the transaction using that rollback segment has completed or failed. In the meantime, however, any attempt to extend other rollback segments will lead to transaction failure. Some DBAs create a separate large rollback segment in its own tablespace for large transactions. I personally don’t like this approach, because these large rollback seg- ments are rarely used, so they are a waste of space. I prefer to lump that extra space into one rollback segment tablespace that will allow the rollback segments to grow with the large transaction. Then I set the OPTIMAL parameter so the rollback segment will shrink back to its correct size later on. NOTE Keep in mind the concept of I/O distribution with regard to rollback segments. I’ve seen more than one DBA just plug the rollback segment tablespace datafiles on one disk, and then wonder why the system had such poor response time. MANAGING ROLLBACK SEGMENTS Oracle Database Administration PART II Copyright ©2002 SYBEX, Inc., Alameda, CA www.sybex.com Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. CHAPTER 6 • ORACLE SCHEMA OBJECT MANAGEMENT 278 I generally will create rollback segments according to the following formula: total extents = 2 × (expected number of concurrent transactions / number of overall rollback segments) For example, if I expect 20 concurrent transactions and create 5 rollback segments, this formula leads to 8 extents each. I generally round this up to 10 extents, for a total of 5 rollback segments of 10 extents each. You should probably not have any rollback segment with more than about 30 extents initially. If you do, you need to either resize the extents or add another rollback segment. I normally will make sure that the total size of each rollback segment (depending on how much space is available) is about 1.3 times the size of the largest table in the database. This allows me to modify all of the rows of any table without any problems. (Of course, if you have some particularly large tables, this might not be possible.) Finally, I always throw as much space as I can to the rollback segment tablespace (par- ticularly in a production database). I am not fond of transaction failures due to lack of rollback segment space. For test or development systems where less disk space is avail- able, I use smaller rollback segments. Again, it all depends on the environment. TIP If you find that you have users that are constantly blowing your tablespaces up with ad-hoc queries that are poorly written, you should look into metering your users’ resource use with Oracle’s resource control facilities, rather then just allowing them to extend a tablespace to eternity. The truth is that most DBAs guess at what they think is the right number and size of rollback segments, and then monitor the system for contention. Usually, when you first create a database, you have little idea of how many users will really be on it or how big the average transaction or largest table will be. After you create your initial set of rollback segments, you need to monitor their usage. Chapter 15 provides details on monitoring rollback segment use. If you find that rollback segments have a significant number of extends, shrinks, or wraps, as determined from the V$ROLLSTAT performance view, you will need to rework the extent size and perhaps the number of extents in the rollback segment. You may also want to review the V$WAITSTAT performance view for classes that include the word UNDO in them. If you see waits for these segment types, you probably need to add rollback segments to your database. Finally, you should use a uniform extent management policy with rollback seg- ments. This implies that the INITIAL and NEXT storage parameters for a rollback seg- ment will always be the same size. Also, INITIAL and NEXT should be the same for all Copyright ©2002 SYBEX, Inc., Alameda, CA www.sybex.com Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. 279 rollback segments in any given tablespace. This helps eliminate any fragmentation issues in the rollback segment tablespaces. Creating Rollback Segments To create a rollback segment, use the CREATE ROLLBACK SEGMENT command. Cre- ating a rollback segment is like creating any other segment in most respects. You define a STORAGE clause for the object, and you can define to which tablespace the rollback segment is assigned. Listing 6.20 shows an example. Listing 6.20: Creating a Rollback Segment CREATE ROLLBACK SEGMENT rbs01 TABLESPACE rbs STORAGE (INITIAL 1m NEXT 1m OPTIMAL 10m MINEXTENTS 10); The OPTIMAL option within the STORAGE clause allows you to define a size that you want Oracle to shrink the rollback segment back to. Thus, the rollback segment will expand as required, but Oracle will shrink it back down to the correct size later. As noted earlier, you should always use the OPTIMAL cause to prevent problems with a rollback segment taking too much tablespace. The OPTIMAL and MINEXTENTS clauses cross-check each other. Thus, you cannot have an OPTIMAL parameter that will cause the rollback segment to drop below the value defined by MINEXTENTS. As noted in the previous section, it is strongly encouraged that you make all of your extents uniform in size (thus, make sure that INITIAL and NEXT are set to the same value). If you choose to make INITIAL or NEXT larger, make sure it is a multiple of the smaller value. When you create a rollback segment it is not initially available for use. You will need to use the ALTER ROLLBACK SEGMENT command (discussed next) to bring the rollback segment online. Also, you will want to add the rollback segment to the data- base parameter file so that it will be brought online immediately when the database is started. You use the ROLLBACK_SEGMENT parameter in the init.ora file to accom- plish this. Altering Rollback Segments You can use the ALTER ROLLBACK SEGMENT command to alter the storage charac- teristics of a rollback segment, bring it online after creation, or take it offline before dropping it. You can also use the ALTER ROLLBACK segment command to force the rollback segment to shrink back to the size defined by OPTIMAL or, optionally, to a MANAGING ROLLBACK SEGMENTS Oracle Database Administration PART II Copyright ©2002 SYBEX, Inc., Alameda, CA www.sybex.com Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.