Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 50 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
50
Dung lượng
457,35 KB
Nội dung
id: 2 select_type: UNION table: film type: ref possible_keys: film_rating key: film_rating key_len: 2 ref: const rows: 210 Extra: Using where *************************** 3. row *************************** id: NULL select_type: UNION RESULT table: <union1,2> type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: NULL Extra: 3 rows in set (0.00 sec) Success! Now we can see we have a query plan that is using the index and processing far fewer rows. We can see from the result of the EXPLAIN command that the optimizer is running each query individually (steps execute from row 1 down to row n) and com- bines the result in the last step. MySQL has a session status variable named last_query_cost that stores the cost of the last query executed. Use this variable to compare two query plans for the same query. For example, after each EXPLAIN, check the value of the variable. The query with the lowest cost value is con- sidered to be the more efficient (less time-consuming) query. A value of 0 indicates no query has been submitted for compilation. While this exercise may seem to be a lot of work for a little gain, consider that there are many such queries being executed in applications without anyone noticing the inefficiency. Normally we encounter these types of queries only when the row count gets large enough to notice. In the sakila database, there are only 1,000 rows, but what if there were a million or tens of millions of rows? Aside from EXPLAIN, there is no single tool in a standard MySQL distribution that you can use to profile a query in MySQL. The “Optimization” chapter in the online MySQL Reference Manual has a host of tips and tricks to help an experienced DBA improve the performance of various query forms. Database Performance | 327 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. Using ANALYZE TABLE The MySQL optimizer, like most traditional optimizers, uses statistical information about tables to perform its analysis of the optimal query execution plan. These statistics include information about indexes, distribution of values, and table structure, among many items. The ANALYZE TABLE command recalculates the key distribution for one or more tables. This information determines the table order for a join operation. The syntax for the ANALYZE TABLE command is shown below: ANALYZE [LOCAL | NO_WRITE_TO_BINLOG] TABLE <table_list> You can update the key distribution for MyISAM and InnoDB tables. This is very im- portant to note because it is not a general tool that applies to all storage engines. How- ever, all storage engines must report index cardinality statistics to the optimizer if they support indexes. Some storage engines, like third-party engines, have their own specific built-in statistics. A typical execution of the command is shown in Example 8-11. Run- ning the command on a table with no indexes has no effect, but will not result in an error. Example 8-11. Analyzing a table to update key distribution mysql> ANALYZE TABLE film; +-------------+---------+----------+----------+ | Table | Op | Msg_type | Msg_text | +-------------+---------+----------+----------+ | sakila.film | analyze | status | OK | +-------------+---------+----------+----------+ 1 row in set (0.00 sec) In this example, we see the analysis is complete and there are no unusual conditions. Should there be any unusual events during the execution of the command, the Msg_type field can indicate “info,” “error,” or “warning.” In these cases, the Msg_text field will give you additional information about the event. You should always investi- gate the situation if you get any result other than “status” and “OK.” You can see the status of your indexes using the SHOW INDEX command. A sample of the output of the film table is shown in Example 8-12. In this case, we’re interested in the cardinality of each index, which is an estimate of the number of unique values in the index. We omit the other columns from the display for brevity. For more information about SHOW INDEX, see the online MySQL Reference Manual. Example 8-12. The indexes for the film table mysql> SHOW INDEX FROM film \G *************************** 1. row *************************** Table: film Non_unique: 0 Key_name: PRIMARY Seq_in_index: 1 328 | Chapter 8: Monitoring MySQL Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. Column_name: film_id Collation: A Cardinality: 1028 . *************************** 2. row *************************** Table: film Non_unique: 1 Key_name: idx_title Seq_in_index: 1 Column_name: title Collation: A Cardinality: 1028 . *************************** 3. row *************************** Table: film Non_unique: 1 Key_name: idx_fk_language_id Seq_in_index: 1 Column_name: language_id Collation: A Cardinality: 2 . *************************** 4. row *************************** Table: film Non_unique: 1 Key_name: idx_fk_original_language_id Seq_in_index: 1 Column_name: original_language_id Collation: A Cardinality: 2 . *************************** 5. row *************************** Table: film Non_unique: 1 Key_name: film_rating Seq_in_index: 1 Column_name: rating Collation: A Cardinality: 11 Sub_part: NULL Packed: NULL Null: YES Index_type: BTREE Comment: 5 rows in set (0.00 sec) The LOCAL or NO_WRITE_TO_BINLOG keyword prevents the command from being written to the binary log (and thereby from being replicated in a replication topology). This can be very useful if you want to experiment or tune while replicating data or if you want to omit this step from your binary log and not replay it during PITR. You should run this command whenever there have been significant updates to the table (e.g., bulk-loaded data). The system must have a read lock on the table for the duration of the operation. Database Performance | 329 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. Using OPTIMIZE TABLE Tables that are updated frequently with new data and deletions can become fragmented quickly and, depending on the storage engine, can have gaps of unused space or sub- optimal storage structures. A badly fragmented table can result in slower performance, especially during table scans. The OPTIMIZE TABLE command restructures the data structures for one or more tables. This is especially beneficial for row formats with variable length fields (rows). The syntax for the OPTIMIZE TABLE command is shown below: OPTIMIZE [LOCAL | NO_WRITE_TO_BINLOG] TABLE <table_list> You can use this command for MyISAM and InnoDB tables. This is very important to note because it is not a general tool that applies to all storage engines. If the table cannot be reorganized (e.g., there are no variable length records or there is no fragmentation), the command will revert to re-creating the table and updating the statistics. A sample output from this operation is shown in Example 8-13. Example 8-13. The optimize table command mysql> OPTIMIZE TABLE film \G *************************** 1. row *************************** Table: sakila.film Op: optimize Msg_type: note Msg_text: Table does not support optimize, doing recreate + analyze instead *************************** 2. row *************************** Table: sakila.film Op: optimize Msg_type: status Msg_text: OK 2 rows in set (0.44 sec) Here we see two rows in the result set. The first row tells us the OPTIMIZE TABLE com- mand could not be run and that the command will instead re-create the table and run the ANALYZE TABLE command. The second row is the result of the ANALYZE TABLE step. Like the ANALYZE TABLE command above, any unusual events during the execution of the command are indicated in the Msg_type field by “info,” “error,” or “warning.” In these cases, the Msg_text field will give you additional information about the event. You should always investigate the situation if you get any result other than “status” and “OK.” The LOCAL or NO_WRITE_TO_BINLOG keyword prevents the command from being written to the binary log (it will therefore not be replicated in a replication topology). This can be very useful if you want to experiment or tune while replicating data or if you want to omit this step from your binary log and thereby not replay it during PITR. You should run this command whenever there have been significant updates to the table (e.g., a large number of deletes and inserts). This operation is designed to 330 | Chapter 8: Monitoring MySQL Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. rearrange data elements into a more optimal structure and could run for longer than expected. This is one operation that is best run during times of lower loads. When using InnoDB, especially when there are secondary indexes (which usually get fragmented), you may not see any improvement or you may encounter long processing times for the operation unless you use the InnoDB “fast index create” option. Database Optimization Best Practices As mentioned earlier, there are many great examples, techniques, and practices that come highly recommended by the world’s best database performance experts. Rather than passing judgment or suggesting any particular tool or technique, we will instead discuss the most common best practices for improving database performance. We en- courage you to examine some of the texts referenced earlier for more detail on each of these practices. Use indexes sparingly but effectively Most database professionals understand the importance of indexes and how they im- prove performance. Using the EXPLAIN command is often the best way to determine which indexes are needed. While the problem of not having enough indexes is under- stood, having too much of a good thing can cause a performance issue. As you saw when exploring the EXPLAIN command, it is possible to create too many indexes or indexes that are of little or no use. Each index adds overhead for every insert and delete against the table. In some cases, having too many indexes with wide (as in many values) distributions can slow insert and delete performance considerably. It can also lead to slower replication and restore operations. You should periodically check your indexes to ensure they are all meaningful and uti- lized. You should remove any indexes that are not used, have limited use, or have wide distributions. You can often use normalization to overcome some of the problems with wide distributions. Use normalization, but don’t overdo it Many database experts who studied computer science or a related discipline may have fond memories (or nightmares) of learning the normal forms as described by C.J. Date and others. We won’t revisit the material here; rather we will discuss the impacts of taking those lessons too far. Normalization (at least to third normal form) is a well-understood and standard prac- tice. However, there are situations in which you may want to violate these rules. The use of lookup tables is often a by-product of normalization. That is, you create a special table that contains a list of related information that is used frequently in other Database Performance | 331 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. tables. However, you can impede performance when you use lookup tables with limited distributions (only a few rows or a limited number of rows with small values) that are accessed frequently. In this case, every time your users query information, they must use a join to get the complete data. Joins are expensive, and frequently accessed data can add up over time. To mitigate this potential performance problem, you can use enumerated fields to store the data rather than a lookup table. For example, rather than creating a table for hair color (despite what some subcultures may insist upon, there really are only a limited number of hair color types), you can use an enumerated field and avoid the join altogether. Another potential issue concerns calculated fields. Typically, we do not store data that is formed from other data (such as sales tax or the sum of several columns). Rather, the calculated data is performed either during data retrieval via a view or in the application. This may not be a real issue if the calculations are simple or are seldom performed, but what if the calculations are complex and they are performed many times? In this case, you are potentially wasting a lot of time performing these calculations. One way to mitigate this problem is to use a trigger to calculate the value and store it in the table. While this technically duplicates data (a big no-no for normalization theorists), it can improve performance in situations where there are a lot of calculations being performed. Use the right storage engine for the task One of the most powerful features of MySQL is its support for different storage engines. Storage engines govern how data is stored and retrieved. MySQL supports a number of them, each with unique features and uses. This allows database designers to tune their database performance by selecting the storage engine that best meets their appli- cation needs. For example, if you have an environment that requires transaction control for highly active databases, choose a storage engine best suited for this task (yes, Vir- ginia, there are some storage engines in MySQL that do not provide transactional sup- port). You may also have identified a view or table that is often queried but almost never updated (e.g., a lookup table). In this case, you may want to use a storage engine the keeps the data in memory for faster access. Recent changes to MySQL have permitted some storage engines to become plug-ins, and some distributions of MySQL have only certain storage engines enabled by default. To find out which storage engines are enabled, issue the SHOW ENGINES command. Example 8-14 shows the storage engines on a typical installation. Example 8-14. Storage engines mysql> SHOW ENGINES \G *************************** 1. row *************************** Engine: InnoDB Support: YES Comment: Supports transactions, row-level locking, and foreign keys Transactions: YES 332 | Chapter 8: Monitoring MySQL Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. XA: YES Savepoints: YES *************************** 2. row *************************** Engine: MyISAM Support: DEFAULT Comment: Default engine as of MySQL 3.23 with great performance Transactions: NO XA: NO Savepoints: NO *************************** 3. row *************************** Engine: BLACKHOLE Support: YES Comment: /dev/null storage engine (anything you write to it disappears) Transactions: NO XA: NO Savepoints: NO *************************** 4. row *************************** Engine: CSV Support: YES Comment: CSV storage engine Transactions: NO XA: NO Savepoints: NO *************************** 5. row *************************** Engine: MEMORY Support: YES Comment: Hash based, stored in memory, useful for temporary tables Transactions: NO XA: NO Savepoints: NO *************************** 6. row *************************** Engine: FEDERATED Support: NO Comment: Federated MySQL storage engine Transactions: NULL XA: NULL Savepoints: NULL *************************** 7. row *************************** Engine: ARCHIVE Support: YES Comment: Archive storage engine Transactions: NO XA: NO Savepoints: NO *************************** 8. row *************************** Engine: MRG_MYISAM Support: YES Comment: Collection of identical MyISAM tables Transactions: NO XA: NO Savepoints: NO 8 rows in set (0.00 sec) Database Performance | 333 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. The result set includes all of the known storage engines; whether they are installed and configured (where Support = YES); a note about the engine’s features; and whether it supports transactions, distributed transactions (XA), or savepoints. A savepoint is a named event that you can use like a transaction. You can establish a savepoint and either release (delete the savepoint) or roll back the changes since the savepoint. See the online MySQL Reference Manual for more details about savepoints. With so many storage engines to choose from, it can be confusing when designing your database for performance. The following describes each of the storage engines briefly, including some of the uses for which they are best suited. You can choose the storage engine for a table using the ENGINE parameter on the CREATE statement, and you can change the storage engine by issuing an ALTER TABLE command: CREATE TABLE t1 (a int) ENGINE=InnoDB; ALTER TABLE t1 ENGINE=MEMORY; The InnoDB storage engine is the premier transactional support storage engine. You should always choose this storage engine when requiring transactional support; it is currently the only transactional engine in MySQL. There are third-party storage engines in various states of production that can support transactions, but the only “out-of-the– box” option is InnoDB. Interestingly, all indexes in InnoDB are B-trees, in which the index records are stored in the leaf pages of the tree. InnoDB is the storage engine of choice for high reliability and transaction-processing environments. The MyISAM storage engine is the default engine; this engine will be used if you omit the ENGINE option on the CREATE statement. MyISAM is often used for data warehousing, e-commerce, and enterprise applications. MyISAM uses advanced caching and index- ing mechanisms to improve data retrieval and indexing. MyISAM is an excellent choice when you need storage in a wide variety of applications requiring fast retrieval of data without the need for transactions. The Blackhole storage engine is very interesting. It doesn’t store anything at all. In fact, it is what its name suggests—data goes in but never returns. All jocularity aside, the Blackhole storage engine fills a very special need. If binary logging is enabled, SQL statements are written to the logs, and Blackhole is used as a relay agent (or proxy) in a replication topology. In this case, the relay agent processes data from the master and passes it on to its slaves but does not actually store any data. The Blackhole storage engine can be handy in situations where you want to test an application to ensure it is writing data, but you don’t want to store anything on disk. The CSV storage engine can create, read, and write comma-separated value (CSV) files as tables. The CSV storage engine is best used to rapidly export structured business data to spreadsheets. The CSV storage engine does not provide any indexing mecha- nisms and has certain issues in storing and converting date/time values (they do not 334 | Chapter 8: Monitoring MySQL Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. obey locality during queries). The CSV storage engine is best used when you want to permit other applications to share or exchange data in a common format. Given that it is not as efficient for storing data, you should use the CSV storage engine sparingly. The CSV storage engine is used for writing logfiles. For example, the backup logs are CSV files and can be opened by other applications that use the CSV protocol (but not while the server is running). The Memory storage engine (sometimes called HEAP) is an in-memory storage that uses a hashing mechanism to retrieve frequently used data. This allows for much faster retrieval. Data is accessed in the same manner as with the other storage engines, but the data is stored in memory and is valid only during the MySQL session—the data is flushed and deleted on shutdown. Memory storage engines are typically good for sit- uations in which static data is accessed frequently and rarely ever altered (e.g., lookup tables). Examples include zip code listings, state and county names, category listings, and other data that is accessed frequently and seldom updated. You can also use the Memory storage engine for databases that utilize snapshot techniques for distributed or historical data access. The Federated storage engine creates a single table reference from multiple database systems. The Federated storage engine allows you to link tables together across data- base servers. This mechanism is similar in purpose to the linked data tables available in other database systems. The Federated storage engine is best suited for distributed or data mart environments. The most interesting feature of the Federated storage engine is that it does not move data, nor does it require the remote tables to use the same storage engine. The Federated storage engine is currently disabled in most distributions of MySQL. Consult the online MySQL Reference Manual for more details. The Archive storage engine can store large amounts of data in a compressed format. The Archive storage engine is best suited for storing and retrieving large amounts of seldom-accessed archival or historical data. Indexes are not supported and the only access method is via a table scan. Thus, you should not use the Archive storage engine for normal database storage and retrieval. The Merge (MRG_MYISAM) storage engine can encapsulate a set of MyISAM tables with the same structure (table layout or schema) and is referenced as a single table. Thus, the tables are partitioned by the location of the individual tables, but no additional partitioning mechanisms are used. All tables must reside on the same server (but not necessarily the same database). Database Performance | 335 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. When a DROP command is issued on a merged table, only the Merge specification is removed. The original tables are not altered. The best attribute of the Merge storage engine is speed. It permits you to split a large table into several smaller tables on different disks, combine them using a merge table specification, and access them simultaneously. Searches and sorts will execute more quickly, since there is less data in each table to manipulate. Also, repairs on tables are more efficient because it is faster and easier to repair several smaller individual tables than a single large table. Unfortunately, this configuration has several disadvantages: • You must use identical MyISAM tables to form a single merge table. • The replace operation is not allowed. • Indexes are less efficient than for a single table. The Merge storage engine is best suited for very large database (VLDB) applications, like data warehousing, where data resides in more than one table in one or more da- tabases. You can also use it to help solve partitioning problems where you want to partition horizontally but do not want to add the complexity of the partition table options. Clearly, with so many choices of storage engines, it is possible to choose engines that can hamper performance or, in some cases, prohibit certain solutions. For example, if you never specify a storage engine when the table is created, MySQL uses the default storage engine. If not set manually, the default storage engine reverts to the platform- specific default, which may be MyISAM on some platforms. This may mean you are missing out on optimizing lookup tables or limiting features of your application by not having transactional support. It is well worth the extra time to include an analysis of storage engine choices when designing or tuning your databases. Use views for faster results via the query cache Views are a very handy way to encapsulate complex queries to make it easier to work with the data. You can use views to limit data both vertically (fewer columns) or hor- izontally (a WHERE clause on the underlying SELECT statement). Both are very handy and, of course, the more complex views use both practices to limit the result set returned to the user or to hide certain base tables or to ensure an efficient join is executed. Using views to limit the columns returned can help you in ways you may not have considered. It not only reduces the amount of data processed, it can also help you avoid costly SELECT * operations that users tend to do without much thought. When many of these types of operations are run, your applications are processing far too much data and this can affect performance of not only the application, but also the server, and more importantly, can decrease available bandwidth on your network. It always a good 336 | Chapter 8: Monitoring MySQL Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. [...]... to monitor on a MySQL server We’ve discussed the basic SQL commands available for monitoring the server, the mysqladmin command-line utility, the benchmark suite, and the MySQL Administrator and MySQL Query Browser GUI tools We have also examined some best practices for improving database performance Now that you know the basics of operating system monitoring, database performance, MySQL monitoring,... is brief, it covers the most important aspects of using MyISAM effectively For more information about the key cache and the MyISAM storage engine, see the online MySQL Reference Manual MySQL, Replication, and High Availability There is a higher probability of corruption of MyISAM data than InnoDB data and, as a result, MyISAM requires longer recovery times Also, since MyISAM does not support transactions,... secondary key caches mysql> SET GLOBAL emp_cache.key_buffer_size=128*1024; Query OK, 0 rows affected (0.00 sec) mysql> CACHE INDEX salaries IN emp_cache; + + + + + | Table | Op | Msg_type | Msg_text | + + + + + | employees.salaries | assign_to_keycache | status | OK | + + + + + 1 row in set (0.00 sec) mysql> SET GLOBAL emp_cache.key_buffer_size=0;... you can monitor each using the SHOW VARIABLES and SHOW STATUS commands or the MySQL Administrator We discussed the built-in key cache monitor in MySQL Administrator in Chapter 8 (see Figure 8-5) The variables you can monitor with the SHOW commands are shown in Example 9-2 Example 9-2 The key cache status and system variables mysql> SHOW STATUS LIKE 'Key%'; + + -+ | Variable_name | Value... TRANSACTIONS FOR EACH SESSION: -TRANSACTION 0, not started, OS thread id 4491317248 MySQL thread id 4, query id 152 localhost root SHOW ENGINE INNODB STATUS -TRANSACTION 2968, ACTIVE 0 sec, OS thread id 4548612096 inserting mysql tables in use 1, locked 1 7 lock struct(s), heap size 1216, 1171 row lock(s), undo log entries 11375 MySQL thread id 3, query id 151 localhost root update INSERT INTO `salaries`... their size to 0 or when the server is restarted You can save the configuration of multiple key caches by storing the statements in a file and using the init-file= command in the [mysql] section of the MySQL option file to execute the statements on startup Other Parameters to Consider There are a number of other parameters to consider Remember, change only one thing at a time and only if you... enumerated field type allows one and only one value The use of sets in MySQL is similar to using enumerated values However, a set field type allows storage of one or more values in the set You can use sets to store information that represents attributes of the data rather than using a master/detail relationship This 338 | Chapter 8: Monitoring MySQL Please purchase PDF Split-Merge on www.verypdf.com to remove... information about InnoDB The list of statistical data displayed is long and very comprehensive Example 9-5 shows an excerpt of the command run on a standard installation of MySQL Example 9-5 The SHOW ENGINE INNODB STATUS command mysql> SHOW ENGINE INNODB STATUS \G *************************** 1 row *************************** Type: InnoDB Name: Status: ===================================== 091205 18:31:10... space While there are methods to compress data in MySQL, the MyISAM storage engine allows you to compress (pack) read-only tables to save space They must be read-only because MyISAM does not have the capability to decompress, reorder, or compress additions (or deletions) To compress a table, use the myisampack utility as follows: myisampack -b /usr/local /mysql/ data/test/table1 Always use the backup (-b)... mutex information about InnoDB and can be very helpful in tuning threading in the storage engine Example 9-6 shows an excerpt of the command run on a standard installation of MySQL Example 9-6 The SHOW ENGINE INNODB MUTEX command mysql> SHOW ENGINE INNODB MUTEX; + + + -+ | Type | Name | Status | + + + -+ | InnoDB | trx/trx0rseg.c:167 | os_waits=1 | | InnoDB | trx/trx0sys.c:181 . single tool in a standard MySQL distribution that you can use to profile a query in MySQL. The “Optimization” chapter in the online MySQL Reference Manual. information about SHOW INDEX, see the online MySQL Reference Manual. Example 8-12. The indexes for the film table mysql& gt; SHOW INDEX FROM film G ***************************