Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 50 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
50
Dung lượng
1,38 MB
Nội dung
Query Analysis and Index Tuning 18
The reason this is a full table scan is that there are no suitable indexes to use. We can use the
INFORMATION_SCHEMA table STATISTICS to show all the indexes on the rental table:
mysql> SELECT COLUMN_NAME, INDEX_NAME, SEQ_IN_INDEX AS pos
-> FROM INFORMATION_SCHEMA.STATISTICS
-> WHERE TABLE_SCHEMA=’sakila’ AND TABLE_NAME=’rental’;
+ + + +
| COLUMN_NAME | INDEX_NAME | pos |
+ + + +
| rental_id | PRIMARY | 1 |
| rental_date | rental_date | 1 |
| inventory_id | rental_date | 2 |
| customer_id | rental_date | 3 |
| inventory_id | idx_fk_inventory_id | 1 |
| customer_id | idx_fk_customer_id | 1 |
| staff_id | idx_fk_staff_id | 1 |
+ + + +
7 rows in set (0.11 sec)
There is no index that includes the return_date field, so add an index to optimize this query:
mysql> USE sakila;
Database changed
mysql> ALTER TABLE rental ADD INDEX (return_date);
Query OK, 16044 rows affected (12.08 sec)
Records: 16044 Duplicates: 0 Warnings: 0
mysql> EXPLAIN SELECT return_date FROM rental\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: rental
type: index
possible_keys: NULL
key: return_date
key_len: 9
ref: NULL
rows: 16249
Extra: Using index
1 row in set (0.00 sec)
Now the type is index, which means a full scan of an index is being done. The index being
scanned is the
return_date index (key), which we just created, with a length (key_len)of9.
Is there a way to further optimize this query?
Looking at Table 18-1, data access strategy types below
index involve using only parts of an
index. The query we are analyzing returns every value of the
return_date field. Therefore,
there is no way to avoid accessing every value in the
return_date index. mysqld needs to
617
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Part IV Extending Your Skills
access a value in order to return it, and every value is returned, so every value must be accessed.
This need to access every value is also shown by the lack of
Using where in the Extra field.
Index consequences
I
n Chapter 6, we explained how indexes work. Indexes can make data retrieval faster because
they are ordered subsets of data, and can be searched faster than the entire set of data, which
may be ordered differently than an index. There is a cost to maintaining indexes. Data changes are
slower because the data needs to be inserted into the table and any appropriate indexes need to be
updated. An index needs uses disk space, memory, and processing power to stay up to date.
When analyzing queries, remember that there are tradeoffs for actions. Many times, adding an index
will make an application run faster because the query runs faster. However, there are times when
adding an index makes an application run more slowly, because although the
SELECT query runs
faster, the
INSERT, UPDATE,andDELETE queries run more slowly.
It helps to be familiar with the nature of all the queries against the database. If you find that selecting
a field from a table that stores user session information is slow, adding an index may make the
application slower because there are many changes to user session information. From time to time,
you may want to reexamine indexes to ensure that they are being used. An index that is not being
used is a waste of resources.
Optimizing away Using filesort
The Extra value Using filesort is not desirable; it means that mysqld has to pass through
the data an extra time in order to sort it. If the
Extra value Using filesort showsupina
subquery, it is best to optimize the query by eliminating the subquery. In queries that do not
involve subqueries, the
Extra value Using filesort may occur in the EXPLAIN plan for
queries that use
ORDER BY, DISTINCT,andGROUP BY.
ON
the
WEBSITE
ON
the
WEBSITE
More information on how to create and use subqueries can be found on the
accompanying website for this book at www.wiley.com/go/mysqladminbible.
For example, the following EXPLAIN plan is for a query to find the customer name and active
status based on an e-mail lookup, sorted by last name:
mysql> EXPLAIN SELECT first_name, last_name, active
-> FROM customer WHERE email=’barbara.jones@sakilacustomer.org’
-> ORDER BY last_name\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
618
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Query Analysis and Index Tuning 18
table: customer
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 541
Extra: Using where; Using filesort
1 row in set (0.00 sec)
In order to optimize away the Using filesort, you need to have an index that
mysqld can use instead of sorting. In most cases, mysqld can only use one index, so
you will need to have an index that handles both the sorting and the filter of
WHERE
email=’barbara.jones@sakilacustomer.org’
:
mysql> ALTER TABLE customer ADD INDEX (email, last_name);
Query OK, 599 rows affected (0.56 sec)
Records: 599 Duplicates: 0 Warnings: 0
mysql> EXPLAIN SELECT first_name, last_name, active
-> FROM customer WHERE email=’barbara.jones@sakilacustomer.org’
-> ORDER BY last_name\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: customer
type: ref
possible_keys: email
key: email
key_len: 203
ref: const
rows: 1
Extra: Using index condition; Using where
1 row in set (0.00 sec)
You have removed the undesirable Extra value Using filesort, and added the desirable
Using index condition. You have also gone from a data access strategy (type)offulltable
scan (
ALL) to one of looking up a nonunique index value (ref).
Often, first instincts may not fully optimize a query. For example, your first instinct in opti-
mizing this query might have been to add an index on only the
email field. This would have
optimized the data access strategy, but the query would still have an
Extra value of Using
filesort
. Having one index for both fields allows mysqld to use that index to optimize
the data access strategy and the filesort. It is always a good idea to test as many optimization
solutions as possible see the sidebar ‘‘Testing ideas.’’
619
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Part IV Extending Your Skills
Testing ideas
I
n the example from the section ‘‘Optimizing away Using filesort,’’ you might have tried to see if
mysqld would use an index on last_name only; if that was your first instinct, you can try out
the following commands to see if the index would work:
ALTER TABLE customer DROP KEY email;
ALTER TABLE customer ADD INDEX (last_name);
EXPLAIN SELECT first_name, last_name, active
FROM customer WHERE email=’barbara.jones@sakilacustomer.org’
ORDER BY last_name\G
Sometimes, the first idea you have to optimize a query will not actually optimize the query. In this
case, the index on
last_name does not help because mysqld needs to filter for the WHERE clause
first, before ordering. If
mysqld wastousetheindexonlast_name, it would have to go through
the entire index, and for each row in the index, look up the
email field from the data to see if it
matched. If there were a match, the
last_name would be put in the result set, and the first_name
and active field would be looked up and also put in the result set. Those lookups are a lot of extra
work, and the query optimizer rightfully uses a full table scan, even with an index on
last_name.
There will be other times when the best solution for optimization is not the best solution overall for
the application. In this example, an index was added on (
email, last_name)andtheEXPLAIN
plan showed a key length (key_len) of 203. That is a very large key to keep up to date, and if it
slows down the application, it may be more beneficial to use an index with a shorter length, even
if it means
mysqld hastodoafilesort.
Optimizing away Range checked for each record
As shown in Table 18-2, the Extra value Range checked for each record is faster than
a full table scan (
type: ALL) but slower than a full index scan (type: index). To optimize
queries with this
Extra value, create or modify an index so that the query optimizer has a good
index to use. Often, optimizing queries to get rid of
Range checked for each record results
in a data access strategy (
type)ofrange, ref or eq_ref.
Optimizing away Using temporary
Unlike in previous discussions, optimizing away an Extra value of Using temporary cannot
be done by adding an index.
Using temporary is undesirable, as it means that a temporary
table must be used to store intermediate results. There are several ways to optimize this,
depending on why a temporary table is used:
■ If
ORDER BY and GROUP BY are both present, and use different fields and/or ordering,
the way to optimize this is to get rid of either the
ORDER BY or the GROUP BY.Thismay
620
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Query Analysis and Index Tuning 18
be done by splitting the query into two queries. It may be possible to combine the two
queries by using
UNION so that intermediate results do not need to be stored in a tempo-
rary table.
■ Thepresenceof
ORDER BY and DISTINCT may cause a temporary table to be
used. The way to optimize this is to get rid of either the
ORDER BY or the DISTINCT.This
may be done by splitting the query into two queries. It may be possible to combine the
two queries by using
UNION so that intermediate results do not need to be stored in a
temporary table.
■ If the
SQL_CALC_FOUND_ROWS keyword is used, the number of rows is stored in a tem-
porary table, which can be retrieved by issuing
SELECT FOUND ROWS(). To optimize, get
rid of
SQL_CALC_FOUND_ROWS. Depending on what you are counting, you might count
results periodically and have an estimate for a time period (i.e., run a query every 10 min-
utes to put the number into table and read the table, doing one count every 10 minutes
instead of one count every time the query is issued).
■ The
SQL_SMALL_RESULT keywordisusedinaSELECT statement with DISTINCT or
GROUP BY.TheSQL_SMALL_RESULT keyword is a hint to the optimizer that the result
is small, and thus it should use a temporary table instead of a filesort. To optimize, get rid
of
SQL_SMALL_RESULT. If you need the SQL_SMALL_RESULT keyword because a tempo-
rary table is more desirable than a filesort, then you cannot optimize
Using temporary
away.
If you use optimizer hints, be sure to run periodic testing. Only through periodic test-
ing can you determine whether a temporary table or a filesort is better for your particular
situation.
■
ORDER BY or GROUP BY is used on a field that is not the first table in the join queue (the
first row returned in the
EXPLAIN plan). One way to optimize this query is to change or
eliminate the
ORDER BY clause. Another way would be to change the filter so that the table
order changes.
For example, the following query uses the
customer table first in the join queue, but is
sorting based on
rental_date, a field in the rental table:
mysql> EXPLAIN SELECT first_name, last_name FROM rental
-> INNER JOIN customer USING (customer_id)
-> ORDER BY rental_date\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: customer
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 591
Extra: Using temporary; Using filesort
621
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Part IV Extending Your Skills
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: rental
type: ref
possible_keys: idx_fk_customer_id
key: idx_fk_customer_id
key_len: 2
ref: sakila.customer.customer_id
rows: 13
Extra:
2 rows in set (0.00 sec)
To optimize this query, we could change the ORDER BY to use a field in the customer table,
or we could change the query to use the rental table first in the join queue. Join table
order can be forced by using a join type of
STRAIGHT_JOIN (which cannot use the USING
syntax):
mysql> EXPLAIN SELECT first_name, last_name FROM rental
-> STRAIGHT_JOIN customer ON rental.customer_id=customer.
customer_id
-> ORDER BY rental_date\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: rental
type: index
possible_keys: idx_fk_customer_id
key: rental_date
key_len: 13
ref: NULL
rows: 16291
Extra: Using index
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: customer
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 2
ref: sakila.rental.customer_id
rows: 1
Extra:
2 rows in set (0.00 sec)
However, this may or may not actually make the query better — Using filesort is
gone, but the data access strategy for the rental table is much slower. In general, using
techniques like index hints and
STRAIGHT_JOIN are dangerous query optimization
622
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Query Analysis and Index Tuning 18
strategies, because changes in the amount of data, the cardinality of data, and the schema
may change the optimal query plan. If you must use these techniques, reassess their
validity every few months and whenever complaints of database slowness arise.
A better way to change the order of the join queue is to limit the rows examined in the
desired table. For example, you can limit the rows examined in
rental table to a certain
range:
mysql> EXPLAIN SELECT first_name, last_name FROM rental
-> INNER JOIN customer USING (customer_id)
-> WHERE rental_date BETWEEN ’2005-01-01 00:00:00’ AND
-> ’2005-01-31 00:00:00’ ORDER BY rental_date\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: rental
type: range
possible_keys: rental_date,idx_fk_customer_id
key: rental_date
key_len: 8
ref: NULL
rows: 1
Extra: Using where; Using index
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: customer
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 2
ref: sakila.rental.customer_id
rows: 1
Extra:
2 rows in set (0.00 sec)
It is beneficial to optimize away Using temporary because in certain cases, temporary tables
will be written to disk. These situations include: when a temporary table exceeds the smaller
of
tmp_table_size and max_heap_table_size, when a temporary table includes BLOB or
TEXT data types, when DISTINCT or GROUP BY clauses contain fields that use more than 512
bytes, and when any field is more than 512 bytes in a
UNION or UNION ALL query.
Using an index by eliminating functions
Sometimes, an index exists but is not being used. For example, the film table has the following
indexes:
mysql> SELECT COLUMN_NAME, INDEX_NAME, SEQ_IN_INDEX AS pos
-> FROM INFORMATION_SCHEMA.STATISTICS
-> WHERE TABLE_SCHEMA=’sakila’ AND TABLE_NAME=’film’;
623
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Part IV Extending Your Skills
+ + + +
| COLUMN_NAME | INDEX_NAME | pos |
+ + + +
| film_id | PRIMARY | 1 |
| title | idx_title | 1 |
| language_id | idx_fk_language_id | 1 |
| original_language_id | idx_fk_original_language_id | 1 |
+ + + +
4 rows in set (0.01 sec)
However, the following query does not use the index on title, as you might expect it would:
mysql> EXPLAIN SELECT title FROM film WHERE LEFT(title,2)=’Tr’\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: film
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 953
Extra: Using where
1 row in set (0.00 sec)
The reason for this is that there is an index on title,buttheWHERE clause is filtering based
on a function of the
title field. Values (such as ’Tr’) cannot be compared to a function
(
LEFT(title,2)) using an index in mysqld, unless the index is on the function itself.
Unfortunately,
mysqld does not support an index on functions, and so it is not possible to
define an index on
LEFT(title,2) even if you had the desire.
To optimize this type of query, see if you can take away the function. In this case, you can
replace
LEFT(title,2)=’Tr’ with title LIKE ’Tr%’ to get rid of the function on title.
Just by changing the query to get rid of the function, you can change your data access strategy
from a
type of ALL to a type of range:
mysql> EXPLAIN SELECT title FROM film WHERE title LIKE ’Tr%’\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: film
type: range
possible_keys: idx_title
key: idx_title
key_len: 766
ref: NULL
rows: 15
Extra: Using where
1 row in set (0.00 sec)
624
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Query Analysis and Index Tuning 18
This type of optimization is done most frequently to queries involving date ranges.
Compare:
EXPLAIN SELECT inventory_id, customer_id FROM rental
WHERE DATE(return_date)=’2005-05-30’\G
with:
EXPLAIN SELECT return_date FROM rental
WHERE return_date BETWEEN ’2005-05-30 00:00:00’ and ’2005-05-30
23:59:59’
However, there are other ways in which functions can be optimized out of a query. Table 18-4
shows some common optimizations:
TABLE 18-4
Common Ways to Optimize by Eliminating Functions
WHERE clause Function Optimization
LEFT(stringfield) = ’Tr’ stringfield LIKE ’Tr%’
DATE(datefield) = ’2005-05-30’ or
LAST_DAY(field)=’2005-05-30’ or
LEFT(datefield, 10) = ’2005-05-30’
SUBSTRING_INDEX(datefield,’ ’) =
’2005-05-30’
field BETWEEN ’2005-05-30
00:00:00’ AND ’2005-05-30
23:59:59’
ABS(field) > 20 field > 20 or field < -20
field +1>20 field > 19
FLOOR(field)=1 field >= 1ANDfield< 2
CONCAT(field,’day’)=’Saturday’ field=’Satur’
FROM_UNIXTIME(field)=’2005-05-30
00:00:00’
field= 1117425600
LEFT(INET_NTOA(field),10)=’192.168.1.’ field BETWEEN 3232235777 AND
3232236031
You may be wondering why anyone would ever create WHERE clauses like the ones in Table 18-
4. Most of the time it happens because of the way the developer is thinking. Developers write
queries to answer questions, so these types of
WHERE clauses happen when the developer writes
a query to ‘‘find sales on May 30’’ or to ‘‘find distances greater than 20’’. In an ideal world, no
query would be saved to code unless it were optimized. In practice, developers write queries,
625
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Part IV Extending Your Skills
and DBAs optimize queries — if the developer writes a suboptimal query, in many organizations
the DBA will find it only when it slows down the application.
Optimizing the last two queries in Table 18-4 requires some work to retrieve the numerical val-
ues. To optimize
FROM_UNIXTIME(field)=’2005-05-30 00:00:00’, you have to find the
UNIX timestamp value for the datetime. There is a function to do that:
mysql> SELECT UNIX_TIMESTAMP(’2005-05-30 00:00:00’);
+ +
| UNIX_TIMESTAMP(’2005-05-30 00:00:00’) |
+ +
| 1117425600 |
+ +
1 row in set (0.05 sec)
To optimize LEFT(INET_NTOA(field),10)=’192.168.1.’, you first have to figure out what
the query is looking for. This filter finds rows that have
field with the numerical equivalent
of an IP address whose left 10 characters are
’192.168.1.’. Another way to look at the fil-
ter is that it finds rows that have
field with the numerical equivalent of an IP address between
192.168.1.1 and 192.168.1.255.
This new way to look at the data presents you with a way to eliminate the function from the
WHERE clause. If you find the numerical equivalent of the boundary IPs, you can use those in
the
BETWEEN comparison shown in Table 18-4. Again, mysqld has a function that will let you
look those values up:
mysql> select INET_ATON(’192.168.1.1’), INET_ATON(’192.168.1.255’);
+ + +
| INET_ATON(’192.168.1.1’) | INET_ATON(’192.168.1.255’) |
+ + +
| 3232235777 | 3232236031 |
+ + +
1 row in set (0.00 sec)
There are functions that simply cannot be eliminated. For example, it is difficult to eliminate
WHERE clauses such as MOD(field,10)=2 and LENGTH(field)<5.
Non-index schema changes
Sometimes the best way to optimize a query is to change the data structure. Consider the follow-
ing query:
mysql> EXPLAIN SELECT first_name, last_name, email
-> FROM staff
-> WHERE email LIKE ’%sakilastaff.com’\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
626
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
[...]... on two in particular: MySQL Enterprise Monitor, developed by the same company that brings us MySQL, and MONyog, a MySQL monitoring tool from www.webyog.com, the same folks who brought us SQLyog, a popular MySQL GUI client access tool which we discussed in Chapter 3 MySQL enterprise monitor The MySQL Enterprise Monitoring is a tool developed by Sun /MySQL specifically for monitoring MySQL It is available... available along the right-hand side of www .mysql. com/products/ enterprise/demo.html ■ Demo videos available at: www .mysql. com/products/enterprise/demo.html (requires login) There is more information on the MySQL Enterprise Monitor on MySQL s website at: www .mysql. com/products/enterprise/monitor.html The MySQL Enterprise Monitor is an extremely useful tool for a MySQL database administrator Its main detractors... text reversed, with a regular index on that For example: mysql> ALTER TABLE staff ADD COLUMN revemail VARCHAR(50) DEFAULT NULL, -> ADD INDEX (revemail); Query OK, 2 rows affected (0.38 sec) Records: 2 Duplicates: 0 Warnings: 0 mysql> UPDATE staff SET revemail=REVERSE(email); Query OK, 2 rows affected (0.08 sec) Rows matched: 2 Changed: 2 Warnings: 0 mysql> SELECT email, revemail FROM staff; + ... database, server and network alerting and graphing framework, it does provide comprehensive alerting and graphing of MySQL systems The main features of MySQL Enterprise Monitor include: ■ Graphical web-based user interface ■ Convenient dashboard overview for overall health at-a-glance ■ Supports full MySQL and some host monitoring ■ Supports client-server agents (a daemon running on each client machine checked... graphing tool available There is also no way to define a custom set of times for when to gather data for graphing with Cacti Baron Schwartz, MySQL community member and co-author of High Performance MySQL, 2nd edition, developed a set of Cacti templates for monitoring MySQL (www.xaprb.com/blog/ tag/cacti-templates/) Hyperic HQ Hyperic HQ is a Java-based comprehensive graphing solution, with some alerting... default a mysqld binary installation will create a root@localhost user with a blank password, and an anonymous user On a non-Windows system, these can be fixed by running the mysql_ secure_installation program The Windows installer asks if an anonymous user should be created, and prompts for a root password 650 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark Securing MySQL Privilege... moc.ffatsalikas@snehpetS.noJ | + + + 2 rows in set (0.03 sec) mysql> EXPLAIN SELECT first_name, last_name, email -> FROM staff -> WHERE email LIKE ’%sakilastaff.com’\G You can use the REVERSE() function to show you what the comparison string should be, and then run an EXPLAIN to see if the new field and index help: mysql> SELECT REVERSE(’%sakilastaff.com’); + -+ | REVERSE(’%sakilastaff.com’)... Supports agent-based checking if the agent is installed in the cloud computing environment ■ Replication auto-discovery ■ Web-based configuration management ■ Supported by Sun /MySQL ■ Define custom data-gathering scripts ■ Over 600 MySQL and operating system parameters checked 644 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark Monitoring Your Systems ■ Over 100 advisors to... Services and Google App Engine ■ Auto-discovery of services on a host (once an agent is deployed) ■ Web-based configuration management ■ Many built-in checks, including hundreds of MySQL checks See www.hyperic.com/ products/managed /mysql- management.htm for the impressive list of metrics on reliability, transactions, connections, SQL commands, configuration changes, resource utilization, queries, I/O, and... an entire application However, it is an exceptionally good tool for graphing, alerting and advising of both current and potential problems MONyog MONyog is a MySQL Monitor and Advisor sold by Webyog Softworks Private, Ltd It is a full-featured MySQL alerting and graphing tool, which can also monitor some Linux parameters It is a closed-source commercial offering, though the price is moderate and there . return_date field, so add an index to optimize this query:
mysql& gt; USE sakila;
Database changed
mysql& gt; ALTER TABLE rental ADD INDEX (return_date);
Query. the Using filesort, you need to have an index that
mysqld can use instead of sorting. In most cases, mysqld can only use one index, so
you will need to