252 Chapter 11 Advanced MySQL Figure 11.2 The output of the EXPLAIN statement. This might look confusing at first, but it can be very useful. Let’s look at the columns in this table one by one. The first column, table, just lists the tables used to answer the query. Each row in the result gives more information about how that particular table is used in this query. In this case, you can see that the tables used are orders, order_items, customers,and books. (We knew this already by looking at the query.) The type column explains how the table is being used in joins in the query.The set of values this column can have is shown in Table 11.7.These values are listed in order from fastest to slowest in terms of query execution. It gives you an idea of how many rows need to be read from each table in order to execute a query. Table 11.7 Possible Join Types as Shown in Output from EXPLAIN Type Description const or system The table is read from only once.This happens when the table has exactly one row.The type system is used when it is a system table, and the type const otherwise. eq_ref For every set of rows from the other tables in the join, we read one row from this table.This is used when the join uses all the parts of the index on the table, and the index is UNIQUE or is the primary key. ref For every set of rows from the other tables in the join, we read a set of rows from this table which all match.This is used when the join cannot choose a single row based on the join condition, that is, when only part of the key is used in the join, or if it is not UNIQUE or a primary key. range For every set of rows from the other tables in the join, we read a set of rows from this table that fall into a particular range. index The entire index is scanned. ALL Every row in the table is scanned. In the previous example, you can see that one of the tables is joined using eq_ref (books), and one is joined using ref (order_items), but the other two (orders and customers) are joined by using ALL; that is, by looking at every single row in the table. The rows column backs this up—it lists (roughly) the number of rows of each table that has to be scanned to perform the join.You can multiply these together to get the total number of rows examined when a query is performed.We multiply these numbers + + + + + + + + - | table | type | possible_keys | key | key_len | ref |rows | Extra | + + + + + + + + - | orders | ALL | PRIMARY | NULL | NULL | NULL | 4 | | | order_items | ref | PRIMARY | PRIMARY | 4 | orders.orderid | 1 | Using index | | customers | ALL | PRIMARY | NULL | NULL | NULL | 3 | where used | | books | eq_ref | PRIMARY | PRIMARY | 13 | order_items.isbn | 1 | where used | + + + + + + + + - 14 525x ch11 1/24/03 3:37 PM Page 252 253 Getting More Information About Databases because a join is like a product of rows in different tables—check out Chapter 9, “Working with Your MySQL Database,” for details. Remember that this is the number of rows examined, not the number of rows returned, and that it is only an estimate— MySQL can’t know the exact number without performing the query. Obviously, the smaller we can make this number, the better. At present we have a pretty negligible amount of data in the database, but when the database starts to increase in size, this query would blow out in execution time.We’ll return to this in a minute. The possible_keys column lists, as you might expect, the keys that MySQL might use to join the table. In this case, you can see that the possible keys are all PRIMARY keys. The key column is either the key from the table MySQL actually used, or NULL if no key was used.You’ll notice that, although there are possible PRIMARY keys for the orders and customers tables, they were not used in this query.We’ll look at how to fix this in a minute. The key_len column indicates the length of the key used.You can use this to tell whether only part of a key was used.This is relevant when you have keys that consist of more than one column. In this case, where the keys were used (order_items and books), the full key was used. The ref column shows the columns used with the key to select rows from the table. Finally, the Extra column tells you any other information about how the join was performed.The possible values you might see in this column are shown in Table 11.8. Table 11.8 Possible Values for Extra Column as Shown in Output from EXPLAIN Va lue Meaning Not exists The query has been optimized to use LEFT JOIN. Range checked for For each row in the set of rows from the other tables in the join, each record try to find the best index to use, if any. Using filesort Two passes will be required to sort the data. (This obviously takes twice as long.) Using index All information from the table comes from the index—that is, the rows are not actually looked up. Using temporary A temporary table will need to be created to execute this query. WHERE used A WHERE clause is being used to select rows. There are several ways you can fix problems you spot in the output from EXPLAIN. First, check column types and make sure they are the same.This applies particularly to column width. Indexes can’t be used to match columns if they have different widths.You can fix this by changing the types of columns to match, or building this in to your design to begin with. Second, you can tell the join optimizer to examine key distributions and therefore optimize joins more efficiently using the myisamchk utility.You can invoke this by typing 14 525x ch11 1/24/03 3:37 PM Page 253 254 Chapter 11 Advanced MySQL myisamchk analyze pathtomysqldatabase/table You can check multiple tables by listing them all on the command line, or by using myisamchk analyze pathtomysqldatabase/*.MYI You can check all tables in all databases by running the following, which will produce the output shown in Figure 11.3: myisamchk analyze pathtomysqldatadirectory/*/*.MYI + + + + + + + + + | table | type | possible_keys | key | key_len | ref ________________| rows | Extra | + + + + + + + + + | books | ALL | PRIMARY | NULL | NULL | NULL | 4 | where used | | order_items | index | PRIMARY | PRIMARY | 17 | NULL | 5 | where used; Using index | | orders | eq_ref | PRIMARY | PRIMARY | 4 | order_items.orderid | 1 | | | customers | eq_ref | PRIMARY | PRIMARY | 4 | orders.customerid | 1 | | + + + + + + + + + Figure 11.3 This is the output of the EXPLAIN after running myisamchk. You’ll notice that the way the query is evaluated has changed quite a lot.We’re now only using ALL the rows in one of the tables (books), which is fine. In particular, we’re now using eq_ref for two of the tables and index for the other. MySQL is also now using the whole key for order_items (17 characters as opposed to 4 previously). You’ll also notice the number of rows being used has actually gone up.This is proba- bly caused by the fact that we have little data in the actual database at this point. Remember that the number of rows listed is only an estimate—try performing the actu- al query and checking this. If these numbers are way off, the MySQL manual suggests using a straight join and listing the tables in your FROM clause in a different order. Third, you might want to consider adding a new index to the table. If this query is a) slow and b) common, you should seriously consider this. If it’s a one-off query that you’ll never use again, such as an obscure report requested once, it won’t be worth the effort, as it will slow other things down.We’ll look at how to do this in the next section. Speeding Up Queries with Indexes If you are in the situation mentioned previously, in which the possible_keys column from an EXPLAIN contains some NULL values, you might be able to improve the perform- ance of your query by adding an index to the table in question. If the column you are using in your WHERE clause is suitable for indexing, you can create a new index for it using ALTER TABLE like this: ALTER TABLE table ADD INDEX (column); General Optimization Tips In addition to the previous query optimization tips, there are quite a few things you can do to generally increase the performance of your MySQL database. 14 525x ch11 1/24/03 3:37 PM Page 254 255 General Optimization Tips Design Optimization Basically you want everything in your database to be as small as possible.You can achieve this in part with a decent design that minimizes redundancy.You can also achieve it by using the smallest possible data type for columns.You should also minimize NULLs wher- ever possible, and make your primary key as short as possible. Avoid variable length columns if at all possible (like VARCHAR, TEXT, and BLOB). If your tables have fixed-length fields they will be faster to use but might take up a little more space. Permissions In addition to using the suggestions mentioned in the previous section on EXPLAIN,you can improve the speed of queries by simplifying your permissions.We discussed earlier the way that queries are checked with the permission system before being executed.The simpler this process is, the faster your query will run. Table Optimization If a table has been in use for a period of time, data can become fragmented as updates and deletions are processed.This will increase the time taken to find things in this table. You can fix this by using the statement OPTIMIZE TABLE tablename; or by typing myisamchk -r table at the command prompt. You can also use the myisamchk utility to sort a table index and the data according to that index, like this: myisamchk sort-index sort-records=1 pathtomysqldatadirectory/*/*.MYI Using Indexes Use indexes where required to speed up your queries. Keep them simple, and don’t cre- ate indexes that are not being used by your queries.You can check which indexes are being used by running EXPLAIN as shown previously. Use Default Values Wherever possible, use default values for columns, and only insert data if it differs from the default.This reduces the time taken to execute the INSERT statement. 14 525x ch11 1/24/03 3:37 PM Page 255 256 Chapter 11 Advanced MySQL Use Persistent Connections This particular optimization tip applies particularly to Web databases.We’ve already dis- cussed it elsewhere so this is just a reminder. Other Tips There are many other minor tweaks you can make to improve performance in particular situations and when you have particular needs.The MySQL Web site offers a good set of additional tips.You can find it at http://www.mysql.com Different Table Types One last useful thing to discuss before we leave MySQL for the time being is the exis- tence of different types of tables.You can choose a table type when you create a table, using CREATE TABLE table TYPE=type The possible table types are n MyISAM.This is the default, and what we have used to date.This is based on ISAM, which stands for Indexed Sequential Access Method,a standard method for storing records and files. n ISAM, as described above. n HEAP.Tables of this type are stored in memory, and their indexes are hashed.This makes HEAP tables extremely fast, but, in the event of a crash, your data will be lost. These characteristics make HEAP tables ideal for storing temporary or derived data. You should specify the MAX_ROWS in the CREATE TABLE statement, or these tables can hog all your memory. Also, they cannot have BLOB, TEXT, or AUTO INCREMENT columns. n BDB.These tables are transaction safe; that is, they provide COMMIT and ROLL- BACK capabilities.They are slower to use than the MyISAM tables, but obviously give all the advantages of using transactions.These tables are based on the Berkeley DB. n InnoDB.These are also transaction safe, and the same riders apply as for BDB. These additional table types can be useful when you are striving for extra speed or trans- actional safety. If you want to use the BDB or InnoDB table types, you should use the MySQL-Max binary which came with your MySQL distribution, rather than the regular MySQL binary. 14 525x ch11 1/24/03 3:37 PM Page 256 . table at the command prompt. You can also use the myisamchk utility to sort a table index and the data according to that index, like this: myisamchk sort-index sort-records=1 pathtomysqldatadirectory/*/*.MYI Using. NULLs wher- ever possible, and make your primary key as short as possible. Avoid variable length columns if at all possible (like VARCHAR, TEXT, and BLOB). If your tables have fixed-length fields. particular needs.The MySQL Web site offers a good set of additional tips.You can find it at http://www .mysql. com Different Table Types One last useful thing to discuss before we leave MySQL for the