Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 50 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
50
Dung lượng
636,12 KB
Nội dung
Administrator. We will look at the MySQL Administrator in “Replication Monitoring with MySQL Administrator” on page 381. Monitoring Commands for the Slave The SHOW SLAVE STATUS command displays information about the slave’s binary log, its connection to the server, and replication activity, including the name and offset position of the current binlog file. This information is vital in diagnosing slave performance, as we have seen in previous chapters. Example 10-5 shows the result of a typical SHOW SLAVE STATUS command executed on a server running MySQL version 5.5. Example 10-5. The SHOW SLAVE STATUS command mysql> SHOW SLAVE STATUS \G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: localhost Master_User: rpl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000002 Read_Master_Log_Pos: 39016226 Relay_Log_File: relay-bin.000004 Relay_Log_Pos: 9353715 Relay_Master_Log_File: mysql-bin.000002 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 25263417 Relay_Log_Space: 39016668 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 66 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Monitoring Slaves | 377 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. Replicate_Ignore_Server_Ids: Master_Server_Id: 1 1 row in set (0.00 sec) There is a lot of information here. This command is the most important command for replication. It is a good idea to study the details of each item presented. Rather than listing the information item by item, we present the information from the perspective of an administrator. That is, the information is normally inspected with a specific goal in mind. Thus, we group the information into categories for easier reference. These categories include master connection information, slave performance, log information, filtering, log performance, and error conditions. The most important piece of information is the first column. This tells you the current status of the I/O thread. It presents one of several states: connecting to the master, waiting for events from the master, reconnecting to the master, etc. The information displayed about the master connection includes the current hostname of the master, the user account used to connect, and the port the slave is connected to on the master. Toward the bottom of the listing is the SSL connection information (if you are using an SSL connection). The next category includes information about the binary log on the master and the relay log on the slave. The filename and position of each are displayed. It is important to note these values whenever you diagnose replication problems. Of particular note is Relay_Master_Log_File, which shows the filename of the master binary log where the most recent event from the relay log has been executed. Replication filtering configuration lists all of the slave-side replication filters. Check here if you are uncertain how your filters are set up. Also included is the last error number and text for the slave and the I/O and SQL threads. Beyond the state values for the slave threads, this information is most often examined when there is an error. It can be helpful to check this information first when encountering errors on the slave, before examining the error log, as this information is the most current and normally gives you the reason for the failure. There is also information about the configuration of the slave, including the settings for the skip counter and the until conditions. See the online MySQL Reference Man ual for more information about these fields. Near the bottom of the list is the current error information. This includes errors for the slave’s I/O and SQL threads. These values should always be 0 for a properly functioning slave. Some of the more important performance columns are discussed in more detail here: 378 | Chapter 10: Replication Monitoring Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. Connect_Retry The number of seconds that expire between retry connect attempts. This value should always be low, but you may want to set it higher if you have a case where the slave is having issues connecting to the master. Exec_Master_Log_Pos This shows the position of the last event executed from the master’s binary log. Relay_Log_Space The total size of all of the relay logfiles. You can use this to determine if you need to purge the relay logs in the event you are running low on disk space. Seconds_Behind_Master The number of seconds between the time an event was executed and the time the event was written in the master’s binary log. A high value here can indicate signif- icant replication lag. We discuss replication lag in an upcoming section. The value for Seconds_Behind_Master could become stale when replication stops due to network failures, loss of heartbeat from the master, etc. It is most meaningful when replication is running. If your slave has binary logging enabled, the SHOW BINARY LOGS command displays the list of binlog files available on the slave and their sizes in bytes. Example 10-6 shows the results of a typical SHOW BINARY LOGS command. Example 10-6. The SHOW BINARY LOGS command on the slave mysql> SHOW BINARY LOGS; +------------------+-----------+ | Log_name | File_size | +------------------+-----------+ | slave-bin.000001 | 5151604 | | slave-bin.000002 | 1030108 | | slave-bin.000003 | 1030044 | +------------------+-----------+ 3 rows in set (0.00 sec) You can rotate the relay log on the slave with the FLUSH LOGS command. You can also use the SHOW BINLOG EVENTS command to show events in the binary log on the slave if the slave has binary logging enabled. The difference between showing events on the slave and showing them on the master is you want to specify the binlog filename on the slave as shown in the SHOW BINARY LOGS output. Example 10-7 shows the binlog events from a typical replication configuration. Monitoring Slaves | 379 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. Example 10-7. The SHOW BINLOG EVENTS command (statement-based) mysql> SHOW BINLOG EVENTS IN 'slave-bin.000001' FROM 2701 LIMIT 2 \G *************************** 1. row *************************** Log_name: slave-bin.000001 Pos: 2701 Event_type: Query Server_id: 1 End_log_pos: 3098 Info: use `employees`; CREATE TABLE salaries ( emp_no INT NOT NULL, salary INT NOT NULL, from_date DATE NOT NULL, to_date DATE NOT NULL, KEY (emp_no), FOREIGN KEY (emp_no) REFERENCES employees (emp_no) ON DELETE CASCADE, PRIMARY KEY (emp_no, from_date) ) *************************** 2. row *************************** Log_name: slave-bin.000001 Pos: 3098 Event_type: Query Server_id: 1 End_log_pos: 3405 Info: use `employees`; INSERT INTO `departments` VALUES ('d001','Marketing'),('d002','Finance'), ('d003','Human Resources'),('d004','Production'), ('d005','Development'),('d006','Quality Management'), ('d007','Sales'),('d008','Research'), ('d009','Customer Service') 2 rows in set (0.01 sec) In MySQL versions 5.5 and later, you can also inspect the slave’s relay log with SHOW RELAYLOG EVENTS. Slave Status Variables There are only a few status variables for monitoring the slave. These include counters that indicate how many times a slave-related command was issued on the master and statistics for key slave operations. The first four listed here are simply counters of the various slave-related commands. The values should correspond with the frequency of the maintenance of your slaves. If they do not, you may want to investigate the possi- bility that there are more slaves in your topology than you expected or that a particular slave is being restarted too frequently. 380 | Chapter 10: Replication Monitoring Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. Com_show_slave_hosts The number of times the SHOW SLAVE HOSTS command was issued. Com_show_slave_status The number of times the SHOW SLAVE STATUS command was issued. Com_slave_start The number of times the SLAVE START command was issued. Com_slave_stop The number of times the SLAVE STOP command was issued. Slave_heartbeat_period The current configuration for the number of seconds that elapse between heartbeat checks of the master. Slave_open_temp_tables The number of temporary tables the slave’s SQL thread is using. A high value can indicate the slave is overburdened. Slave_received_heartbeats The count of heartbeat replies from the master. This value should correspond roughly to the elapsed time since the slave was restarted divided by the heartbeat interval. Slave_retried_transactions The number of times the SQL thread has retried transactions since the slave was started. Slave_running Simply displays ON if the slave is connected to the master and the I/O and SQL threads are executing without error. Replication Monitoring with MySQL Administrator You have seen how you can use the MySQL Administrator to monitor network traffic and storage engines. It also has a simple display for monitoring the master and slave in a replication topology. You can view basic information about replication on the Rep- lication Status tab. However, to get the most out of this information, you should start your slaves with the --report_host startup option, providing a unique name for each slave. Figure 10-1 shows the MySQL Administrator running on a master with one connected slave. If there were slaves connected without the --report_host option, they would be omitted from the list. If you run the MySQL Administrator on a slave, you will only see the slave’s informa- tion. Figure 10-2 shows the MySQL Administrator running on the slave. Replication Monitoring with MySQL Administrator | 381 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. Figure 10-2. The MySQL Administrator running on the slave Figure 10-1. The MySQL Administrator running on the master 382 | Chapter 10: Replication Monitoring Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. In Figures 10-1 and 10-2, the information displayed includes the hostname, server ID, port, kind (master or slave), a general status, the logfile (binlog filename), and the current log position. Figure 10-1 shows the replication topology listing all of the con- nected slaves. This report can be handy when you want to get an at-a-glance status of your servers. Other Items to Consider This section discusses some additional considerations for monitoring replication. It includes special networking considerations and monitoring lag (delays in replication). Networking If you have limited networking bandwidth, high contention for the bandwidth, or simply a very slow connection, you can improve replication performance by using compression. You can configure compression using the slave_compressed_protocol variable. In cases where network bandwidth is not a problem but you have data that you want to protect while in transit from the master to the slaves, you can use an SSL connection. You can configure the SSL connection using the CHANGE MASTER command. See the sec- tion titled “Setting Up Replication Using SSL” in the online MySQL Reference Man ual for details on using SSL connections in replication. Another networking configuration you may want to consider is using master heart- beats. You have seen where this information is shown on the SHOW SLAVE STATUS com- mand. A heartbeat is a mechanism to automatically check connection status between a master and a slave. It can detect levels of connectivity in milliseconds. Master heart- beat is used in replication scenarios where the slave must be kept in sync with the master with little or no delay. Having the capability to detect when a threshold expires ensures the delay is identified before replication is halted on the slave. You can configure master heartbeat using a parameter in the CHANGE MASTER command with the master_heartbeat_period=<value> setting (added in MySQL version 5.4.4), where the value is the number of seconds at which you want the heartbeat to occur. You can monitor the status of the heartbeat with the following commands: SHOW STATUS like 'slave_heartbeat period' SHOW STATUS like 'slave_received_heartbeats' Monitor and Manage Slave Lag Periods of massive updates, overburdened slaves, or other significant network per- formance events can cause your slaves to lag behind the master. When this happens, the slaves are not processing the events in their relay logs fast enough to keep up with the changes sent from the master. Other Items to Consider | 383 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. As you saw with the SHOW SLAVE STATUS command, Seconds_Behind_Master can show indications that the slave is running behind the master. This field tells you by how many seconds the slave’s SQL thread is behind the slave’s I/O thread—that is, how far behind the slave is in processing the incoming events from the master. The slave uses the time- stamps of the events to calculate this value. When the SQL thread on the slave reads an event from the master, it calculates the difference in the timestamp. The following excerpt shows a condition in which the slave is 146 seconds behind the master. In this case, the slave is more than two minutes behind; this can be a problem if your appli- cation is relying on the slaves to provide timely information. mysql> SHOW SLAVE STATUS \G . Seconds_Behind_Master: 146 . The SHOW PROCESSLIST command (run on the slave) can also provide an indication of how far behind the slave is. Here, we see the number of seconds that the SQL thread is behind, measured using the difference between the timestamp of the last replicated event and the real time of the slave. For example, if your slaves have been offline for 30 minutes and have reconnected to the master, you would expect to see a value of ap- proximately 1,800 seconds in the Time field of the SHOW PROCESSLIST results. The excerpt below shows this condition. Large values in this field are indicative of significant delays that can result in stale data on the slaves. mysql> SHOW PROCESSLIST \G . Time: 1814 . Depending on how your replication topology is designed, you may be replicating data for load balancing. In this case, you typically use multiple slaves, directing a portion of the application or users to the slaves for SELECT queries, thereby reducing the burden on the master. Causes and Cures for Slave Lag Slave lag can be a nuisance for some replication users. The main reason for lag is the single-threaded nature of the slave (actually, there are two threads, but only one exe- cutes events and this is the main culprit in slave lag). For example, a master with a multiple-core CPU can run multiple transactions in parallel and will be faster than a slave that is executing transactions (events from the binary log) in a single thread. We have already discussed some ways to detect slave lag. In this section, we discuss some common causes and solutions for reducing slave lag. There are several causes for slave lag (e.g., network latency). It is possible the slave I/O thread is delayed in reading events from the logs. The most common reason for slave lag is simply that the slave has a single thread to execute all events, whereas the master has potentially many threads executing in parallel. Some other causes include 384 | Chapter 10: Replication Monitoring Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. long-running queries with inefficient joins, I/O-bound reads from disk, lock conten- tion, and InnoDB thread concurrency issues. Now that you know more about what causes slave lag, let us examine some things you can do to minimize it: Organize your data You can see performance improvements by normalizing your data and by using sharding to distribute your data. This helps eliminate duplication of data, but as you saw in Chapter 8, duplication of some data (such as lookup text) can actually improve performance. The idea here is to use just enough normalization and sharding to improve performance without going too far. This is something only you, the owner of the data, can determine either through experience or experi- mentation. Divide and conquer We know that adding more slaves to handle the queries (scale-out) is a good way to improve performance, but not scaling out enough could still result in slave lag if the slaves are processing a much greater number of queries. In extreme cases, you can see slave lag on all of the slaves. To combat this, consider segregating your data using replication filtering to replicate different databases among your slaves. You can still use scale-out, but in this case you use an intermediary slave for each group of databases you filter, then scale from there. Identify long-running queries and refactor them If long-running queries are the source of slave lag, consider refactoring the query or the operation or application to issue shorter queries or more compact transac- tions. However, if you use this technique combined with replication filtering, you must use care when issuing transactions that span the replication filter groups. Once you divide a long-running query that should be an atomic operation (a trans- action) across slaves, you run the risk of causing data integrity problems. Load balancing You can also use load balancing to redirect your queries to different slaves. This may reduce the amount of time each slave is spending answering queries, thereby leaving more computational time to process replication events. Ensure you are using the latest hardware Clearly, having the best hardware for the job normally equates to better perform- ance. At the very least, you should ensure your slave servers are configured to their optimal hardware capabilities and are at least as powerful as the master. Reduce lock contention Table locks for MyISAM and row-level locks for InnoDB can cause slave lag. If you have queries that result in a lot of locks on MyISAM or InnoDB tables, consider refactoring the queries to avoid as many locks as possible. Other Items to Consider | 385 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. Conclusion This chapter concludes our discussion of the many ways you can monitor MySQL, and provides a foundation for you to implement your own schedules for monitoring virtu- ally every aspect of the MySQL server. Now that you know the basics of operating system monitoring, database performance, and MySQL monitoring and benchmarking, you have the tools and knowledge to suc- cessfully tune your server for optimal performance. Joel smiled as he compiled his report about the replication issue. He paused and glanced at his doorway. He could almost sense it coming. “Joel!” Joel jumped, unable to believe his prediction. “I’ve got the replication problem solved, sir,” he said quickly. “Great! Send me the details when you get a moment.” “I also discovered some interesting things about the order processing system.” He no- ticed Mr. Summerson’s eyebrow raise slightly in anticipation. Joel continued, “It seems we have sized the buffer pool incorrectly. I think I can make some improvements in that area as well.” Mr. Summerson said, “Monitoring again?” “Yes, sir. I’ve got some reports on the InnoDB storage engine. I’ll include that in my email, too.” “Good work. Good work indeed.” Joel knew that look. His boss was thinking again, and that always led to more work. Joel was surprised when his boss simply walked away slowly. “Well, it seems I finally stumped him.” 386 | Chapter 10: Replication Monitoring Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. [...]... this level of integration—consuming data from a third party—you would need a second instance of a MySQL server on the production relay slave to replicate the data from the strategic partner (192.168.1.101) and use a script to conduct periodic transfers of the data from the second MySQL instance to the primary MySQL instance on the production relay slave This would achieve the integration depicted in Figure... make it part of their creed to prevent failures and ensure reliable access and data to users However, even properly managed systems can have issues MySQL replication is no exception In particular, the slave state is not crash-safe This means that if the MySQL instance on the slave crashes, it is possible the slave will stop in an undefined state In the worst case, the relay log or the master.info file... configuration change What Can Go Wrong There are many things that can go wrong to disrupt replication MySQL replication is most susceptible to problems with data, be it data corruption or unintended interruptions in the replication stream System crashes that result in an unsafe and uncontrolled termination of MySQL can also cause replication restarting issues You should always prepare a backup of your data... by examining the server’s logfiles, looking for errors like the following: [ERROR] /usr/bin/mysqld: Table 'db1.t1' is marked as crashed and should be repaired You can use the following command to perform optimization and repair in one step to repair all of the tables for a given database (in this case, db1) mysqlcheck -u -p check optimize auto-repair db1 What Can Go Wrong | 391 Please purchase... replication problems 387 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark Troubleshooting replication problems involving the MySQL Cluster follows the same procedures presented in this chapter If you are having problems with MySQL Cluster, see Chapter 15 for troubleshooting cluster failures and startup issues Seasoned computer users understand that computing systems are prone... he originally thought Joel reached for his handy MySQL book “Money well spent,” he said as he opened the book to the chapter titled “Protecting Your Investment.” This chapter focuses on protecting data and providing data recovery Practical topics in the chapter include backup planning, data recovery, and the procedures for backing up and restoring in MySQL Any discussion of these topics would be incomplete... Act, see http://www soxlaw.com/ High Availability Versus Disaster Recovery Most businesses recognize they must invest in technologies that allow their systems to recover quickly from minor to moderate events Technologies such as replication, redundant array of inexpensive disks (RAID), redundant power supplies, etc., are all solutions to these needs These are considered high availability options because... purchase PDF Split-Merge on www.verypdf.com to remove this watermark For MyISAM tables, you can use the myisam-recover option to turn on automatic recovery There are four modes of recovery See the online MySQL Reference Manual for more details Once you have repaired the affected tables, you must also determine if the tables on the slave have been corrupted This is necessary if the master and slave share... problems (e.g., the slave’s replication account was deleted) or corrupted tables on the master or slave(s) In these cases, you are likely to see connection errors in the console and logs for the slave MySQL server When this occurs, always check the permissions of the replication user on the master Ensure the proper privileges are granted to the user defined in either your configuration file or on your... topology When this happens, the server designated as the originating server has failed to terminate the replication of the event You can solve this problem by using the IGNORE_SERVER_IDS option (available in MySQL versions 5.5.2 and later) with the CHANGE MASTER command, supplying a list of server IDs to ignore for an event When the missing servers are restored, you must adjust this setting so that events . Administrator. We will look at the MySQL Administrator in “Replication Monitoring with MySQL Administrator” on page 381. Monitoring Commands. STATUS command executed on a server running MySQL version 5.5. Example 10-5. The SHOW SLAVE STATUS command mysql& gt; SHOW SLAVE STATUS G ***************************