Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 21 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
21
Dung lượng
278,51 KB
Nội dung
High Availability with MySQL Cluster 36 mysql> USE cluster_test; Database changed mysql> CREATE TABLE ctest (i INT) ENGINE=NDBCLUSTER; Query OK, 0 rows affected (0.84 sec) mysql> INSERT INTO ctest () VALUES (1); Query OK, 1 row affected (0.04 sec) mysql> SELECT * FROM ctest; + + | i | + + | 1 | + + 1 row in set (0.00 sec) The next step is to select this row from the other SQL node in the cluster (node2), and then insert another row from the second node: [root@node2 ~]# mysql Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 2 Server version: 5.1.34-ndb-7.0.6-cluster-gpl MySQL Cluster Server (GPL) Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> SHOW DATABASES; + + | Database | + + | information_schema | | cluster_test | | mysql | Chapter 1 37 | test | + + 4 rows in set (0.03 sec) mysql> use cluster_test; Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed mysql> show tables; + + | Tables_in_cluster_test | + + | ctest | + + 1 row in set (0.04 sec) mysql> SELECT * from ctest; + + | i | + + | 1 | + + 1 row in set (0.00 sec) mysql> INSERT INTO ctest () VALUES (2); Query OK, 1 row affected (0.01 sec) High Availability with MySQL Cluster 38 Finally, we return to the rst node and check that the data inserted on node2 is visible. mysql> SELECT * FROM ctest; + + | i | + + | 2 | | 1 | + + 2 rows in set (0.00 sec) mysql> Congratulations! The cluster works, and you have created a table in it. You can repeat these tests while powering off the storage nodes and watch the cluster continue to work. With a single-cluster storage node powered off, everything will continue to work (except for the SQL node that was previously running on the powered-off mode, of course!). Restarting a MySQL Cluster without downtime One of the key aspects of a highly-available system is that routine maintenance can be carried out without any service interruption to users. MySQL Cluster achieves this through its shared nothing architecture, and in this recipe we will show how to restart the three types of nodes online (without taking the cluster down as a whole). Getting started For this recipe, we will be using the following cluster setup: Four storage nodes One management node Two SQL nodes Chapter 1 39 The output of the SHOW command on the management client for this cluster is as follows: ndb_mgm> SHOW Cluster Configuration [ndbd(NDB)] 4 node(s) id=3 @10.0.0.1 (mysql-5.1.34 ndb-7.0.6, Nodegroup: 0) id=4 @10.0.0.2 (mysql-5.1.34 ndb-7.0.6, Nodegroup: 0) id=5 @10.0.0.3 (mysql-5.1.34 ndb-7.0.6, Nodegroup: 1, Master) id=6 @10.0.0.4 (mysql-5.1.34 ndb-7.0.6, Nodegroup: 1) [ndb_mgmd(MGM)] 1 node(s) id=1 @10.0.0.5 (mysql-5.1.34 ndb-7.0.6) [mysqld(API)] 4 node(s) id=11 @10.0.0.1 (mysql-5.1.34 ndb-7.0.6) id=12 @10.0.0.2 (mysql-5.1.34 ndb-7.0.6) id=13 (not connected, accepting connect from any host) id=14 (not connected, accepting connect from any host) How to do it… In this section, we will show how to restart each node using our example cluster. Restarting a storage node There are two ways to restart a storage node. For both the methods, the rst step is to check the output of SHOW command in the management client to ensure that there is at least one other online (not starting or shutdown) node in the same nodegroup. In our example cluster, we have storage node ID 3 and 4 in nodegroup 0 and storage node ID 5 and 6 in nodegroup 1. High Availability with MySQL Cluster 40 The two options for restarting a node are as follows: Firstly, from the management client a node can be restarted with the <nodeid> RESTART command: ndb_mgm> 3 status Node 3: started (mysql-5.1.34 ndb-7.0.6) ndb_mgm> 3 RESTART Node 3: Node shutdown initiated Node 3: Node shutdown completed, restarting, no start. Node 3 is being restarted ndb_mgm> 3 status Node 3: starting (Last completed phase 4) (mysql-5.1.34 ndb-7.0.6) Node 3: Started (version 7.0.6) ndb_mgm> 3 status Node 3: started (mysql-5.1.34 ndb-7.0.6) Secondly, on the storage node itself the ndbd process can simply be killed and restarted: Remember that ndbd has two processes—an angel process in addition to the main process. You must kill both these processes at the same time. [root@node4 ~]# ps aux | grep ndbd root 4082 0.0 0.4 33480 2316 ? Ss Jul08 0:00 ndbd initial root 4134 0.1 17.4 426448 91416 ? Sl Jul08 0:02 ndbd initial root 4460 0.0 0.1 61152 720 pts/0 R+ 00:11 0:00 grep ndbd [root@node4 ~]# kill 4082 4134 [root@node4 ~]# ps aux | grep ndbd | grep -v grep | wc -l 0 Once we have killed the ndbd process, and ensured that no processes are running with the name ndbd, we can restart the ndbd process: [root@node4 ~]# ndbd 2009-07-09 00:12:03 [ndbd] INFO Configuration fetched from '10.0.0.5:1186', generation: 1 Chapter 1 41 If you were to leave a management client connected during this process, you can see that the management node picks up on the dead node and then allow it to rejoin the cluster: ndb_mgm> Node 6: Node shutdown completed. Initiated by signal 15. ndb_mgm> Node 6: Started (version 7.0.6) ndb_mgm> 6 status Node 6: started (mysql-5.1.34 ndb-7.0.6) Remember that you can restart more than one node at a time, but you must always have one node fully started in each nodegroup or your cluster will shut down. Restarting a management node Restarting a management node is best done by simply killing the ndb_mgmd process and restarting it. When there is no management node in the cluster, there is no central logging for the cluster, and the storage and the API nodes cannot start or restart (so if they fail they will stay dead). In addition, processes that are initiated from the management client (such as hot backups) cannot be run. Firstly, we will pass the process ID of the ndb_mgmd process to the kill command: [root@node5 mysql-cluster]# kill $(pidof ndb_mgmd) This will kill the management node, so now start it again: [root@node5 mysql-cluster]# ndb_mgmd config-file=config.ini 2009-07-09 00:30:00 [MgmSrvr] INFO NDB Cluster Management Server. mysql-5.1.34 ndb-7.0.6 2009-07-09 00:30:00 [MgmSrvr] INFO Loaded config from '// mysql-cluster/ndb_1_config.bin.1' Finally, verify that the management node is working: [root@node5 mysql-cluster]# ndb_mgm NDB Cluster Management Client ndb_mgm> 1 status Node 1: connected (Version 7.0.6) High Availability with MySQL Cluster 42 Restarting a SQL node Restarting a SQL node is trivial—just restart the mysqld process as normal, and carry out the checks mentioned earlier to ensure that the node restarts correctly. [root@node1 ~]# service mysql restart Shutting down MySQL [ OK ] Starting MySQL [ OK ] Recovering from a cluster shutdown This recipe will cover the procedure to follow in case of a cluster shutdown. We discuss the split brain problem and also explain how to start a cluster without all storage nodes. How to do it… This section covers the procedure to follow—both for a partial failure (some nodes fail, but the cluster remains operational) and a complete failure (all nodes fail!). Partial cluster failure In the event of a single node failing, you will notice the following: If the node that fails is a management node—no immediate problem occurs, but other nodes cannot restart and activities requiring a management node (online backups, centralized logging) will not take place If the node that fails is a storage node—assuming one node remains in each nodegroup, there will be no immediate action (but there is the possibility of a small number of transactions being rolled back) If the node that fails is a SQL node—any clients connected to that SQL node clearly will either have to use another SQL node or will fail, but no effect on the cluster To recover from a partial shutdown, carry out the restart procedure in the previous recipe—Restarting a MySQL Cluster without downtime; however, it may not be necessary to kill the existing process. If you do nd a zombied (crashed) process remaining, you should rst kill that process and then restart the node. Chapter 1 43 Complete cluster failure The following errors can cause a total cluster shutdown: Catastrophic software bug that causes multiple cluster nodes to fail Every node in the cluster loosing power (an entire facility failing for example), split brain condition (that will be discussed shortly) Malicious or mistaken users gaining access to the management node or any storage node A split brain problem refers to the problem of cutting a cluster suffering some communication problems between nodes. If we have four nodes and split them into two groups of two nodes each (perhaps, through the failure in a switch), there is absolutely no way for either of the pairs to tell if the other node is working or not. In this case, the only safe thing to do is to shut down both the nodes, even though both pairs could have all the data required to carry on working. Imagine what would happen if your four-data nodes, two-storage nodes cluster continued working as two separate clusters—and then you had to attempt to reconcile two completely different databases! MySQL Cluster gets around this with the concept of an arbitrator—put simply, the cluster nodes elect a single node to act as the arbitrator while all nodes can still communicate. In the event of nodes loosing each other's contact, they (as a new group) ask the following questions: Do we (nodes I can now talk to) have enough nodes to remain viable (one storage node per nodegroup)? Can I see the previously agreed arbitrator? Unless the answer is yes for each node, the cluster will shut down with an error similar to the following appearing in the log: Forced node shutdown completed. Caused by error 2305: 'Node lost connection to other nodes and cannot form a unpartitioned cluster, please investigate if there are error(s) on other node(s)(Arbitration error). Temporary error, restart node'. High Availability with MySQL Cluster 44 The arbitrator is typically the management node, but can be a SQL node and you can specify ArbitrationRank=1 in config.ini le to make a node of high priority to become the cluster arbitrator. Recovery of a full cluster shutdown is conceptually simple—we need to start all storage nodes. It is likely that storage nodes would have killed themselves or had been killed by whatever caused the outage. So the procedure is identical to the rolling cluster restart without killing the existing processes. In other words, start the management node (ndb_mgmd), start all storage nodes (ndbd), and start all SQL nodes (start or restart mysqld). How it works… During a full-cluster start up, the storage nodes will start and will have to use their local copies of data that they stored to disk (It is likely that there will be some data loss after a total cluster shutdown). By default, a running MySQL Cluster will commit a Local Checkpoint (LCP)—a copy of all the local data held on disk—every time 4 MB of changes are made to the cluster (since the previous LCP). A MySQL Cluster will also take a global checkpoint (all transactions that have occurred since the last LCP) to disk every two seconds. A storage node when starting from a full shutdown will apply all local transactions up to the last LCP, and then apply them up to two seconds of transactions from the latest global checkpoint to get data that is as up-to-date as possible. Because global checkpoints are made consistently across the cluster, this allows for consistent recovery of all nodes. There's more… In the case of a total cluster shutdown, it may happen that a storage node is damaged and cannot be repaired quickly. It is possible to start your cluster with only one storage node per nodegroup. To do this, pass the nowait-nodes=<NODES> where <NODES> is a comma-separated list of nodes not to wait for. For example, in this example cluster: [ndbd(NDB)] 4 node(s) id=3 @10.0.0.1 (mysql-5.1.34 ndb-7.0.6, Nodegroup: 0) id=4 @10.0.0.2 (mysql-5.1.34 ndb-7.0.6, Nodegroup: 0, Master) id=5 @10.0.0.3 (mysql-5.1.34 ndb-7.0.6, Nodegroup: 1) id=6 @10.0.0.4 (mysql-5.1.34 ndb-7.0.6, Nodegroup: 1) We could potentially start with nodes [ (3 or 4) and (5 or 6) ]. In this example, we will start without node3: Chapter 1 45 This example assumes that your cluster is already shut down. Run the following command on the nodes that you want to start: [root@node4 ~]# ndbd nowait-nodes=3 2009-07-09 23:32:02 [ndbd] INFO Configuration fetched from '10.0.0.5:1186', generation: 1 The cluster should start without node3: ndb_mgm> ALL STATUS Node 3: not connected Node 4: started (mysql-5.1.34 ndb-7.0.6) Node 5: started (mysql-5.1.34 ndb-7.0.6) Node 6: started (mysql-5.1.34 ndb-7.0.6) Clearly, at this point, the cluster no longer has a single point of failure and as quickly as possible node3 should be repaired and started. [...]... ndb-7.0.6, Nodegroup: 0, Master) id=5 @10.0.0 .3 (mysql- 5.1 .34 ndb-7.0.6, Nodegroup: 1) id=6 @10.0.0.4 (mysql- 5.1 .34 ndb-7.0.6, Nodegroup: 1) [ndb_mgmd(MGM)] 2 node(s) id=1 @10.0.0.5 (mysql- 5.1 .34 ndb-7.0.6) id=2 @10.0.0.6 (mysql- 5.1 .34 ndb-7.0.6) [mysqld(API)] 4 node(s) id=11 @10.0.0.1 (mysql- 5.1 .34 ndb-7.0.6) id=12 @10.0.0.2 (mysql- 5.1 .34 ndb-7.0.6) id= 13 (not connected, accepting connect from any... 91,954 in 0.7s 136 K/s 03: 37:57 ( 136 KB/s) - `world.sql.gz' saved [91954/91954] 2 Uncompress the file using following command: [root@node1 tmp]# gunzip world.sql.gz 3 The next step is to create a database, world, to hold the new tables: [root@node1 tmp]# mysql Welcome to the MySQL monitor Commands end with ; or \g Your MySQL connection id is 5 Server version: 5.1 .34 -ndb-7.0.6-cluster-gpl MySQL Cluster... tmp]# mysql world < world.sql 2 Secondly, enter the MySQL client, select the correct database, and check the tables that need to be converted: [root@node1 tmp]# mysql Welcome to the MySQL monitor Commands end with ; or \g Your MySQL connection id is 8 Server version: 5.1 .34 -ndb-7.0.6-cluster-gpl MySQL Cluster Server (GPL) Type 'help;' or '\h' for help Type '\c' to clear the current input statement mysql> ... table The steps are as follows: 1 Firstly, enter the mysql client on one SQL node as follows: [root@node2 ~]# mysql Welcome to the MySQL monitor Commands end with ; or \g Your MySQL connection id is 22 Server version: 5.1 .34 -ndb-7.0.6-cluster-gpl MySQL Cluster Server (GPL) Type 'help;' or '\h' for help Type '\c' to clear the current input statement 53 MySQL Cluster Backup and Recovery 2 Select the world...2 MySQL Cluster Backup and Recovery In this chapter, we will cover: Importing SQL files to a MySQL server and converting them to MySQL Cluster Taking an online backup of a MySQL Cluster Restoring the cluster from a MySQL Cluster online backup Restricting write access to a MySQL Cluster with single-user mode Taking an offline backup with MySQL Cluster Introduction When designing a MySQL. .. and data, and allow this backup to be restored elsewhere A MySQL Cluster hot backup must be imported to another MySQL Cluster A mysqldump backup may be imported to any MySQL server of a similar version MySQL Cluster Backup and Recovery Backups require early consideration during the design of a MySQL Cluster for two reasons: 1 Firstly, because MySQL Cluster offers some fantastic backup features far superior... follows: 1 The first step is to download the SQL file from the MySQL website to the temporary directory /tmp/ as follows: [root@node1 ~]# cd /tmp/ [root@node1 tmp]# wget http://downloads .mysql. com/docs/world.sql gz 03: 37:55 http://downloads .mysql. com/docs/world.sql.gz Resolving downloads .mysql. com 192.9.76.15 Connecting to downloads .mysql. com|192.9.76.15|:80 connected HTTP request sent, awaiting... backup successfully 55 MySQL Cluster Backup and Recovery In this example, we will backup a fully-functioning four storage node cluster Firstly, check the status of the cluster (all storage nodes must be up) using the show command as follows: ndb_mgm>SHOW Cluster Configuration [ndbd(NDB)] 4 node(s) id =3 @10.0.0.1 (mysql- 5.1 .34 ndb-7.0.6, Nodegroup: 0) id=4 @10.0.0.2 (mysql- 5.1 .34 ndb-7.0.6, Nodegroup:... another MySQL Cluster If you wish to take a "disaster recovery" backup, which can be restored into say a single MySQL server relatively quickly, you must use mysqldump (look at the previous recipe) or cluster replication (explained in Chapter 5, High Availability with MySQL Replication) Getting ready MySQL Cluster online backups cover all the tables in the cluster (that is, all the NDB tables) and are initiated... startup with -A Database changed mysql> SHOW TABLES; + -+ | Tables_in_world | + -+ | City | | Country | | CountryLanguage | + -+ 3 rows in set (0.00 sec) 51 MySQL Cluster Backup and Recovery 3 Now, we can see all the tables in the database that have been imported For each table, check that the table is currently not using an engine of NDBCLUSTER: mysql> SHOW TABLE STATUS FROM world . restarted ndb_mgm> 3 status Node 3: starting (Last completed phase 4) (mysql- 5.1 .34 ndb-7.0.6) Node 3: Started (version 7.0.6) ndb_mgm> 3 status Node 3: started (mysql- 5.1 .34 ndb-7.0.6) Secondly,. node(s) id =3 @10.0.0.1 (mysql- 5.1 .34 ndb-7.0.6, Nodegroup: 0) id=4 @10.0.0.2 (mysql- 5.1 .34 ndb-7.0.6, Nodegroup: 0, Master) id=5 @10.0.0 .3 (mysql- 5.1 .34 ndb-7.0.6, Nodegroup: 1) id=6 @10.0.0.4 (mysql- 5.1 .34 . 1) [ndb_mgmd(MGM)] 1 node(s) id=1 @10.0.0.5 (mysql- 5.1 .34 ndb-7.0.6) [mysqld(API)] 4 node(s) id=11 @10.0.0.1 (mysql- 5.1 .34 ndb-7.0.6) id=12 @10.0.0.2 (mysql- 5.1 .34 ndb-7.0.6) id= 13 (not connected, accepting connect