|
MySQL集群解決方案--高可用性、負(fù)載均衡 一、mysql的市場占有率 二、mysql為什么受到如此的歡迎 三、mysql數(shù)據(jù)庫系統(tǒng)的優(yōu)缺點 四、網(wǎng)絡(luò)服務(wù)器的需求 五、什么是mysql的集群 六、什么是負(fù)載均衡 七、mysql集群部署和實現(xiàn)方法 八、負(fù)載均衡的配置和測試 九、Mysql集群系統(tǒng)的測試(測試方案+測試腳本+測試結(jié)果分析) MySQL是世界上最流行的開源數(shù)據(jù)庫,已有1100多萬的擊活安裝,每天超過五萬的下載。MySQL為全球開發(fā)者、DBA和IT管理者在可靠性、性能、易用性方面提供了選擇。 第三方市場調(diào)查機(jī)構(gòu)Evans Data Corporation調(diào)查顯示,過去兩年內(nèi)在開發(fā)者使用的所有數(shù)據(jù)庫中,MySQL已經(jīng)擁有了25%的市場占有率。開源已經(jīng)成為當(dāng)今IT結(jié)構(gòu)中不可或缺的重要部分,而且開源的市場占有率將繼續(xù)增加。如下圖所示:  Sun公司今天1月份花了10億美元將mysql收購,準(zhǔn)備進(jìn)軍開源和數(shù)據(jù)庫。 數(shù)據(jù)庫系統(tǒng) Oracle SQL Server MySQL DB2 每個系統(tǒng)都有自身的不足和發(fā)展歷程,mysql也一樣。 優(yōu)點 缺點 1. 源碼公開,免費 1. 不完善,很多數(shù)據(jù)庫特性不支持跨平臺 2. 只適合中小型應(yīng)用,對于大型應(yīng)用, 3. 為多種開發(fā)語言和包提供了API 可以跟其他數(shù)據(jù)庫互補(bǔ); 4. 支持多線程 3. 數(shù)據(jù)庫系統(tǒng)數(shù)據(jù)量只能達(dá)到千萬級 5. 小巧、靈活、速度較快 別; 6. 支持各種字符集 7. 提供各種連接、優(yōu)化的工具包 隨著Internet的飛速發(fā)展和對我們生活的深入影響,越來越多的個人在互聯(lián)網(wǎng)上購物、娛樂、休閑、與人溝通、獲取信息;越來越多的企業(yè)把他們與顧客和業(yè)務(wù)伙伴之間的聯(lián)絡(luò)搬到互聯(lián)網(wǎng)上,通過網(wǎng)絡(luò)來完成交易,建立與客戶之間的聯(lián)系?;ヂ?lián)網(wǎng)的用戶數(shù)和網(wǎng)絡(luò)流量正以幾何級數(shù)增長,這對網(wǎng)絡(luò)服務(wù)的可伸縮性提出很高的要求。例如,比較熱門的Web站點會因為被訪問次數(shù)急劇增長而不能及時處理用戶的請求,導(dǎo)致用戶進(jìn)行長時間的等待,大大降低了服務(wù)質(zhì)量。另外,隨著電子商務(wù)等關(guān)鍵性應(yīng)用在網(wǎng)上運行,任何例外的服務(wù)中斷都將造成不可估量的損失,服務(wù)的高可用性也越來越重要。所以,對用硬件和軟件方法實現(xiàn)高可伸縮、高可用網(wǎng)絡(luò)服務(wù)的需求不斷增長,這種需求可以歸結(jié)以下幾點: 1) 可伸縮性(Scalability),當(dāng)服務(wù)的負(fù)載增長時,系統(tǒng)能被擴(kuò)展來滿足需求,且不降低服務(wù)質(zhì)量。 2) 高可用性(Availability),盡管部分硬件和軟件會發(fā)生故障,整個系統(tǒng)的服務(wù)必須是每天24小時每星期7天可用的。 3) 可管理性(Manageability),整個系統(tǒng)可能在物理上很大,但應(yīng)該容易管理。 4) 價格有效性(Cost-effectiveness),整個系統(tǒng)實現(xiàn)是經(jīng)濟(jì)的、易支付的。 單服務(wù)器顯然不能處理不斷增長的負(fù)載。這種服務(wù)器升級方法有下列不足: 一是升級過程繁瑣,機(jī)器切換會使服務(wù)暫時中斷,并造成原有計算資源的浪費; 二是越往高端的服務(wù)器,所花費的代價越大;三是一旦該服務(wù)器或應(yīng)用軟件失效,會導(dǎo)致整個服務(wù)的中斷。 通過高性能網(wǎng)絡(luò)或局域網(wǎng)互聯(lián)的服務(wù)器集群正成為實現(xiàn)高可伸縮的、高可用網(wǎng)絡(luò)服務(wù)的有效結(jié)構(gòu)。這種松耦合結(jié)構(gòu)比緊耦合的多處理器系統(tǒng)具有更好的伸縮性和性能價格比,組成集群的PC服務(wù)器或RISC服務(wù)器和標(biāo)準(zhǔn)網(wǎng)絡(luò)設(shè)備因為大規(guī)模生產(chǎn),價格低,具有很高的性能價格比。但是,這里有很多挑戰(zhàn)性的工作,如何在集群系統(tǒng)實現(xiàn)并行網(wǎng)絡(luò)服務(wù),它對外是透明的,它具有良好的可伸縮性和可用性。 針對上述需求,我們給出了基于IP層和基于內(nèi)容請求分發(fā)的負(fù)載平衡調(diào)度解決方法,并在Linux內(nèi)核中實現(xiàn)了這些方法,將一組服務(wù)器構(gòu)成一個實現(xiàn)可伸縮的、高可用網(wǎng)絡(luò)服務(wù)的服務(wù)器集群,我們稱之為Linux虛擬服務(wù)器(Linux Virtual Server)。在LVS集群中,使得服務(wù)器集群的結(jié)構(gòu)對客戶是透明的,客戶訪問集群提供的網(wǎng)絡(luò)服務(wù)就像訪問一臺高性能、高可用的服務(wù)器一樣??蛻舫绦虿皇芊?wù)器集群的影響不需作任何修改。系統(tǒng)的伸縮性通過在服務(wù)機(jī)群中透明地加入和刪除一個節(jié)點來達(dá)到,通過檢測節(jié)點或服務(wù)進(jìn)程故障和正確地重置系統(tǒng)達(dá)到高可用性。 分為同步集群和異步集群。 同步集群(mysql cluster) 結(jié)構(gòu):(data + sql + mgm節(jié)點) 特點: 1) 內(nèi)存級別的,對硬件要求較低,但是對內(nèi)存要求較大。換算比例為:1:1.1; 2) 數(shù)據(jù)同時放在幾臺服務(wù)器上,冗余較好; 3) 速度一般; 4) 建表需要聲明為engine=ndbcluster 5) 擴(kuò)展性強(qiáng); 6) 可以實現(xiàn)高可用性和負(fù)載均衡,實現(xiàn)對大型應(yīng)用的支持; 7) 必須是特定的mysql版本,如:已經(jīng)編譯好的max版本; 8) 配置和管理方便,不會丟失數(shù)據(jù); 異步集群(mysql replication)結(jié)構(gòu):(master + slave) 特點: 1) 主從數(shù)據(jù)庫異步數(shù)據(jù); 2) 數(shù)據(jù)放在幾臺服務(wù)器上,冗余一般; 3) 速度較快; 4) 擴(kuò)展性差; 5) 無法實現(xiàn)高可用性和負(fù)載均衡(只能在程序級別實現(xiàn)讀寫分離,減輕對主數(shù)據(jù)庫的壓力); 6) 配置和管理較差,可能會丟失數(shù)據(jù); ,,什么是負(fù)載均衡通過director,將用戶的請求分發(fā)到real server服務(wù)器上,然后返回給用戶。 負(fù)載均衡部署靈活、能夠滿足各種需求。 實現(xiàn)方式: 硬件:BIG/IP、Cisco、IBM(昂貴) 軟件:LVS(免費)LVS系統(tǒng)將用戶的請求的數(shù)據(jù)包在數(shù)據(jù)層和網(wǎng)絡(luò)層進(jìn)行了封裝和轉(zhuǎn)發(fā),由三種方式滿足各種需求。 1) DR:直接路由 2) Tuning:tcp/ip隧道 3) NAT:網(wǎng)絡(luò)地址轉(zhuǎn)換 需求:免費的軟件包 1) 2臺低端的director(active和standby) 2) 心跳線:連接2臺director,檢測活動情況 3) 2臺以上的real servers 通用結(jié)構(gòu):  有興趣的可以分別研究上面的三種LVS結(jié)構(gòu)。 1) 假設(shè)現(xiàn)在有4臺服務(wù)器(mysql官方推薦的最小配置) 服務(wù)器 開啟的服務(wù) 角色 192.168.131.164 Mysqld Mysql API Ndb1 Ndb_mgmd 管理節(jié)點(master) Heartbeat Director(master) 192.168.131.26 Mysqld Mysql API Ndb2 Ndb_mgmd 管理節(jié)點(backup) Heartbeat Director(standby) 192.168.131.77 Mysqld Mysql API(realserver) Sql1 Ndbd 存儲節(jié)點 Arptables 訪問路由 192.168.131.101 Mysqld Mysql API(realserver) Sql2 Ndbd 存儲節(jié)點Arptables 訪問路由 2)服務(wù)器安裝配置和網(wǎng)絡(luò)連接 (以下為所有服務(wù)器各操作一遍,共4遍) 安裝: 將4臺服務(wù)器安裝CentOS 5.2,選擇下面的包: Clustering Storage Clustering mysql不需要安裝,但perl-mysql-xxx的所有包需要安裝 開發(fā)工具包和類庫sshd服務(wù) SElinux ==>disable 語言支持包不安裝,默認(rèn)美國英語 設(shè)定主機(jī)名: Vi /etc/sysconfig/network Hostname=xxx 檢查主機(jī)名: Uname -a 必須和上表中的一一對應(yīng)。否則有問題。 Vi /etc/hosts Ndb1 192.168.131.164 Ndb2 192.168.131.26 Sql1 192.168.131.77 Sql2 192.168.131.101 更新: #rpm --import #yum update -y && yum -y install lynx libawt xorg-x11-deprecated-libs nx freenx arp tables_jf httpd-devel 下載: Mysql cluster版本(我下載的5.0.67社區(qū)版本): [root@ndb1 RHEL5]# ls -lh MySQL* | awk '{print $9}' MySQL-client-community-5.0.67-0.rhel5.i386.rpm MySQL-clusterextra-community-5.0.67-0.rhel5.i386.rpm MySQL-clustermanagement-community-5.0.67-0.rhel5.i386.rpm MySQL-clusterstorage-community-5.0.67-0.rhel5.i386.rpm MySQL-clustertools-community-5.0.67-0.rhel5.i386.rpm MySQL-devel-community-5.0.67-0.rhel5.i386.rpm MySQL-server-community-5.0.67-0.rhel5.i386.rpm MySQL-shared-community-5.0.67-0.rhel5.i386.rpm MySQL-shared-compat-5.0.67-0.rhel4.i386.rpm MySQL-shared-compat-5.0.67-0.rhel5.i386.rpm MySQL-test-community-5.0.67-0.rhel5.i386.rpm perl-HTML-Template-2.9-1.el5.rf.noarch.rpm [root@ndb1 RHEL5]# 在服務(wù)器上安裝以上包,在安裝的過程中如果缺少包或者庫,采用:yum install xxxx自行安裝。 建立目錄: #mkdir /var/lib/mysql-cluster -p 以下分別操作: 安裝cluster組件: #Rpm -Uvh MySQL-xx-xx.rpm,根據(jù)不同,可以少安裝部分組件。根據(jù)你需要而定。 163、26上,我安裝了: [root@ndb1 RHEL5]# rpm -aq | grep MySQL MySQL-clusterstorage-community-5.0.67-0.rhel5 MySQL-clustertools-community-5.0.67-0.rhel5 MySQL-clustermanagement-community-5.0.67-0.rhel5 MySQL-shared-community-5.0.67-0.rhel5 perl-DBD-MySQL-3.0007-1.fc6 MySQL-server-community-5.0.67-0.rhel5 [root@ndb1 RHEL5]# 101、77上,我安裝了: [root@sql1 ~]# rpm -aq | grep MySQL MySQL-clusterstorage-community-5.0.67-0.rhel4 MySQL-devel-community-5.0.67-0.rhel4 MySQL-server-community-5.0.67-0.rhel4 MySQL-client-community-5.0.67-0.rhel4 MySQL-shared-community-5.0.67-0.rhel4 [root@sql1 ~]# 以下在ndb1(164)和ndb2(26)上操作 [root@ndb1 ~]# vi /var/lib/mysql-cluster/config.ini [NDBD DEFAULT] NoOfReplicas=2 DataMemory=800M IndexMemory=400M [MYSQLD DEFAULT] [NDB_MGMD DEFAULT] [TCP DEFAULT] # Section for the cluster management node [NDB_MGMD] # IP address of the management node (this system)ID=1 HostName=192.168.131.164 [NDB_MGMD] # IP address of the management node (this system)ID=2 HostName=192.168.131.26 # Section for the storage nodes [NDBD] # IP address of the first storage node HostName=192.168.131.77 DataDir= /var/lib/mysql-cluster [NDBD] # IP address of the second storage node HostName=192.168.131.101 DataDir=/var/lib/mysql-cluster # one [MYSQLD] per storage node 以下在mysql API上操作(這里,我設(shè)定了7個API,以后可以隨時加入) Mysqld API的配置文件: Vi /etc/my.cnf [root@ndb1 ~]# cat /etc/my.cnf [mysqld] ndbcluster ndb-connectstring = "host=192.168.131.164,host=192.168.131.26" [ndb_mgm] connect-string = "host=192.168.131.164,host=192.168.131.26" [ndbd] connect-string = "host=192.168.131.164,host=192.168.131.26" 分別啟動ndb_mgmd/ndbd/mysqld 164/26: ndb_mgmd -f /var/lib/mysql-cluster/config.ini 77/101: Ndbd --initial 164/26/77/101: /etc/rc.d/init.d/mysql start 在管理節(jié)點ndb1(164)和ndb2(26)上查看各節(jié)點的情況: [root@ndb1 ~]# ndb_mgm -- NDB Cluster -- Management Client -- ndb_mgm> show Connected to Management Server at: 192.168.131.164:1186 Cluster Configuration --------------------- [ndbd(NDB)] 2 node(s) id=3 @192.168.131.77 (Version: 5.0.67, Nodegroup: 0, Master) id=4 @192.168.131.101 (Version: 5.0.67, Nodegroup: 0) [ndb_mgmd(MGM)] 2 node(s) id=1 @192.168.131.164 (Version: 5.0.67) id=2 @192.168.131.26 (Version: 5.0.67) [mysqld(API)] 7 node(s) id=5 @192.168.131.101 (Version: 5.0.67) id=6 @192.168.131.26 (Version: 5.0.67) id=7 @192.168.131.164 (Version: 5.0.67) id=8 @192.168.131.77 (Version: 5.0.67) id=9 (not connected, accepting connect from any host) id=10 (not connected, accepting connect from any host) id=11 (not connected, accepting connect from any host) ndb_mgm> 以上說明一切正常。 將服務(wù)增加到開機(jī)啟動服務(wù)項中: 164/26: echo 'ndb_mgmd -f /var/lib/mysql-cluster/config.ini' > /etc/rc.d/init.d/ndb_mgmd chmod 755 /etc/rc.d/init.d/ndb_mgmd 77/101: Echo 'ndbd' > /etc/rc.d/init.d/ndbd Chmod 755 /etc/rc.d/init.d/ndbd Chkconfig --level 2345 ndbd on OK,到此mysql cluster 配置完成。 強(qiáng)調(diào): 1)由于數(shù)據(jù)放在內(nèi)存中,需要在ndb節(jié)點上加大內(nèi)存的數(shù)量。按照1:1.1的比例,如果數(shù)據(jù)量達(dá)到3.6GB,需要4GB的內(nèi)存。 2)由于NDB和mysqld(API)都很耗費內(nèi)存,所以建議將NDB放在164和26上。可能啟動的時候會有警告,但是沒關(guān)系的。 查看數(shù)據(jù)和內(nèi)存情況: [root@sql2 ~]# top top - 16:39:36 up 1:59, 1 user, load average: 1.37, 0.76, 0.60 Tasks: 80 total, 2 running, 78 sleeping, 0 stopped, 0 zombie Cpu(s): 4.0%us, 4.0%sy, 0.0%ni, 87.3%id, 2.9%wa, 0.2%hi, 1.5%si, 0.0%st Mem: 2075600k total, 2005868k used, 69732k free, 68256k buffers Swap: 2031608k total, 0k used, 2031608k free, 1400812k cachedPID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2306 mysql 25 0 119m 15m 3952 S 22 0.8 10:20.94 mysqld 23791 root 15 0 1587m 484m 31m R 20 23.9 9:34.97ndbd 由于77只有2GB的內(nèi)存,而在config.ini中,把1.2GB的內(nèi)存分配給了NDB,所以,加上mysqld用掉的,2GB的內(nèi)存似乎已經(jīng)所剩無幾了。 查看77上的數(shù)據(jù)大小: [root@sql2 ~]# cd /var/lib/mysql-cluster/ndb_4_fs/ [root@sql2 ndb_4_fs]# du -lh 1.3GB 連接API創(chuàng)建數(shù)據(jù)庫: 由于上面4臺都做為mysqld 的API,所以創(chuàng)建數(shù)據(jù)庫的時候,都需要創(chuàng)建一遍。以下操作在4臺API上都需要操作: # Mysql -uroot -pxxxxxxxxxxxx -A Mysql> create database testdatabase;Mysql> grant all on *.testdatabase to root@'192.168.131.%' identified by 'xxxxxxxxxxxxxxx'; Mysql> flush privileges; Mysql> create table test(int (1)); Mysql> insert into test(1); Mysql> quit; 以上做完以后,可以通過任意一臺API上創(chuàng)建表,并寫數(shù)據(jù)到表中,其他數(shù)據(jù)庫都會同步寫入。 分別連接每臺服務(wù)器進(jìn)行檢查: # Mysql -uroot -pxxxxxxxxxxxx -A Mysql> use testdatabase; Mysql> select * from test; 如果輸出結(jié)果完全相同,表明mysql cluster已經(jīng)可以正常工作了。在2臺API上設(shè)置LVS Mysql cluster做好以后,數(shù)據(jù)庫分別建立同名的數(shù)據(jù)庫以后,權(quán)限分配好,然后只要在一臺上寫入數(shù)據(jù),其他的NDB就存儲了相同的數(shù)據(jù)。 用程序連接任意一臺API寫數(shù)據(jù),如果程序中未設(shè)置API的選擇和判斷,只使用了其中一個API,一旦API當(dāng)機(jī),則無法寫入數(shù)據(jù),必須修改程序。即便做了API的判斷和選擇,因為沒有實現(xiàn)負(fù)載均衡,服務(wù)器的性能沒有充分利用。高可用性也沒有達(dá)到目標(biāo)。所以,我們現(xiàn)在在2臺API之間做LVS。 LVS采用 ultramonkey() 首先在NDB1(164)和NDB2(26)上下載heartbeat的軟件包: 下載所有的rpm包: Cd /usr/local/src Mkdir heartbeat Cd heartbeat #Wget xxx.xxx.rpm 我下載了如下的軟件包: [root@ndb1 heartbeat]# ls -lh *.rpm | awk '{print $9}'; arptables-noarp-addr-0.99.2-1.rh.el.um.1.noarch.rpm heartbeat-1.2.3.cvs.20050927-1.rh.el.um.4.i386.rpm heartbeat-ldirectord-1.2.3.cvs.20050927-1.rh.el.um.4.i386.rpm heartbeat-pils-1.2.3.cvs.20050927-1.rh.el.um.4.i386.rpm heartbeat-stonith-1.2.3.cvs.20050927-1.rh.el.um.4.i386.rpm ipvsadm-1.21-1.rh.el.1.um.1.i386.rpm libnet-1.1.2.1-1.rh.el.um.1.i386.rpm perl-Authen-SASL-2.08-1.rh.el.um.1.noarch.rpm perl-Convert-ASN1-0.18-1.rh.el.um.1.noarch.rpm perl-IO-Socket-SSL-0.96-1.rh.el.um.1.noarch.rpm perl-ldap-0.3202-1.rh.el.um.1.noarch.rpm perl-Mail-IMAPClient-2.2.9-1.rh.el.um.1.noarch.rpm perl-Net-SSLeay-1.25-1.rh.el.um.1.i386.rpm perl-Parse-RecDescent-1.94-1.el5.rf.noarch.rpm perl-Parse-RecDescent-1.94-1.rh.el.um.1.noarch.rpm perl-XML-NamespaceSupport-1.08-1.rh.el.um.1.noarch.rpm perl-XML-SAX-0.12-1.rh.el.um.1.noarch.rpm [root@ndb1 heartbeat]# Heartbeat中包含以下幾部分: 1) Master Director(分發(fā)器)-- MD 2) Backup Director(備份分發(fā)器)-- BD 3) Real server (真實服務(wù)器,可以有2個以上)--RS  IP設(shè)置并確認(rèn): Eth0:192.168.131.164/24/GW:192.168.131.1 Eth1:10.9.30.1/24 Eth0:192.168.131.26/24/GW:192.168.131.1 Eth1:10.9.30.2 VIP:192.168.131.105/24/GW:192.168.131.1 -- 用戶訪問的統(tǒng)一虛擬IP RS1:192.168.131.101/24/GW:192.168.131.1 RS2:192.168.131.77/24/GW:192.168.131.1 。。。 等等 以下操作在所有服務(wù)器上執(zhí)行: 主機(jī)名確認(rèn): 分別執(zhí)行: #uname -a 主機(jī)名對應(yīng)表中所列。 在MD和BD修改IP轉(zhuǎn)發(fā): #vi modprobe.sh modprobe ip_vs_dh modprobe ip_vs_ftp modprobe ip_vs modprobe ip_vs_lblc modprobe ip_vs_lblcr modprobe ip_vs_lc modprobe ip_vs_nq modprobe ip_vs_rr modprobe ip_vs_sed modprobe ip_vs_sh modprobe ip_vs_wlc modprobe ip_vs_wrr :wq #chmod 755 modprobe.sh # sh modprobe.sh # vi /etc/modules ip_vs_dh ip_vs_ftp ip_vs ip_vs_lblc ip_vs_lblcr ip_vs_lc ip_vs_nq ip_vs_rr ip_vs_sed ip_vs_sh ip_vs_wlc ip_vs_wrr #Vi /etc/sysctl.conf net.ipv4.ip_forward = 0 改為: net.ipv4.ip_forward = 1 使修改生效: /sbin/sysctl -p 在MD和BD上安裝heartbeat軟件包 #Rpm -Uvh perl-xx-xx-xx.rpm #Yum install heartbeat #Rpm -Uvh arptables-noarp-addr-0.99.2-1.rh.el.um.1.noarch.rpm #rpm -Uvh perl-Mail-POP3Client-2.17-1.el5.centos.noarch.rpm 缺少perl包,就使用yum install perl-xx-xx #Perl -CPAN -e shell 這樣安裝的perl包不知道為何不好使,奇怪 這里VIP實際上是綁定在2臺director上。所以director之間需要做心跳處理。 心跳線使用eth1口,用交叉線連接起來。  這樣可以避免影響其他服務(wù)器。 配置heartbeat Heartbeat有3個配置文件: Ha.cf Authkeys Haresources ldirectord進(jìn)程的配置文件 Ldirectord.cf 一共需要配置4個配置文件。 #vi ha.cf logfacility local0 bcast eth1 mcast eth1 225.0.0.1 694 1 0 auto_failback off node ndb1 node ndb2 respawn hacluster /usr/lib/heartbeat/ipfail apiauth ipfail gid=haclient uid=hacluster :wq # vi authkeys auth 3 3 md5 514a49f83820e34c877ff48770e48ea7 :wq # vi haresources ndb1 ldirectord::ldirectord.cf LVSSyncDaemonSwap::master IPaddr2::192.168.131.105/24/eth0/192.168.131.255 Ndb2上需要將主機(jī)名更改一下。 設(shè)置屬性并使heartbeat開機(jī)啟動 # chmod 600 /etc/ha.d/authkeys #/sbin/chkconfig --level 2345 heartbeat on #/sbin/chkconfig --del ldirectord 啟動heartbeat: /etc/init.d/ldirectord stop /etc/init.d/heartbeat start 在MD和BD上檢查VIP是否生效: ip addr sh eth0 [root@ndb1 ha.d]# ip addr sh eth0 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 00:30:48:28:c6:85 brd ff:ff:ff:ff:ff:ff inet 192.168.131.164/24 brd 192.168.131.255 scope global eth0 inet 192.168.131.105/24 brd 192.168.131.255 scope global secondary eth0 inet6 fe80::230:48ff:fe28:c685/64 scope link valid_lft forever preferred_lft forever [root@ndb1 ha.d]# [root@ndb2 ~]# ip addr sh eth0 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 00:30:48:28:c4:af brd ff:ff:ff:ff:ff:ff inet 192.168.131.26/24 brd 192.168.131.255 scope global eth0 inet6 fe80::230:48ff:fe28:c4af/64 scope link valid_lft forever preferred_lft forever [root@ndb2 ~]# 現(xiàn)在在MD(164)上已經(jīng)生效了。 檢查ldirectored進(jìn)程 [root@ndb1 ha.d]# /usr/sbin/ldirectord ldirectord.cf status ldirectord for /etc/ha.d/ldirectord.cf is running with pid: 5596 [root@ndb1 ha.d]# [root@ndb2 ~]# /usr/sbin/ldirectord ldirectord.cf status ldirectord is stopped for /etc/ha.d/ldirectord.cf [root@ndb2 ~]# VIP生效的director應(yīng)該是running狀態(tài),standby應(yīng)該是stop狀態(tài)。 利用ipvs檢查包轉(zhuǎn)發(fā)是否生效 [root@ndb1 ha.d]# /sbin/ipvsadm -L -n IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 192.168.131.105:3306 wrr -> 192.168.131.77:3306 Route 1 3 3034 -> 192.168.131.101:3306 Route 1 3 3038 [root@ndb1 ha.d]# [root@ndb2 ~]# /sbin/ipvsadm -L -n IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn [root@ndb2 ~]# 在MB上已經(jīng)生效了。 在MD和BD上檢查LVSSyncDaemonSwap的狀態(tài): [root@ndb1 ha.d]# /etc/ha.d/resource.d/LVSSyncDaemonSwap master status master running (ipvs_syncmaster pid: 5689) [root@ndb1 ha.d]# [root@ndb2 ~]# /etc/ha.d/resource.d/LVSSyncDaemonSwap master status master stopped (ipvs_syncbackup pid: 5493) [root@ndb2 ~]# 同樣,standby的處于stopped狀態(tài)。 以下在RS服務(wù)器上執(zhí)行: ARP轉(zhuǎn)發(fā)限制MD或者BD采用ARP欺騙將ARP包轉(zhuǎn)發(fā)給下面的realserver。為了轉(zhuǎn)發(fā)成功,需要做ARP限制。 #/etc/init.d/arptables_jf stop #/usr/sbin/arptables-noarp-addr 192.168.6.240 start #/etc/init.d/arptables_jf save #/sbin/chkconfig --level 2345 arptables_jf on #/etc/init.d/arptables_jf start 查看限制鏈表 [root@sql2 mysql-cluster]# /sbin/arptables -L -v -n Chain IN (policy ACCEPT 29243 packets, 819K bytes) pkts bytes target in out source-ip destination-ip source-hw destination-hw hlen op hrd pro 54 1512 DROP * * 0.0.0.0/0 192.168.131.105 00/00 00/00 any 0000/0000 0000/0000 0000/0000 Chain OUT (policy ACCEPT 3931 packets, 110K bytes) pkts bytes target in out source-ip destination-ip source-hw destination-hw hlen op hrd pro 0 0 mangle * eth0 192.168.131.105 0.0.0.0/0 00/00 00/00 any 0000/0000 0000/0000 0000/0000 --mangle-ip-s 192.168.131.101 Chain FORWARD (policy ACCEPT 0 packets, 0 bytes) pkts bytes target in out source-ip destination-ip source-hw destination-hw hlen op hrd pro [root@sql2 mysql-cluster]# [root@sql1 ~]# /sbin/arptables -L -v -n Chain IN (policy ACCEPT 29375 packets, 823K bytes) pkts bytes target in out source-ip destination-ip source-hw destination-hw hlen op hrd pro 54 1512 DROP * * 0.0.0.0/0 192.168.131.105 00/00 00/00 any 0000/0000 0000/0000 0000/0000 Chain OUT (policy ACCEPT 3903 packets, 109K bytes) pkts bytes target in out source-ip destination-ip source-hw destination-hw hlen op hrd pro 0 0 mangle * eth0 192.168.131.105 0.0.0.0/0 00/00 00/00 any 0000/0000 0000/0000 0000/0000 --mangle-ip-s 192.168.131.77 Chain FORWARD (policy ACCEPT 0 packets, 0 bytes) pkts bytes target in out source-ip destination-ip source-hw destination-hw hlen op hrd pro [root@sql1 ~]# 這樣,由MD或者BD轉(zhuǎn)發(fā)過來的ARP包就被鏈表控制了。 設(shè)置如何接收ARP包以下在所有RS上執(zhí)行 # cp /etc/sysconfig/network-scripts/ifcfg-lo /etc/sysconfig/network-scripts/ifcfg-lo:0 #Vi /etc/sysconfig/network-scripts/ifcfg-lo:0 DEVICE=lo:0 IPADDR=192.168.131.105 NETMASK=255.255.255.255 NETWORK=192.168.131.0 BROADCAST=192.168.131.255 ONBOOT=yes NAME=loopback #/sbin/ifup lo 查看lo:0 [root@sql1 ~]# ip addr sh lo 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet 192.168.131.105/32 brd 192.168.131.255 scope global lo:0 inet6 ::1/128 scope host valid_lft forever preferred_lft forever [root@sql1 ~]# [root@sql2 mysql-cluster]# ip addr sh lo 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet 192.168.131.105/32 brd 192.168.131.255 scope global lo:0 inet6 ::1/128 scope host valid_lft forever preferred_lft forever [root@sql2 mysql-cluster]# 重新啟動服務(wù)器 以下在所有服務(wù)器上執(zhí)行(請確認(rèn)ip,服務(wù)器上沒有running任何正在使用的服務(wù)) reboot 啟動mysql cluster:順序: ndb_mgmd -- 164/26 Ndbd -- 101/77 Mysqld -- 所有檢查服務(wù)是否正常以下在ndb上執(zhí)行 #ndb_mgm [root@ndb1 ha.d]# ndb_mgm -- NDB Cluster -- Management Client -- ndb_mgm> show Connected to Management Server at: 192.168.131.164:1186 Cluster Configuration --------------------- [ndbd(NDB)] 2 node(s) id=3 @192.168.131.77 (Version: 5.0.67, Nodegroup: 0, Master) id=4 @192.168.131.101 (Version: 5.0.67, Nodegroup: 0) [ndb_mgmd(MGM)] 2 node(s) id=1 @192.168.131.164 (Version: 5.0.67) id=2 @192.168.131.26 (Version: 5.0.67) [mysqld(API)] 7 node(s) id=5 @192.168.131.101 (Version: 5.0.67) id=6 @192.168.131.26 (Version: 5.0.67) id=7 @192.168.131.164 (Version: 5.0.67) id=8 @192.168.131.77 (Version: 5.0.67) id=9 (not connected, accepting connect from any host) id=10 (not connected, accepting connect from any host) id=11 (not connected, accepting connect from any host) ndb_mgm> 一切正常。 檢查heartbeat是否正常:關(guān)閉BD,在MD上查看日志: [root@ndb1 ha.d]# tail -f /var/log/messages Dec 17 19:42:21 ndb1 heartbeat: [5462]: info: Received shutdown notice from 'ndb2'. Dec 17 19:42:21 ndb1 heartbeat: [5462]: info: Resources being acquired from ndb2. Dec 17 19:42:21 ndb1 harc[7085]: info: Running /etc/ha.d/rc.d/status status Dec 17 19:42:21 ndb1 mach_down[7118]: info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired Dec 17 19:42:21 ndb1 mach_down[7118]: info: mach_down takeover complete for node ndb2. Dec 17 19:42:21 ndb1 heartbeat: [5462]: info: mach_down takeover complete. Dec 17 19:42:21 ndb1 ldirectord[7153]: Invoking ldirectord invoked as: /etc/ha.d/resource.d/ldirectord ldirectord.cf status Dec 17 19:42:21 ndb1 ldirectord[7153]: ldirectord for /etc/ha.d/ldirectord.cf is running with pid: 5596 Dec 17 19:42:21 ndb1 ldirectord[7153]: Exiting from ldirectord status Dec 17 19:42:21 ndb1 heartbeat: [7086]: info: Local Resource acquisition completed. Dec 17 19:42:21 ndb1 harc[7175]: info: Running /etc/ha.d/rc.d/ip-request-resp ip-request-resp Dec 17 19:42:21 ndb1 ip-request-resp[7175]: received ip-request-resp ldirectord::ldirectord.cf OK yes Dec 17 19:42:21 ndb1 ResourceManager[7196]: info: Acquiring resource group: ndb1 ldirectord::ldirectord.cf LVSSyncDaemonSwap::master IPaddr2::192.168.131.105/24/eth0/192.168.131.255 Dec 17 19:42:22 ndb1 ldirectord[7223]: Invoking ldirectord invoked as: /etc/ha.d/resource.d/ldirectord ldirectord.cf status Dec 17 19:42:22 ndb1 ldirectord[7223]: ldirectord for /etc/ha.d/ldirectord.cf is running with pid: 5596 Dec 17 19:42:22 ndb1 ldirectord[7223]: Exiting from ldirectord status Dec 17 19:42:22 ndb1 ResourceManager[7196]: info: Running /etc/ha.d/resource.d/ldirectord ldirectord.cf start Dec 17 19:42:23 ndb1 ldirectord[7245]: Invoking ldirectord invoked as: /etc/ha.d/resource.d/ldirectord ldirectord.cf start Dec 17 19:42:23 ndb1 IPaddr2[7291]: INFO: Running OK 如果沒有出現(xiàn)異常,表明一切正常。 破壞性試驗 1) 檢查ndbd 關(guān)閉任意一臺ndbd的進(jìn)程,在ndb_mgm上查看是否失去連接。如果失去連接,表示已經(jīng)識別出來。此時在數(shù)據(jù)庫表中增加內(nèi)容之后啟動剛剛關(guān)閉的ndbd,檢查新寫入的數(shù)據(jù)是否已經(jīng)被同步過來。如果同步過來,一切正常。 2) 檢查heartbeat 關(guān)閉MD,檢查BD的反應(yīng): [root@ndb2 ~]# tail -f /var/log/messages Dec 17 19:47:22 ndb2 harc[6862]: info: Running /etc/ha.d/rc.d/status status Dec 17 19:47:23 ndb2 heartbeat: [6852]: info: Comm_now_up(): updating status to active Dec 17 19:47:23 ndb2 heartbeat: [6852]: info: Local status now set to: 'active' Dec 17 19:47:23 ndb2 heartbeat: [6852]: info: Starting child client "/usr/lib/heartbeat/ipfail" (498,496) Dec 17 19:47:23 ndb2 heartbeat: [6879]: info: Starting "/usr/lib/heartbeat/ipfail" as uid 498 gid 496 (pid 6879) Dec 17 19:47:23 ndb2 heartbeat: [6852]: info: remote resource transition completed. Dec 17 19:47:23 ndb2 heartbeat: [6852]: info: remote resource transition completed. Dec 17 19:47:23 ndb2 heartbeat: [6852]: info: Local Resource acquisition completed. (none) Dec 17 19:47:23 ndb2 heartbeat: [6852]: info: Initial resource acquisition complete (T_RESOURCES(them)) Dec 17 19:47:29 ndb2 ipfail: [6879]: info: Ping node count is balanced. Dec 17 19:47:43 ndb2 heartbeat: [6852]: info: Received shutdown notice from 'ndb1'. Dec 17 19:47:43 ndb2 heartbeat: [6852]: info: Resources being acquired from ndb1. Dec 17 19:47:43 ndb2 heartbeat: [6884]: info: acquire all HA resources (standby). Dec 17 19:47:43 ndb2 ResourceManager[6911]: info: Acquiring resource group: ndb2 ldirectord::ldirectord.cf LVSSyncDaemonSwap::master IPaddr2::192.168.131.105/24/eth0/192.168.131.255 Dec 17 19:47:43 ndb2 ldirectord[6957]: ldirectord is stopped for /etc/ha.d/ldirectord.cf Dec 17 19:47:43 ndb2 ldirectord[6957]: Exiting with exit_status 3: Exiting from ldirectord status Dec 17 19:47:43 ndb2 heartbeat: [6885]: info: Local Resource acquisition completed. Dec 17 19:47:43 ndb2 ldirectord[6961]: ldirectord is stopped for /etc/ha.d/ldirectord.cf Dec 17 19:47:43 ndb2 ldirectord[6961]: Exiting with exit_status 3: Exiting from ldirectord status Dec 17 19:47:43 ndb2 ResourceManager[6911]: info: Running /etc/ha.d/resource.d/ldirectord ldirectord.cf start Dec 17 19:47:44 ndb2 ldirectord[6986]: Starting Linux Director v1.77.2.32 as daemon Dec 17 19:47:44 ndb2 ldirectord[6988]: Added virtual server: 192.168.131.105:3306 Dec 17 19:47:44 ndb2 ldirectord[6988]: Quiescent real server: 192.168.131.101:3306 mapped from 192.168.131.101:3306 ( x 192.168.131.105:3306) (Weight set to 0) Dec 17 19:47:44 ndb2 ldirectord[6988]: Quiescent real server: 192.168.131.77:3306 mapped from 192.168.131.77:3306 ( x 192.168.131.105:3306) (Weight set to 0) Dec 17 19:47:44 ndb2 ResourceManager[6911]: info: Running /etc/ha.d/resource.d/LVSSyncDaemonSwap master start Dec 17 19:47:44 ndb2 kernel: IPVS: stopping sync thread 5493 ... Dec 17 19:47:45 ndb2 kernel: IPVS: sync thread stopped! Dec 17 19:47:45 ndb2 LVSSyncDaemonSwap[7050]: info: ipvs_syncbackup down Dec 17 19:47:45 ndb2 kernel: IPVS: sync thread started: state = MASTER, mcast_ifn = eth0, syncid = 0 Dec 17 19:47:45 ndb2 LVSSyncDaemonSwap[7050]: info: ipvs_syncmaster up Dec 17 19:47:45 ndb2 LVSSyncDaemonSwap[7050]: info: ipvs_syncmaster obtained Dec 17 19:47:45 ndb2 IPaddr2[7102]: INFO: Resource is stopped Dec 17 19:47:45 ndb2 ResourceManager[6911]: info: Running /etc/ha.d/resource.d/IPaddr2 192.168.131.105/24/eth0/192.168.131.255 start Dec 17 19:47:45 ndb2 IPaddr2[7214]: INFO: ip -f inet addr add 192.168.131.105/24 brd 192.168.131.255 dev eth0 Dec 17 19:47:45 ndb2 avahi-daemon[2776]: Registering new address record for 192.168.131.105 on eth0. Dec 17 19:47:45 ndb2 IPaddr2[7214]: INFO: ip link set eth0 up Dec 17 19:47:45 ndb2 IPaddr2[7214]: INFO: /usr/lib/heartbeat/send_arp -i 200 -r 5 -p /var/run/heartbeat/rsctmp/send_arp/send_arp-192.168.131.105 eth0 192.168.131.105 auto not_used not_used Dec 17 19:47:45 ndb2 kernel: IPVS: ip_vs_wrr_schedule(): no available servers Dec 17 19:47:45 ndb2 kernel: IPVS: ip_vs_wrr_schedule(): no available servers Dec 17 19:47:45 ndb2 IPaddr2[7185]: INFO: Success Dec 17 19:47:45 ndb2 kernel: IPVS: ip_vs_wrr_schedule(): no available servers Dec 17 19:47:45 ndb2 heartbeat: [6884]: info: all HA resource acquisition completed (standby). Dec 17 19:47:45 ndb2 heartbeat: [6852]: info: Standby resource acquisition done [all]. Dec 17 19:47:45 ndb2 harc[7277]: info: Running /etc/ha.d/rc.d/status status Dec 17 19:47:45 ndb2 kernel: IPVS: ip_vs_wrr_schedule(): no available servers Dec 17 19:47:45 ndb2 last message repeated 14 times Dec 17 19:47:45 ndb2 mach_down[7293]: info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired Dec 17 19:47:45 ndb2 kernel: IPVS: ip_vs_wrr_schedule(): no available servers Dec 17 19:47:45 ndb2 mach_down[7293]: info: mach_down takeover complete for node ndb1. Dec 17 19:47:45 ndb2 kernel: IPVS: ip_vs_wrr_schedule(): no available servers Dec 17 19:47:45 ndb2 heartbeat: [6852]: info: mach_down takeover complete. Dec 17 19:47:45 ndb2 harc[7327]: info: Running /etc/ha.d/rc.d/ip-request-resp ip-request-resp Dec 17 19:47:45 ndb2 ip-request-resp[7327]: received ip-request-resp ldirectord::ldirectord.cf OK yes Dec 17 19:47:45 ndb2 ResourceManager[7348]: info: Acquiring resource group: ndb2 ldirectord::ldirectord.cf LVSSyncDaemonSwap::master IPaddr2::192.168.131.105/24/eth0/192.168.131.255 Dec 17 19:47:45 ndb2 kernel: IPVS: ip_vs_wrr_schedule(): no available servers Dec 17 19:47:46 ndb2 last message repeated 3 times Dec 17 19:47:46 ndb2 ldirectord[7375]: ldirectord for /etc/ha.d/ldirectord.cf is running with pid: 6988 Dec 17 19:47:46 ndb2 ldirectord[7375]: Exiting from ldirectord status Dec 17 19:47:46 ndb2 ResourceManager[7348]: info: Running /etc/ha.d/resource.d/ldirectord ldirectord.cf start Dec 17 19:47:46 ndb2 kernel: IPVS: ip_vs_wrr_schedule(): no available servers Dec 17 19:47:46 ndb2 last message repeated 6 times Dec 17 19:47:46 ndb2 IPaddr2[7443]: INFO: Running OK Dec 17 19:47:46 ndb2 kernel: IPVS: ip_vs_wrr_schedule(): no available servers Dec 17 19:48:16 ndb2 last message repeated 289 times Dec 17 19:48:16 ndb2 heartbeat: [6852]: WARN: node ndb1: is dead Dec 17 19:48:16 ndb2 heartbeat: [6852]: info: Dead node ndb1 gave up resources. Dec 17 19:48:16 ndb2 heartbeat: [6852]: info: Link ndb1:eth1 dead. Dec 17 19:48:16 ndb2 ipfail: [6879]: info: Status update: Node ndb1 now has status dead Dec 17 19:48:16 ndb2 kernel: IPVS: ip_vs_wrr_schedule(): no available servers Dec 17 19:48:17 ndb2 last message repeated 8 times Dec 17 19:48:17 ndb2 ipfail: [6879]: info: NS: We are dead. :< Dec 17 19:48:17 ndb2 ipfail: [6879]: info: Link Status update: Link ndb1/eth1 now has status dead Dec 17 19:48:17 ndb2 kernel: IPVS: ip_vs_wrr_schedule(): no available servers Dec 17 19:48:17 ndb2 ipfail: [6879]: info: We are dead. :< Dec 17 19:48:17 ndb2 ipfail: [6879]: info: Asking other side for ping node count. Dec 17 19:48:18 ndb2 kernel: IPVS: ip_vs_wrr_schedule(): no available servers[root@ndb2 ~]# tail -f /var/log/messages Dec 17 19:47:22 ndb2 harc[6862]: info: Running /etc/ha.d/rc.d/status status Dec 17 19:47:23 ndb2 heartbeat: [6852]: info: Comm_now_up(): updating status to active Dec 17 19:47:23 ndb2 heartbeat: [6852]: info: Local status now set to: 'active' Dec 17 19:47:23 ndb2 heartbeat: [6852]: info: Starting child client "/usr/lib/heartbeat/ipfail" (498,496) Dec 17 19:47:23 ndb2 heartbeat: [6879]: info: Starting "/usr/lib/heartbeat/ipfail" as uid 498 gid 496 (pid 6879) Dec 17 19:47:23 ndb2 heartbeat: [6852]: info: remote resource transition completed. Dec 17 19:47:23 ndb2 heartbeat: [6852]: info: remote resource transition completed. Dec 17 19:47:23 ndb2 heartbeat: [6852]: info: Local Resource acquisition completed. (none) Dec 17 19:47:23 ndb2 heartbeat: [6852]: info: Initial resource acquisition complete (T_RESOURCES(them)) Dec 17 19:47:29 ndb2 ipfail: [6879]: info: Ping node count is balanced. Dec 17 19:47:43 ndb2 heartbeat: [6852]: info: Received shutdown notice from 'ndb1'. Dec 17 19:47:43 ndb2 heartbeat: [6852]: info: Resources being acquired from ndb1. Dec 17 19:47:43 ndb2 heartbeat: [6884]: info: acquire all HA resources (standby). Dec 17 19:47:43 ndb2 ResourceManager[6911]: info: Acquiring resource group: ndb2 ldirectord::ldirectord.cf LVSSyncDaemonSwap::master IPaddr2::192.168.131.105/24/eth0/192.168.131.255 Dec 17 19:47:43 ndb2 ldirectord[6957]: ldirectord is stopped for /etc/ha.d/ldirectord.cf Dec 17 19:47:43 ndb2 ldirectord[6957]: Exiting with exit_status 3: Exiting from ldirectord status Dec 17 19:47:43 ndb2 heartbeat: [6885]: info: Local Resource acquisition completed. Dec 17 19:47:43 ndb2 ldirectord[6961]: ldirectord is stopped for /etc/ha.d/ldirectord.cf Dec 17 19:47:43 ndb2 ldirectord[6961]: Exiting with exit_status 3: Exiting from ldirectord status Dec 17 19:47:43 ndb2 ResourceManager[6911]: info: Running /etc/ha.d/resource.d/ldirectord ldirectord.cf start Dec 17 19:47:44 ndb2 ldirectord[6986]: Starting Linux Director v1.77.2.32 as daemon Dec 17 19:47:44 ndb2 ldirectord[6988]: Added virtual server: 192.168.131.105:3306 Dec 17 19:47:44 ndb2 ldirectord[6988]: Quiescent real server: 192.168.131.101:3306 mapped from 192.168.131.101:3306 ( x 192.168.131.105:3306) (Weight set to 0) Dec 17 19:47:44 ndb2 ldirectord[6988]: Quiescent real server: 192.168.131.77:3306 mapped from 192.168.131.77:3306 ( x 192.168.131.105:3306) (Weight set to 0) Dec 17 19:47:44 ndb2 ResourceManager[6911]: info: Running /etc/ha.d/resource.d/LVSSyncDaemonSwap master start Dec 17 19:47:44 ndb2 kernel: IPVS: stopping sync thread 5493 ... Dec 17 19:47:45 ndb2 kernel: IPVS: sync thread stopped! Dec 17 19:47:45 ndb2 LVSSyncDaemonSwap[7050]: info: ipvs_syncbackup down Dec 17 19:47:45 ndb2 kernel: IPVS: sync thread started: state = MASTER, mcast_ifn = eth0, syncid = 0 Dec 17 19:47:45 ndb2 LVSSyncDaemonSwap[7050]: info: ipvs_syncmaster up Dec 17 19:47:45 ndb2 LVSSyncDaemonSwap[7050]: info: ipvs_syncmaster obtained Dec 17 19:47:45 ndb2 IPaddr2[7102]: INFO: Resource is stopped Dec 17 19:47:45 ndb2 ResourceManager[6911]: info: Running /etc/ha.d/resource.d/IPaddr2 192.168.131.105/24/eth0/192.168.131.255 start Dec 17 19:47:45 ndb2 IPaddr2[7214]: INFO: ip -f inet addr add 192.168.131.105/24 brd 192.168.131.255 dev eth0 Dec 17 19:47:45 ndb2 avahi-daemon[2776]: Registering new address record for 192.168.131.105 on eth0. Dec 17 19:47:45 ndb2 IPaddr2[7214]: INFO: ip link set eth0 up Dec 17 19:47:45 ndb2 IPaddr2[7214]: INFO: /usr/lib/heartbeat/send_arp -i 200 -r 5 -p /var/run/heartbeat/rsctmp/send_arp/send_arp-192.168.131.105 eth0 192.168.131.105 auto not_used not_used Dec 17 19:47:45 ndb2 kernel: IPVS: ip_vs_wrr_schedule(): no available servers Dec 17 19:47:45 ndb2 kernel: IPVS: ip_vs_wrr_schedule(): no available servers Dec 17 19:47:45 ndb2 IPaddr2[7185]: INFO: Success Dec 17 19:47:45 ndb2 kernel: IPVS: ip_vs_wrr_schedule(): no available servers Dec 17 19:47:45 ndb2 heartbeat: [6884]: info: all HA resource acquisition completed (standby). Dec 17 19:47:45 ndb2 heartbeat: [6852]: info: Standby resource acquisition done [all]. Dec 17 19:47:45 ndb2 harc[7277]: info: Running /etc/ha.d/rc.d/status status Dec 17 19:47:45 ndb2 kernel: IPVS: ip_vs_wrr_schedule(): no available servers Dec 17 19:47:45 ndb2 last message repeated 14 times Dec 17 19:47:45 ndb2 mach_down[7293]: info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired Dec 17 19:47:45 ndb2 kernel: IPVS: ip_vs_wrr_schedule(): no available servers Dec 17 19:47:45 ndb2 mach_down[7293]: info: mach_down takeover complete for node ndb1. Dec 17 19:47:45 ndb2 kernel: IPVS: ip_vs_wrr_schedule(): no available servers Dec 17 19:47:45 ndb2 heartbeat: [6852]: info: mach_down takeover complete. Dec 17 19:47:45 ndb2 harc[7327]: info: Running /etc/ha.d/rc.d/ip-request-resp ip-request-resp Dec 17 19:47:45 ndb2 ip-request-resp[7327]: received ip-request-resp ldirectord::ldirectord.cf OK yes Dec 17 19:47:45 ndb2 ResourceManager[7348]: info: Acquiring resource group: ndb2 ldirectord::ldirectord.cf LVSSyncDaemonSwap::master IPaddr2::192.168.131.105/24/eth0/192.168.131.255 Dec 17 19:47:45 ndb2 kernel: IPVS: ip_vs_wrr_schedule(): no available servers Dec 17 19:47:46 ndb2 last message repeated 3 times Dec 17 19:47:46 ndb2 ldirectord[7375]: ldirectord for /etc/ha.d/ldirectord.cf is running with pid: 6988 Dec 17 19:47:46 ndb2 ldirectord[7375]: Exiting from ldirectord status Dec 17 19:47:46 ndb2 ResourceManager[7348]: info: Running /etc/ha.d/resource.d/ldirectord ldirectord.cf start Dec 17 19:47:46 ndb2 kernel: IPVS: ip_vs_wrr_schedule(): no available servers Dec 17 19:47:46 ndb2 last message repeated 6 times Dec 17 19:47:46 ndb2 IPaddr2[7443]: INFO: Running OK Dec 17 19:47:46 ndb2 kernel: IPVS: ip_vs_wrr_schedule(): no available servers Dec 17 19:48:16 ndb2 last message repeated 289 times Dec 17 19:48:16 ndb2 heartbeat: [6852]: WARN: node ndb1: is dead Dec 17 19:48:16 ndb2 heartbeat: [6852]: info: Dead node ndb1 gave up resources. Dec 17 19:48:16 ndb2 heartbeat: [6852]: info: Link ndb1:eth1 dead. Dec 17 19:48:16 ndb2 ipfail: [6879]: info: Status update: Node ndb1 now has status dead Dec 17 19:48:16 ndb2 kernel: IPVS: ip_vs_wrr_schedule(): no available servers Dec 17 19:48:17 ndb2 last message repeated 8 times Dec 17 19:48:17 ndb2 ipfail: [6879]: info: NS: We are dead. :< Dec 17 19:48:17 ndb2 ipfail: [6879]: info: Link Status update: Link ndb1/eth1 now has status dead Dec 17 19:48:17 ndb2 kernel: IPVS: ip_vs_wrr_schedule(): no available servers Dec 17 19:48:17 ndb2 ipfail: [6879]: info: We are dead. :< Dec 17 19:48:17 ndb2 ipfail: [6879]: info: Asking other side for ping node count. Dec 17 19:48:18 ndb2 kernel: IPVS: ip_vs_wrr_schedule(): no available servers 如果沒有錯誤,表明heartbeat已經(jīng)切換。 此時再次插入數(shù)據(jù)驗證,如果還可以繼續(xù)寫入,表明配置完全成功。 Mysql cluster的測試報告: 在192.168.8.48上部署測試腳本,讓這臺服務(wù)器表示一個客戶端請求讀寫數(shù)據(jù)庫。 測試腳本1: [root@localhost mysql-cluster]# cat /data/pay.kingsoft.com/wwwroot/test.php <?php $link = mysql_connect('192.168.131.105', 'ldirector', 'xxxxxxxxx'); mysql_select_db('kingsoft',$link); $sql = "insert into `preference`(`id`,`preferenceSerialNumber`,`username`,`preferenceTypeId`,`isExpired`,`isUsed`,`preferenceUsername`,`equalMoney`,`genDatetime`,`useDatetime`,`grantDatetime`,`expriedDatetime`) values ( NULL,'514a49f83820e34c877ff48770e48ea7','liujun','2','1','1','kingsoft','512.23','2008-12-03','2008-12-03','2008-12-03','2008-12-03')"; for($i = 0;$i < 100 ;$i++){ mysql_query($sql); } mysql_close($link); ?> 測試腳本2: [root@localhost mysql-cluster]# cat test.sh #!/bin/sh i=0; j=0; while [ $i -lt 1000 ] do wget -q ; i=`expr $i + 1`; done sleep 2; find . -name "test.php.*" | xargs rm -rf ; while [ $j -lt 1000 ] do mysql -uldirector -pxxxxxxxxxxx -h192.168.131.105 -e "use kingsoft; insert into preference(preferenceSerialNumber,username,preferenceTypeId,preferenceUsername,equalMoney,genDatetime,useDatetime,grantDatetime,expriedDatetime) values('514a49f83820e34c877ff48770e48ea7','liujun2','3','liujun33333','33.8','2008-12-23 7:05:00','2008-12-23 7:15:00','2008-12-23 7:25:00','2008-12-23 7:35:00')"; j=`expr $j + 1`; done sleep 3; server=`mysql -uldirector -pxxxxxxxxxx -h192.168.131.105 -e "use kingsoft;select count(*) from preference"`; datetime=`date +%T`; echo $datetime"----------"$server >> /tmp/mysql-cluster/mysql.log; [root@localhost mysql-cluster]# 測試時間: 在192.168.8.48的cron中添加: [root@localhost mysql-cluster]# crontab -e */3 * * * * sh /tmp/mysql-cluster/test.sh > /dev/null 2>&1 [root@localhost mysql-cluster]# 連續(xù)運行24小時。 測試結(jié)果: #Cat mysql.log 14:31:54----------count(*) 21022 14:35:00----------count(*) 42634 14:37:57----------count(*) 63608 14:40:55----------count(*) 84708 14:43:55----------count(*) 105887 14:46:54----------count(*) 127045 14:49:58----------count(*) 148512 14:53:01----------count(*) 169795 14:56:27----------count(*) 190714 14:59:29----------count(*) 209921 15:02:03----------count(*) 231380 15:03:51----------count(*) 252231 15:05:12----------count(*) 269825 15:05:33----------count(*) 271824 15:08:05----------count(*) 291141 15:10:59----------count(*) 311836 15:14:00----------count(*) 332951 15:16:57----------count(*) 353841 15:19:59----------count(*) 374977 15:23:03----------count(*) 396181 15:26:01----------count(*) 417064 15:29:01----------count(*) 438098 15:32:03----------count(*) 459191 15:35:05----------count(*) 480229 15:38:05----------count(*) 501222 15:41:02----------count(*) 521868 15:43:59----------count(*) 542721 15:47:02----------count(*) 563841 16:00:32----------count(*) 698215 18:50:49----------count(*) 2105513 19:09:01----------count(*) 2105513 19:26:13----------count(*) 2105513 19:27:28----------count(*) 2105513 [root@localhost mysql-cluster]# 測試結(jié)果分析: 1)當(dāng)逐漸增加負(fù)載,數(shù)據(jù)庫的負(fù)載并不大,CPU占用率為30%,而內(nèi)存則由600MB逐漸升至2GB,最終達(dá)到極限。 2)數(shù)據(jù)并發(fā)量大,并未引起數(shù)據(jù)庫的異常,表明負(fù)載均衡已經(jīng)解決了單臺服務(wù)器負(fù)載太大的引起的瓶頸。 3)由于內(nèi)存有限(2GB),當(dāng)表中數(shù)據(jù)達(dá)到一定量以后,會出現(xiàn)表滿現(xiàn)象。這種情況可以通過增加內(nèi)存來解決。 4)mysql cluster可以實現(xiàn)高可用性、負(fù)載均衡,并且通過優(yōu)化參數(shù)使其進(jìn)一步穩(wěn)定服務(wù)。 5)可以采用6.3版本的mysql cluster,來減小NDBD內(nèi)存用量。 6)Mysql cluster的性能一般,比mysql replication慢。 需要注意的問題: 1) 當(dāng)ndbd第一次啟動的時候或者config.ini更改的時候,需要加--initial參數(shù)進(jìn)行初始化。 2) 盡可能的不要人工干預(yù)系統(tǒng),出現(xiàn)問題需要謹(jǐn)慎對待。
信息發(fā)布:廣州名易軟件有限公司 http://www.jetlc.com
|