歡迎您光臨本站 註冊首頁

找了好久都沒找到的HA解決方法,來這裡碰碰運氣看

←手機掃碼閱讀     火星人 @ 2014-03-04 , reply:0

找了好久都沒找到的HA解決方法,來這裡碰碰運氣看

環境:兩台RH3,做HA,軟體是自帶的
現在情況是,集群已經做好,添加進去的系統自帶伺服器HTTPD也能運行,但是想從rh3-ha1切換到rh3-ha2時,總是不成功,狀態顯示disable
說這麼多,先看看圖吧
rh3-ha2上的狀態查詢
Cluster Status - rh3-cluster                                           16:23:00
Cluster Quorum Incarnation #1
Shared State: Shared Raw Device Driver v1.2

  Member             Status   
  ------------------ ----------
  192.168.1.120      Active               
  192.168.1.121      Active     <-- You are here

  Service        Status   Owner (Last)     Last Transition Chk Restarts
  -------------- -------- ---------------- --------------- --- --------
  httpd          disabled (None)           04:13:30 Jan 04   4        0

rh3-ha1上的狀態查詢
Cluster Status - rh3-cluster                                           05:31:22
Cluster Quorum Incarnation #1
Shared State: Shared Raw Device Driver v1.2

  Member             Status   
  ------------------ ----------
  192.168.1.120      Active     <-- You are here
  192.168.1.121      Active               

  Service        Status   Owner (Last)     Last Transition Chk Restarts
  -------------- -------- ---------------- --------------- --- --------
  httpd          started  192.168.1.120    05:18:07 Jan 04   4        0

然後我是想把rh3-ha1的httpd停掉后,rh3-ha2的狀態會變成started,而rh3-ha1的狀態會變成stoped
結果卻不是,如下
# clusvcadm -s httpd -m 192.168.1.120
Member 192.168.1.120 stopping httpd...success

Cluster Status - rh3-cluster                                           05:33:46
Cluster Quorum Incarnation #1
Shared State: Shared Raw Device Driver v1.2

  Member             Status   
  ------------------ ----------
  192.168.1.120      Active     <-- You are here
  192.168.1.121      Active               

  Service        Status   Owner (Last)     Last Transition Chk Restarts
  -------------- -------- ---------------- --------------- --- --------
  httpd          stopped  (192.168.1.120)  05:33:18 Jan 04   4        0



rh3-ha2上
Cluster Status - rh3-cluster                                           16:26:22
Cluster Quorum Incarnation #1
Shared State: Shared Raw Device Driver v1.2

  Member             Status   
  ------------------ ----------
  192.168.1.120      Active               
  192.168.1.121      Active     <-- You are here

  Service        Status   Owner (Last)     Last Transition Chk Restarts
  -------------- -------- ---------------- --------------- --- --------
  httpd          disabled (None)           04:13:30 Jan 04   4        0

然後想再手動去在rh3-ha2 來啟動這個服務,結果如下
# clusvcadm -e httpd -m 192.168.1.121
Unknown service: httpd
#

下面傳兩張這兩台設備的圖形界面,其中30那張是rh3-ha2的,31那張是rh3-ha1的(因為我手動停止了httpd服務,所以這裡顯示的是停止)
現在問題就是為什麼rh3-ha2就無法接管服務?

[ 本帖最後由 atianyu 於 2010-1-4 16:32 編輯 ]
《解決方案》

為什麼沒人回答呢?太容易了,不屑於回答嗎?自己暖和暖和吧

對於這個問題有些進展,經過自己的摸索,目前兩者之間是可以切換的,也就是說,我把rh3-ha1停掉,服務是可以轉移到rh3-ha2的,但是超級慢,我用PING來測試,發現當停掉ha1時在超時很久后,在能通,也就是說轉移到ha2了,192.168.1.122是兩台虛擬出來的對外提供服務IP。
不知道是不是哪裡設置錯誤了,我那個伺服器探測等待時間是默認的15秒。
請高手們支個招吧。
Reply from 192.168.1.122: bytes=32 time<1ms TTL=64
Reply from 192.168.1.122: bytes=32 time<1ms TTL=64
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Reply from 192.168.1.122: bytes=32 time<1ms TTL=64
Reply from 192.168.1.122: bytes=32 time<1ms TTL=64

[ 本帖最後由 atianyu 於 2010-1-5 09:12 編輯 ]
《解決方案》

調整一下 各個 time 的值試試.

好久不搞 rhcs 了. 不記得有哪些參數了.

不過, 注意不要調得太小. 否則會產生震蕩的. :em17:
《解決方案》

以下是我的配置參數
# more cluster.xml
<?xml version="1.0"?>
<cluconfig version="3.0">
  <clumembd broadcast="yes" interval="500000" loglevel="5" multicast="no" multicast_ipaddress="" thread="yes" tko_count="20"/>
  <cluquorumd loglevel="5" pinginterval="1" tiebreaker_ip=""/>
  <clurmtabd loglevel="5" pollinterval="4"/>
  <clusvcmgrd loglevel="5"/>
  <clulockd loglevel="5"/>
  <cluster config_viewnumber="21" key="f22ec73099984401e3330bb7d713d49a" name="rh3-cluster"/>
  <sharedstate driver="libsharedraw.so" rawprimary="/dev/raw/raw1" rawshadow="/dev/raw/raw2" type="raw"/>
  <members>
    <member id="0" name="192.168.1.120" watchdog="yes">
    </member>
    <member id="1" name="192.168.1.121" watchdog="yes"/>
  </members>
  <services>
    <service checkinterval="4" failoverdomain="fd" id="0" name="httpd" userscript="/etc/rc.d/init.d/httpd">
      <service_ipaddresses>
        <service_ipaddress broadcast="192.168.1.255" id="0" ipaddress="192.168.1.122" netmask="255.255.255.0"/>
      </service_ipaddresses>
      <device id="0" name="/dev/sdc1" sharename="">
        <mount forceunmount="yes" fstype="ext3" mountpoint="/app/" options="rw"/>
      </device>
    </service>
  </services>
  <failoverdomains>
    <failoverdomain id="0" name="fd" ordered="no" restricted="no">
      <failoverdomainnode id="0" name="192.168.1.121"/>
      <failoverdomainnode id="1" name="192.168.1.120"/>
    </failoverdomain>
  </failoverdomains>
</cluconfig>
《解決方案》

呵呵. 我都看不懂了.
你看一下手冊. 那個圖形的管理器上好象就可以配置的.
《解決方案》

樓主,雖然看上去你那個問題應該有解,但是RHEL3今年就要退出支持了,因此實在沒興趣再架環境搞RHEL3的HA,你還是考慮換個高版本RHEL的HA把。紅帽早就拋棄RHEL3HA的架構了,高版本的RHCS會好用很多。
《解決方案》

既然能failover,證明這個功能應該是OK的。但為什麼這麼慢,還是要跟蹤日誌。
如果日誌也沒有什麼問題,那麼interval和tko是不是太長了點?

但反正正如LS說,RHEL3上的RHCS已經過氣了。對此我也不是很熟。
《解決方案》

回復 #6 zhang1980s 的帖子

這位兄弟說得很中肯呀,非常贊同,以前我做Novell公司的SUSE HA的時候,那功能就強大,而且非常人性化的配置,用起來爽呀。
我主要是想多接觸點看看,其實當初是想用redhat 5的HA玩玩,連繫統都裝好了,環境也準備了,就是沒那些集群方面的軟體,不知道兄弟你有嗎,共享 下看看。
《解決方案》

回復 #7 jerrywjl 的帖子

謝謝回答,回去試試改些和時間有關的東西,NND,就不信測不出來
《解決方案》

誒呀,各位高手,還是不行,改了一些時間縮短,還是一樣很長時間才能切換過來
很奇怪的就是這rh3的cluster里怎麼就沒有設置心跳IP地址的地方?
按照安裝指導的說法,要在兩台機器的/etc/hosts寫類似10.0.0.1     xintiao1
                                                                            10.0.0.2     xintiao2

但在實際的配置的時候貌似根本沒用到這兩個IP,是不是多餘的呀?

[火星人 ] 找了好久都沒找到的HA解決方法,來這裡碰碰運氣看已經有645次圍觀

http://coctec.com/docs/service/show-post-5878.html