請教集群時cman無法啟l動問題
問題:用fence_ilo 命令success。
但用cman就無法啟動服務.
謝謝!
環境:linuxa5
硬體連接:
伺服器都是eth0連接交換機,iloh直接連接交換機
陣列,直接和伺服器連接
hosts如下:
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
10.229.208.134 test-1
10.229.208.136 test-2
10.229.208.138 test-cluter
10.229.208.135 ilo1
10.229.208.137 ilo2
/etc/cluster/cluster.confv如下:
<?xml version="1.0" ?>
<cluster config_version="3" name="new_cluster">
<fence_daemon post_fail_delay="0" post_join_delay="3"/>
<clusternodes>
<clusternode name="test-1" nodeid="1" votes="1">
<fence>
<method name="1">
<device name="Fence-1"/>
</method>
</fence>
</clusternode>
<clusternode name="test-2" nodeid="2" votes="1">
<fence>
<method name="1">
<device name="Fence-2"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman expected_votes="1" two_node="1"/>
<fencedevices>
<fencedevice agent="fence_ilo" hostname="10.229.208.135" login="Administrator" name="Fence-1" passwd="C7N7NS4B"/>
<fencedevice agent="fence_ilo" hostname="10.229.208.137" login="Administrator" name="Fence-2" passwd="CFEFHGP3"/>
</fencedevices>
<rm>
<failoverdomains>
<failoverdomain name="Failover_sybase" ordered="1" restricted="1">
<failoverdomainnode name="test-1" priority="1"/>
<failoverdomainnode name="test-2" priority="1"/>
</failoverdomain>
</failoverdomains>
<resources>
<ip address="10.229.208.138" monitor_link="1"/>
</resources>
<service autostart="1" domain="Failover_sybase" name="Services_sybase">
<ip ref="10.229.208.138"/>
</service>
</rm>
</cluster>
在service cman start時/var/log/message日誌::
Jan 12 03:35:59 test-2 openais: Copyright (C) 2002-2006 MontaVista Software, Inc and contributors.
Jan 12 03:35:59 test-2 openais: Copyright (C) 2006 Red Hat, Inc.
Jan 12 03:35:59 test-2 openais: AIS Executive Service: started and ready to provide service.
Jan 12 03:35:59 test-2 openais: Using default multicast address of 239.192.187.112
Jan 12 03:35:59 test-2 openais: openais component openais_cpg loaded.
Jan 12 03:35:59 test-2 openais: Registering service handler 'openais cluster closed process group service v1.01'
Jan 12 03:35:59 test-2 openais: openais component openais_cfg loaded.
Jan 12 03:35:59 test-2 openais: Registering service handler 'openais configuration service'
Jan 12 03:35:59 test-2 openais: openais component openais_msg loaded.
Jan 12 03:35:59 test-2 openais: Registering service handler 'openais message service B.01.01'
Jan 12 03:35:59 test-2 openais: openais component openais_lck loaded.
Jan 12 03:35:59 test-2 openais: Registering service handler 'openais distributed locking service B.01.01'
Jan 12 03:35:59 test-2 openais: openais component openais_evt loaded.
Jan 12 03:35:59 test-2 openais: Registering service handler 'openais event service B.01.01'
Jan 12 03:35:59 test-2 openais: openais component openais_ckpt loaded.
Jan 12 03:35:59 test-2 openais: Registering service handler 'openais checkpoint service B.01.01'
Jan 12 03:35:59 test-2 openais: openais component openais_amf loaded.
Jan 12 03:35:59 test-2 openais: Registering service handler 'openais availability management framework B.01.01'
Jan 12 03:35:59 test-2 openais: openais component openais_clm loaded.
Jan 12 03:35:59 test-2 openais: Registering service handler 'openais cluster membership service B.01.01'
Jan 12 03:35:59 test-2 openais: openais component openais_evs loaded.
Jan 12 03:35:59 test-2 openais: Registering service handler 'openais extended virtual synchrony service'
Jan 12 03:35:59 test-2 openais: openais component openais_cman loaded.
Jan 12 03:35:59 test-2 openais: Registering service handler 'openais CMAN membership service 2.01'
Jan 12 03:35:59 test-2 openais: Token Timeout (10000 ms) retransmit timeout (495 ms)
Jan 12 03:35:59 test-2 openais: token hold (386 ms) retransmits before loss (20 retrans)
Jan 12 03:35:59 test-2 openais: join (60 ms) send_join (0 ms) consensus (4800 ms) merge (200 ms)
Jan 12 03:35:59 test-2 openais: downcheck (1000 ms) fail to recv const (50 msgs)
Jan 12 03:35:59 test-2 openais: seqno unchanged const (30 rotations) Maximum network MTU 1500
Jan 12 03:35:59 test-2 openais: window size per rotation (50 messages) maximum messages per rotation (17 messages)
Jan 12 03:35:59 test-2 openais: send threads (0 threads)
Jan 12 03:35:59 test-2 openais: RRP token expired timeout (495 ms)
Jan 12 03:35:59 test-2 openais: RRP token problem counter (2000 ms)
Jan 12 03:35:59 test-2 openais: RRP threshold (10 problem count)
Jan 12 03:35:59 test-2 openais: RRP mode set to none.
Jan 12 03:35:59 test-2 openais: heartbeat_failures_allowed (0)
Jan 12 03:35:59 test-2 openais: max_network_delay (50 ms)
Jan 12 03:35:59 test-2 openais: HeartBeat is Disabled. To enable set heartbeat_failures_allowed > 0
Jan 12 03:35:59 test-2 openais: Receive multicast socket recv buffer size (262142 bytes).
Jan 12 03:35:59 test-2 openais: Transmit multicast socket send buffer size (262142 bytes).
Jan 12 03:35:59 test-2 openais: The network interface is now up.
Jan 12 03:35:59 test-2 openais: Created or loaded sequence id 0.10.229.208.136 for this ring.
Jan 12 03:35:59 test-2 openais: entering GATHER state from 15.
Jan 12 03:35:59 test-2 openais: Initialising service handler 'openais extended virtual synchrony service'
Jan 12 03:35:59 test-2 openais: Initialising service handler 'openais cluster membership service B.01.01'
Jan 12 03:35:59 test-2 openais: Initialising service handler 'openais availability management framework B.01.01'
Jan 12 03:35:59 test-2 openais: Initialising service handler 'openais checkpoint service B.01.01'
Jan 12 03:35:59 test-2 openais: Initialising service handler 'openais event service B.01.01'
Jan 12 03:35:59 test-2 openais: Initialising service handler 'openais distributed locking service B.01.01'
Jan 12 03:35:59 test-2 openais: Initialising service handler 'openais message service B.01.01'
Jan 12 03:35:59 test-2 openais: Initialising service handler 'openais configuration service'
Jan 12 03:35:59 test-2 openais: Initialising service handler 'openais cluster closed process group service v1.01'
Jan 12 03:36:00 test-2 openais: Initialising service handler 'openais CMAN membership service 2.01'
Jan 12 03:36:00 test-2 openais: CMAN 2.0.60 (built Jan 23 2007 12:42:29) started
Jan 12 03:36:00 test-2 openais: Not using a virtual synchrony filter.
Jan 12 03:36:00 test-2 openais: Creating commit token because I am the rep.
Jan 12 03:36:00 test-2 openais: Saving state aru 0 high seq received 0
Jan 12 03:36:00 test-2 openais: entering COMMIT state.
Jan 12 03:36:00 test-2 openais: entering RECOVERY state.
Jan 12 03:36:00 test-2 openais: position member 10.229.208.136:
Jan 12 03:36:00 test-2 openais: previous ring seq 0 rep 10.229.208.136
Jan 12 03:36:00 test-2 openais: aru 0 high delivered 0 received flag 0
Jan 12 03:36:00 test-2 openais: Did not need to originate any messages in recovery.
Jan 12 03:36:00 test-2 openais: Couldn't store new ring id 4 to stable storage (Permission denied)
Jan 12 03:36:01 test-2 setroubleshoot: SELinux is preventing the /usr/sbin/aisexec from using potentially mislabeled files (tmp). For complete SELinux messages. run sealert -l 3ee1d4bd-50a6-4093-ab44-c17869fa8d36
Jan 12 03:36:07 test-2 ccsd: Unable to connect to cluster infrastructure after 30 seconds.
《解決方案》
<fence_daemon post_fail_delay="0" post_join_delay="3"/>
改成
<fence_daemon post_fail_delay="0" post_join_delay="30"/>
確保關閉selinux並且使用非xen的內核。
另外,這個地方配置有問題:
<fence>
<method name="1">
<device name="Fence-1"/>
</method>
</fence>
《解決方案》
回復 #2 jerrywjl 的帖子
非常感謝!!!!
剛回家,明天一早過去試試,Thank you very much
《解決方案》
我仔細比較了一些其它ok的cluster.conf。和我方面沒什麼區別啊。。。