redhat rhcs還有點問題,請各們幫忙看看
現在安裝了openipmitool這個工具,但拔網線還是不能切換,日誌如下
如里我把兩台主機node1,node2指向同一個fence設備,拔網線就可以切換,不知道是什麼原因。
日誌文件:
-----拔網線的主機日誌-------------------------
Aug 19 16:28:29 ehrdb1 kernel: bnx2: eth1 NIC Copper Link is Down
Aug 19 16:28:38 ehrdb1 openais: The token was lost in the OPERATIONAL state.
Aug 19 16:28:38 ehrdb1 openais: Receive multicast socket recv buffer size (288000 bytes).
Aug 19 16:28:38 ehrdb1 openais: Transmit multicast socket send buffer size (288000 bytes).
Aug 19 16:28:38 ehrdb1 openais: entering GATHER state from 2.
Aug 19 16:28:43 ehrdb1 openais: entering GATHER state from 0.
Aug 19 16:28:43 ehrdb1 openais: Creating commit token because I am the rep.
Aug 19 16:28:43 ehrdb1 openais: Saving state aru 2a high seq received 2a
Aug 19 16:28:43 ehrdb1 openais: Storing new sequence id for ring 348
Aug 19 16:28:43 ehrdb1 openais: entering COMMIT state.
Aug 19 16:28:43 ehrdb1 openais: entering RECOVERY state.
Aug 19 16:28:43 ehrdb1 openais: position member 10.0.133.60:
Aug 19 16:28:43 ehrdb1 openais: previous ring seq 836 rep 10.0.133.60
Aug 19 16:28:43 ehrdb1 openais: aru 2a high delivered 2a received flag 1
Aug 19 16:28:43 ehrdb1 openais: Did not need to originate any messages in recovery.
Aug 19 16:28:43 ehrdb1 openais: Sending initial ORF token
Aug 19 16:28:43 ehrdb1 openais: CLM CONFIGURATION CHANGE
Aug 19 16:28:43 ehrdb1 openais: New Configuration:
Aug 19 16:28:43 ehrdb1 kernel: dlm: closing connection to node 2
Aug 19 16:28:43 ehrdb1 fenced: ehrdb2 not a cluster member after 0 sec post_fail_delay
Aug 19 16:28:43 ehrdb1 openais: r(0) ip(10.0.133.60)
Aug 19 16:28:43 ehrdb1 fenced: fencing node "ehrdb2"
Aug 19 16:28:43 ehrdb1 openais: Members Left:
Aug 19 16:28:43 ehrdb1 openais: r(0) ip(10.0.133.61)
Aug 19 16:28:43 ehrdb1 openais: Members Joined:
Aug 19 16:28:43 ehrdb1 openais: CLM CONFIGURATION CHANGE
Aug 19 16:28:43 ehrdb1 openais: New Configuration:
Aug 19 16:28:43 ehrdb1 openais: r(0) ip(10.0.133.60)
Aug 19 16:28:44 ehrdb1 openais: Members Left:
Aug 19 16:28:44 ehrdb1 openais: Members Joined:
Aug 19 16:28:44 ehrdb1 openais: This node is within the primary component and will provide service.
Aug 19 16:28:44 ehrdb1 openais: entering OPERATIONAL state.
Aug 19 16:28:44 ehrdb1 openais: got nodejoin message 10.0.133.60
Aug 19 16:28:44 ehrdb1 openais: got joinlist message from node 1
Aug 19 16:28:44 ehrdb1 gnome-power-manager: (root) GNOME interactive logout because the power button has been pressed
Aug 19 16:42:54 ehrdb1 syslogd 1.4.1: restart.
Aug 19 16:42:54 ehrdb1 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Aug 19 16:42:54 ehrdb1 kernel: Linux version 2.6.18-92.el5 (brewbuilder@ls20-bc2-13.build.redhat.com) (gcc version 4.1.2 20071124 (Red Hat 4.1.2-41)) #1 SMP Tue Apr 29 13:16:15 EDT 2008
Aug 19 16:42:54 ehrdb1 kernel: Command line: ro root=LABEL=/1 rhgb quiet
Aug 19 16:42:54 ehrdb1 kernel: BIOS-provided physical RAM map:
Aug 19 16:42:54 ehrdb1 kernel: BIOS-e820: 0000000000000000 - 000000000009b800 (usable)
Aug 19 16:42:54 ehrdb1 kernel: BIOS-e820: 000000000009b800 - 00000000000a0000 (reserved)
Aug 19 16:42:54 ehrdb1 kernel: BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
Aug 19 16:42:54 ehrdb1 kernel: BIOS-e820: 0000000000100000 - 00000000cff4b380 (usable)
Aug 19 16:42:54 ehrdb1 kernel: BIOS-e820: 00000000cff4b380 - 00000000cff57b40 (ACPI data)
Aug 19 16:42:54 ehrdb1 kernel: BIOS-e820: 00000000cff57b40 - 00000000e0000000 (reserved)
------備機日誌-------------------------
Aug 19 16:28:02 ehrdb2 clurgmgrd: <notice> Reconfiguring
Aug 19 16:29:31 ehrdb2 openais: The token was lost in the OPERATIONAL state.
Aug 19 16:29:31 ehrdb2 openais: Receive multicast socket recv buffer size (288000 bytes).
Aug 19 16:29:31 ehrdb2 openais: Transmit multicast socket send buffer size (262142 bytes).
Aug 19 16:29:31 ehrdb2 openais: entering GATHER state from 2.
Aug 19 16:29:35 ehrdb2 openais: entering GATHER state from 0.
Aug 19 16:29:35 ehrdb2 openais: Creating commit token because I am the rep.
Aug 19 16:29:35 ehrdb2 openais: Saving state aru 2a high seq received 2a
Aug 19 16:29:35 ehrdb2 openais: Storing new sequence id for ring 348
Aug 19 16:29:35 ehrdb2 openais: entering COMMIT state.
Aug 19 16:29:35 ehrdb2 openais: entering RECOVERY state.
Aug 19 16:29:35 ehrdb2 openais: position member 10.0.133.61:
Aug 19 16:29:35 ehrdb2 openais: previous ring seq 836 rep 10.0.133.60
Aug 19 16:29:35 ehrdb2 openais: aru 2a high delivered 2a received flag 1
Aug 19 16:29:35 ehrdb2 openais: Did not need to originate any messages in recovery.
Aug 19 16:29:35 ehrdb2 openais: Sending initial ORF token
Aug 19 16:29:35 ehrdb2 openais: CLM CONFIGURATION CHANGE
Aug 19 16:29:35 ehrdb2 openais: New Configuration:
Aug 19 16:29:35 ehrdb2 kernel: dlm: closing connection to node 1
Aug 19 16:29:35 ehrdb2 fenced: ehrdb1 not a cluster member after 0 sec post_fail_delay
Aug 19 16:29:36 ehrdb2 openais: r(0) ip(10.0.133.61)
Aug 19 16:29:36 ehrdb2 fenced: fencing node "ehrdb1"
Aug 19 16:29:36 ehrdb2 openais: Members Left:
Aug 19 16:29:36 ehrdb2 openais: r(0) ip(10.0.133.60)
Aug 19 16:29:36 ehrdb2 openais: Members Joined:
Aug 19 16:29:36 ehrdb2 openais: CLM CONFIGURATION CHANGE
Aug 19 16:29:36 ehrdb2 openais: New Configuration:
Aug 19 16:29:36 ehrdb2 openais: r(0) ip(10.0.133.61)
Aug 19 16:29:36 ehrdb2 openais: Members Left:
Aug 19 16:29:36 ehrdb2 openais: Members Joined:
Aug 19 16:29:36 ehrdb2 openais: This node is within the primary component and will provide service.
Aug 19 16:29:36 ehrdb2 openais: entering OPERATIONAL state.
Aug 19 16:29:36 ehrdb2 openais: got nodejoin message 10.0.133.61
Aug 19 16:29:36 ehrdb2 openais: got joinlist message from node 2
Aug 19 16:29:36 ehrdb2 gnome-power-manager: (root) GNOME interactive logout because the power button has been pressed
Aug 19 16:43:52 ehrdb2 syslogd 1.4.1: restart.
Aug 19 16:43:52 ehrdb2 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Aug 19 16:43:52 ehrdb2 kernel: Linux version 2.6.18-92.el5 (brewbuilder@ls20-bc2-13.build.redhat.com) (gcc version 4.1.2 20071124 (Red Hat 4.1.2-41)) #1 SMP Tue Apr 29 13:16:15 EDT 2008
配製文件:
----------------cluster.conf-----------------------
# cat /etc/cluster/cluster.conf
<?xml version="1.0"?>
<cluster alias="ehrdb" config_version="50" name="ehrdb">
<fence_daemon post_fail_delay="0" post_join_delay="3"/>
<clusternodes>
<clusternode name="ehrdb1" nodeid="1" votes="1">
<fence>
<method name="1">
<device name="bmcdb1"/>
</method>
</fence>
</clusternode>
<clusternode name="ehrdb2" nodeid="2" votes="1">
<fence>
<method name="1">
<device name="bmcdb2"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman expected_votes="1" two_node="1"/>
<fencedevices>
<fencedevice agent="fence_ipmilan" ipaddr="192.168.133.70" login="admin" name="bmcdb1" passwd="123"/>
<fencedevice agent="fence_ipmilan" ipaddr="192.168.133.71" login="admin" name="bmcdb2" passwd="123"/>
</fencedevices>
<rm>
<failoverdomains>
<failoverdomain name="ehrfd" ordered="0" restricted="0">
<failoverdomainnode name="ehrdb1" priority="1"/>
<failoverdomainnode name="ehrdb2" priority="1"/>
</failoverdomain>
</failoverdomains>
<resources>
<fs device="/dev/sdb5" force_fsck="0" force_unmount="1" fsid="60706" fstype="ext3" mountpoint="/db/sys" name="sys" options="" self_fence="0"/>
<fs device="/dev/sdb6" force_fsck="0" force_unmount="1" fsid="62307" fstype="ext3" mountpoint="/db/data" name="data" options="" self_fence="0"/>
<fs device="/dev/sdb7" force_fsck="0" force_unmount="1" fsid="2367" fstype="ext3" mountpoint="/app/sys" name="appsys" options="" self_fence="0"/>
<fs device="/dev/sdb8" force_fsck="0" force_unmount="1" fsid="41738" fstype="ext3" mountpoint="/db/bk" name="bk" options="" self_fence="0"/>
<ip address="10.0.133.69" monitor_link="1"/>
<script file="/home/oracle/bin/oracledb.sh" name="ehr10g"/>
</resources>
<service autostart="1" domain="ehrfd" name="ehrservice">
<fs ref="sys"/>
<fs ref="data"/>
<fs ref="appsys"/>
<fs ref="bk"/>
<ip ref="10.0.133.69"/>
<script ref="ehr10g"/>
</service>
</rm>
</cluster>
《解決方案》
回復 #1 openpower710 的帖子
你的fence是內部的fence還是外部的fence設備
《解決方案》
fence_ipmilan嗎,一看就是內部的!
《解決方案》
集群運行中,執行:
fence_node
測一下。
《解決方案》
當然事前還要執行fence_ipmilan測一下。