歡迎您光臨本站 註冊首頁

HP-U MC頻繁報錯,日誌如下 請兄弟們幫忙分析下

←手機掃碼閱讀     火星人 @ 2014-03-04 , reply:0

HP-U MC頻繁報錯,日誌如下 請兄弟們幫忙分析下

故障現象:HP-U MC 上面的資料庫經常突然間連不上,查看集群狀態是總顯示 LAN0  為down   ,重啟集群 cmruncl -v后恢復正常,然後又不定時的出現原來的故障,只能重啟集群才能恢復正常,麻煩各位兄弟幫小弟分析一下
#cmviewcl -v

Network_Parameters:    
INTERFACE    STATUS                     PATH                NAME            
PRIMARY      down (disabled) (IP only)  0/1/1/0             lan0            
PRIMARY      up                         0/2/2/0             lan1            
STANDBY      up                         0/2/2/1             lan3           
 STANDBY      up                         0/1/1/1             lan2 


syslog.log如下

Oct 19 08:07:20 syczora1 cmnetd: 10.20.90.8 failed.
Oct 19 08:07:20 syczora1 cmnetd: lan2 is down at the IP layer.
Oct 19 08:07:20 syczora1 cmnetd: lan2 failed.
Oct 19 08:06:47 syczora1 cmdisklockd: Still trying to inquire cluster lock disk /dev/dsk/c2t0d0
Oct 19 08:07:20 syczora1  above message repeats 8 times
Oct 19 08:07:20 syczora1 cmnetd: Subnet 10.20.90.0 down
Oct 19 08:07:20 syczora1 cmcld: Subnet 10.20.90.0 in package orapkg is down.
Oct 19 08:07:20 syczora1 cmcld: Failing package orapkg on node syczora1 due to subnet failure.
Oct 19 08:07:20 syczora1 cmcld: Request from node syczora1 to fail package orapkg on node syczora1.
Oct 19 08:07:20 syczora1 cmcld: Executing '/etc/cmcluster/orapkg/orapkg.cntl  stop' for package orapkg, as service PKG*107009.
Oct 19 08:07:20 syczora1 cmserviced: Request to perform run service PKG*107009
Oct 19 08:07:20 syczora1 su: + tty?? root-oracle
Oct 19 08:07:30 syczora1 cmnetd: 10.20.90.8 recovered.
Oct 19 08:07:30 syczora1 cmnetd: Subnet 10.20.90.0 up
Oct 19 08:07:30 syczora1 cmnetd: lan2 is up at the IP layer.
Oct 19 08:07:29 syczora1 su: + tty?? root-oracle
Oct 19 08:07:30 syczora1 cmnetd: lan2 recovered.
Oct 19 08:08:10 syczora1 syslog: cmmodnet -r -i 10.20.90.7 10.20.90.0
Oct 19 08:08:11 syczora1 LVM: vgchange -a n vgdata
Oct 19 08:08:11 syczora1 LVM: vgchange -a n vgarch
Oct 19 08:08:11 syczora1 cmserviced: Service PKG*107009 terminated due to an exit(0).
Oct 19 08:08:11 syczora1 cmcld: Halted package orapkg on node syczora1.
Oct 19 08:08:11 syczora1 cmcld: Request from node syczora1 to start package orapkg on node syczora1.
Oct 19 08:08:11 syczora1 cmcld: Executing '/etc/cmcluster/orapkg/orapkg.cntl  start' for package orapkg, as service PKG*107009.
Oct 19 08:08:11 syczora1 cmserviced: Request to perform run service PKG*107009
Oct 19 08:08:17 syczora1 cmdisklockd: Still trying to inquire cluster lock disk /dev/dsk/c2t0d0
Oct 19 08:08:26 syczora1 LVM: vgchange -a e vgdata
Oct 19 08:08:41 syczora1 LVM: vgchange -a e vgarch
Oct 19 08:08:42 syczora1 syslog: cmmodnet -a -i 10.20.90.7 10.20.90.0
Oct 19 08:08:42 syczora1 su: + tty?? root-oracle
Oct 19 08:09:16 syczora1 cmserviced: Service PKG*107009 terminated due to an exit(0).
Oct 19 08:09:16 syczora1 cmcld: Started package orapkg on node syczora1.
Oct 19 08:09:47 syczora1 cmdisklockd: Still trying to inquire cluster lock disk /dev/dsk/c2t0d0
Oct 19 08:08:55 syczora1 su: + tty?? root-oracle
Oct 19 08:13:00 syczora1 cmdisklockd: Still trying to inquire cluster lock disk /dev/dsk/c2t0d0
Oct 19 08:13:47 syczora1  above message repeats 3 times
Oct 19 08:14:30 syczora1 cmdisklockd: Still trying to inquire cluster lock disk /dev/dsk/c2t0d0
Oct 19 08:32:30 syczora1 cmdisklockd: Still trying to inquire cluster lock disk /dev/dsk/c2t0d0
Oct 19 08:33:48 syczora1  above message repeats 12 times
Oct 19 08:34:00 syczora1 cmdisklockd: Still trying to inquire cluster lock disk /dev/dsk/c2t0d0
Oct 19 08:53:30 syczora1 cmdisklockd: Still trying to inquire cluster lock disk /dev/dsk/c2t0d0
Oct 19 08:53:48 syczora1  above message repeats 13 times
Oct 19 08:55:00 syczora1 cmdisklockd: Still trying to inquire cluster lock disk /dev/dsk/c2t0d0
Oct 19 09:08:08 syczora1 sshd: SSH: Server;Ltype: Version;Remote: 10.20.90.127-1065;Protocol: 2.0;Client: SecureCRT_5.1.3 (build 281) SecureCRT
Oct 19 09:08:31 syczora1 sshd: Accepted password for root from 10.20.90.127 port 1065 ssh2
Oct 19 09:09:02 syczora1 syslog: cmruncl -v
Oct 19 09:09:07 syczora1 syslog: cmruncl: Failed to validate the network configuration but will try to start the cluster anyway.
Oct 19 09:10:04 syczora1 syslog: cmhaltcl -f -v
Oct 19 09:10:04 syczora1 cmcld: Request from root on node syczora1 to halt the cluster on this node
Oct 19 09:10:04 syczora1 cmcld: Request from node syczora1 to disable node switching for package orapkg on node syczora1.
Oct 19 09:10:00 syczora1 cmdisklockd: Still trying to inquire cluster lock disk /dev/dsk/c2t0d0
Oct 19 09:10:04 syczora1  above message repeats 10 times
Oct 19 09:10:04 syczora1 cmcld: Disabled package orapkg on node syczora1.
Oct 19 09:10:04 syczora1 cmcld: Disabled package orapkg on node syczora2.
Oct 19 09:10:04 syczora1 cmcld: Request from node syczora1 to disable global switching for package orapkg.
Oct 19 09:10:04 syczora1 cmcld: Disabled switching for package orapkg.
Oct 19 09:10:04 syczora1 cmserviced: Request to perform run service PKG*107009
Oct 19 09:10:04 syczora1 cmcld: Request from root on node syczora1 to halt the cluster on this node
Oct 19 09:10:04 syczora1 su: + tty?? root-oracle
Oct 19 09:10:04 syczora1 cmcld: Request from root on node syczora1 to halt the cluster on this node
Oct 19 09:10:04 syczora1 cmcld: Request from node syczora1 to begin the halting process for package orapkg on node syczora1.
Oct 19 09:10:04 syczora1 cmcld: Halting package orapkg on node syczora1 as requested by user.
Oct 19 09:10:04 syczora1 cmcld: Request from node syczora1 to halt package orapkg on node syczora1.
Oct 19 09:10:04 syczora1 cmcld: Executing '/etc/cmcluster/orapkg/orapkg.cntl  stop' for package orapkg, as service PKG*107009.
Oct 19 09:10:11 syczora1 su: + tty?? root-oracle
Oct 19 09:10:43 syczora1 syslog: cmmodnet -r -i 10.20.90.7 10.20.90.0
Oct 19 09:10:44 syczora1 LVM: vgchange -a n vgdata
Oct 19 09:10:44 syczora1 LVM: vgchange -a n vgarch
Oct 19 09:10:44 syczora1 cmserviced: Service PKG*107009 terminated due to an exit(0).
Oct 19 09:10:44 syczora1 cmcld: Halted package orapkg on node syczora1.
Oct 19 09:10:44 syczora1 cmcld: Request from root on node syczora1 to halt the cluster on this node
Oct 19 09:10:44 syczora1 cmcld: Request from node syczora1 to enable global switching for package orapkg.
Oct 19 09:10:44 syczora1 cmcld: Enabled switching for package orapkg.
Oct 19 09:10:47 syczora1 cmcld: Member 2 is HALTING
Oct 19 09:10:47 syczora1 cmcld: Lost heartbeat to syczora2
Oct 19 09:10:47 syczora1 cmcld: Resolving quorum with members syczora1
Oct 19 09:10:47 syczora1 cmcld: Quorum satisfied
Oct 19 09:10:47 syczora1 cmserviced: Service cmlvmd terminated due to an exit(0).
Oct 19 09:10:47 syczora1 cmserviced: Service cmlockd terminated due to an exit(0).
Oct 19 09:10:47 syczora1 cmcld: Membership: membership at 1 is REFORMING (coordinator 1) includes: 1 excludes: 2
Oct 19 09:10:47 syczora1 cmcld: Membership: membership at 2 is FORMED (coordinator 1) includes: 1 excludes: 2
Oct 19 09:10:47 syczora1 cmcld: Closing route 192.168.100.2:5300 on fd 32 to syczora2: closing member
Oct 19 09:10:47 syczora1 cmcld: The following node(s) syczora2(id=2), left the cluster.
Oct 19 09:10:47 syczora1 cmcld: 1 nodes have formed a new cluster, sequence #2
Oct 19 09:10:47 syczora1 cmcld: The new active cluster membership is: syczora1(id=1)
Oct 19 09:10:47 syczora1 cmcld: Received clear reply in state clearing
Oct 19 09:10:47 syczora1 cmcld: Cluster CDB version 12 and node 1 CDB version 12
Oct 19 09:10:47 syczora1 cmcld: Package orapkg cannot run on this node because switching has been disabled for this node
Oct 19 09:10:50 syczora1 cmcld: Member syczora1 halting.
Oct 19 09:10:50 syczora1 cmcld: Membership: membership at 2 is HALTED (coordinator 1) includes: 1 excludes: 2
Oct 19 09:10:50 syczora1 cmnetd: Subnet 10.20.90.0 switching from lan2 to lan0
Oct 19 09:10:50 syczora1 cmnetd: Subnet 10.20.90.0 switched from lan2 to lan0
Oct 19 09:10:50 syczora1 cmnetd: lan2 switched to lan0
Oct 19 09:10:50 syczora1 cmserviced: Service cmnetd terminated due to an exit(0).
Oct 19 09:10:50 syczora1 cmserviced: Service cmfileassistd terminated due to an exit(0).
Oct 19 09:10:50 syczora1 cmserviced: Request to perform halt service cmlogd
Oct 19 09:10:55 syczora1 cmserviced: Service cmlogd terminated due to a signal(9).
Oct 19 09:10:55 syczora1 cmcld: This node (syczora1) has ceased cluster activities.
Oct 19 09:10:55 syczora1 cmcld: Daemon exiting
Oct 19 09:10:55 syczora1 cmdisklockd: cmdisklockd exiting
Oct 19 09:10:55 syczora1 cmproxyd: The cluster daemon aborted our connection (231).
Oct 19 09:10:55 syczora1 cmwbemd: The cluster daemon aborted our connection (231).
Oct 19 09:10:55 syczora1 cmclconfd: The cluster daemon aborted our connection (231).
Oct 19 09:10:55 syczora1 cmclconfd: The Serviceguard daemon, cmcld, exited normally.
Oct 19 09:10:56 syczora1 cmserviced: Service assistant daemon halted.
Oct 19 09:13:00 syczora1 sshd: SSH: Server;Ltype: Version;Remote: 10.20.90.127-1076;Protocol: 2.0;Client: SecureCRT_5.1.3 (build 281) SecureCRT
Oct 19 09:13:05 syczora1 sshd: Accepted password for root from 10.20.90.127 port 1076 ssh2
Oct 19 09:13:30 syczora1 syslog: cmhaltcl -f -v
Oct 19 09:13:48 syczora1 syslog: cmhaltcl -f -v
Oct 19 09:13:54 syczora1 syslog: cmruncl =v
Oct 19 09:14:11 syczora1 syslog: cmruncl -v
Oct 19 09:14:45 syczora1 cmclconfd: Request from root on node syczora1 to start the cluster on this node
Oct 19 09:14:46 syczora1 cmcld: Daemon Initialization - Maximum number of packages supported for this incarnation is 300.
Oct 19 09:14:46 syczora1 cmcld: Global Cluster Information:
Oct 19 09:14:46 syczora1 cmcld: Network Polling Interval is 2.00 seconds.
Oct 19 09:14:46 syczora1 cmcld: IO Timeout Extension is 0.00 seconds.
Oct 19 09:14:46 syczora1 cmcld: Auto Start Timeout is 600.00 seconds.
Oct 19 09:14:46 syczora1 cmcld: Failover Optimization is disabled.
Oct 19 09:14:46 syczora1 cmcld: Information Specific to node syczora1:
Oct 19 09:14:46 syczora1 cmcld: Cluster lock disk: /dev/dsk/c2t0d0.
Oct 19 09:14:46 syczora1 cmcld: lan3  0x002481773f9f  192.168.100.1  bridged net:1
Oct 19 09:14:46 syczora1 cmcld: lan0  0x0024817777c2  10.20.90.8  bridged net:2
Oct 19 09:14:46 syczora1 cmcld: lan1  0x0024817777c3  192.168.10.1  bridged net:3
Oct 19 09:14:46 syczora1 cmcld: lan2  0x002481773f9e    standby    bridged net:2
Oct 19 09:14:46 syczora1 cmcld: Heartbeat Subnet: 192.168.100.0
Oct 19 09:14:46 syczora1 cmcld: Configured quorum disk(s) /dev/dsk/c2t0d0
Oct 19 09:14:46 syczora1 cmcld: Member Timeout is 14.00 seconds.
Oct 19 09:14:46 syczora1 cmcld: Max reformation duration is 17.80 seconds.
Oct 19 09:14:46 syczora1 cmcld: The maximum # of concurrent local connections to the daemon that will be supported is 1024.
Oct 19 09:14:46 syczora1 cmdisklockd: Changed to working directory /var/adm/cmcluster/cmdisklockd.
Oct 19 09:14:46 syczora1 cmdisklockd: cmdisklockd started
Oct 19 09:14:46 syczora1 cmcld: Total allocated: 46085864 bytes, used: 3400688 bytes, unused 42685168 bytes
Oct 19 09:14:46 syczora1 cmserviced: Initializing
Oct 19 09:14:46 syczora1 cmserviced: Executing command: rm -f /var/adm/cmcluster/.cmserviced.*.socket
Oct 19 09:14:46 syczora1 cmserviced: Request to perform run service cmlogd
Oct 19 09:14:46 syczora1 cmserviced: Request to perform run service cmfileassistd
Oct 19 09:14:46 syczora1 cmserviced: Request to perform run service cmlockd
Oct 19 09:14:46 syczora1 cmfileassistd: Changed to working directory /var/adm/cmcluster/cmfileassistd.
Oct 19 09:14:46 syczora1 cmlockd: Changed to working directory /var/adm/cmcluster/cmlockd.
Oct 19 09:14:46 syczora1 cmlockd: Executing command: rm -f /var/adm/cmcluster/.cmlock.*.socket
Oct 19 09:14:46 syczora1 cmserviced: Request to perform run service cmnetd
Oct 19 09:14:46 syczora1 cmnetd: Changed to working directory /var/adm/cmcluster/cmnetd.
Oct 19 09:14:46 syczora1 cmnetd: Initializing
Oct 19 09:14:46 syczora1 cmnetd: Executing command: rm -f /var/adm/cmcluster/.cmnetd.*.socket
Oct 19 09:14:46 syczora1 cmnetd: Auto Failback is enabled.
Oct 19 09:14:46 syczora1 cmserviced: Request to perform run service cmlvmd
Oct 19 09:14:47 syczora1 cmcld: Membership: membership at 0 is REFORMING (coordinator 1) includes: 1 excludes: 2
Oct 19 09:14:47 syczora1 cmcld: Member syczora2 is joining the cluster.
Oct 19 09:14:47 syczora1 cmcld: Resolving quorum with members syczora1, syczora2
Oct 19 09:14:47 syczora1 cmcld: Quorum satisfied
Oct 19 09:14:47 syczora1 cmcld: Membership: membership at 1 is FORMED (coordinator 1) includes: 1 2 excludes:
Oct 19 09:14:47 syczora1 cmcld: 2 nodes have formed a new cluster, sequence #1
Oct 19 09:14:47 syczora1 cmcld: The new active cluster membership is: syczora1(id=1), syczora2(id=2)
Oct 19 09:14:47 syczora1 cmcld: Cluster CDB version 12 and node 1 CDB version 12
Oct 19 09:14:47 syczora1 cmcld: Cluster CDB version 12 and node 2 CDB version 12
Oct 19 09:14:47 syczora1 cmlvmd: Clvmd initialized successfully.
Oct 19 09:14:47 syczora1 cmcld: Request from node syczora1 to start package orapkg on node syczora1.
Oct 19 09:14:47 syczora1 cmcld: Executing '/etc/cmcluster/orapkg/orapkg.cntl  start' for package orapkg, as service PKG*107009.
Oct 19 09:14:47 syczora1 cmserviced: Request to perform run service PKG*107009
Oct 19 09:14:58 syczora1 cmdisklockd: Still trying to inquire cluster lock disk /dev/dsk/c2t0d0
Oct 19 09:14:59 syczora1 cmdisklockd: added device: /dev/vglock:/dev/dsk/c2t0d0
Oct 19 09:14:59 syczora1 cmcld: Cluster lock disk /dev/vglock:/dev/dsk/c2t0d0 is good
Oct 19 09:14:59 syczora1 cmcld: Received clear reply in state clearing
Oct 19 09:15:02 syczora1 LVM: vgchange -a e vgdata
Oct 19 09:15:06 syczora1 sshd: SSH: Server;LType: Throughput;Remote: 10.20.90.127-1076;IN: 14672;OUT: 4532;Duration: 120.3;tPut_in: 121.9;tPut_out: 37.7
Oct 19 09:15:17 syczora1 LVM: vgchange -a e vgarch
Oct 19 09:15:18 syczora1 syslog: cmmodnet -a -i 10.20.90.7 10.20.90.0
Oct 19 09:15:18 syczora1 su: + tty?? root-oracle
Oct 19 09:15:52 syczora1 cmserviced: Service PKG*107009 terminated due to an exit(0).
Oct 19 09:15:52 syczora1 cmcld: Started package orapkg on node syczora1.
Oct 19 09:15:31 syczora1 su: + tty?? root-oracle
Oct 19 09:16:04 syczora1 telnetd: getpid: peer died: Error 0
Oct 19 09:16:26 syczora1 cmdisklockd: Still trying to inquire cluster lock disk /dev/dsk/c2t0d0
Oct 19 09:17:44 syczora1 telnetd: getpid: peer died: Error 0
Oct 19 09:32:56 syczora1 cmdisklockd: Still trying to inquire cluster lock disk /dev/dsk/c2t0d0
Oct 19 09:33:48 syczora1  above message repeats 11 times
Oct 19 09:34:26 syczora1 cmdisklockd: Still trying to inquire cluster lock disk /dev/dsk/c2t0d0
Oct 19 09:52:32 syczora1 cmdisklockd: Still trying to inquire cluster lock disk /dev/dsk/c2t0d0
Oct 19 09:53:48 syczora1  above message repeats 12 times
Oct 19 09:54:02 syczora1 cmdisklockd: Still trying to inquire cluster lock disk /dev/dsk/c2t0d0
Oct 19 10:13:42 syczora1 cmdisklockd: Still trying to inquire cluster lock disk /dev/dsk/c2t0d0
Oct 19 10:13:48 syczora1  above message repeats 14 times

[火星人 ] HP-U MC頻繁報錯,日誌如下 請兄弟們幫忙分析下已經有985次圍觀

http://coctec.com/docs/service/show-post-4324.html