歡迎您光臨本站 註冊首頁

安裝RedHat RHCS on rhel 5.1

←手機掃碼閱讀     火星人 @ 2014-03-04 , reply:0

安裝RedHat RHCS on rhel 5.1

作者  Fabio Silva

If anyone know there is English version or Chinese Version for this article, Please let me know. Thanks!

http://sources.redhat.com/cluster/wiki/QuickStart-Portuguese

安裝RedHat RHCS on rhel 5.1
我本想把該文翻譯成中文,原文是葡萄牙文,我只是用古狗翻譯, 我發現我有很多不懂的地方,敬請注意,對不起了以上文本藉助www.InputKing.com在線中文輸入法輸入

Contents

Installation of RHEL 5.1 Cluster Suite
Installing the Support Pack for the HP server DL380 G5
Setting bounding of network interfaces
Setup Multipath
Openais
Cluster Setup
Administration of Cluster
Actions to restart the service manuals
Acknowledgments

This document describes how to set up an environment in the cluster using RHEL Cluster suite.

The environment is being configured in a cluster of two nodes, one active and one on standby, where the active node has a VIP (Virtual IP), mount the partitions that are in storage and provides the services and apache mysql.

The machines used in the solution are two HP DL380 G5, so the Gnome Desktop is installed in the limbs, for you can install the Support Pack from HP that contains the appropriate drivers for this machine for the operating system RHEL 5.1, if you do not want to install the gnome and make the installation of the Support Pack through the command line, it is also possible.

Questiojn1:  I never use HP ILO before, and I do not know how the iLo communicate with other Node, If anyone know, please give me a hint.

Question2:  bond1 is used for CMAN communication(?) - hello message?

Question3: Storage is NOT mentioned a lot in the article, I took a wild guess, I will be grateful if anyone tell me it is wrong. This is the purpose I am here.

[ 本帖最後由 gl00ad 於 2008-10-30 10:43 編輯 ]
《解決方案》

回復 #1 gl00ad 的帖子

Definition of environmental settings

The partitions created in members are:

/dev/cciss/c0d0p1       /boot   100M  
/dev/cciss/c0d0p2       SWAP    2G  
/dev/cciss/c0d0p3       /       10G  
/dev/cciss/c0d0p4       Extended  
/dev/cciss/c0d0p5       /var    57G

The name of the machines should be defined as

node1.local
node2.local

Machine: node1.local

eth0 192.168.2.153/24
eth2 10.10.10.1/30
iLO IP 192.168.2.149/24
Default GW 192.168.2.1 (Default GW's network)

Machine: node2.local

eth0 192.168.2.154/24
eth2 10.10.10.2/30
iLO IP 192.168.2.150/24
Default GW 192.168.2.1 (Default GW's network)

Where the eth2 will be used to check the availability of members. The right is to create a VLAN on the switch to isolate it and keep the communication network alone.After installing the Support Pack will be made in setting up Bonding, where we have the use of two network cards in high availability.
The packages selected at the time of installation are

Desktop Environment     --> Gnome Desktop
Applications --> Graphical Internet
Development --> Development Libraries
                                Development Tools
                                Legacy Software Development
Server -->      Legacy Network Server
                        Mysql Database
                        Printing Support
                        Web Server
                                Optional Packages
                                        + mod_auth_mysql
                                        + php-mysql
                                        + php-odbc
                                        + php-pear
                                        - squid
                                        - webalizer
Base System --> Administration Tools
                                Base
                                Java
                                Legacy Software Tools
                                System Tools
                                X Window System
After installation completed, some extra packages must be installed, and are located within the CD of RHEL. By mounting the CD, you can see a folder name of Cluster where the packages are necessary to install the Cluster, you can also see the folder Server that is the folder where the packages are referring to RHEL itself

mount /dev/cdrom /mnt
cd /mnt
rpm -ivh Server/perl-Net-Telnet*
rpm -ivh Server/perl-Carp-Clan*
rpm -ivh Server/perl-Bit-Vector*
rpm -ivh Server/perl-Date-Calc*
rpm -ivh Server/perl-Crypt-SSLeay*
rpm -ivh Server/openais*
rpm -ivh Server/cman*
rpm -ivh Server/postfix*
rpm -ivh Cluster/rgmanager*

rpm-e sendmail

chkconfig on JHLC
chkconfig on rgmanager
chkconfig on postfix
chkconfig opena off
chkconfig iptables off
chkconfig ip6tables off
chkconfig bluetooth off
chkconfig avahi-daemon off
chkconfig portmap off
chkconfig off acpid

[ 本帖最後由 gl00ad 於 2008-10-30 09:03 編輯 ]
《解決方案》

Installing the Support Pack for the HP server DL380 G5

Installing the Support Pack for the HP server DL380 G5

The version used was the version 7.92
Download the package of 97 M. and copy the file psp-7.92.rhel5.linux.en.tar.gz for members, and further do the following:
tar xvzf psp-7.92.rhel5.linux.en.tar.gz
cd compaq/csp/linux
./install792.sh -nui

Some questions will be asked, and that I changed were the following:

The hpsmh-2.1.10-186.linux.x86_64.rpm component requires input: TrustByAll

Do you want to activate the iLO driver at startup? : NO
Please enter the SNMP localhost read/write community string []: private
Please enter the SNMP localhost Read-Only community string []: public
Enter the SNMP Read/Write Authorized Mgmt Station community string []: private
Enter the SNMP Read-Only Authorized Mgmt Station community string []: public
Enter the Default SNMP trap community string []: public
Do you wish to reconfigure the HP Version Control Agent? N
with 'root only' access! : Y

chkconfig hp vt-off

/etc/hosts

# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1       localhost.localdomain   localhost
::1     localhost6.localdomain6 localhost6

10.10.10.1      node1.local     node1
10.10.10.2      node2.local     node2
192.168.2.149   ilo-node1.local ilo-node1
192.168.2.150   ilo-node2.local ilo-node2

[ 本帖最後由 gl00ad 於 2008-10-30 09:11 編輯 ]
《解決方案》

Setting bounding of network interfaces

Setting bounding of network interfaces

Because this server has 4 network cards, we use the eth0 and eth1 to respond to the internal network and eth2 and eth3 for the communication of the cluster. For this, one should follow the following steps.

Open /etc/modprobe.conf and add the following lines to the end of the file

alias bond0 bonding
alias bond1 bonding
options bonding mode=0 miimon=100

Then shall be on the configuration of the startup scripts of network interfaces in the directory / etc / sysconfig / network-scripts /.

The following files are the same on both machines, and is just below the settings of the interfaces bond0 and bond1 for each of the machines.

File: / etc/sysconfig/network-scripts/ifcfg-eth0

DEVICE=eth0
BOOTPROTO=none
ONBOOT=yes
MASTER=bond0
SLAVE=yes
USERCTL=no
ETHTOOL_OPTS="autoneg off speed 100 duplex full"

/etc/sysconfig/network-scripts/ifcfg-eth1

DEVICE=eth1
BOOTPROTO=none
ONBOOT=yes
MASTER=bond0
SLAVE=yes
USERCTL=no
ETHTOOL_OPTS="autoneg off speed 100 duplex full"
/etc/sysconfig/network-scripts/ifcfg-eth2

DEVICE=eth2
BOOTPROTO=none
ONBOOT=yes
MASTER=bond1
SLAVE=yes
USERCTL=no
ETHTOOL_OPTS="autoneg off speed 100 duplex full"

/etc/sysconfig/network-scripts/ifcfg-eth3
DEVICE=eth3
BOOTPROTO=none
ONBOOT=yes
MASTER=bond1
SLAVE=yes
USERCTL=no
ETHTOOL_OPTS="autoneg off speed 100 duplex full"
Setup node1.local

File: / etc/sysconfig/network-scripts/ifcfg-bond0
DEVICE=bond0
USERCTL=no
ONBOOT=yes
IPADDR=192.168.2.153
NETMASK=255.255.255.0
BROADCAST=192.168.2.255
NETWORK=192.168.2.0
GATEWAY=192.168.2.1

/etc/sysconfig/network-scripts/ifcfg-bond1
DEVICE=bond1
USERCTL=no
ONBOOT=yes
IPADDR=10.10.10.1
NETMASK=255.255.255.252
BROADCAST=10.10.10.3
NETWORK=10.10.10.0
Setup node2.local

File: / etc/sysconfig/network-scripts/ifcfg-bond0

DEVICE=bond0
USERCTL=no
ONBOOT=yes
IPADDR=192.168.2.154
NETMASK=255.255.255.0
BROADCAST=192.168.2.255
NETWORK=192.168.2.0
GATEWAY=192.168.2.1
/etc/sysconfig/network-scripts/ifcfg-bond1

DEVICE=bond1
USERCTL=no
ONBOOT=yes
IPADDR=10.10.10.2
NETMASK=255.255.255.252
BROADCAST=10.10.10.3
NETWORK=10.10.10.0
Restart your machine so that everything go up in line. reboot

Note: If you have a problem with the order that the signs go up after the boot, you must set the parameter HWADDR with their MAC addresses of each card in your configuration files.

After the boot, do the test ping between the machines, and the default network gateway. In the State node1.local

ping 192.168.2.1
ping 192.168.2.149
ping 192.168.2.150
ping 10.10.10.2
ping node2
ping ilo-node1
ping ilo-node2

node2.local
ping 192.168.2.1
ping 192.168.2.149
ping 192.168.2.150
ping 10.10.10.1
ping node1
ping ilo-node1
ping ilo-node2

[ 本帖最後由 gl00ad 於 2008-10-30 09:27 編輯 ]
《解決方案》

Setup Multipath

Setup Multipath
To that Linux can have access to the data storage, you must configure the multipath. The package responsible for communicating with the Linux is the storage-device-mapper multipath.

After presenting the disks of storage for Linux (not discussed here as it is that) we should see if the Linux is viewing the disk's storage.

Edit /etc/multipath.conf and comment below the block so that the devices are not blocked, leave and below, and also make sure user_friendly_names is setada to yes, if not, do so.
#blacklist {
#        devnode "*"
#}

After this, make sure that the module is loaded dm_multipath
# lsmod | grep dm_multipath
dm_multipath           52433  2 dm_round_robin
dm_mod                 96017  7 dm_mirror,dm_multipath

Inicialize o serviço multipathd
# multipath -l
mpath0 (3600508b40010870a0001e000000d0000) dm-0 HP,HSV200

\_ round-robin 0
\_ 0:0:1:1 sdb 8:16  
\_ 1:0:0:1 sdc 8:32  
\_ round-robin 0
\_ 0:0:0:1 sda 8:0   
\_ 1:0:1:1 sdd 8:48  
You can also see in /dev/mpath / something like below

lrwxrwxrwx  1 root root    7 Mar  7 09:49 3600508b40010870a0001e000000d0000 -> ../dm-0

And as we leave the option user_friendly_names setada to yes, we can use a friendly name you locate the device created, it can be seen on / dev / mapper /

# ls -la /dev/mapper/
total 0
drwxr-xr-x  2 root root     100 Mar  7 09:49 .
drwxr-xr-x 14 root root    4280 Mar  7 09:49 ..
crw-------  1 root root  10, 63 Mar  7 09:49 control
brw-rw----  1 root disk 253,  0 Mar  7 09:49 mpath0
All right so far, you have to do is create a partition on this disk mpath0 and create the file system for her.

Create a partition with the command fdisk / dev/mapper/mpath0.

Then
# partprobe
  then, create the file system on the partition created.
# mkfs.ext3 /dev/mapper/mpath0p1
Okay, we can test the partition mounting somewhere
# mount /dev/mapper/mpath0p1 /mnt
  And you can see the disc mounted
# df -h | grep mpath0p1
/dev/mapper/mpath0p1  9.9G  999M  8.4G  11% /mnt
The file / var / lib / multipath / bindings is created automatically from the multipathd daemon, and has the friendly name for the multipath, an example of its content is below.

mpath0 3600508b40010870a0001e000000d0000

obs.: It is very important that this file is equal for both us, because otherwise we will have problems with the assembly of the partition. If this occurs, check the documentation and see how to create aliases to the wwid, may be the solution.

[ 本帖最後由 gl00ad 於 2008-10-30 09:40 編輯 ]
《解決方案》

Openais

Openais
You should check and change the file / etc / ais / openais.conf so he uses the network 10.10.10.0 and not the 192.168.2.0 network, it should be done on both machines.

Change the line
bindnetaddr: 192.168.2.0
to
bindnetaddr: 10.10.10.0

[ 本帖最後由 gl00ad 於 2008-10-30 09:44 編輯 ]
《解決方案》

Cluster Setup

Cluster Setup
The configuration of the cluster is made from the / etc / cluster / cluster.conf and configuration of this environment is defined below.
<?xml version="1.0"?>
<cluster config_version="1" name="teste">
<cman two_node="1" expected_votes="1">
</cman>
<clusternodes>
        <clusternode name="node1.local" nodeid="1" votes="1">
                <fence>
                                        <method name="1">
                                                <device name="ilo-node1"/>
                                        </method>
                                        <method name="2">
                                                <device name="manual" nodename="node1.local"/>
                                        </method>
                </fence>
        </clusternode>
        <clusternode name="node2.local" nodeid="2" votes="1">
                <fence>
                                        <method name="1">
                                                <device name="ilo-node2"/>
                                        </method>
                                        <method name="2">
                                                <device name="manual" nodename="node2.local"/>
                                        </method>
                </fence>
        </clusternode>
</clusternodes>
<fencedevices>
        <fencedevice agent="fence_ilo" hostname="ilo-node1.local" login="Administrator" name="ilo-node1" passwd="12345"/>
        <fencedevice agent="fence_ilo" hostname="ilo-node2.local" login="Administrator" name="ilo-node2" passwd="12345"/>
        <fencedevice agent="fence_manual" name="manual" />
</fencedevices>

<rm>
<failoverdomains/>
       <service name="cluster-services" autostart="1">
                <fs name="dados" device="/dev/mapper/mpath0p1" fstype="ext3" mountpoint="/dados" force_unmount="1"/>
                <ip address="192.168.2.23"/>
                <script name="apache" file="/etc/init.d/httpd"/>
                <script name="mysql" file="/etc/init.d/mysqld"/>
       </service>
</rm>
</cluster>

More information about the parameters can be found on this link

The parameter fencedevice, is used to restart the nodes when a problem occurs.

Eg: If the Node1 is active, and no longer to communicate via eth2 with node2, the Node1 sends a command to the iLO node2 (via eth0) for the node2 is restarted, this may have had a problem with network or freezing of the operating system.
If the node2 was without power cable, we have problem, because the Node1 not be able to send the reset for the iLO node2, the cluster will be unstable and some manual steps must be taken, and will be seen later in this documentation.

This parameter fencedevice is very important to ensure data integrity, especially if the nodes are accessing a storage for reading and writing of data. Imagine if the Node1 not have to reset the node2 (that is with the active services) and resolves mount the partitions of storage and start the services? This will corrupt the file system, because both we will mount the partition of storage. So in some situations we must interact with some hand controls so as not to corrupt the data.

Boot services rgmanager CMAN
service rgmanager start
service cman start

[ 本帖最後由 gl00ad 於 2008-10-30 09:51 編輯 ]
《解決方案》

Administration of Cluster

Administration of Cluster
Who "take care" of services that the wheel is the cluster service rgmanager

Checking the status of the cluster with the command clustat with this command you can also see if the members are Online or Offline
# clustat
Member Status: Quorate

  Member Name                        ID   Status
  ------ ----                        ---- ------
  node1.local                           1 Online, Local, rgmanager
  node2.local                           2 Online, rgmanager

  Service Name         Owner (Last)                   State
  ------- ----         ----- ------                   -----
  service:cluster-serv node1.local                    started
# cman_tool nodes
Node  Sts   Inc   Joined               Name
   1   M     36   2008-02-29 13:07:42  node1.local
   2   M     40   2008-02-29 13:07:42  node2.local
Another interesting command is the clusvcadm with this command you can stop the services of the cluster, or even relocate the service of one member to another.

Eg services are running on node1.local, and we need to move the services to node2.local, you should then run the command clusvcadm-r-services cluster

Some other interesting parameters are:
clusvcadm -e   <-- enable (start)
clusvcadm -d   <-- disable (stop)
clusvcadm -r   <-- relocate (move to another machine)
If it takes a restart of us, you can do so through the commands below, that will trigger the agent's iLO and restart the machine requested
# fence_node node2.local    <-- node2.local should reboot
# fence_node node1.local    <-- node1.local should reboot

[ 本帖最後由 gl00ad 於 2008-10-30 09:58 編輯 ]
《解決方案》

Actions to restart the service manuals

Actions to restart the service manuals
If the node2 fall, and the services do not rise in Node1 it takes up the services manually and make sure the node2 is not with the active services and / or partitions the storage mounted in order to maintain the integrity of data and not corrupt the same.

First check with the command clustat the status of members

# clustat
Member Status: Quorate

  Member Name                        ID   Status
  ------ ----                        ---- ------
  node1.local                           1 Online, Local, rgmanager
  node2.local                           2 Offline

  Service Name         Owner (Last)                   State
  ------- ----         ----- ------                   -----
  service:cluster-serv node2.local                    started
#
  You can see that node2 is offline, and for the Node1 services are still active in node2. This happens because the Node1 can not verify whether the node2 had a network problem or if the machine had a crash and can still be with the partition of storage installed, and if the Node1 start the services, the data may be corrupted.

As the Node1 can not obtain information from node2, you must check the logs to see if the process is halted.

tail -f /var/log/messages
Feb 29 15:24:45 node1 fenced: fencing node "node2.local"
Feb 29 15:24:45 node1 ccsd: process_get: Invalid connection descriptor received.
Feb 29 15:24:45 node1 ccsd: Error while processing get: Invalid request descriptor
Feb 29 15:24:45 node1 fenced: fence "node2.local" failed
To solve it we must run the command below
echo node2.local > /var/run/cluster/fenced_override
And you can see in the logs the result of command
tail -f /var/log/messages
Feb 29 15:29:27 node1 fenced: fence "node2.local" overridden by administrator intervention

And as a result, one should stop the services and start them again in Node1.
clusvcadm -d cluster-services
clusvcadm -e cluster-services
After that, with the command clustat you can see the services running on Node1
# clustat
Member Status: Quorate

  Member Name                        ID   Status
  ------ ----                        ---- ------
  node1.local                           1 Online, Local, rgmanager
  node2.local                           2 Offline

  Service Name         Owner (Last)                   State
  ------- ----         ----- ------                   -----
  service:cluster-serv node1.local                    started
#

[ 本帖最後由 gl00ad 於 2008-10-30 10:05 編輯 ]
《解決方案》

Acknowledgments

Acknowledgments
I thank Lon Hohberg that I helped in shaping the environment.

And the staff of the channel # linux-server cluster in irc.freenode.net

My name is Fabio Silva, fssilva@gmail.com - http://www.fabiosilva.eti.br.

None: QuickStart-Portuguese (last edited 2008-04-17 19:26:24 by fabiosilva)

[火星人 ] 安裝RedHat RHCS on rhel 5.1已經有891次圍觀

http://coctec.com/docs/service/show-post-6209.html