Infiniband cluster Part 2

Configuring the Network

Intro:

Welcome to part 2 of the series. We will cover sett ing up 3 vlans for the tcp/ip network and then we will ensure all the packages for infiniband are installed.

TCP/IP

I will use the following VLANs:

  • 1 – Management LAN
  • 2 – Main LAN
  • 3 – Private Cluster LAN

These should be changed to suit your environment.

Switch:

As my cluster is using 8 out of 16 slots on the blade I have set the ports as follows:

  • VLAN1 – N/A (this is management and only used to access the CMC / DRAC cards)
  • VLAN2 – 1/g1 -> 1/g20 (all ports should be able to talk on the Main LAN)
  • VLAN3 – 1/g1 -> 1/g8 (I only have 8 nodes at the moment, though you could configure the remaining 8 so there is less to configure for additional nodes)

Port settings:

  • Ports 1-16
    • VLAN Mode = General
    • PVID = 2
    • Frame Type = admit all
    • ingress Filtering = enable
    • port priority = 0
  • Ports 17-20
    • VLAN Mode = Trunk

To configure from the command line run the following for ports 1-16:

Enable
configure
interface ethernet 1/gXX
switchport mode general
switchport general pvid 2
exit

for ports 17-20:

Enable
configure
interface ethernet 1/gXX
switchport mode trunk
exit

System

Enable VLAN Support:

 modprobe 802.1q 

Ensure that eth0 is up and add VLANs:

ifconfig eth0 up
vconfig add eth0
vconfig add eth0 3

Edit /etc/sysconfig/network-scripts/ifcfg-eth0:

  • Change “ONBOOT=yes” to “ONBOOT=no
  • Remove the DHCP_HOSTNAME line

For MainLAN access:

vi /etc/sysconfig/network-scripts/ifcfg-eth0.2

DEVICE=eth0.2
BOOTPROTO=dhcp
ONBOOT=yes
VLAN=yes

For the private VLAN (replace 3 with the VLANID):

vi /etc/sysconfig/network-scripts/ifcfg-eth0.3

  • Delete the contents then add:
DEVICE=eth0.3
BOOTPROTO=none
ONBOOT=yes
IPADDR=
NETMASK=255.255.255.0
VLAN=yes

Note: All nodes should have eth0.2 even if it is not to be used. On the nodes that are to be private set “eth0.2” ONBOOT to no.

To make a class 2 cluster:
Follow instructions above

To make a class 1 cluster:
In MainLAN section above:
In “/etc/sysconfig/network-scripts/ifcfg-eth0.2” change “ONBOOT=yes” to “ONBOOT=no

vconfig rem eth0.2

Leave the ifcfg script alone so we can reuse if switch back to a class 2.

InfiniBand

Note: Currently it is required for one of the nodes to be a master for the InfiniBand although in theory it should be possible to have all as equal peers, I just don’t know how to configure it.

Note2: Starting openibd seems to kill eth0 in some instances so you may need to bring it up again (ifconfig eth0 up)

Ensure the following packages are installed:

yum install openib libibverbs libmthca libmlx4 libipathverbs libcxgb3 libnes libibcm librdmacm librdmacm-utils opensm libibcommon libibumad libibmad dapl dapl-utils ibutils infiniband-diags ibsim libsdp mstflint tvflash perftest qperf ofed-docs qlvnictools srptools mpi-selector openmpi mvapich mvapich2 libibverbs-utils

On the node that is to act as the “head node”:

/etc/init.d/opensmd start
/etc/init.d/openibd start

chkconfig opensmd on
chkconfig openibd on

On the other nodes:

/etc/init.d/openibd start
chkconfig openibd on

Series Links

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This blog is kept spam free by WP-SpamFree.