Deploying Your Solution
3.1. Installing and configuring OpenHPC 1.3
3.1.1. Pre-Requisites
The first step after the OS installation on head node is to disable the firewall
[root@HPCHN ~]# systemctl disable firewalld
Removed symlink / etc/systemd/system/dbusorg.fedoraproject.FirewallD1.service.
Removed symlink /etc/systemd/system/basic.target.wants/firewalld.service.
[root@HPCHN ~]# systemctl stop firewalld
Then update the distro with:
[root@HPCHN ~]#yum update
Next it must be disabled the SELinux
From the command line, you can edit the /etc/sysconfig/selinux file. This file is a link to /etc/selinux/config. The configuration file is self-explanatory. Changing the value of SELINUX or SELINUXTYPE changes the state of SELinux and the name of the policy to be used the next time the system boots.
[root@HPCHN ~]# vi /etc/sysconfig/selinux
The installation recipe herein assumes that the master host name is resolvable locally. Depending on the manner in which you installed the Base OS, there may be an adequate entry already defined in /etc/hosts. If not, the following addition can be used to identify your master host.
[root@HPCHN ~]# echo 10.0.2.1 HPCHN >> /etc/hosts
[root@HPCHN ~]#
3.1.2. Enable OpenHPC Reposity for local use
The OpenHPC repository must be enabled in order to find the necessary packages for HPC Configuration.
[root@HPCHN ~]# yum install http://build.openhpc.community/OpenHPC:/1.3/CentOS_7/x86_64/ohpc-release-1.3-1.el7.x86_64.rpm It should be installed the following package ohpc-release.x86_64 0:1.3-1.el7 with this Dependencies Installed: epel-release.noarch 0:7-9.
Note: the link expressed can change so review the link shown
3.1.3. Add provisioning services on master node
The head node need the openHPC basic package and the warewulf installed in order to provision the system.
[root@HPCHN ~]# yum -y groupinstall ohpc-base
[root@HPCHN ~]# yum -y groupinstall ohpc-warewulf
3.1.4. Enable Provisioning network interface
The provisioning interface used is the corresponding to eth1 or enp5s0f1. This interfaces is used for provisioning, cluster administration and workload. It is important to start by enabling the ntp service.
[root@HPCHN ~]# systemctl enable ntpd
Created symlink from /etc/systemd/system/multi-user.target.wants/ntpd.service to /usr/lib/systemd/system/ntpd.service.
[root@HPCHN ~]# echo "server pool.ntp.org" >> /etc/ntp.conf
[root@HPCHN ~]# systemctl restart ntpd
[root@HPCHN ~]#
Then, using perl the information about the interface can be imported into the provision.conf configuration file.
[root@HPCHN ~]# perl -pi -e "s/device = eth1/device = enp5s0f1/" /etc/warewulf/provision.conf
[root@HPCHN ~]# perl -pi -e "s/^\s+disable\s+= yes/ disable = no/" /etc/xinetd.d/tftp
[root@HPCHN ~]# perl -pi -e "s/^\s+disable\s+= yes/ disable = no/" /etc/xinetd.d/tftp
[root@HPCHN ~]# ifconfig enp5s0f1 10.0.2.1 netmask 255.255.255.0 up
[root@HPCHN ~]#
Note: the interface can have a different name on your server or virtual machine, so review the name convention using the command ifconfig.
3.1.5. Install SLURM Server on master node
The slurm server packages must be installed on the head node.
[root@HPCHN ~]# yum -y groupinstall ohpc-slurm-server
This is a list of the packages installed:
- munge-devel-ohpc.x86_64 0:0.5.12-21.1
- munge-libs-ohpc.x86_64 0:0.5.12-21.1
- munge-ohpc.x86_64 0:0.5.12-21.1
- slurm-devel-ohpc.x86_64 0:16.05.10-36.1
- slurm-munge-ohpc.x86_64 0:16.05.10-36.1
- slurm-ohpc.x86_64 0:16.05.10-36.1
- slurm-perlapi-ohpc.x86_64 0:16.05.10-36.1
- slurm-plugins-ohpc.x86_64 0:16.05.10-36.1
- slurm-slurmdb-direct-ohpc.x86_64 0:16.05.10-36.1
- slurm-slurmdbd-ohpc.x86_64 0:16.05.10-36.1
- slurm-sql-ohpc.x86_64 0:16.05.10-36.1
Note: Client-side components will be added to the compute image in a subsequent step.
3.1.6. Restart and enable relevant services
This step is not mandatory but it have shown to work, facilitating further configuration.
[root@HPCHN ~]# systemctl restart xinetd
[root@HPCHN ~]# systemctl enable mariadb.service
Created symlink from /etc/systemd/system/multi-user.target.wants/mariadb.service to /usr/lib/systemd/system/mariadb.service.
[root@HPCHN ~]# systemctl restart mariadb.service
[root@HPCHN ~]# systemctl enable httpd.service
[root@HPCHN ~]# systemctl restart httpd.service
[root@HPCHN ~]#
3.2. Define compute image for provisionning
The next step is to define a system image that can be used to provision one or more compute nodes.
3.2.1. Define CHROOT Location and Built initial Image
Chroot is an operation that changes the root directory for the current running process and its children, so it can be considered a form of virtualization. This new directory is the one that is going to be used for the compute image.
Define the chroot location as:
[root@HPCHN ~]# export CHROOT=/opt/ohpc/admin/images/centos7.3
Build an initial chroot image, this image contains only a minimal Cento7.3 configuration.
[root@HPCHN ~]# wwmkchroot centos-7 $CHROOT
Loaded plugins: fastestmirror, langpacks
os-base | 3.6 kB 00:00
(1/2): os-base/x86_64/group_gz | 155 kB 00:00
(2/2): os-base/x86_64/primary_db | 5.6 MB 00:03
Determining fastest mirrors
Resolving Dependencies
--> Running transaction check
---> Package basesystem.noarch 0:10.0-7.el7.centos will be installed
---> Package bash.x86_64 0:4.2.46-20.el7_2 will be installed
…
shared-mime-info.x86_64 0:1.1-9.el7
sqlite.x86_64 0:3.7.17-8.el7
systemd.x86_64 0:219-30.el7
systemd-libs.x86_64 0:219-30.el7
systemd-sysv.x86_64 0:219-30.el7
sysvinit-tools.x86_64 0:2.88-14.dsf.el7
tcp_wrappers-libs.x86_64 0:7.6-77.el7
ustr.x86_64 0:1.0.4-16.el7
xz.x86_64 0:5.2.2-1.el7
xz-libs.x86_64 0:5.2.2-1.el7
Complete!
Note: it is important to remember the path where the chroot is defined.
Next, you need to add additional components as resource management client services, drivers, and other packages to support the default OpenHPC environment.
[root@HPCHN ~]# yum -y --installroot=$CHROOT groupinstall ohpc-base-compute
This are the packages installed:
- libicu.x86_64 0:50.1.2-15.el7,
- numactl.x86_64 0:2.0.9-6.el7_2.
The chroot environment needs to be updated to enable DNS resolution. It is important to note that the master node must have a working DNS configuration.
[root@HPCHN ~]# cp -p /etc/resolv.conf $CHROOT/etc/resolv.conf
3.2.2. Add slurm client support, NTP Support, kernel drivers and other modules.
In this step, it is going to be added the Slurm client support to communicate with the master and execute the work.
Note: only the master node needs the server component for slurm, if a compute node has a slurm server installed, the cluster is not going to work. You will need to rebuild the image a restart the node.
[root@HPCHN ~]# yum -y --installroot=$CHROOT groupinstall ohpc-slurm-client
Packages Installed:
munge-ohpc.x86_64 0:0.5.12-21.1 slurm-munge-ohpc.x86_64 0:16.05.10-36.1
slurm-ohpc.x86_64 0:16.05.10-36.1 slurm-pam_slurm-ohpc.x86_64 0:16.05.10-36.1
slurm-plugins-ohpc.x86_64 0:16.05.10-36.1 slurm-sjobexit-ohpc.x86_64 0:16.05.10-36.1
Dependencies installed:
munge-libs-ohpc.x86_64 0:0.5.12-21.1 perl.x86_64 4:5.16.3-291.el7
perl-Carp.noarch 0:1.26-244.el7 perl-Encode.x86_64 0:2.51-7.el7
perl-Exporter.noarch 0:5.68-3.el7 perl-File-Path.noarch 0:2.09-2.el7
perl-File-Temp.noarch 0:0.23.01-3.el7 perl-Filter.x86_64 0:1.49-3.el7
perl-Getopt-Long.noarch 0:2.40-2.el7 perl-HTTP-Tiny.noarch 0:0.033-3.el7
perl-PathTools.x86_64 0:3.40-5.el7 perl-Pod-Escapes.noarch 1:1.04-291.el7
perl-Pod-Perldoc.noarch 0:3.20-4.el7 perl-Pod-Simple.noarch 1:3.28-4.el7
perl-Pod-Usage.noarch 0:1.63-3.el7 perl-Scalar-List-Utils.x86_64 0:1.27-248.el7
perl-Socket.x86_64 0:2.010-4.el7 perl-Storable.x86_64 0:2.45-3.el7
perl-Text-ParseWords.noarch 0:3.29-4.el7 perl-Time-HiRes.x86_64 4:1.9725-3.el7
perl-Time-Local.noarch 0:1.2300-2.el7 perl-constant.noarch 0:1.27-2.el7
perl-libs.x86_64 4:5.16.3-291.el7 perl-macros.x86_64 4:5.16.3-291.el7
perl-parent.noarch 1:0.225-244.el7 perl-podlators.noarch 0:2.5.1-3.el7
perl-threads.x86_64 0:1.87-4.el7 perl-threads-shared.x86_64 0:1.43-6.el7
slurm-devel-ohpc.x86_64 0:16.05.10-36.1 slurm-perlapi-ohpc.x86_64 0:16.05.10-36.1
As you can see there are several munge packages. Munge is an authentication service for creating and validating credentials. It is designed to be highly scalable and fast for an HPC Cluster. It allows a process to authenticate the UID and GID of another process within a group of hosts having common user configuration.
Next step is to add network time protocol (NTP) Support on the image.
[root@HPCHN ~]# yum -y --installroot=$CHROOT install ntp
The last command will install the ntp.x86_64 0:4.2.6p5-25.el7.centos.2 package.
Now it is time to add the kernel drivers using:
[root@HPCHN ~]# yum -y --installroot=$CHROOT install kernel
For this version, it is installing the kernel.x86_64 0:3.10.0-514.21.1.el7 with the following dependencies: grubby.x86_64 0:8.28-21.el7_3, linux-firmware.noarch 0:20160830-49.git7534e19.el7
Finally, user environment modules must be included:
[root@HPCHN ~]# yum -y --installroot=$CHROOT install lmod-ohpc
Packages Installed:
lmod-ohpc.x86_64 0:6.5.11-6.1
Dependency Installed:
lua-bit-ohpc.x86_64 0:1.0.2-1.1 lua-filesystem-ohpc.x86_64 0:1.6.3-4.1
lua-posix-ohpc.x86_64 0:33.2.1-4.1 tcl.x86_64 1:8.5.13-8.el7
tcsh.x86_64 0:6.18.01-13.el7_3.1
This will complete the needed installations on the image but it is recommended to perform any additional configuration prior to assembling the image.
3.2.3. Customize system configuration
The following steps document the process to add a local ssh key created by Warewulf to support remote access, identify the resource manager server, configure NTP for compute, and enable NFS mounting of a $HOME file system and the public OpenHPC install path (/opt/ohpc/pub) that will be hosted by the master host.
First of all, we need to initialize the warewulf database
[root@HPCHN ~]# wwinit database
database: Checking to see if RPM 'mysql-server' is installed NO
database: Checking to see if RPM 'mariadb-server' is installed OK
database: Activating Systemd service: mariadb
database: + /bin/systemctl -q enable mariadb.service OK
database: + /bin/systemctl -q restart mariadb.service OK
database: + mysqladmin create warewulf OK
database: Database version: UNDEF (need to create database)
database: Creating database schema SUCCESS
database: Setting the DB SCHEMA version to 1 SUCCESS
database: Updating database permissions for base users SUCCESS
database: Updating database permissions for root user SUCCESS
Done.
[root@HPCHN ~]#
Then, initialize Ssh key
[root@HPCHN ~]# wwinit ssh_keys
ssh_keys: Checking ssh keys for root OK
ssh_keys: Checking root's ssh config OK
ssh_keys: Checking for default RSA1 host key for nodes NO
ssh_keys: Creating default node ssh_host_key:
ssh_keys: + ssh-keygen -q -t rsa1 -f /etc/warewulf/vnfs/ssh/ssh_host_ke OK
ssh_keys: Checking for default RSA host key for nodes NO
ssh_keys: Creating default node ssh_host_rsa_key:
ssh_keys: + ssh-keygen -q -t rsa -f /etc/warewulf/vnfs/ssh/ssh_host_rsa OK
ssh_keys: Checking for default DSA host key for nodes NO
ssh_keys: Creating default node ssh_host_dsa_key:
ssh_keys: + ssh-keygen -q -t dsa -f /etc/warewulf/vnfs/ssh/ssh_host_dsa OK
ssh_keys: Checking for default ECDSA host key for nodes NO
ssh_keys: Creating default node ssh_host_ecdsa_key:
ssh_keys: + ssh-keygen -q -t ecdsa -f /etc/warewulf/vnfs/ssh/ssh_host_e OK
Done.
[root@HPCHN ~]#
After initialized the keys, copy the cluster.pub file to the image directory, so it can contain SSH keys file.
[root@HPCHN ~]# cat ~/.ssh/cluster.pub >> $CHROOT/root/.ssh/authorized_keys
Next, add the NFS Client mounts for /home and /opt/ohpc/pub to the base image. Note that the IP requested is the one assigned to the provisioning port.
[root@HPCHN ~]# echo "10.0.2.1:/home /home nfs nfsvers=3,rsize=24,wsize=1024,cto 0 0" >> $CHROOT/etc/fstab
[root@HPCHN ~]# echo "10.0.2.1:/opt/ohpc/pub /opt/ohpc/pub nfs nfsvers=3 0 0" >> $CHROOT/etc/fstab
[root@HPCHN ~]#
Now, add the head node’s hostname to the slurm.conf file. Compute nodes utilizes the hostname to identify the master, and this is the reason why DNS must be properly configured.
[root@HPCHN ~]# hostname
HPCHN
[root@HPCHN ~]# perl -pi -e "s/ControlMachine=\S+/ControlMachine=HPCHN/" /etc/slurm/slurm.conf
[root@HPCHN ~]#
Export /home and OpenHPC public packages from master server to cluster compute nodes.
root@HPCHN ~]# echo "/home *(rw,no_subtree_check,fsid=10,no_root_squash)" >> /etc/exports
[root@HPCHN ~]# echo "/opt/ohpc/pub *(ro,no_subtree_check,fsid=11)" >> /etc/exports
[root@HPCHN ~]# exportfs -a
[root@HPCHN ~]# systemctl restart nfs
[root@HPCHN ~]# systemctl enable nfs-server
Created symlink from /etc/systemd/system/multi-user.target.wants/nfs-server.service to /usr/lib/systemd/system/nfs-server.service.
[root@HPCHN ~]#
Note: as you can see, /home directory is exported with read/write permission while the /opt/ohpc/pub is exported as read only.
Enable NTP services on compute and identify the master node as local NTP Server. Head node can use an external ntp server but for compute nodes, the head node is the server. Time synchronization is very important for clustering.
[root@HPCHN ~]# chroot $CHROOT systemctl enable ntpd
Created symlink /etc/systemd/system/multi-user.target.wants/ntpd.service, pointing to /usr/lib/systemd/system/ntpd.service.
[root@HPCHN ~]#
[root@HPCHN ~]# echo "server 10.0.2.1" >> $CHROOT/etc/ntp.conf
[root@HPCHN ~]#
Note: SLURM requires enumeration of the physical hardware characteristics for compute nodes under its control. The default configuration file provided by OpenHPC assumes dual socket, 8 cores per socket, and two threads per core for this 4-node example. If this does not reflect your local hardware, please update the configuration file at /etc/slurm/slurm.conf accordingly to match your particular hardware. See the Appendix B.1 for a view of the slurm.conf file on this cluster.
3.3 Addition Customization (optional)
There are several configurations that can optionally be applied to the cluster environment, some examples are:
- Increase memlock limits
- Restrict ssh access to compute resources.
- Add Lustre client.
- Add Nagios core monitoring.
- Add genders
- Add conman
- Add Ganglia monitoring
For this case, the cluster is for training or Prove of concept, so the only feature that is going to be enabled is Ganglia monitoring.
3.3.1. Add Ganglia Monitoring
The following command can be used to enable Ganglia on master and compute nodes.
[root@HPCHN ~]# yum -y groupinstall ohpc-ganglia
Packets Installed:
ganglia-gmetad-ohpc.x86_64 0:3.7.2-4.1 ganglia-gmond-ohpc.x86_64 0:3.7.2-4.1
ganglia-gmond-python-ohpc.x86_64 0:3.7.2-4.1 ganglia-ohpc.x86_64 0:3.7.2-4.1
ganglia-web-ohpc.x86_64 0:3.7.1-4.1
Dependency Installed:
libXpm.x86_64 0:3.5.11-3.el7 libconfuse.x86_64 0:2.7-7.el7
libmemcached.x86_64 0:1.0.16-5.el7 libxslt.x86_64 0:1.1.28-5.el7
libzip.x86_64 0:0.10.1-8.el7 php.x86_64 0:5.4.16-42.el7
php-ZendFramework.noarch 0:1.12.20-1.el7 php-bcmath.x86_64 0:5.4.16-42.el7
php-cli.x86_64 0:5.4.16-42.el7 php-common.x86_64 0:5.4.16-42.el7
php-gd.x86_64 0:5.4.16-42.el7 php-process.x86_64 0:5.4.16-42.el7
php-xml.x86_64 0:5.4.16-42.el7 t1lib.x86_64 0:5.1.2-14.el7
On the cluster nodes (base image), you only need to install the gmond
[root@HPCHN ~]# yum -y --installroot=$CHROOT install ganglia-gmond-ohpc
Packages installed:
ganglia-gmond-ohpc.x86_64 0:3.7.2-4.1
Dependency Installed:
apr.x86_64 0:1.4.8-3.el7 ganglia-ohpc.x86_64 0:3.7.2-4.1 libconfuse.x86_64 0:2.7-7.el7
Enable unicast receiver on master host:
[root@HPCHN ~]# cp /opt/ohpc/pub/examples/ganglia/gmond.conf /etc/ganglia/gmond.conf
cp: overwrite ‘/etc/ganglia/gmond.conf’? yes
[root@HPCHN ~]#
root@HPCHN ~]# perl -pi -e "s/<sms>/HPCHN/" /etc/ganglia/gmond.conf
[root@HPCHN ~]#
Add configuration to compute image and provide a grid name for the server daemon.
[root@HPCHN ~]# cp /etc/ganglia/gmond.conf $CHROOT/etc/ganglia/gmond.conf
cp: overwrite ‘/opt/ohpc/admin/images/centos7.3/etc/ganglia/gmond.conf’? yes
[root@HPCHN ~]# echo "gridname pymelab" >> /etc/ganglia/gmetad.conf
[root@HPCHN ~]#
Then start and enable Ganglia services
[root@HPCHN ~]# systemctl enable gmond
Created symlink from /etc/systemd/system/multi-user.target.wants/gmond.service to /usr/lib/systemd/system/gmond.service.
[root@HPCHN ~]# systemctl enable gmetad
Created symlink from /etc/systemd/system/multi-user.target.wants/gmetad.service to /usr/lib/systemd/system/gmetad.service.
[root@HPCHN ~]# systemctl start gmond
[root@HPCHN ~]# systemctl start gmetad
[root@HPCHN ~]# chroot $CHROOT systemctl enable gmond
Created symlink /etc/systemd/system/multi-user.target.wants/gmond.service, pointing to /usr/lib/systemd/system/gmond.service.
Note: if there is a problem with the enabling of the services, check the configuration files for gmond.
Ganglia use HTTP Server, so any change of the configuration files require a web server restart.
[root@HPCHN ~]# systemctl try-restart httpd
[root@HPCHN ~]#
Note: The Ganglia top-level overview is available at http://<sms-public-ip>/ganglia
3.4. Finalizing provisioning configuration
3.4.1. Create new users
You will need to create some users for training purposes, so you don’t need to give the root account to anyone. This users must be on the sudoers group.
[root@HPCHN ~]# useradd hpcuser01 -p HPCTest00
[root@HPCHN ~]# useradd hpcuser02 -p HPCTest00
[root@HPCHN ~]# useradd hpcuser04 -p HPCTest00
[root@HPCHN ~]# useradd hpcuser03 -p HPCTest00
[root@HPCHN ~]# useradd hpcuser05 -p HPCTest00
[root@HPCHN ~]#
3.4.2. Import Files
Warewulf let import arbitrary files from the provisioning server for distribution to managed hosts. This is especially helpful to distribute user credentials to compute nodes, or update files.
[root@HPCHN ~]# wwsh file import /etc/group
[root@HPCHN ~]# wwsh file import /etc/shadow
[root@HPCHN ~]#
To import global slurm configuration file and the cryptographic key that is required by the munge authentication library to be available on every host:
[root@HPCHN ~]# wwsh file import /etc/slurm/slurm.conf
[root@HPCHN ~]# wwsh file import /etc/munge/munge.key
[root@HPCHN ~]#
3.4.3 Image preparation
The image includes kernel and associated modules.
[root@HPCHN ~]# wwbootstrap `uname -r`
Number of drivers included in bootstrap: 489
Number of firmware images included in bootstrap: 96
Building and compressing bootstrap
Integrating the Warewulf bootstrap: 3.10.0-514.21.1.el7.x86_64
Including capability: provision-adhoc
Including capability: provision-files
Including capability: provision-selinux
Including capability: provision-vnfs
Including capability: setup-filesystems
Including capability: transport-http
Compressing the initramfs
Locating the kernel object
Bootstrap image '3.10.0-514.21.1.el7.x86_64' is ready
Done.
[root@HPCHN ~]#
Note: on this command you can use the presented form, or the writing the kernel name as in:
#wwwbootstrap 3.10.0-514.21.1.el7.x86_64
Now assemble the Virtual Node File System (VNFS) image:
[root@HPCHN ~]# wwvnfs -y --chroot=$CHROOT
Unknown option: y
Using 'centos7.3' as the VNFS name
Creating VNFS image from centos7.3
Compiling hybridization link tree : 0.08 s
Building file list : 0.22 s
Compiling and compressing VNFS : 5.78 s
Adding image to datastore : 49.13 s
Wrote a new configuration file at: /etc/warewulf/vnfs/centos7.3.conf
Total elapsed time : 55.21 s
[root@HPCHN ~]#
The importance for NFS on cluster are the following:
- Network File System (NFS) is a distributed file system protocol.
- The NFS is an open standard defined in Request for Comments (RFC), allowing anyone to implement the protocol.
- In clusters is a common way to solve the storage problem, without the need for an isolated infrastructure.
- The user decides any directory to share.
- On NFS configuration both server and client are in the kernel, and the distribution should have the tools for configuring and monitoring NFS.
3.4.4. Set provisioning interface as the default networking device for the cluster
[root@HPCHN ~]# echo "GATEWAYDEV=enp5s0f1" > /tmp/network.$$
[root@HPCHN ~]# wwsh -y file import /tmp/network.$$ --name network
[root@HPCHN ~]# wwsh -y file set network --path /etc/sysconfig/network --mode=0644 --uid=0
About to apply 3 action(s) to 1 file(s):
SET: PATH = /etc/sysconfig/network
SET: MODE = 0644
SET: UID = 0
Proceed?
[root@HPCHN ~]#
After finishing this step, make sure that the warewulf-httpd.conf has all the permissions needed, the file should looks like the one on the Appendix B.2.
3.4.5. Register nodes for provisioning
The naming convention is the one expressed on the General Recommendations:
Node02
Node03
Node04
At this point it is important to know the MAC Address for the eht0 of each node.
[root@HPCHN ~]# wwsh node new node02 --netdev=eth0 --hwaddr=xx:xx:xx:xx:xx:xx
Are you sure you want to make the following 2 change(s) to 1 node(s):
NEW: NODE = node02
SET: eth0.HWADDR = xx:xx:xx:xx:xx:xx
Yes/No [no]> yes
[root@HPCHN ~]# wwsh node new node03 --netdev=eth0 --hwaddr=xx:xx:xx:xx:xx:xx
Are you sure you want to make the following 2 change(s) to 1 node(s):
NEW: NODE = node03
SET: eth0.HWADDR = xx:xx:xx:xx:xx:xx
Yes/No [no]> yes
[root@HPCHN ~]# wwsh node new node04 --netdev=eth0 --hwaddr=xx:xx:xx:xx:xx:xx
Are you sure you want to make the following 2 change(s) to 1 node(s):
NEW: NODE = node04
SET: eth0.HWADDR = xx:xx:xx:xx:xx:xx
Yes/No [no]> yes
[root@HPCHN ~]#
Note: Warewulf uses network interface names of the eth# variety and adds kernel boot arguments to maintain this scheme on newer kernels
The next step is to configure the IP Address for each node:
[root@HPCHN ~]# wwsh node set node02 --netdev=eth0 --ip=10.0.2.2 --netmask=255.255.255.0 --gateway=10.0.2.1
Are you sure you want to make the following 3 change(s) to 1 node(s):
SET: eth0.IPADDR = 10.0.2.2
SET: eth0.NETMASK = 255.255.255.0
SET: eth0.GATEWAY = 10.0.2.1
Yes/No [no]> yes
[root@HPCHN ~]# wwsh node set node03 --netdev=eth0 --ip=10.0.2.3 --netmask=255.255.255.0 --gateway=10.0.2.1
Are you sure you want to make the following 3 change(s) to 1 node(s):
SET: eth0.IPADDR = 10.0.2.3
SET: eth0.NETMASK = 255.255.255.0
SET: eth0.GATEWAY = 10.0.2.1
Yes/No [no]> yes
[root@HPCHN ~]# wwsh node set node04 --netdev=eth0 --ip=10.0.2.4 --netmask=255.255.255.0 --gateway=10.0.2.1
Are you sure you want to make the following 3 change(s) to 1 node(s):
SET: eth0.IPADDR = 10.0.2.4
SET: eth0.NETMASK = 255.255.255.0
SET: eth0.GATEWAY = 10.0.2.1
Yes/No [no]> yes
Define the provision image for the hosts
[root@HPCHN ~]# wwsh provision set node[02-04] --bootstrap=`uname -r` --vnfs=centos7.3 --files=dynamic_hosts,passwd,group,shadow,slurm.conf,munge.key,network
[root@HPCHN ~]# wwvnfs --chroot $CHROOT Using 'centos7.3' as the VNFS name
Creating VNFS image from centos7.3
Compiling hybridization link tree : 0.06 s
Building file list : 0.24 s
Compiling and compressing VNFS : 5.44 s
Adding image to datastore : 68.71 s
Total elapsed time : 74.46 s
[root@HPCHN ~]# wwsh provision set node[02-04] --bootstrap=`uname -r` --vnfs=centos7.3 --files=dynamic_hosts,passwd,group,shadow,slurm.conf,munge.key,network
Are you sure you want to make the following changes to 3 node(s):
SET: BOOTSTRAP = 3.10.0-514.21.1.el7.x86\_64
SET: VNFS = centos7.3
SET: FILES = dynamic\_hosts,passwd,group,shadow,slurm.conf,munge.key,network
Yes/No> yes
[root@HPCHN ~]#
You can add all the files that you want to the provision command.
To verify if the provisioning was successful
[root@HPCHN ~]# wwsh provision print node02
#### node02 ###################################################################
node02: BOOTSTRAP = 3.10.0-514.21.1.el7.x86\_64
node02: VNFS = centos7.3
node02: FILES = dynamic\_hosts,group,munge.key,network,passwd,shadow,slurm.conf
node02: PRESHELL = FALSE
node02: POSTSHELL = FALSE
node02: CONSOLE = UNDEF
node02: PXELINUX = UNDEF
node02: SELINUX = DISABLED
node02: KARGS = "net.ifnames=0 biosdevname=0 quiet"
node02: BOOTLOCAL = FALSE
[root@HPCHN ~]#
[root@HPCHN ~]# wwsh file list dynamic_hosts : rw-r--r-- 0 root root 668 /etc/hosts
group : rw-r--r-- 1 root root 920 /etc/group
munge.key : r-------- 1 munge munge 1024 /etc/munge/munge.key
network : rw-r--r-- 1 root root 20 /etc/sysconfig/network
passwd : rw-r--r-- 1 root root 2222 /etc/passwd
shadow : rw-r----- 1 root root 1317 /etc/shadow
slurm.conf : rw-r--r-- 1 root root 2169 /etc/slurm/slurm.conf
[root@HPCHN ~]#
3.4.6. Restart DHCP and other services
[root@HPCHN ~]# wwsh dhcp update
Rebuilding the DHCP configuration
Done.
[root@HPCHN ~]#
[root@HPCHN ~]# systemctl restart dhcpd
[root@HPCHN ~]# wwsh pxe update
[root@HPCHN ~]#
Lets restart all the services:
[root@HPCHN ~]# systemctl restart mariadb
[root@HPCHN ~]# systemctl restart xinetd
[root@HPCHN ~]# systemctl restart httpd
3.4.7. Test the configuration during boot
Before booting the servers to the image, make sure that the BIOS is configured to PXE boot. Then power cycle each of the desired hosts. The boot process should not take too much time. You can test connectivity through ssh or pinging the nodes.
Node02 Status:
[root@HPCHN ~]# ssh 10.0.2.2
[root@node02 ~]#
Node03 Status:
[root@HPCHN ~]# ssh 10.0.2.3
[root@node03 ~]#
Node04 Status:
[root@HPCHN ~]# ssh 10.0.2.4
[root@node04 ~]#
Also you can test the solution using the command pdsh to see which nodes are up.
[root@HPCHN ~]# pdsh -w node[02-04] uptime
node03: Warning: Permanently added 'node03,10.0.2.3' (ECDSA) to the list of known hosts.
node04: Warning: Permanently added 'node04,10.0.2.4' (ECDSA) to the list of known hosts.
node02: Warning: Permanently added 'node02,10.0.2.2' (ECDSA) to the list of known hosts.
node03: 23:10:33 up 4 min, 0 users, load average: 0.00, 0.01, 0.01
node02: 23:15:58 up 5 min, 0 users, load average: 0.00, 0.01, 0.01
node04: 23:13:38 up 2 min, 0 users, load average: 0.01, 0.01, 0.01
Another option is to verify the configuration using the Ganglia dasboard, here it can be shown which nodes are up and working.
Figure 8 Ganglia Dashboard for the current Cluster
As you can see, the default metrics are Network usage, memory usage and CPU usage. It is important to take this graphs into account. In the case that virtual memory start to rise you can lower the default thread stack size used in Ganglia.
3.4.8. Important notes when restarting nodes
After restarting the head nodes and the compute nodes make sure that all the services are up and running with a:
[root@HPCHN ~]# systemctl status <service name>
Start the SLURM Controller daemon on the head node and the client node on the compute nodes. By default, all nodes are initialized as down, it is necessary to manually change the stat to idle. You can use the scontrol command to change node states if desired.
[root@HPCHN ~]# scontrol update nodename="node[02-04]" state=idle
To verify the status using:
[root@HPCHN ~]# sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
normal* up 1-00:00:00 3 down* node[02-04]
normal* up 1-00:00:00 1 idle HPCHN
3.5. Installing openHPC Development components
After installing and configuring the base operating system image for the compute nodes, you need to install several components that allows a flexible development environment, this tools go from development tools to 3rd party libraries.
3.5.1. Development tools
OpenHPC provides recent versions of GNU autotools collection, valgrind memory debugger, EasyBuild, Spack, and R. these can be installed as follow:
[root@HPCHN ~]# yum -y groupinstall ohpc-autotools
Packages installed:
autoconf-ohpc.x86_64 0:2.69-11.1 automake-ohpc.x86_64 0:1.15-3.1
libtool-ohpc.x86_64 0:2.4.6-3.1
[root@HPCHN ~]# yum -y install valgrind-ohpc
Packages installed:
valgrind-ohpc.x86_64 0:3.11.0-9.1
[root@HPCHN ~]# yum -y install EasyBuild-ohpc
Packages installed:
EasyBuild-ohpc.x86_64 0:3.1.2-47.1
[root@HPCHN ~]# yum -y install spack-ohpc
Installed:
spack-ohpc.noarch 0:0.8.17-28.1
Dependency Installed:
mercurial.x86_64 0:2.6.2-6.el7_2
[root@HPCHN ~]# yum -y install R_base-ohpc
Installed:
R_base-ohpc.x86_64 0:3.3.2-17.1
Dependency Installed:
gnu-compilers-ohpc.x86_64 0:5.4.0-13.1 libicu.x86_64 0:50.1.2-15.el7
openblas-gnu-ohpc.x86_64 0:0.2.19-16.2 tk.x86_64 1:8.5.13-6.el7
3.5.2. Compilers
OpenHPC packages the GNU compilers integrated with underlying modules-environment system.
[root@HPCHN ~]# yum -y install gnu-compilers-ohpc
3.5.3. MPI Stacks
The message Passing Interface or MPI is used to write explicit statements in applications where one processor need to talk with other processor before it can continue to work, basically it is a technique to coordinated and all working together. OpenHPC offers pre-packaged builds for a variety of MPI families and transport layers.
[root@HPCHN ~]# yum -y install openmpi-gnu-ohpc mvapich2-gnu-ohpc mpich-gnu-ohpc
Installed:
mpich-gnu-ohpc.x86_64 0:3.2-5.3 mvapich2-gnu-ohpc.x86_64 0:2.2-19.1
openmpi-gnu-ohpc.x86_64 0:1.10.6-23.1
Dependency Installed:
prun-ohpc.noarch 0:1.1-21.1
3.5.4. Performance tools
OpenHPC provides a variety of open source tools for application performance analysis. Some examples are:
- Scalable Performance Measurement Infrastructure for Parallel Codes
- Scalable Performance Measurement Infrastructure for Parallel Codes.
- Toolset for performance analysis of large-scale parallel applications
- mpiP: a lightweight profiling library for MPI applications.
- Performance Application Programming Interface
On the current example the installed is the following:
[root@HPCHN ~]# yum -y groupinstall ohpc-perf-tools-gnu
Packages installed:
imb-gnu-mpich-ohpc.x86_64 0:4.1-9.1
imb-gnu-mvapich2-ohpc.x86_64 0:4.1-6.5
imb-gnu-openmpi-ohpc.x86_64 0:4.1-4.2
mpiP-gnu-mpich-ohpc.x86_64 0:3.4.1-30.1
mpiP-gnu-mvapich2-ohpc.x86_64 0:3.4.1-30.1
mpiP-gnu-openmpi-ohpc.x86_64 0:3.4.1-16.1
papi-ohpc.x86_64 0:5.4.3-11.1
scalasca-gnu-mpich-ohpc.x86_64 0:2.3.1-22.1
scalasca-gnu-mvapich2-ohpc.x86_64 0:2.3.1-22.1
scalasca-gnu-openmpi-ohpc.x86_64 0:2.3.1-11.1
tau-gnu-mpich-ohpc.x86_64 0:2.26-147.1
tau-gnu-mvapich2-ohpc.x86_64 0:2.26-147.1
tau-gnu-openmpi-ohpc.x86_64 0:2.26-75.1
Dependency Installed:
binutils-devel.x86_64 0:2.25.1-22.base.el7
pdtoolkit-gnu-ohpc.x86_64 0:3.23-36.1
scorep-gnu-mpich-ohpc.x86_64 0:3.0-11.1
scorep-gnu-mvapich2-ohpc.x86_64 0:3.0-11.1
scorep-gnu-openmpi-ohpc.x86_64 0:3.0-10.1
sionlib-gnu-mpich-ohpc.x86_64 0:1.7.0-19.1
sionlib-gnu-mvapich2-ohpc.x86_64 0:1.7.0-19.1
sionlib-gnu-openmpi-ohpc.x86_64 0:1.7.0-18.1
3.5.5. Setup default development environment
Setting a default development environment will allow that compilation can be performed directly for parallel programing requiring MPI.
[root@HPCHN ~]# yum -y install lmod-defaults-gnu-mvapich2-ohpc
Installed:
lmod-defaults-gnu-mvapich2-ohpc.noarch 0:1.2-6.2
According to the Official installation guide from OpenHPC: If you want to change the default environment from the suggestion above, OpenHPC also provides the GNU
compiler toolchain with the OpenMPI and MPICH stacks:
- lmod-defaults-gnu-openmpi-ohpc
- lmod-defaults-gnu-mpich-ohpc
Installing third party libraries for use with the GNU Compiler family and parallel libraries:
[root@HPCHN ~]# yum -y groupinstall ohpc-serial-libs-gnu
[root@HPCHN ~]# yum -y groupinstall ohpc-io-libs-gnu
[root@HPCHN ~]# yum -y groupinstall ohpc-python-libs-gnu
[root@HPCHN ~]# yum -y groupinstall ohpc-runtimes-gnu
[root@HPCHN ~]# yum -y groupinstall ohpc-parallel-libs-gnu-mpich
[root@HPCHN ~]# yum -y groupinstall ohpc-parallel-libs-gnu-mvapich2
[root@HPCHN ~]# yum -y groupinstall ohpc-parallel-libs-gnu-openmpi
Note: you can also install Development tools from Intel, but this requires a license and is out of the scope of this document.
3.6. Resource Manager StartUp
After the installation of the development tools and the correspondent libraries, it is time to start the resource manager on the head node and the computer nodes.
[root@HPCHN ~]# systemctl enable munge
[root@HPCHN ~]# systemctl enable munge
[root@HPCHN ~]# systemctl enable slurmctld
Created symlink from /etc/systemd/system/multi-user.target.wants/slurmctld.service to /usr/lib/systemd/system/slurmctld.service.
[root@HPCHN ~]# systemctl start munge
[root@HPCHN ~]# systemctl start slurmctld
[root@HPCHN ~]#
[root@HPCHN ~]# pdsh -w node[02-04] systemctl start slurmd
[root@HPCHN ~]#
3.6.1. Post Deployment activities
Perform the following steps after successful deployment.
- Verify that all the compute nodes are up and running, you can use SSH, ping, or other tool.
- Verify you have a running slurm configuration:
[root@HPCHN ~]# pdsh -w node[02-04] systemctl start slurmd
- Correct any failed resource. The following is an entry with slurm initiation fail:
Jun 19 17:55:26 HPCHN slurmctld[27238]: fatal: slurm_init_msg_engine_addrname_port error ...use
Jun 19 17:55:26 HPCHN systemd[1]: slurmctld.service: main process exited, code=exited, s...LURE
Jun 19 17:55:26 HPCHN systemd[1]: Unit slurmctld.service entered failed state.
Jun 19 17:55:26 HPCHN systemd[1]: slurmctld.service failed.
- First, correct the configuration on the slurm.conf file, the send the updated file to the nodes, after that erase all the logs generated for slurm, kill the process, stop the slurm and restart it.
Logs from:
SlurmctldLogFile=/var/log/slurmctld.log
[root@HPCHN ~]# wwsh file sync
[root@HPCHN ~]# ssh 10.0.2.2
[root@node02 ~]# /warewulf/bin/wwgetfiles
[root@node02 ~]# exit
logout
Connection to 10.0.2.2 closed.
[root@HPCHN ~]# ssh 10.0.2.3
[root@node03 ~]# /warewulf/bin/wwgetfiles
[root@node03 ~]# exit
logout
Connection to 10.0.2.3 closed.
[root@HPCHN ~]# ssh 10.0.2.4
[root@node04 ~]# /warewulf/bin/wwgetfiles
[root@HPCHN ~]# systemctl start slurmctld
[root@HPCHN ~]# systemctl start slurmd
[root@HPCHN ~]# pidof slurmctld
29082
[root@HPCHN ~]# pidof slurmd
29115
[root@HPCHN ~]# kill 29082 29115
[root@HPCHN ~]# systemctl stop slurmctld
[root@HPCHN ~]# systemctl stop slurmd
[root@HPCHN ~]#
- Validate your services status using the command systemctl status <service>
Note: at this point the cluster is ready to be used by developers. Take into account that you can execute activities on nodes using this command:
[elopez@HPCHN ~]$ srun -n 2 -N 2 --pty /bin/bash
- OpenHPC includes a simple “hello-world” MPI application in the /opt/ohpc/pub/examples directory that can be used for this quick compilation and execution. At present, OpenHPC is unable to include the PMI process management server normally included within Slurm which implies that srun cannot be use for MPI job launch. So you can test the configuration:
[elopez@HPCHN ~]$ mpicc -O3 /opt/ohpc/pub/examples/mpi/hello.c
[elopez@HPCHN ~]$ srun -n 8 -N 2 --pty /bin/bash
[test@n1 ~]$ prun ./a.out
[prun] Master compute host = c1
[prun] Resource manager = slurm
[prun] Launch cmd = mpiexec.hydra -bootstrap slurm ./a.out