9.1. Past Experience using clusters in a workshop

This chapter describes how we used a previous Raspberry Pi image for a remote workshop.

CSinParallel distributes a Raspberry Pi image that enables a group of Raspberry Pis to self-organize into a compute cluster. Participants of the 2021 CSinParallel Virtual Summer Workshop received Raspberry Pi Cluster Kits in the mail. The following videos illustrate how to assemble the cluster kit:

The video below gives a quick summary of what is mailed to participants:

This next two videos illustrate how to assemble the Cluster Kit and verify that it is operational. Please note that we assume that viewers have already set up the Raspberry Pi Kit “head node” that is used on day 1 of the workshop, and verified that they can connect to the head node.

System Message: ERROR/3 (/home/sjm/distributed-c-book/_sources/10Appendix/A_Hardware.rst, line 19)

Duplicate ID – see 01Prerequisites/Hardware, line 17

.. youtube:: EanA4Ash28Q
    :height: 315
    :width: 560
    :align: left

System Message: ERROR/3 (/home/sjm/distributed-c-book/_sources/10Appendix/A_Hardware.rst, line 24)

Duplicate ID – see 01Prerequisites/Hardware, line 22

.. youtube:: ACTw9QHsDic
    :height: 315
    :width: 560
    :align: left

9.1.1. A summary of important commands

After initial set up, here is a summary of important commands that you need to run in order to get your cluster up and running every day:

Starting up the cluster:

  1. First, log in to the head node using VNC viewer or equivalent. In a terminal window:

  2. Run the following command to configure the head node as the head node of the cluster:

    sudo head-node
    
  3. Log into hd-cluster account (Notice the use of sudo before the su command):

    sudo su - hd-cluster
    
  4. Type pwd to ensure that you are the hd-cluster user (should see output like /home/hd-cluster/):

    pwd
    
  5. Direct the head node to auto-discover all the worker nodes:

    soc-mpisetup
    

The worker nodes that are discovered will be placed in a hostifle called hostflie that is located in the root directory of the hd-cluster account.

  1. Lastly, to test out your setup, run the following series of commands:

    cd CSinParallel/Patternlets/MPI/00.spmd/
    make
    mpirun -hostfile ~/hostfile -np 4 ./spmd
    

Note

PLEASE BE SURE TO SHUT DOWN YOUR CLUSTER USING THE COMMANDS BELOW. DO NOT JUST POWER OFF THE CLUSTER! If you power off the cluster without following the shutdown procedure outlined below, your cluster will likely enter an inconsistent state. The procedure below resets the cluster so that way you can easily start up the cluster whenever you want.

Shutting down the cluster:

In the terminal window that you have open:

  1. Type exit to exit out of the hd-cluster account:

    exit
    
  2. Type pwd to confirm that you are now the pi user (should see output like /home/pi):

    pwd
    
  3. Shut down the worker nodes using the following command. Enter the default password as necessary (should only need to do this the first time around):

    sudo shutdown-workers
    
  4. Reset the head node to be a regular node:

    sudo worker-node
    
  5. Shut down the head node by using the following command:

    sudo shutdown -h now
    

Following the above steps fully when you start up and finish using the self-organizing Raspberry Pi cluster will ensure that subsequent uses will be error free!

9.1.2. Using this guide with your own cluster image

You can use this guide if you have your own Raspberry Pi cluster, or any system that has multiple cores and/or nodes. If you are planning on using your own cluster make sure have the following software requirements:

  • mpich or openmpi library

  • (optional for more advanced examples) python numpy package and Python 3.6 or higher

The Raspberry Pi cluster image provided by the CSinParallel group already have everything you need installed, including the following code examples that we describe in detail in the following chapters.

9.1.3. What are Patterns?

Patterns in software are common implementations that have been used over and over by practitioners to accomplish tasks. As practitioners use them repeatedly, the community begins to give them names and catalog them, often turning them into reusable library functions. The examples you will see in this book are based on documented patterns that have been used to solve different problems using message passing between processes. Message passing is one form of distributed computing using processes, which can be used on clusters of computers or multicore machines.

In many of these examples, the pattern’s name is part of the python code file’s name. You will also see that often the MPI library functions also take on the name of the pattern, and the implementation of those functions themselves contains the pattern that practitioners found themselves using often. These pattern code examples we show you here, dubbed patternlets, are based on original work by Joel Adams (4).

9.1.4. References

1
  1. Dalcin, P. Kler, R. Paz, and A. Cosimo, Parallel Distributed Computing using Python, Advances in Water Resources, 34(9):1124-1139, 2011. http://dx.doi.org/10.1016/j.advwatres.2011.04.013

2
  1. Dalcin, R. Paz, M. Storti, and J. D’Elia, MPI for Python: performance improvements and MPI-2 extensions, Journal of Parallel and Distributed Computing, 68(5):655-662, 2008. http://dx.doi.org/10.1016/j.jpdc.2007.09.005

3
  1. Dalcin, R. Paz, and M. Storti, MPI for Python, Journal of Parallel and Distributed Computing, 65(9):1108-1115, 2005. http://dx.doi.org/10.1016/j.jpdc.2005.03.010

4

Adams, Joel C. “Patternlets: A Teaching Tool for Introducing Students to Parallel Design Patterns.” 2015 IEEE International Parallel and Distributed Processing Symposium Workshop. IEEE, 2015.

You have attempted of activities on this page