Coordination Patterns in MPI: Basic message passing¶

The examples in this section illustrate the fundamental pattern in software for distributed systems and systems where individual processes, not threads, are coordinating by communication with each other via message passing. Now we get to see the message passing part of the of name MPI (Message Passing Interface)! The portion of the patterns diagram pertaining to these examples is shown on the left.

04. Message passing deadlock, using Send-Receive of a single value¶

This example shows the pattern of sending and receiving messages between various processes. The code displays 2-way communication between pairs with message passing of an integer occurring between pairs of odd and even id processes.

Note

The id of a process is sometimes called a rank, so we use these terms interchangeably here.

An integer value will be sent and received by the following sender-receiver pairs:

(rank 0, rank 1), (rank 2, rank 3), (rank 4, rank 5), … ,

On lines 18 - 20 in the code below, each odd process (1, 3, 5, …) will both receive from and send a message to the process whose id is one less than it (its neighbor to the left in the pairs shown above). The message that is being passed is the rank of the current process (set on line 16). On lines 23 -25, each even process (0, 2, 4, …) is receiving from and sending to the process whose id is one greater than it (its neighbor to the right).

To do before you go on:

Find documentation for the MPI functions MPI_Recv and MPI_Send. Make sure that you know what each parameter is for. Note that you could also look up MPI_Datatype, which is an enumeration that is used to indicate the type of the data being sent. The argument MPI_INT is one of these.

Conceptually, the running code is executing like this for 2 processes, where time is moving from top to bottom:

Look at the code belwo for the send and receive being executed by odd and even process ids.

#include <stdio.h>
  #include <mpi.h>

int odd(int number) { return number % 2; }

int main(int argc, char** argv) {
      int id = -1, numProcesses = -1;
      int sendValue = -1, receivedValue = -1;
      MPI_Status status;

MPI_Init(&argc, &argv);
      MPI_Comm_rank(MPI_COMM_WORLD, &id);
      MPI_Comm_size(MPI_COMM_WORLD, &numProcesses);

if (numProcesses > 1) {
          sendValue = id;
          if ( odd(id) ) {  // odd processors receive from their 'left neighbor', then send
              MPI_Recv(&receivedValue, 1, MPI_INT, id-1, 2,
                        MPI_COMM_WORLD, &status);
              MPI_Send(&sendValue, 1, MPI_INT, id-1, 1, MPI_COMM_WORLD);

} else {          // even processors receive from their 'right neighbor', then send
              MPI_Recv(&receivedValue, 1, MPI_INT, id+1, 1,
                        MPI_COMM_WORLD, &status);
              MPI_Send(&sendValue, 1, MPI_INT, id+1, 2, MPI_COMM_WORLD);
          }

printf("Process %d of %d computed %d and received %d\n",
                  id, numProcesses, sendValue, receivedValue);
      } else if ( !id) {  // only process 0 does this part
          printf("\nPlease run this program using -np N where N is positive and even.\n\n");
      }

MPI_Finalize();
      return 0;
  }

To do:

What happens when you run this code with 2 or more processes?

Can you explain why this program deadlocks and how we might avoid this situation?

Note

We introduce this problematic example because it is a bug that we can accidentally introduce into our code. The next example provides the solution to this problem.

05. Message passing 1, using Send-Receive of a single value¶

The previous example highlights how a deadlock might occur from message passing. Next we will show one possible solution for fixing this problem. We can avoid a deadlock by simply reversing the order of one of the receive/send pairs. Now, we have one receive/send ordering and one send/receive ordering for each process in the pair sharing data. Note this in the code example further below. As shown in the following diagram, where time is moving from top to bottom, even processes are receive/send pairs and odd processes are send/receive pairs.

#include <stdio.h>
#include <mpi.h>

int odd(int number) { return number % 2; }

int main(int argc, char** argv) {
    int id = -1, numProcesses = -1;
    int sendValue = -1, receivedValue = -1;
    MPI_Status status;

MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &id);
    MPI_Comm_size(MPI_COMM_WORLD, &numProcesses);

if (numProcesses > 1) {
        sendValue = id;
        if ( odd(id) ) {  // odd processors send, then receive
            MPI_Send(&sendValue, 1, MPI_INT, id-1, 1, MPI_COMM_WORLD);
            MPI_Recv(&receivedValue, 1, MPI_INT, id-1, 2,
                      MPI_COMM_WORLD, &status);
        } else {          // even processors receive, then send
            MPI_Recv(&receivedValue, 1, MPI_INT, id+1, 1,
                      MPI_COMM_WORLD, &status);
            MPI_Send(&sendValue, 1, MPI_INT, id+1, 2, MPI_COMM_WORLD);
        }

printf("Process %d of %d computed %d and received %d\n",
                id, numProcesses, sendValue, receivedValue);
    } else if ( !id) {  // only process 0 does this part
        printf("\nPlease run this program using -np N where N is positive and even.\n\n");
    }

MPI_Finalize();
    return 0;
}

To do:

Run using 4, 6, 8 and 10 processes by changing the -np flag. Note that the program now completes without deadlocking. Why does reversing one of the receive/send pairs allow us to avoid the deadlock situation all together?

Run using 5 processes. What process threw an error and why was an error thrown? Hint: See diagram below.

Note

This solution avoids the deadlock by ensuring that when using an even number of processes, one of the processes in the pair will send its value to the neighbor process who is blocked waiting to receive the message being sent. Then it is ready to send its own message to its partner in the pair. This is a common pattern that occurs in MPI programs when pairs of processes need to exchange information.

Yet we still can have errors when code designed for an even number of processes is run with an odd number. Can you determine how you might fix this in the code above?

06. Message passing 2, using Send-Receive of an array of values¶

The messages sent and received by processes can be of types other than integers. Here the message that is being passed is a string (array of chars). This example follows the previous message passing examples in that it passes strings between pairs of odd and even rank processes.

We use dynamic memory allocation for the sendString and receivedString. Dynamic memory allocation lets a program obtain more memory space while running or release memory space if it is not needed. We can use this type of memory allocation to manually handle memory space. The function malloc on lines 28 and 29 allocates a block of SIZE bytes of memory for the sendString and receivedString.

/* messagePassing2.c
  * ... illustrates using MPI_Send() and MPI_Recv() commands on arrays...
  * While this example sends and receives char arrays (strings),
  *  the same approach works on arrays of numbers or other types.
  */

#include <stdio.h>   // printf()
  #include <mpi.h>     // MPI
  #include <stdlib.h>  // malloc()
  #include <string.h>  // strlen()

int odd(int number) { return number % 2; }

int main(int argc, char** argv) {
      int id = -1, numProcesses = -1, length = -1;
      char * sendString = NULL;
      char * receivedString = NULL;
      char hostName[MPI_MAX_PROCESSOR_NAME];
      MPI_Status status;
      size_t SIZE = (32+MPI_MAX_PROCESSOR_NAME) * sizeof(char);

MPI_Init(&argc, &argv);
      MPI_Comm_rank(MPI_COMM_WORLD, &id);
      MPI_Comm_size(MPI_COMM_WORLD, &numProcesses);
      MPI_Get_processor_name (hostName, &length);

if (numProcesses > 1 && !odd(numProcesses) ) {
          sendString = (char*) malloc( SIZE );
          receivedString = (char*) malloc( SIZE );
          // write to string
          snprintf(sendString, SIZE, "Process %d is on host \"%s\"", id, hostName);

if ( odd(id) ) {  // odd processes send, then receive
              MPI_Send(sendString, strlen(sendString)+1,
                        MPI_CHAR, id-1, 1, MPI_COMM_WORLD);
              MPI_Recv(receivedString, SIZE, MPI_CHAR, id-1, 2,
                        MPI_COMM_WORLD, &status);
          } else {          // even processes receive, then send
              MPI_Recv(receivedString, SIZE, MPI_CHAR, id+1, 1,
                        MPI_COMM_WORLD, &status);
              MPI_Send(sendString, strlen(sendString)+1,
                        MPI_CHAR, id+1, 2, MPI_COMM_WORLD);
          }

printf("\nProcess %d of %d received the message:\n\t'%s'\n",
                  id, numProcesses, receivedString);

free(sendString);
          free(receivedString);
      } else if ( !id) {  // only process 0 does this part
          printf("\nPlease run this program using -np N where N is positive and even.\n\n");
      }

MPI_Finalize();
      return 0;
  }

To do:

Review documentation for the MPI functions MPI_Recv and MPI_Send; make sure you understand what the second parameter is for in each case and why the code above is written as it is.

What is the free function doing in this code? Why must we apply the free function to both sendString and receivedString?

Run with 2, 4, and 8 processes. Trace the code and match with the results returned.

07. Message passing 3, using Send-Receive with conductor-worker pattern¶

Sending and receiving often occurs in pairs. We will investigate a scenario in which this is not the case. Suppose we have four processes, 0 through 3, all of which are arranged in a “ring”. We want each process to communicate a modified string containing sequential ranks to the next process. Process 0 begins by sending its rank to process 1. Process 1 receives a string containing a 0. Next, process 1 adds its rank to the string and sends the string to process 2. Then, process 2 receives the string containing 0 and 1, and so on. This continues until process 0 receives the final string from the last process (process with the largest rank). Thus, process 0 is the beginning and ending location of the “ring”. This type of circular dependency can be thought of like this:

/* messagePassing3.c
  * ... illustrates the use of MPI_Send() and MPI_Recv(),
  *      in combination with the conductor-worker pattern.
  */

#include <stdio.h>    // printf()
  #include <string.h>   // strlen()
  #include <mpi.h>      // MPI

#define MAX 256

int main(int argc, char** argv) {
      int id = -1, numProcesses = -1;
      char sendBuffer[MAX] = {'\0'};
      char recvBuffer[MAX] = {'\0'};
      MPI_Status status;

MPI_Init(&argc, &argv);
      MPI_Comm_rank(MPI_COMM_WORLD, &id);
      MPI_Comm_size(MPI_COMM_WORLD, &numProcesses);

size_t buf_size = MAX;

if (numProcesses > 1) {
          if ( id == 0 ) {                              // conductor:
              snprintf(sendBuffer, buf_size, "%d", id); //  create msg

MPI_Send(sendBuffer,                      //  msg sent
                        strlen(sendBuffer) + 1,         //  num chars + NULL
                        MPI_CHAR,                       //  type
                        id+1,                           //  destination
                        1,                              //  tag
                        MPI_COMM_WORLD);                //  communicator

MPI_Recv(recvBuffer,                      //  msg received
                        MAX,                            //  buffer size
                        MPI_CHAR,                       //  type
                        numProcesses-1,                 //  sender
                        1,                              //  tag
                        MPI_COMM_WORLD,                 //  communicator
                        &status);                       //  recv status

printf("Process #%d of %d received %s\n", // show msg
                      id, numProcesses, recvBuffer);
          } else {                                      // workers:
              MPI_Recv(recvBuffer,                      //  msg received
                        MAX,                            //  buffer size
                        MPI_CHAR,                       //  type
                        MPI_ANY_SOURCE,                 //  sender (anyone)
                        1,                              //  tag
                        MPI_COMM_WORLD,                 //  communicator
                        &status);                       //  recv status

printf("Process #%d of %d received %s\n", // show msg
                      id, numProcesses, recvBuffer);

// build msg to send by appending id to msg received
              snprintf(sendBuffer, buf_size, "%s %d", recvBuffer, id);

MPI_Send(sendBuffer,                      //  msg to send
                        strlen(sendBuffer) + 1,         //  num chars + NULL
                        MPI_CHAR,                       //  type
                        (id+1) % numProcesses,          //  destination
                        1,                              //  tag
                        MPI_COMM_WORLD);                //  communicator
          }
      } else {
          printf("\nPlease run this program with at least 2 processes\n\n");
      }

MPI_Finalize();
      return 0;
  }

To do:

Run the program, varying the value of the -np argument from 1-8.

Explain the behavior you observe. Is there an ordering preserved?

Note what if statement indicates the conductor-worker pattern.

In our example, when will this communication pattern fail to execute properly and finish? What was added to check whether we can guarantee completion?

Note

Examine the output of this code, noticing how each process sends the results of its ‘work’ to the next process. This code forms the basis of what is called the pipeline pattern. You might want to do some research about when this can be useful. Though they don’t in this example, imagine that the processes could overlap by having each process ‘stream’ partially completed parts of its work to the next process in line for further processing.

Continue on to see more interesting patterns of collective communication using message passing.

You have attempted of activities on this page