7.1. Collective Communication: scatter and gather lists

When several processes need to work on portions of a data structure, such as a 1-d or 2-d array, at various points in a program, a way to do this is to have one node, usually the conductor, divide the data structure and send portions to each of the other processes, often keeping one portion for itself. Each process then works on that portion of the data, and then the conductor can get the completed portions back. This type of coordination is so common that MPI has special patterns for it called scatter and gather.

7.1.1. Scatter Arrays

The following diagrams illustrate how scatter using python list structures works. The conductor contains a list of lists and all processes participate in the scatter:

../_images/scatter_list_of_lists.png

After the scatter is completed, each process has one of the smaller lists to work on, like this:

../_images/after_scatter_list_of_lists.png

In this example, some small lists are created in a list whose length is as long as the number of processes.

Navigate to: ../14.scatter/

Make and run the code:

make

mpirun -hostfile ~/hostfile -np N ./scatter

Here the N signifies the number of processes to start up in mpi.

Exercises:

  • Run, using N = from 2 through 8 processes. Trace execution through source code.

  • Explain behavior/effect of MPI_Scatter().

7.1.1.1. Explore the code

Note

In the code below, note how all processes must call the scatter function.

#include <mpi.h>      // MPI
#include <stdio.h>    // printf(), etc.
#include <stdlib.h>   // malloc()

void print(int id, char* arrName, int* arr, int arrSize);

int main(int argc, char** argv) {
    const int MAX = 8;
    int* arrSend = NULL;
    int* arrRcv = NULL;
    int numProcs = -1, myRank = -1, numSent = -1;

    MPI_Init(&argc, &argv);                            // initialize
    MPI_Comm_size(MPI_COMM_WORLD, &numProcs);
    MPI_Comm_rank(MPI_COMM_WORLD, &myRank);

    if (myRank == 0) {                                 // conductor process:
        arrSend = (int*) malloc( MAX * sizeof(int) );  //  allocate array1
        for (int i = 0; i < MAX; i++) {                //  load with values
            arrSend[i] = (i+1) * 11;
        }
        print(myRank, "arrSend", arrSend, MAX);        //  display array1
    }
     
    numSent = MAX / numProcs;                          // all processes:
    arrRcv = (int*) malloc( numSent * sizeof(int) );   //  allocate array2

    MPI_Scatter(arrSend, numSent, MPI_INT, arrRcv,     //  scatter array1 
                 numSent, MPI_INT, 0, MPI_COMM_WORLD); //   into array2

    print(myRank, "arrRcv", arrRcv, numSent);          // display array2

    free(arrSend);                                     // clean up
    free(arrRcv);
    MPI_Finalize();
    return 0;
}

void print(int id, char* arrName, int* arr, int arrSize) {
    printf("Process %d, %s: ", id, arrName);
    for (int i = 0; i < arrSize; i++) {
        printf(" %d", arr[i]);
    }
    printf("\n");
}

7.1.2. Gather Arrays

Once several processes have their own arrays of data, those can also be gathered back together into a larger array, usually in the conductor process. All processes participate in a gather, like this:

../_images/gather_lists.png

The gather creates a list of lists in the conductor, like this:

../_images/after_gather_List_of_lists.png

In this example, each process creates some very small arrays. Then a gather is used to create a larger array on the conductor process.

Navigate to: ../15.gather/

Make and run the code:

make

mpirun -hostfile ~/hostfile -np N ./gather

Here the N signifies the number of processes to start up in mpi.

In this example all processes have a small array of values, whose length is indicated by SIZE, and the conductor has an additional larger one.

Exercises:

  • Run, using N = from 2 through 8 processes.

  • Trace execution through source.

  • Explain behavior of MPI_Gather().

  • Try with different values of SIZE, perhaps changing printing of result for readability.

7.1.2.1. Explore the code

Note

In the code below, note how all processes must call the gather function.

#include <mpi.h>       // MPI
#include <stdio.h>     // printf()
#include <stdlib.h>    // malloc()

void print(int id, char* arrName, int* arr, int arrSize);

#define SIZE 3

int main(int argc, char** argv) {
   int  computeArray[SIZE];                          // array1
   int* gatherArray = NULL;                          // array2
   int  numProcs = -1, myRank = -1,
        totalGatheredVals = -1;

   MPI_Init(&argc, &argv);                           // initialize
   MPI_Comm_size(MPI_COMM_WORLD, &numProcs);
   MPI_Comm_rank(MPI_COMM_WORLD, &myRank);
                                                     // all processes:
   for (int i = 0; i < SIZE; i++) {                  //  load array1 with
      computeArray[i] = myRank * 10 + i;             //   3 distinct values
   }

   print(myRank, "computeArray", computeArray,       //  show array1
           SIZE);

   if (myRank == 0) {                                // conductor:
      totalGatheredVals = SIZE * numProcs;           //  allocate array2
      gatherArray = (int*) malloc( totalGatheredVals * sizeof(int) );
   }

   MPI_Gather(computeArray, SIZE, MPI_INT,           //  gather array1 vals
               gatherArray, SIZE, MPI_INT,           //   into array2
               0, MPI_COMM_WORLD);                   //   at conductor process

   if (myRank == 0) {                                // conductor process:
      print(myRank, "gatherArray",                   //  show array2
             gatherArray, totalGatheredVals);
      free(gatherArray);                             // clean up
   }


   MPI_Finalize();
   return 0;
}

void print(int id, char* arrName, int* arr, int arrSize) {
    printf("Process %d, %s: ", id, arrName);
    for (int i = 0; i < arrSize; i++) {
        printf(" %d", arr[i]);
    }
    printf("\n");
}
You have attempted of activities on this page