CHAPTER 4: Parallel Sum

Very often, loops are used with an accumulator variable to compute a a single value from a set of values, such as the sum of integers in an array or list. Fortunately, OpenMP implements a special parallel pattern known as reduction, which will aid us in this process. The reduction pattern is one of a group of patterns called collective communication patterns because the threads must somehow work together to create the final desired single value.

4.1 An Initial Attempt

From the ParallelLoopChunksOf1 directory, let’s enter the reduction directory by typing in the following command:

cd ../reduction

Running ls in this new directory should reveal the following two files:

  • Makefile

  • reduction.c

The code in reduction.c is fairly long, so we will discuss it in parts:

17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
#include <stdio.h>   // printf()
#include <omp.h>     // OpenMP
#include <stdlib.h>  // rand()

void initialize(int* a, int n);
int sequentialSum(int* a, int n);
int parallelSum(int* a, int n);

#define SIZE 1000000

int main(int argc, char** argv) {
   int array[SIZE];

   if (argc > 1) {
        omp_set_num_threads( atoi(argv[1]) );
   }

   initialize(array, SIZE);
   printf("\nSeq. sum: \t%d\nPar. sum: \t%d\n\n",
            sequentialSum(array, SIZE),
            parallelSum(array, SIZE) );

   return 0;
} 

The main() function in reduction.c first initializes an array of 1 million random integers. The main() function then executes two separate functions: sequentialSum() (which sums up an array sequentially), and parallelSum() which attempts to sum up the array in parallel.

For simpliicty, we next show only the parallelSum() function:

60
61
62
63
64
65
66
67
68
69
/* sum the array using multiple threads */
int parallelSum(int* a, int n) {
   int sum = 0;
   int i;
//   #pragma omp parallel for // reduction(+:sum)
   for (i = 0; i < n; i++) {
      sum += a[i];
   }
   return sum;
}

In its current form, the function is identical to how the sequentialSum() function works. However, line 64 contains some commented out code that should parallelize this function. For now, let’s keep line 64 commented out.

Try It Out

First, let’s compile and run the program using the following command:

make
./reduction 4

4.2 Edit, Build and Run in Parallel

Let’s next edit the parallelSum() function so that it executes in parallel by removing // comment at the front of line 64. Be sure to ONLY remove the first // (keep the reduction clause commented out for now).

Recompile and rerun the program using the following command:

make
./reduction 4

Line 64 of the parallelSum function currently contains a (commented out) special clause that can be used with the omp parallel for pragma called reduction. The notion of a reduction comes from the mathematical operation reduce, in which a collection of values are combined into a single value via a common mathematical function. Summing up a collection of values is therefore a natural example of reduction.

In this program, the reduction(+:sum) clause indicates that a summation is being performed (the + identifier) on the the variable sum, which acts as the accumulator.

4.3 Something to think about

Do you have any ideas about why the omp parallel for pragma without the reduction clause did not produce the correct result?

The sum example that we presented in this chapter was a fairly small example. In the next two chapters, we will explore much larger examples.

You have attempted of activities on this page