6.2 OpenMP code with gcc and pgcc¶
Now let’s parallelize this code using the shared memory multicore CPU and each compiler: gcc and pgcc. What this illustrates is that the new pgcc compiler will compile the same OpenMP code into a threaded version, provided we give it particular compiler arguments to indicate that is the code we want.
Note that each of these examples includes the same code from the previous section for command line argument functions and helper functions.
The gcc version¶
The new addition to this code is the pragma for OpenMP added to the CPUAdd. We can also set the number of threads to use on the command line. If you run it as is, you will see printing that tells you which threads were working on which loop iteration. Then try removing the ‘-n’, ‘10’, elements of the command line arguments and note that it uses a much larger array size and reports a time.
Try varying the number of threads
Use [‘-t’, ‘4’] and [‘-t’, ‘8’] to see how you gain some improvement in the running time.
The pgcc version¶
Try varying the number of threads
Remove the ‘-n’, ‘10’ from the command line arguments and try each of these: [‘-t’, ‘2’] and [‘-t’, ‘4’] and [‘-t’, ‘8’] to see how you gain some improvement in the running time.
What do you notice about the running times of these two versions of code using different compilers?