3.2.1 Equal chunks for loops with 2 methods of random number streams

Now let’s explore the following parallel version using OpenMP.

As a reference, here is the set of numbers when using one thread with a given constant seed:

t 0( 0):83
t 0( 1):63
t 0( 2):97
t 0( 3):21
t 0( 4):62
t 0( 5):54
t 0( 6):14
t 0( 7):46

This is an OpenMP version with the following command line arguments:

-t indicates number of threads to use.
-n indicates the number of repetitions of the loop (default is 8).
-c indicates that a fixed seed will be used, resulting in the same
   stream of numbers each time this is run.
-d indicates whether the trng generator will dole out numbers in
   block or in leapfrog fashion. (default is leapfrog).

Study the code by reading the comments. Note in particular:

Note

This is an important new aspect of how we can and need to change this type of code by separating some lines of code that are done on each thread when they fork before the loop is decomposed onto each thread (requiring a second pragma).

This syntax is tricky for newcomers to OpenMP. The inner pramga is simply ‘#pragma omp for’ without the keyword parallel, which has already been placed in the previous pragma to indicate where the threads fork.

Try running the code above with these arguments (which are the original ones on this page if you reload it):

['-c', '-n 8', '-d block', '-t 1']

You should see that we are able to replicate the same set of numbers as the sequential version using this OpenMP version with one thread.

Now try changing the command line arguments in the above code block to use two threads, like this:

['-c', '-n 8', '-d block', '-t 2']

You should still get the same set of numbers. Now try using 4 threads, like this:

['-c', '-n 8', '-d block', '-t 4']

These examples have been using block splitting for the method of doling out portions of the stream of numbers to the threads. One important rule about using this particular method is that the set of numbers must be equally divisible by the number of threads that you use. To see how this method can fail, try this:

['-c', '-n 8', '-d block', '-t 3']

What do you observe?

Warning

You must be very careful when using this version of block splitting, because the code will run, but the results for this example are incorrect when the number of random numbers in the stream is not divisible by the number of threads. We will see how to fix this later.

Leapfrog to get random values per thread

Next try leapfrog by trying these command line arguments:

['-c', '-n 8', '-d leapfrog', '-t 1']
['-c', '-n 8', '-d leapfrog', '-t 2']

Notice a couple of things about using 2 threads with leapfrogging:

  • Thread 0 should get every other number starting from loop index 0: 83, 97, 62, 14

  • Thread 1 should get every other number starting from loop index 1: 63, 21, 54, 46

  • Since the OpenMP pragma and the for loop uses data decomposition of equal chunks of the loop per thread, then thread 0 gets loop indices 0, 1, 2, and 3, and thread 1 gets loop indices 4, 5, 6, and 7.

  • The order in which the threads complete is not guaranteed. Luckily, many applications do not rely on this (it’s random numbers, after all).

Now try different numbers of threads, like this:

['-c', '-n 8', '-d leapfrog', '-t 4']
['-c', '-n 8', '-d leapfrog', '-t 3']

Study the results of each to see that the leapfrogging is working as you would expect.

Notice that although the total number of numbers in the stream is not divisible by 3 threads, the set of numbers generated is the same.

Note

These last few examples show that leapfrogging can be a safer alternative when you may vary how many numbers you generate and how many threads you use. The issue with block splitting is due to the way it is implemented above on line 120. The number of repetitions must be equally divided by the number of threads, otherwise the value sent to the trng jump function is incorrect. HOWEVER, in later examples we will see how to fix this.

Command line code for reference

This code block below has code for handling the command line arguments for the parallel versions in this subsection. You can study it if you want to see how it was done.

You have attempted of activities on this page