# 6.1 A simple Monte Carlo simulation¶

This first MPI example is a classic- you will find many, many examples of it on the web, and it is used to describe Monte Carlo methods on Wikipedia, where the following image is found (click for its attribution):

The method is straight forward:

1. Consider one quarter of a unit circle (a quadrant) inscribed inside a square of width 1.0.

2. Randomly choose x, y between 0.0 and 1.0 of a point and determine if the point lies within or outside the quadrant.

3. Keep choosing points.

4. When done, compute the ratio of the points inside the circle to the total number of points. This is an estimate of the ratio of the area of the circle quadrant to the square, which we know to be $$\pi/4$$.

5. We can estimate the value of $$\pi$$ by multiplying by 4.

From steps 2 and 3 above, you can infer that there is a loop where a given number of random points are generated and counted as in or out. This type of loop makes this algorithm a candidate for decomposition using a parallel for loop pattern. The following code demonstrates this. In addition, we show how we can use the parallel random number generator library introduced in Chapter 3 in an MPI program- it is very similar to how we used it for the basic OpenMP examples that we showed there.

## Primary patterns used¶

The important patterns to envision in this code are:

1. SPMD, since we have one program that multiple processes all run.

2. Data decomposition, since each process will compute an equal share of the total number of points requested.

3. Parallel for loop split with equal chunks computed by each process (inside the function called Toss()).

4. Reduction communication pattern to combine the results from each process.

5. There is also a broadcast from process 0. See if you can find that and what it was used for.

We start with this relatively straightforward example to illustrate that most MPI applications contain several patterns like this.

• The function called Toss is named that because choosing the numbers is like tossing a dart at the square. Though our use of random numbers makes the tosses far more uniform than a human on a real board.

• The command line argument is the number of random ‘tosses’ to generate.

• We illustrate the block-splitting approach for delving out random numbers to the processes. The default value of 200000 tosses is divisible by 2, 4, and 8 threads so that this will work. We have a check to make sure that the number of tosses is divisible by the number of processes chosen.

• We use the -Ofast compiler flag because in our experience the trng random number generator functions seem to perform better using this.

Exercises

1. Try increasing the number of tosses from 200000 to 2000000, then 20000000, then 200000000, and finally 2000000000 by adding a 0 each time and re-running with 4 or 8 processes. Does the estimated value of pi get more accurate as you increase the number of tosses? NOTE: random number generators have a limit of how many numbers they can generate. If you try too many, it will fail mysteriously.

2. For a challenge: Try changing to the leapfrog method for generating the random values, using the method called ‘split’. Revisit how this was done in Chapter 3.

3. For a challenge: Try using a different random number generation class from the trng library other than yarn2.