Assignment #4: Cartesian Topology, Splitting Communicators, fully decentralized distributed computing example - Parallel Matrix Multiplication


Description:

I wanted to explore a problem where I could split the existing communicators & I could create my own. I also wanted to design a fully decentralized program where there would be no manager-worker kind of processor farm.  I wanted to explore the MPI_Cart_create() call & using row & column communicators. I took the idea of this program from Alan's lecture slides & then took ideas from Chapter 7 of the book Parallel Programming with MPI by Peter S Patcheco to implement this program. More specifically, two 9x9 dimensional matrix has been used, multiplied them together & a third 9x9 dimensional matrix generated. We split each of 9x9 matrices into nine 3x3 matrices and assign these sub matrices to nine parallel processors. Though in this program uses a 9x9 matrix, care has been taken so that this program can easily be generalized to a NxN matrix. Also, in this program, we initialize the input matrices deterministically so that all processes initialize them in the same manner & the program can be made fully decentralized. In a more practical problem, the matrices will be arbitrary & a central manager processor will distribute the workers their share of the matrices.

However, we must be careful so that the following conditions are met when we generalize the problem into a NxN matrix:-

  1. If p is the number of processors, sqrt(p) must evenly divide N and 1<= sqrt(p)<=N.
  2. p is a perfect square.
  3. We use square matrices only.

Subject to the above conditions, this program can be easily be generalized to any N & any p. The dimensions of the sub matrices will be N/sqrt(p) x N/sqrt(p). In the extreme case, each processor gets one element of the original array when sqrt(p)=N. Also when sqrt(p)=1, that is when p=1, we proceed just like a normal non-parallel matrix multiplication program.

Main areas explored in this assignment:

The main areas we explore & learn are:-

  1. Grid topology creation.
  2. Creating new row & column communicators & use of MPI_Broadcast among them.
  3. How to split existing MPI_COMM_WORLD
  4. A fully decentralized distributed computing example

Assignment Resources:

  1. The Parallel Source Code

  2. The sequential code to check the correctness of the solution

  3. Correct Output from Sequential Code

  4. Output from Parallel Code
    Description:
    The program proceeds in parallel with 9 processors & each processor outputs its share of the final output matrix (the matrix obtained by multiplying the input matrices). Each of the sub matrices has dimension 3 x3 & the processor (0,0) outputs elements c(0,0),c(0,1),c(0,2); c(1,0), c(1,1) & c(1,2); c(2,0),c(2,1) & c(2,2) of the final matrix. Similarly another processor, say (1,1) will output elements c(3,3),c(3,4),c(3,5); c(4,3), c(4,4) & c(4,5); c(5,3),c(5,4) & c(5,5) of the final matrix. Proceeding similarly, each of the other processors output their share of the final matrix. I deliberately did not combine their sub matrices into one because I wanted to keep the processes absolutely independent of each other with no communication going in between.


First Posted: Feb 18 2006. Last Updated: Feb 18 2006 Credits: 10

Page Last Revised:Monday March 31, 2008

Valid HTML 4.01 Transitional