Due 10 pm, Tuesday, February 12, 2013. Submit by email.

Problems taken from P. Pacheco, Parallel Programming with MPI, Chapter 6

1. We can use derived datatypes to write functions for (dense) matrix I/O when we store the matrix by block

a. Write a function that prints a square matrix ditributed by block columns among the processes. Suppose that the order of the matrix is n and the number of processes is p, and assume that n is evenly divisible by p. The function should successively gather blocks of n/p rows to process 0, and process 0 should print each block of n/p rows immediately after it has been received. For each gather of n/p rows, each process should send (using MPI_Send) a block of order n/p x n/p to process 0. Process 0 should carry out the gather using a sequence of calls to MPI_Recv. The datatype argument should be a derived datatype created with MPI_Type_vector

b. Write a function that reads in a square matrix stored in row-major order in a single file. Process 0 should read in the number of rows and broadcast this information to the other processes. Assume that n, the number of rows, is evenly divisible by p, the number of processes. Process 0 should then read in a block of n/p rows and distribute blocks of n/p columns to each of the processes: the first n/p columns go to 0, the next n/p to 1 etc. Process 0 should then repeat this process for each block of n/p rows. Use a derived datatype created with MPI_Type_vector so that the data sent to each process can be sent with a single call to MPI_Send.

2. Write a dense matrix transpose function. Suppose a dense n x n matrix A is stored on process 0. Create a derived datatype representing a single column of A. Send each column of A to process 1, but have process 1 receive each column into a row. When the function returns, A should be sstored on process 0 and A transpose on process 1.