Task Parallel Computing with
     PPEVAL >

Vectorizing Your MATLAB®      Code for Best Performance >

Parallel File I/O >

 

Computational Ecology at University of California at Santa Barbara >

Genomic Analysis at the National Cancer Institute >

Integration with Numerical Algorithms Group >

Fast and Fourier: FFTW Integrated in Star-P >

“Fast and Fourier”: Under
the hood of high-performance FFTs in Star-P

Significant upgrade in the functionality and performance of Fast Fourier Transform functions like fft, fft2 and fftn on distributed data. More >

Python Users: Extend your codes to parallel servers and clusters. More >

Interactive Tour >

Python Users: Join our early adopter program!
Whitepaper Library: Going Parallel: An Implementation Guide.
Productivity Breakthrough for Python, and Others. Learn More.

Tips & Tricks: Task Parallel Computing with PPEVAL


Task parallelism (also called "coarse-grained parallelism") is a powerful method to carry out multiple independent calculations in parallel, such as Monte Carlo simulations, or "un-rolling" serial FOR loops. These calculations are sometimes referred to as “embarrassingly parallel.” Star-P's “ppeval” construct is a function call which executes your code in parallel on the server.


serial processing on one processor

                 


parallel processing on four processors

With Star-P, there is no need to worry about the details of parallelization - Star-P takes care of distributing the data across the processors, executing the computations and gathering the results at the end of a computation. "ppeval" is reminiscent of MATLAB®'s "feval", except that the function specified and all required data are operated on on the parallel server. Consider the example of processing a stack of MRI brain scans. Let's assume we require running an SVD computation on each image. In serial fashion, we might write a FOR loop, to process one image at a time (the last 3 lines in the code below).

% load 12 MRI images, each 256x256 pixels from file.
% The resultant matlab variable MRIdat will be 256-by-256-by-12 in size.

load MRIdata


% get size of the image cube
[xpixel,ypixel,nimage] = size(MRIdat);


% pre-allocate output variable to improve MATLAB® performance
MRI_U = zeros(xpixel,ypixel,nimage);
MRI_S = zeros(xpixel,ypixel,nimage);
MRI_V = zeros(xpixel,ypixel,nimage);


% Loop over the individual slices
for i = 1:nimage
[MRI_U(:,:,i),MRI_S(:,:,i),MRI_V(:,:,i)] = svd(MRIdat(:,:,i));
end

Because the operations are independent, we can use ppeval to carry them out in parallel using the ppeval construct, in the last line in the code below:

% load 12 MRI images, each 256x256 pixels from file.
% The resultant matlab variable MRIdat will be 256-by-256-by-12p in size.

ppload MRIdata


% For the ppeval algorithm it is NOT necessary to
% 1) know the size of the image cube
% 2) pre-allocate the memory for the output
% for successful completion on the algorithm.



% Loop over the individual slices in task parallel


[MRI_U,MRI_S,MRI_V] = ppeval('svd',MRIdat);

Now, let’s consider a data set with 240 images, each 256x256 pixels. On a desktop PC with a 2.8 GHz AMD Opteron CPU and 3 GB of RAM, the serial computation takes 59 seconds. When carried out in parallel with Star-P on a server with eight processors (2.4 GHz Opteron) and 32 GB of RAM, the computation takes less than 7 seconds:

  

Finally, it should be noted that Star-P’s powerful abstraction does not require you to know how many processors are available. Star-P handles the data distribution, and the same MATLAB® code will run successfully whether there are 8, 32, or 512 processors available.

 

   

ISC Home | Forward to Friend | Subscribe

©Copyright 2007 Interactive Supercomputing, Inc. and its licensors. All rights reserved.
Interactive Supercomputing, Inc. | 135 Beaver St. | Waltham, MA 02452
Phone: +1.781.419.5050 | Fax: +1.781.419.6050 www.interactivesupercomputing.com
STAR-P™ and the "star" logo are trademarks of Interactive Supercomputing. MATLAB® is a registered trademark of The MathWorks, Inc.