Task Parallel Computing with
     PPEVAL >

Vectorizing Your MATLAB®      Code for Best Performance >

Parallel File I/O >

 

Computational Ecology at University of California at Santa Barbara >

Genomic Analysis at the National Cancer Institute >

Integration with Numerical Algorithms Group >

Fast and Fourier: FFTW Integrated in Star-P >

“Fast and Fourier”: Under
the hood of high-performance FFTs in Star-P

Significant upgrade in the functionality and performance of Fast Fourier Transform functions like fft, fft2 and fftn on distributed data. More >

Python Users: Extend your codes to parallel servers and clusters. More >

Interactive Tour >

Python Users: Join our early adopter program!
Whitepaper Library: Going Parallel: An Implementation Guide.
Productivity Breakthrough for Python, and Others. Learn More.

Tips & Tricks: Parallel File I/O


Let’s say you have a large file in raw binary format that contains a larger number of images that need to be processed one by one, using the two-dimensional FFT.

In serial, you would follow these steps:

  1. Open the file
  2. Loop over the number of images, reading each one and performing the FFT
  3. Close the file

Serial code:

filename = '/home/user/client/myfile.bin'; NumberOfImages = 10000;
fid = fopen(filename,'r');
for i = 1:NumberOfImages
img = fread(fid,[256,256],'double');
img = fft2(img);
end
status = fclose(fid);

This computation lends itself very well to parallelization, and Star-P offers several methods to get the data to the server. One approach is to use the ppeval command, in a manner similar to the 3 steps outlined in the serial case above:

  1. Open the file ON EACH PROCESSOR
  2. Loop over the number of images, read each one and perform the FFT (EACH PROCESSOR PROCESSES A SUBSET OF THE TOTAL NUMBER OF IMAGES)
  3. Close the file ON EACH PROCESSOR

This approach of letting each processor directly load the data into its memory is fast, memory-efficient, and particularly useful for very large data sets that may be difficult to load otherwise, even on SMP servers with large memory.

The Star-P code to accomplish these steps:

filename = '/home/user/server/myfile.bin'; NumberOfImages = 10000;
fid = ppeval('open_file',split(1:np),filename);
img = ppeval('process_img',split(1:NumberOfImages));
status = ppeval('fclose',split(fid));

Now, let’s take a look at the functions used here:

Our open_file function opens the file named filename and stores the file descriptor fid in a global variable as well as return it as an output value.

function fid_ = open_file(process,filename)
global fid
fid_ = fopen(filename,'r');
fid = fid_;

The process_img function reads the relevant part of the data in the file called filename (opened by open_file), and processes the images. In this simple example, we just take the 2D fft. The processed images are then returned as output values.

function processed_img = process_img(number)

global fid

% seek to the correct location in the file.
% (each image is 256x256 pixels of 8 bytes large)
%

offset = 256*256*8*(number-1);
fseek(fid,offset,-1);
%
% Read the current image
%

img = fread(fid,[256,256],'double');
%
% Take the the FFT2 of the image
%

processed_img = fft2(img);

 

ISC Home | Forward to Friend | Subscribe

©Copyright 2007 Interactive Supercomputing, Inc. and its licensors. All rights reserved.
Interactive Supercomputing, Inc. | 135 Beaver St. | Waltham, MA 02452
Phone: +1.781.419.5050 | Fax: +1.781.419.6050 www.interactivesupercomputing.com
STAR-P™ and the "star" logo are trademarks of Interactive Supercomputing. MATLAB® is a registered trademark of The MathWorks, Inc.