Skip to content

Parallel Processing Issues

jstanleyx edited this page Feb 16, 2019 · 1 revision

Parallel Processing Issues

Some users may have access to the Mathworks Distributed Processing Toolbox, which is sometimes referred to as the Parallel Processing Toolbox. This Toolbox provides access to multi-process Matlab computing and can increase performance by a large factor -- or destroy it. There are a couple of hints about using this toolbox.

Creating a Parallel "Pool"

A "pool" is what Mathworks calls a group of matlab processes that handle the parallel processing. The default pool size for older matlab versions is 8. More recent versions default to 12. Those defaults are also the maximum.

Prior to matlab 2015a, the command to create a pool was 'matlabpool'. This command had several options to manage pools and determine status, but the most common form of the command was:

matlabpool open local 8

which would open a local pool (all on the same computer) of 8 "workers".

Starting in 2015a, the command changed to 'parpool'. The equivalent pool creation was then:

parpool 8

Sometime in the matlab version stream, you could also open a parallel pool by simply using the 'parfor' command. Matlab would open a default pool automatically, if you had not done so already. The option to control this automatic pool creation is found under 'preferences'->'parallel'.

Why not use a parallel pool all the time?

Were your computer to have infinite physical memory, or your data and processing didn't create large matrices, there is no reason not to. Under older versions the interprocess communications took place via files written to and read from the disk, which could be very slow, and could result in parallel processing that is actually slower than single threaded. However, more recent versions use pipes to communicate, and this bypasses the disk.

Your computer probably doesn't have infinite disk and you don't have small data, so you may run into a serious problem when parallel processing. Once you have enough matlab processes using enough memory, you run out. This triggers a condition called "paging" or "swapping". Unused or least recently used memory will get written to disk so executing processes can get more. This is AMAZINGLY slow, and your system may actually hang up while this goes on. Running fewer workers (8 instead of 12, say) can result in much faster execution because 8 processes use less memory and paging may not happen.

Dealing with the different commands

If you run code on older and newer versions of matlab, you will need to deal with the change to the command to create a pool. You could put your own try/catch on every pool open, but here's a simple function that is based on an idea from Matlab Cental.

function makeParallelPool(number)
% makeParallelPool(number) -- create a parallel pool 
%
%  A stupid function because Matlab RENAMED the function to create a parallel
%  computing pool in 2015a. Instead of matlabpool, it's now parpool. 
%
if nargin == 0
    number = 8;
end

try
    if( matlabpool('size')==0 ) matlabpool('local',num2str(number)); end;
catch
    if( isempty(gcp('nocreate'))) parpool(number); end;
end
Clone this wiki locally