MATLAB is a great environment to quickly prototype scientific algorithms, especially with interactive debugging, decent data plotting facilities, dynamic typing, and all the built-in toolboxes. However, it’s definitely pretty slow when it comes to code-execution performance compared to, well, almost every other language out there. This is largely because MATLAB is an interpreted language – that handy ability to set a breakpoint and then run code and plot data comes at the cost of not being able to compile or optimize the code in advance and having to do everything on-the-fly. This can cause MATLAB code to execute up to hundreds of times slower!
Vectorize
So in particular, loops are pretty bad because MATLAB has to interpret every line of code individually and loops just mean lots of time spent interpreting. Basically, in MATLAB, the fewer lines of code you can do something in, the better. Fortunately, because MATLAB was designed to work with matrices, most of the built-in functions can process entire matrices at a time, rather than having to process them one element at a time as you would in most languages. This is called vectorization. For instance, in C, to take the square root of every element of a 2-D array and multiply them by 2, you’d need to have a nested set of FOR loops to iterate over every element – this would look something like: (using MATLAB syntax)
for i = 1:size(matrix, 1) % Loops BAD! for j = 1:size(matrix, 2) out(i,j) = sqrt(matrix(i,j)) * 2; end end
In MATLAB, this takes forever. The proper, vectorized way is simply:
out = sqrt(matrix) * 2; % Vectorized!
not only is the vectorized version much faster (0.022 vs 1.1882 seconds – 125X – on a 1M element array in MATLAB 2009b on an Intel Core i5), it’s much shorter to type. Basically, whenever you want to write a FOR loop do to something on an array, you should instead be vectorizing it.
The MathWorks guide to vectorization is here.
Pre-allocate Memory
This is important in most languages, but especially so in MATLAB. If MATLAB has to keep reallocating memory for the array as it grows, it slows down a lot. In the above FOR loop example, adding one line cuts the execution time in half:
out = zeros(size(matrix)); % Preallocate output matrix for i = 1:size(matrix, 1) for j = 1:size(matrix, 2) out(i,j) = sqrt(matrix(i,j)) * 2; end end
Suppress Output (aka Semicolons are your friend)
In MATLAB, if you don’t terminate a line with a semicolon, that’s OK! Your code will still run, but without the semicolon, MATLAB will print the output to the Command Window. Just as if you’d thrown a cout/printf after that line of code. In the previous FOR loop example, leaving off the semicolon on the line:
out(i,j) = sqrt(matrix(i,j)) * 2
means that MATLAB will print the entire million element array for EACH of the million iterations! This takes a small eternity – hit CTRL-C to kill it and put that semicolon in. In fact, it’s typically best to make it a habit of terminating every line of code with a semicolon and using the ‘disp’ command to write formatted output.
Aside from displaying text in the command window, drawing figures is also very time-consuming. If you’re trying to update a plot frequently, consider deleting the specific elements and drawing just the new ones, rather than redrawing the entire plot. Also, if you’re generating a bunch of plots that you won’t be looking at, say because you’re writing them to disk, don’t keep spawning new figures – just create one and keep erasing it. The more figures your have open, the slower MATLAB’s graphics engine becomes.
Use the Profiler!
MATLAB has a pretty nice profiler which will tell you how many times each line is called and how much processing time each takes. Makes it pretty easy to figure out what’s really bogging down your program so you can speed it up, whether by refactoring, vectorizing, or simply converting it into a MEX function.
Install Lightspeed
There are a good number of useful functions in MATLAB which are, sadly, unoptimized. One of the biggest is repmat, which is really handy when it comes to vectorizing code as it is used to create matrices by tiling a smaller matrix. This function, plus a lot of other useful functions, have been written as blazingly fast MEX-functions (see the next section for details) by Tom Minka over at Microsoft Research. I’ve seen many-fold to orders-of-magnitude speedups with this toolbox! Also, in some vectorization guides, they’ll tell you not to use repmat as it’s slow, and to use the ones indexing trick to simulate the functioning of repmat. Don’t bother – just install Lightspeed and use repmat to your heart’s content.
Download Lightspeed here.
Write MEX-functions
MEX-functions are C, C++, or FORTRAN programs that have been compiled with the MEX interface so that MATLAB can call the program. Because the MEX-functions are compiled, they typically run much faster than normal MATLAB code. Just absolutely have to do something in a FOR loop? Write a MEX function. This is also great for pulling in external functions from a library like OpenCV. Basically, all a MEX-function is is a program with an interface wrapper to define how data is passed into and out of the program from MATLAB.
MathWorks MEX Guide is here.
Install the latest version of MATLAB
So over the years, MathWorks has put some effort into improving the performance of MATLAB. Repmat, for instance, has been sped up many-fold, though the Lightspeed version is still faster. MATLAB now also has a Just-In-Time accelerator which allows MATLAB to compile certain things on-the-fly, including simple FOR loops. So you need not fear loops – as much! There are certain rules, though, for what can and cannot be accelerated. This document lists these rules. The profiler also helps to identify what is and is not being accelerated, as described here. Finally, if you’re willing to pony up for the Parallel Computing Toolbox, this is an pretty easy way to take advantage of multiple processors/core/machines.
So go out there and write some code! And if you come up with any tips of your own, drop us a line!
Great post, than you for all the good tips!