usarray_robs_rules_matlab

Download Report

Transcript usarray_robs_rules_matlab

Robert’s Rules of Matlab

Disclaimer:

My usual processing flow is to use c or fortran codes linked together via tcsh scripts. I find Matlab to be really great for easy to code routines which require user interaction. Instances include SplitLab and FuncLab as you will learn later this week, quality control of ambient noise correlations, and determination of discontinuities from 1D shear velocity and Receiver Functions.

Poker

Matlab is not only for data processing.

From the Matlab command window, type “poker” to launch a very simple draw poker game I wrote in 2004.

What is this code?

Input user parameters Set the random number generator While (player has chips or decides not to cash out) Generate a hand display hand allow user to exchange cards display new hand analyze results and pay out End while

Getting help

This is the first block of comments in the .m file displayed when the user types: >> help command.m

Pre-allocate your arrays

30x speedup!

Matrices and array notation

Matlab is built on the idea that everything is a matrix (or vector).

Much faster than nested for loops

The cell array

A cell array can be seen as an array of matrices. Each matrix can be its own size and type. Where matrix/vector elements can be accessed by enclosing the indices in parentheses ( ), cell arrays use squiggly brackets { }.

Structures

variable.element

think of structures as a handy way to keep track of, pass, and handle ‘objects’ with common properties. In the example here, I use a ‘station’ with some common parameters. I can then operate on a structure variable if I know what all properties it contains.

Avoid the parfor temptation

To parallelize a Matlab for loop, all you have to do is: 1.Change “for” to “parfor” 2.Open a pool of Matlab works to run the parallel loop BUT I suggest avoiding this.

Why?

Matlab relies on built-in algorithms that are already optimized and often are already parallel. Therefore, if you override that optimized parallel code with a non-optimized block, you’re likely to see drops in performance because the resources are not available for the optimized function.

Furthermore, there is a cost of parallelization – namely message passing.

Each time a parallel block is entered, the processor has to figure out and send information to the nodes before the parallel job even begins.

Avoid toolboxes in distributed code

Toolboxes are great extensions to Matlab, but for a single user license they cost ~$45 each in addition to the ~$150 base cost. Institutional costs are order of $500 for the base and ~$100 per toolbox. Therefore you can’t assume other users of your code will have access to the same set of toolboxes. Need a toolbox function?

Is it simple?

Write it yourself.

Is it not simple?

Google the usage; sometimes you will find a drop in replacement For instance, in some of the SRF codes we use at USC will rely on reckon.m from the mapping toolbox. We recently found geodreckon.m, which is nearly a drop in replacement freely available online from the author (Charles Karney).

Pandora’s box: UI control

uimenu sets menu items such as file->open or edit->select all uipanel creates a grouped background useful for arranging user control uicontrol creates an instance of some user interactive widget such as a button, popup menu, editable text box, radio buttons, etc… When one of these items is activated (ie button pressed) it enacts what is called a ‘callback function’.

Making a GUI is all about designing the button layout and then coding the set of callback functions to process the data as input by the user.

Search for existing code!

The real value of Matlab is its extensive code/user base.

Developers from across the sciences all use Matlab and contribute new code which may help you make a new discovery. Many of these codes can be found on http://www.mathworks.com/matlabcentral/file exchange http://github.com

Think “Processing Flow” rather than just processing

Who wants to click once per seismogram?

When you’re dealing with big data, its only practical to have the data (not just waveform) flow through filters and processes to develop the final product.

You will see this in FuncLab and SplitLab. Half of the work is just setting up the workflow and then hitting a button to actually run the processing flow.

Things like sac or the matlab command line are great for interactively experimenting with small data. But once you want to scale up, you need to think through as many possibilities as you can.

Some useful packages: m_map seizmo the waveform suite coral processRFmatlab FuncLab SplitLab irisFetch.m + IRIS-WS jar