Data formats. Loading other data: Last time we loaded Matlab binary files (with the suffix .mat) using:

load
sample_data.mat

load('sample_data.mat')

But not all data is in Matlab binary format.

load('sample_data.mat')

But not all data is in Matlab binary format.

For plain text (or ASCII) files, Matlab also loads data fairly automatically:

load
sample_data.ascii

DATA=load('sample_data.ascii'); and DATA2=load('lecture3.timeseries');

DATA=load('-ascii','sample_data.ascii');

DATA=load('sample_data.ascii'); and DATA2=load('lecture3.timeseries');

DATA=load('-ascii','sample_data.ascii');

What about Microsoft Word files or Excel spreadsheets? These have binary data formating that Matlab is not designed to read. Save them as plain text, and load them as ASCII files.

Other types of binary data files that you might encounter include:

- netcdf: This is a self-documenting file format that is commonly used for geophysical data. The files are easy to read into Matlab if you install the netcdf libraries on your system and the netcdf toolbox. We'll steer clear of these for this class.
- hdf: This is another self-documenting file format, though it comes in different flavors. NASA decided on one version of hdf as a standard for satellite data collected by the Earth Observation Satellites. Matlab has an hdf interface that might now work for NASA hdf files.
- raw binary: These might include output files from computer simulations, or data files archived in a compact format. Binary files are often classed as "big-endian" or "little-endian". The terminology is an allusion to Jonathan Swift's Gulliver's Travels, and refers to the arbitrary standards used to write binary data on different computer systems. In general, PCs use one format; high end unix workstations often use the other. To read these files in Matlab, you can do the following:

- fid=fopen('file.bin','r'); or one of the following if the file was written on a different type of computer: fid=fopen('file.bin','r','l'); or fid=fopen('file.bin','r','b');
- a=fread(fid,'float'); to read the entire file. Check the fread documentation to find out how to read part of a file, or to read files that have some floating point numbers and some integers.
- fclose(fid); to clean up and avoid hogging all the input/output channels.

clear

To clear one or two variables:clear
x y

Representing a matrix: Matlab is really designed to work with matrices, so it very naturally stores data in two dimensional arrays, and will also use arrays of arbitrary dimension size. If you read in your data as an ASCII file, you might end up with a big data array containing latitude in the first column, longitude in the next, temperature and the third, and so forth sort of like this:

DATA = [latitudes longitudes salinity
temperature velocity(east/west) velocity(north/south)
velocity(vertical)];

That's perfect for some purposes, but if your original data came from a
two dimensional grid---perhaps they represent sea surface temperature
in the tropical Pacific---then we might want to format the data some
other way. Here's one way to handle it:SST=reshape(DATA(:,4),nlat,nlon);

will create an nlat by nlon array containing the temperature data from
column 4 of A.Looking at the data:

To look at the contents of a variable, you can always type its name. You can also specify specific elements of an array. For example:

>> T(1:5,1:5)

ans =

27.2218 27.2415 27.2729 27.3757 27.4669

27.4352 27.4828 27.4918 27.5867 27.6677

27.6014 27.6939 27.6883 27.7699 27.8425

27.7381 27.8820 27.8640 27.9348 27.9986

27.8452 28.0494 28.0191 28.0774 28.1343

This is useful for some purposes, but we can't tell much about our temperature data by looking at it this way.

In Matlab, the first index corresponds to the row, and the second to the column. So the second row is:

>> T(2,1:5)

ans =

27.4352 27.4828 27.4918 27.5867 27.6677

And the second column is:

>> T(1:5,2)

ans =

27.2415

27.4828

27.6939

27.8820

28.0494

On the otherhand, we can treat a two dimensional array as a vector by using a single index. Thus:

>> T(1:5)

ans =

27.2218 27.4352 27.6014 27.7381 27.8452

This provides the first 5 elements in column 1, i.e. T(1:5,1).

More on plotting:

The Matlab image functions order arrays like mathematical matrices with coordinate (1,1) in the upper left corner. Data tend to start with the smallest latitude and longitude values, which should be mapped in the lower left corner. To make your Matlab image plot look correct, you can use axis xy, as above, or you can flip the matrix top-to-bottom: imagesc(lon_t,lat_t,flipud(T));

controlling details of your plots: As you work with more data, you may want to control the types of lines you use for plots, line color, or even the range of colors in an image plot. Lots of information is available from the help files. Here are a few pointers.

For simple line plots, lines are ordered blue-green-red-cyan-magenta-yellow-black, but you can override this by specifying a single letter color code (e.g. plot(DATA2(:,1),DATA2(:,2),'r') for a red line.) You can also specify a line type as solid ('-'), dashed ('--'), dotted (':'), or dash-dotted ('-.').

To plot data as points (particularly useful when your data aren't ordered sequentially), make a scatter plot by specifying a symbol only:

plot(DATA2(:,1),DATA2(:,2),'o')

to plot circles. If you start creating a plot and want to keep adding to it, use the hold on command to prevent Matlab from erasing the first plot when you add a new line and the hold off command to revert to normal mode. The command clf will clear everything out of a figure window to let you start again.

colormaps: In Matlab, the default colormap for contour and image plots is a blue to red spectrum, but you can override this. To change the colormap used for contour or image plots, you can specify a different basic color map by typing, for example, colormap(cool). Other colormaps include hsv, prism, gray, hot, cool, copper, flag, pink, bone, and jet (the default).

Sometimes, you want to make sure that NaNs don't end up shaded the same color as useful data points, so you can force values at the end of your range to be white or black, for example.

cmap2=[[1 1 1]; colormap; [1 1
1]];

colormap(cmap2);

You might have to fix the limits of your colors to keep real data from
also being whited out. To get black where you had no data, you'd
use [0 0 0].colormap(cmap2);

Plotting vector quantities: Velocity vectors can be a little subtle. The Matlab function quiver will plot vector quantities. So:

quiver(lon_t,lat_t,U,V)

plots a map of vector velocities. The results are a little
deceptive. If you change the aspect ratio of the plot, the
vectors appear to change direction, so the on screen vectors aren't
telling you the precise direction of the current. We can also compute the angular direction of the current and plot that as an image:

theta=atan2(V,U)*180/(pi);

Here we use atan2 rather than atan, because we want our angles
to go from 0 to 360 degrees.
Since 0 and 360 degrees are equivalent, it's good to choose a
colorscale where the colors are 0 and 360 are nearly the same.
The clrscl.m function (written by an SIO
researcher) provides one way to fix a colorscale
appropriately. Compare:imagesc(lon_t,lat_t,theta);
axis xy; colorbar

withcolormap(clrscl('rmbcgyr',36))

imagesc(lon_t,lat_t,theta); axis xy; colorbar

imagesc(lon_t,lat_t,theta); axis xy; colorbar

Probability density functions and histograms: We looked at the hist function last week. How do you convert a histogram into a probability density function?

x=rand(20000,1);

[a,b]=hist(x,0.05:.1:.95);

Now we can make a bar plot, typing[a,b]=hist(x,0.05:.1:.95);

bar(b,a)

We can also modify the bar plot to give us what we need for a pdf:bar(b,a/sum(a)/.1);
% where 0.1 is the interval dx

xlabel('x'); ylabel('probability density')

xlabel('x'); ylabel('probability density')

You can get rid of the space between the bars by supplying a width parameter to bar:

bar(b,a/sum(a)/.1,1);

In contrast to this procedure, the Matlab function pdf.m returns the
functional form of a specified probability density function.Logic: Sometimes we want to test for a particular condition. In Matlab, conditional statements can use if, while, or switch. These test whether an expression is true or false. Expressions test the relationship between two quantities. The help page for relop explains everything. Here are some examples.

To test whether T is less than 999, use
(T<999).

To test whether T is equal to 999, use (T==999).

To test whether T is not equal to 999, use (T~=999).

To test whether T and S are both not equal to 999, use (T~=999 & S~=999).

To test whether T or S is equal to 999, use (T==999 | S == 999).

To test whether T is NaN, use (isnan(T)).

To test whether T is equal to 999, use (T==999).

To test whether T is not equal to 999, use (T~=999).

To test whether T and S are both not equal to 999, use (T~=999 & S~=999).

To test whether T or S is equal to 999, use (T==999 | S == 999).

To test whether T is NaN, use (isnan(T)).

Automating a procedure: Last week we looked at for loops. Here are some other ways to make a calculation loop:

i=3;

while i<size(A,2)

array=reshape(A(:,i),nlat,nlon);

figure(i-2)

imagesc(array,lat,lon)

i=i+1;

end

use Matlab's vectorization
capabilities: For many purposes, Matlab will naturally
loop through a whole set of variables without requiring a loop.
For example, if you wanted to compute the mean and standard deviation
of each column in A, you might be tempted to create a loop:while i<size(A,2)

array=reshape(A(:,i),nlat,nlon);

figure(i-2)

imagesc(array,lat,lon)

i=i+1;

end

for
i=3:8

m(i-2)=mean(A(:,i));

s(i-2)=std(A(:,i));

end

But instead you could say:m(i-2)=mean(A(:,i));

s(i-2)=std(A(:,i));

end

m=mean(A(:,3:8));
s=std(A(:,3:8));

This second option not only requires less code, but it is also
computationally faster.