MAE 127: Lecture 1
Introduction to the course
Handouts: syllabus
and reading reference list (pdf format)
survey
(pdf format)
Introductions:
Professor Sarah Gille:
I'm a physical oceanographer. At
UCSD I'm split between Scripps Institution of Oceanography and the
department of Mechanical and Aerospace Engineering. Since I can't
be two places at once, I usually sit at Scripps, though I'll be in my
MAE office before class. If you need to reach me any other
time, call or send e-mail.
Among other things, I also teach data analysis grad students at
SIO. When I'm not teaching, my research focuses on the ocean's
role in climate, and I work mostly on the Southern Ocean---that's the
part of the global ocean that encircles Antarctica. My work
depends on analyzing data, and the methods that we'll cover in this
class are things that come up in climate research all of the
time. That's not to say that they only matter for climate studies
or even that they're only relevant for research in a university setting.
The class: nominally aimed at
ME, EnvE, Earth Science, and ESYS majors.
Formal prerequisites are Math 20C (or equivalent) and 20F is
recommended.
Why study statistical methods?
Statistical methods are the tools to help us interpret
observations of the natural environment or measurements from the
lab. Thus statistics is fundamental to science and
engineering. Environmental sciences differ from pure physics or
laboratory engineering work, because we often observe the world as it
exists, rather than performing controlled experiments. As a
result, our data can be noisy and imperfect, and we have to analyze our
data carefully.
Course objectives:
(1) To teach you fundamental statistics and basic techniques for
analyzing data.
(2) To make sure you learn not only how to treat data but also how to
assess uncertainties.
(3) To emphasize problem solving skills.
Course schedule:
Roughly 3 segments: basic statistics, least-squares
fitting, spectral analysis, with empirical orthogonal functions at
end. Details are subject to revision, so check course web site
for updates. This is a new course, and feedback is welcome.
Expectations:
See the syllabus for course requirements. I try to start
class promptly and to finish on time, so please plan to arrive on time.
Comments on texts:
This year the course has no assigned text. That's because
the class
is new, and nothing that I checked out in advance seemed a perfect
choice in terms of both content and cost. In exchange, I'll post
detailed notes on the web. I've also put a
dozen or so books on reserve, and I'll have key chapters made available
through electronic reserves. Your feedback will be great in
deciding whether we can assign a single textbook next year.
Taylor's book is quite introductory but clearly written and
comparatively inexpensive, so I've made it available at the bookstore
as an optional text. We won't cover all of it.
We will make use of some basics of linear algebra. I've put a few
basic textbooks (by Strang and by Noble) on reserve. There are
also some good Matlab tutorials on linear algebra.
The other books are mostly upper level undergraduate or graduate texts
on data analysis. I'll point out appropriate references as they
come up.
Comments on software:
We will use Matlab for
this course. Matlab is good for statistical methods, because it's
really built around a linear algebra package. (The name Matlab
comes from Matrix Laboratory). It does the mathematical
operations that we need, and it lets you make plots easily. Some
of you are probably really familiar with Matlab, but some of you
probably haven't seen it, so I'll spend a couple of class sessions
going over the basics that you need for this course.
What is data?
Data can be any measurements, either from field observations,
laboratory experiments, or computer simulations. Some
examples include temperatures at the Scripps Pier or output from a
computer simulation of ocean circulation in the tropical Pacific.
What do you do with it? (see lecture1.pdf)
Start by looking at it, making plots.
time-series: graphs
of a variable versus time
maps: variable versus
latitude and longitude
sections:
variable versus depth and for example, longitude.
Hovmoller diagrams:
variable versus time and for example, longitude.
What do we learn through data analysis?
(see lecture1.pdf)
Example 1:
Long-term climate trends are tracked by testing whether conditions at
present differ from conditions in the past. But what does
different mean? We'll have to define a statistical standard
for determining when one measurement differs from another. In
some cases, when many data are averaged together, error bars are
clearly small. On the other hand, measurements from single points
can be useful, if we can figure out how to interpret their
uncertainties.
Example 2: Evaluating
Southern California air quality: Federal and state air quality
standards require monitoring for levels of ozone, carbon monoxide, NOx,
and particulate matter that exceed a threshold. The law is
strict: one day of violation is cause for concern. These
threshold standards are implemented because they're easy to set up, and
because the thresholds are considered minimum requirements for human
health. However natural variability can lead to occasional
extreme events, so threshold requirements can be very stringent
requirements.
(On the other hand, if you design a satellite to monitor pollution for
example, any failure in the rocket that launches your satellite or the
measurement equipment would prevent you from collecting any data, so
you might want a stringent engineering design standard.)
Example 3: Identifying
the annual cycle: The annual cycle of seasons means that almost
everything on Earth undergoes a natural annual cycle. Thus,
before we do any analysis we often want to remove the annual
trend. For the moment, we'll look at examples for ozone in the
troposphere (which varies because it undergoes a photochemical
reaction, and that of course depends on the available sunlight),
temperature in the upper ocean at the BATS site, and the Keeling curve.