sadovski-unk.ppt

Download Report

Transcript sadovski-unk.ppt

Review of Mathematical and Statistical Models to
Predict (Extrapolate) Water Levels and to Fill Gaps
Extending
Strengthening
(Interpolate)
of theand
TCOON
Data
the Pipeline
in Computer Science
Texas A&M University - Corpus Christi
6300 Ocean Dr. Corpus Christi, Texas 78412, USA
This year team:
Alexey Sadovski
Scott Duff
Garry Jeffress
Carl Steidley
Philippe Tissot
Beate Zimmer
Zack Bowles
David Beck
Jeremy Flores
Aimee Mostella
Kelly Knott (Torres)
Project Goals
• Develop effective & reliable prediction tools
• Developed methods:
– Harmonic analysis
– Numerical methods based equations of
hydrodynamics
– Statistical models
– Neural networks
Typical TCOON
station web page
Primary Water Level
Water Temperature
Wind Speed
Wind Gust
Wind Direction
http://dnr.cbi.tamucc.edu/
Texas Coastal Ocean Observation Network
Monitors water levels and other coastal
parameters along the Texas coast
Tide Charts
• In general this is the first choice
• Astronomical forcing
– Earth, Sun, Moon motions
• Limitations
– Areas such as the Gulf of Mexico where the
dominant forcing is meteorological in nature
Harmonic Analysis
• Standard method for tide predictions
• Represented by constituent cosine waves with
known frequencies based on gravitational
(periodic) forces
• Elevation of water is modeled as
h(t) = H0 +  Hc fy,c cos(act + ey,c – kc)
h(t) = elevation of water at time t
H0 = datum offset
ac = frequency (speed) of constituent c Hc = amplitude of constituent c
fy,c ey,c = node factors/equilibrium arg-s kc = phase offset for constituent c
Prediction vs. Observation
It’s nice when it works…
Prediction vs. Observation
…but it often doesn’t work in Texas
Water Levels
Tides
In Texas, meteorological factors have
significant effect on water elevations
Two Reliable Statistical Models
– Both are linear multi-regression models
– Both deal with combinations of previous
water levels only
– Difference in models
• Between 4 and 8 variables in one kind of model,
which takes into account first and second
differences of water levels
• All 12 to 48 variables in the other models, in
which only previous water levels are used
Statistical Models
x1  a0 x0  a1 x1  ...  a n x n
and step by step
xk  a0 xk 1  a1xk 2  ...  a1n xk n
Statistical Models
• One possible future application
– Occasional losses of data
• Regression models, using forward and backward
regression, evaluate lost data as a linear combination of
forward and backward predictions with weights
proportional to the distances from the edges of the gap
Factor Analysis
Question:
Why do models with only previous water levels work
better than models with all data provided by TCOON
stations?
• No more than 5 factors explain over 90% of variance
for water levels
Bob Hall Pier (014)
Bob Hall Pier (014)
Flower Garden (028)
Flower Garden (028)
Factor Analysis
– In off-shore deep waters, the first two or three
components are periodical
– In coastal shallow waters and estuaries the
major or the first component is not periodical
– Our conclusion is that the prime factor is
“weather”
– Linear regression models for different locations
have different coefficients for the same
variables
– This difference may be explained by the
geography where the data is collected
Improved Predictions
• Model differences between the observed
water levels and the harmonic predictions by
using multiple regression (so-called marriage
of harmonic and regression analysis)
• Build a model based on past observations;
use that to make a model to predict
differences in future observations
Statistical Models
xn  wn  hn
x n is the difference between water
level w n and harmonic water level h n
at the moment n
x1  a0 x0  a1 x1  ...  a n x n
and step by step
xk  a0 xk 1  a1 xk 2  ...  a1n xk n
Predicted Levels
Now we can predict
water levels at the moment t
pwt  ht  xt
Station 005: Packery Channel
Evaluation Criteria
• Criteria for the
evaluation of water
level forecasts
– Different criteria were
developed mostly by
the U.S. National
Oceanic and
Atmospheric
Administration
(NOAA) to address the
different priorities of
coastal users
Evaluation Criteria
• Average error will address the possible bias
of a model
• The absolute error will give information on
the overall accuracy of the model
• Standard deviation will give information on
the variability of the forecasts
Evaluation Criteria
• Specialized criteria,e.g., positive and negative
outlier frequencies, will be useful to
characterize model performance for unusually
high or low water level situations
• Some forecasting methodologies will be better
suited for some criteria and worse for others,
e.g., predictions based on harmonic analysis are
very good when evaluated by the standard
deviation criteria and not as good when
using the absolute error criteria.
Training Set - March 2003
Rockport (015)
Prediction for 48 hours
Training Set - March 2003
Prediction for 24 hours
Bob Hall Pier (014)
Basic Algorithm
• Retrieve data according to user
provided parameters
• Search data for missing values
• Perform linear regression to obtain
two sets of coefficients
• Calculate missing values with
coefficients
Combine two sets into one
Insert new values in place of missing data
Water Level
• RWL = AWL - HWL
– RWL => Residual Water Level
– AWL => Actual Water Level
– HWL => Harmonic Water Level
• Record the location of gaps in the AWL
• Record the difference between AWL and
HWL as the RWL
Linear Regression
• For each gap in the data
– Perform forward and backward linear
regression (FLR & BLR, respectively) using
hourly data to obtain coefficients
– Calculate the missing data points with these
coefficients
Methods of Combination
• Combine the results of FLR & BLR using
one of the following methods:
– Convex linear combination
• Based on weighted proportion
– Convex trigonometric combination
• Based on trigonometrically weighted proportion
– Combination at intersection
• Fuse together at the intersection
Results
Effect of Number of Coefficients
• Timing was negligible
• Accuracy peaked and then
declined depending upon
weather conditions
• RMSE was used to
determine the optimal
number of coefficients
Coefficients
Figure 1 displays our chosen coefficients
according to weather condition
Although these coefficients are optimal,
the accuracy of interpolation still
declines as
weather
becomes
more extreme
Artificial Neural Network Modeling
• Started in the 60’s
• Key innovation in the late 80’s: backpropagation
learning algorithms
• Number of applications has grown rapidly in the
90’s especially financial applications
• Growing number of publications presenting
environmental applications
Why ANN’s?
• Modeled after human brain
• Neurons compute outputs
(forecasts) based on inputs,
weights and biases
• Able to model non-linear
systems
Neural Network Features
• Non-linear modeling capability as well as generic
modeling capability
• Robustness to noisy data
• Ability for dynamic learning
• Limitation: Requires availability of high density
of data
Artificial Neural Network Setup
• ANN models developed within the Matlab
and Matlab NN Toolbox environment
• Found simple ANNs are optimum
• Use of ‘tansig’ and ‘purelin’ functions
• Use of Levenberg-Marquardt training
algorithm
• ANN trained over 1 year of hourly data
(8760 observations)
Transform Functions
3
1
0.8
2
0.6
0.4
1
0.2
0
0
-0.2
-1
-0.4
-0.6
-2
-0.8
-1
-3
-2
-1
0
1
2
3
Tansig
y = (ex – e-x)/(ex + e-x)
-3
-3
-2
-1
0
Purelin
y=x
1
2
3
Optimum ANN Structure
• Simple ANNs work best: 1 hidden neuron and 1
output neuron
• Optimum number of previous water level inputs
varies between 3 and 24 hours
• Optimum number of previous wind measurement
inputs varies between 1 and 12 hours
• Actual number of inputs chosen does not
significantly change model performance
ANN schematic
Water Level
History
 (a1,ixi)
Wind Squared
History
Tidal
Forecasts
b1
 (a2,ixi)
 (X1+b1)
 (X3+b3)
 (a3,ixi)
b3
 (X2+b2)
H (t+i)
Water Level
Forecast
b2
Input Layer
Hidden Layer
Output Layer
Philippe Tissot - 2000
Model Assessment
•
•
•
•
Based on five 1-year data sets: ‘97, ‘98, ’99, ’00, ‘01
including observed water levels and winds, and tide
forecasts
Train the ANN model using one data set e.g. ‘97 for
each hourly forecast target, e.g. 12 hours
Apply the ANN model to the other four data sets,
Repeat the performance analysis for each training
year and forecast target and compute the model
performance and variability
Artificial Neural Network forecasting of
water levels
• Use historical time series of previous water
levels, winds, barometric pressure as input
• Train neural network to associate changes
in inputs and future water level changes
• Create water level forecasts using a static
neural network model
ANN Inputs
• Tested model with input from different locations:
– Rockport only
– Rockport with Port Aransas (Ship channel)
– Rockport with Bob Hall Pier (Coastal station)
• Tested model with different meteorological time
series:
– Water Level only
– Water Level and Previous Wind measurements
– Water Level, Previous Wind, and Wind Forecasts
Training with one set (X = 15cm)
Morgan’s Point
Tropical Storms
• Tropical storms are a challenge for any
predictive model
• They are relatively infrequent and unique
• As storms are often destructive, improved
predictions are very useful to emergency
management
Performance applied to 1998
Water level
(m)
Hours (1998)
Tropical Storm Frances September 7-17, 1998
Frances Trajectory
Landfall on Sept. 11
Forecasts in storm events
Rockport ANN 24-hour Forecasts During 1998 Tropical Storm Frances
(ANN trained over 1997 Data Set)
New Directions and Problems
• Statistics on ANN structures.
• New feedback functions and optimal criteria for ANN to
improve prediction quality.
• Evaluating quality and quantity of information produced
by ANN during training and predictions using ergodic and
entropy analysis.
• Spatial-temporal analysis (statistics) of data from stations
in and around Corpus Christi Bay and Galveston Bay.
• Dynamic models of the water levels (Integral-differential
equations with boundary values).
New Directions and Problems
• Visualization and simulation of water levels.
• Predictions during tropical storms using as inputs the first,
the second, and the third differences of water levels.
• Optimal control in multi-species models for fisheries and
fish harvesting.
• Spectral analysis of stochastic processes based on TCOON
data.
• Developing of the theory of quasi-periodic functions.
• Development of mathematics for autonomous boat.
Acknowledgments
• The presented work is funded in part by the
following federal and state agencies of the
USA:
– National Aeronautic and Space Agency
(NASA Grant #NCC5-517)
– National Oceanic and Atmospheric
Administration (NOAA)
– Texas General Land Office
– Coastal Management Program (CMP)
Resources
 Division of Nearshore Research Website
http://dnr.cbi.tamucc.edu
 TCOON Data Query Page
http://dnr.cbi.tamucc.edu/pquery
http://dnr.cbi.tamucc.edu/wiki/Modeling/WLModelComparisons
 Web-based Predictions Development Page
http://wip.cbi.tamucc.edu/~jessica/cbidb/cgi-bin/excel/sdiffcoeff.cgi