Time Series Analysis Tutorial 2 - GeoWeb

Download Report

Transcript Time Series Analysis Tutorial 2 - GeoWeb

Large-scale cGPS processing and
prototyping solutions
Generating large GAMIT solutions (>50 sites)
Regional networks: All sites to be processed
Global networks: Make global networks of certain size
given list of available sites.
Strategies for large network processing in GLOBK
Solution concatenations
Prototyping tools: Run globk command setup on time
series files.
GPS Processing and Analysis with GAMIT/GLOBK/TRACK
T. Herring, R. King. M. Floyd – MIT
UNAVCO, Boulder - July 8-12, 2013
Strategies for Large-network Processing
• Since GAMIT is limited by parameter definitions to 99 sites, with
large networks, we divide the processing into sub-nets, each of 3050 sites (processing is proportional to the cube of the number of
parameters, so it’s better to have more smaller sub-nets than a few
large ones)
• Sh_gamit can use the -netext parameter to define multiple day
directories (e.g. [DDD]n1, [DDD]n2, ….)
• GLOBK is used to combine the networks for each day
You can run htoglb to generate binary h-files (.glx) for each
subnet, then use sh_glred with the LB and –net options to select
the h-files to be combined
• Prototyping programs (tscon, tssum, tsfit) can be used to identify
breaks and outliers before running a (time-consuming) velocity
solution
07/10/2013
Large cGPS+
2
Large regional networks
•
Program netsel : Subnetting program for regional GPS networks
Scans the RINEX list and generates a sites.default
Usage:
netsel <options>
Options are
-f <file> -- List of rinex files generated with ls -s <rinex files>
-v <file> -- Globk velocity file with site coordinates
-n <number> -- number of sites per network (additional sites added for ties)
-t <number> -- Number of tie sites per network
-s <file> -- Name of station.info file to use (default ../tables/station.info)
-c <code> -- Specifies network code (2-characters). Default ne so that
networks will be ne01, ne02 .... neNN
Output is nominally written to the screen but is usually redirected to a file.
07/10/2013
Large cGPS+
3
netsel output
NETSEL:
FTPLOG: PBO_2011026.rx
VELFILE: PBO_all.pos
Number of sites per net: 40
NETSEL: PBO_all.pos contains 1358 sites
NETSEL: PBO_2011026.rx contains 1234 sites
Site Range Long 122.1406 310.1850 Latitude 10.2680 82.4940 deg
NETSEL: For 1234 sites, with nominal 40 sites per network, final selection is:
NETSEL: Fin 39 sites in 32 networks with 25 sites in one network
NETSEL: Number of tie sites 1
#NETWORK Number 001 with 39 sites
# NN # Long
Lat Name RK
# 001 1 242.10350 34.12600 AZU1 13
…. List of networks
07/10/2013
Large cGPS+
4
netsel output and tie
• Algorithm selects sites from highest density regions progressively
working to lower density regions.
• Final network ties “centroid” sites of each network together (for case
here only one tie site
• Output sites.default.yyyy.ddd to be used in gamit processing.
• -expt code and –netext are normally set to neXX where XX is
network number.
• Script file with sh_gamit calls are then passed to sh_PBS_gamit when
running on a cluster.
07/10/2013
Large cGPS+
5
Global Network Selection
• Script sh_network_sel used with program global_sel to
make sites.defaults.yyyy.ddd files
• This scripts ftp’s lists of available data on a given day and
build global networks from this list.
• The core list are 4-char codes of sites to be included if
they are available
• Reference list are the initial sites in each network (next
slide).
• Each network shares ties sites with each other network.
Algorithm in based on keeping sites widely separated.
07/10/2013
Large cGPS+
6
Reference sites
# Reference site lists set initial sites in each network and the number of
networks to use.
REF_NET NET1 ONSA|ALGO|KOUR|S071|WDC1|WDC3
REF_NET NET2 AMC2|MATE|KHAJ|KOKB
REF_NET NET3 NYAL|CHUR|CRO1|TWTF
REF_NET NET4 GOL2|NIST|PIE1|WSRT
REF_NET NET5 BREW|STJO|IENG|NOT1
REF_NET NET6 WAB2|BRUS|NLIB|HOB2
07/10/2013
Large cGPS+
7
Prototyping tools
• There are two new programs that are used for prototyping
solutions are:
– tscon which converts a variety of data formats into the PBO .pos
format while allowing a new reference frame realization using
techniques similar to GLORG stabilization. Stabilization can used to
test selection of reference sites.
– tsfit which fits time series with a variety of models some of which can
be specified in a GLOBK .eq file format. tsfit also output a globk apriori
coordinate files. Use of realistic sigma option here and sh_gen_stats
allows process noise to be set for globk (site dependent random walk
variances)
• There is also an additional program xyzsave that can be used to
generate XYZ files for use in tscon when the pbo output option was
not specified in the original globk runs. It highly recommended that
the pbo option be used in all output from globk and glred. The
somewhat new program, tssum can be used to extract and append
pbo time series files from globk and glred output files (normally
.org files).
07/10/2013
Large cGPS+
8
Prototyping concept
• The general idea of the solution prototyping is to generate an
earthquake file and a list of stabilization sites that can be used
in both velocity and time series analysis in GLOBK and GLRED
runs. Tsfit can also be used to generate apriori coordinate
files for use in tscon and globk/glred.
• Both tscon and tsfit can read standard globk earthquake and
apriori coordinate files (include EXTENDED entries). The
programs do not manipulate covariance matrices and so it
assumed that an initial time-series solution exists with
stabilized coordinates (i.e., the output of a glred run with
stabilization).
07/10/2013
Large cGPS+
9
Process
• Basic processing ordering:
– First run glred to generate time series with the pbo output option set.
This solution might for example use ITRF05 sites for stabilization, or for
more regionally focused networks, globk might be used for a velocity
solution and the good sites from this analysis used as the stabilization
sites in the glred run.
– (There is a "catch-22" here in that knowing which sites are well
behaved requires generating time series first and so these approaches
tend to be iterative with the list of good sites being determined from
their behavior in different analyses.)
– Once the initial time-series are generated, tscon can be used to
generate new time-series with different stabilization sites and with
different apriori coordinate models than those used in the original run.
– Analyses of these time series can be carried out using tsfit to estimate
new apriori coordinate models and additional parameters associated
with seasonal variations, earthquake post-seismic deformations and
jumps in the time series due to antenna and the instrument changes
and earthquakes.
07/10/2013
Large cGPS+
10
Basic Processing (cont.)
– The statistics of the fits to the time series are generated by
tsfit and these can be used to judge the quality of the
analyses. The summary file output by tsfit can be used in
the version of sh_gen_stats with the –ts option.
– Removal of outlier data using an n-sigma condition can
also be preformed by tfsit with the output in standard eqfile format.
– The new coordinate apriori files from tsfit can be used in a
new reference frame realization using tscon. The newly
generated time series can be used to refine the analysis
more using tsfit. Iterating the reference frame in this
manner could lead to some systematic behaviors and it is
ideally best to generate the reference frame with a globk
solution.
07/10/2013
Large cGPS+
11
Prototyping output
• At the completion of the tscon/tsfit process, there should be
available an earthquake file that contains earthquakes,
renames for offsets and for time series editing (renames to
_XPS names), and an apriori coordinate file with optional
EXTENDED entries that should provide a good match to the
behavior of the time series.
• A refined list of reference frame sites and process noise
models may also have been generated (sh_gen_stats).
• The earthquake and apriori file and other information can be
used in an updated globk velocity solution or in glred
repeatability time series run. These final globk and glred
analyses should run with no major problems and would be
used to generate final results.
07/10/2013
Large cGPS+
12
tsfit
• tsfit is a program to fit PBO-formatted times series using a
globk eathquake file input and other optional parameters
(such as periodic signals). PBO format time series are
generated using the pbo output option in glorg and
program tssum to extract the time series. tssum allows
incremental updates of time series rather the full regeneration used by ensum and multibase.
• For the prototyping role, the most important commands
are eq_file (input) and out_aprf and rep_edits (outputs).
• The command line for tsfit is:
– tsfit <command file> <summary file> <list of files/file containing
list>
07/10/2013
Large cGPS+
13
tsfit commands
• EQ_FILE <File Name>
– Name of standard globk earthquake file. Command may
used multiple times as in the lastest version of globk.
• OUT_APRF <file name>
•
•
– Specifies name of a globk apriori coordinate file to be
generated from the fits. This file contains EXTENDED entries
if needed and can be used directly in globk or tscon.
REP_EDITS <rename file>
– Set to report edits to file <rename file>. Edit lines start with
R. The rename file if given will contain globk rename to _XPS
lines.
REAL_SIGMA
– Apply the tsview/ensum realistic sigma algorithm to generate
sigmas that account for temporal correlations in the data.
This option is needed to use sh_gen_stats
07/10/2013
Large cGPS+
14
Other tsfit commands
• PERIODIC <Period (days)>
– Estimates Cosine and Sine terms with Period. This
command may be issued multiple times to estimate signals
with different periods.
• DETROOT <det_root>
•
•
– String to be used at the start of the site dependent
parameter estimate files. Each site generates its own file.
Default is ts_. NONE generates no files
VELFILE <vel file name>
– Name of the output file containing velocity estimates in
the standard globk velocity file format.
NSIGMA <nsigma limit>
– Edit time series based on a n-sigma condition.
07/10/2013
Large cGPS+
15
Other tsfit commands
•
MAX_SIGMA <Sig N> <Sig E> <Sig U> meters
– Allows limit to be set on sigma of data included in the solutions.
– Default values are 0.1 meters in all three coordinates.
•
TIME_RANGE <Start Date> <End Date>
– Allows time range of data to be processed to be specified. Dates are Year Mon Day Hr Min.
End date is optional.
•
OUT_EQROOT <root for Earthquake files> <out days>
– Specifies the root part of the name for earthquake estimates outputs. The outputs are in globk
.vel file format and so can be used with sh_plotvel and velview. The outputs are coseismic
offset and log and exponential coefficient estimates. If the <out days> argument is included
the total post-seismic motion is computed that many days after each of the earthquakes. If
exponential and log terms are estimated for the same event (same eq_def code) then they are
summed and correlations accounted for in computing the sigmas of the total motion. Output
file format is .vel file format.
07/10/2013
Large cGPS+
16
tscon
• The program tscon converts timeseries from Reason/JPL/SIO
XYZ files and SCEC CSV format to PBO time series format and
optionally re-realizes the reference frame used to generate
the time series for the format above and standard PBO time
series files generated with tssum.
• The program assumes that the position time series are
reported at a regular 1-day interval. This is the normal timing
used in gamit for 24-hr sessions of data.
• The command line for tscon is:
– tscon <dir> <prod_id> <cmd file> <XYZ/PBO files/file with
list>
07/10/2013
Large cGPS+
17
tscon commands
• SummarySummary of commands are:
–
–
–
–
–
–
–
–
eq_file <file name> (maybe issued mutliple times)
apr_file <apriori coordinate file> (may be issued multiple times)
stab_site <list of stablization sites> (multiple times)
pos_org <xtran> <ytran> <ztran> <xrot> <yrot> <zrot> <scale>
stab_ite [# iterations] [Site Relative weight] [n-sigma]
stab_min [dHsig min pos] [dNEsig min pos]
cnd_hgtv [Height variance] [Sigma ratio]
time_range [Start YY,MM,DD,HR,MIN] [End YY,MM,DD,HR,MIN]
• These commands mimic the glorg equivalent commands
and operate is very similar way. There are some small
differences because tscon starts with frame realized time
series.
07/10/2013
Large cGPS+
18
GLOBK Velocity Solutions
• The aim of these solutions is to combined many years of data to generate
position, velocity, offset, and postseismic parameter estimates. Not
uncommon to have 10000 parameters in these solutions.
• Input requirements for these solutions:
– Apriori coordinate and velocity file. Used as a check on positions in daily
solutions (for editing of bad solutions) and adjustments are apriori values
(apriori sigmas are for these values)
– Earthquake file which specifies when earthquakes, discontinuities, and missnamed stations affect solution. Critical that this file correctly describe data.
– Process noise parameters for each station. Critical for generating realistic
standard deviations for the velocity estimates.
07/10/2013
Large cGPS+
19
Velocity Solution Strategies
•
In general careful setup (i.e., correct apriori coordinate, earthquake file and
process noise files) is needed since each run that corrects a problem can take
several days. In correct solutions may not complete correctly.
•
Previous methods for constructing these solutions:
– Define a core-set of sites (usually 20-200 sites) where the solution runs quickly. Test files on
this solutions and use the coordinate/velocity estimates to form the reference frame for time
series generation.
– Time series using these reference frame sites and then test (RMS scatter, discontinuity tests)
to form a more complete earthquake and apriori coordinate/velocity files.
– Steps above are repeated, usually increasing number of stations until solution is complete. As
new stations are added missed discontinuities and bad process noise models can cause
problems.
07/10/2013
Large cGPS+
20
Velocity strategies
•
Other methods that are used in increase speed are:
– Pre-combine daily solutions into weekly to monthly solutions and use these combined
solutions in the velocity solutions. There are many advantages to this approach:
•
Runs are much faster. Each processing step takes about the same time with the monthly as a daily file
but there are 30 fewer files so 30 times faster.
•
Numerical rounding errors are much better when monthlies are used
•
New MIDP output option refers the solutions to the middles of the month. (Earlier versions used last
day of month as reference time, natural time for a sequential Kalman filter.
•
Random walk process noise models correct when velocity NOT estimated in combinations
– Run decimated solutions (e.g., one day per week). Works fine and changing start day does not
have large effect due to correlated noise models. Care needed when different start day results
are combined to avoid white noise sigma reduction.
07/10/2013
Large cGPS+
21