Transcript Document

Applications of change point detection in Gravitational Wave Data Analysis

Soumya D. Mohanty AEI

Plan of the talk

• Brief introduction to change point detection and its relevance to GW data analysis • Contrast with prevalent methods • Three applications in different areas

26/3/03 UT Brownsville 2

What is a change point?

26/3/03 UT Brownsville 3

Signals and Change points

• • • The most elementary signature of a signal is to introduce a

change

in the distribution of data Isolating a subset of given data that is significantly

different

from the rest is the most general signal detection method This division is subject to statistical uncertainty

26/3/03 UT Brownsville 4

Mathematical Statement

• • • Data described by a joint probability density p(

x

).

CP detection: Can the data be divided into disjoint sets y, z (x = y  z), such that p(y) is

different

from p(z)? Not required to know p(y) or p(z) themselves.

Adaptive detection

: Somehow

deduce or estimate

a noise p(x). Then given

new

data y, test if it could have come from p(x).

26/3/03 UT Brownsville 5

Pros and Cons

• • • • Change point detection Can go from full

prior

information to no prior Less sensitive Possible to tune away response to different types of inhomogeneity Post analysis definition required of what is a signal and what is noise • • • • Adaptive detection Needs prior information

and

assumption of stationarity More sensitive

provided prior information is correct

Tuning is a complicated process if at all possible Signal & noise pre-defined

26/3/03 UT Brownsville 6

Applications

• • • Change point detection in the time frequency plane – burst detection Change point detection in a multivariate time series – Data/Detector Characterization Robot Two sample comparison – GRB-GW association

26/3/03 UT Brownsville 7

Bursts in time-frequency plane

• • Time frequency plane – arena for burst detection • Example: split time series into segments and FFT each one.

Basic signature of a burst: changes the distribution of samples in some region of the time-frequency plane.

26/3/03 UT Brownsville 8

26/3/03 UT Brownsville 9

• • Most Burst detection algorithms try to look for this effect in different ways • Excess power: thresholds the average (=band limited rms) • Tfclusters: thresholds cluster size PSDCD (Mohanty, PRD,’99): tests for difference in sample distributions of blocks in TF plane. • PSDCD is a

change point

detector, others are

adaptive

detectors.

26/3/03 UT Brownsville 10

Non-parametric CP detection

• • • Non-parametric detection: the

false alarm rate

is independent of noise distribution

by construction

. Sets it apart from other burst detectors.

A non-stationary time series can be thought of as a sequence of transitions from one noise model to another (e.g. 1   10  ...). A non-parametric detector should maintain a

constant false alarm rate

even for non-stationary noise.

CP detection can be tuned to prevent triggering on known technical features.

26/3/03 UT Brownsville 11

KSCD

• • P ower S pectral D ensity C hange D etector [ DMT Monitor] • K olmogorov S mirnov test based C hange D etector (KSCD) KSCD: improvement in detection efficiency and implementation

26/3/03 UT Brownsville 12

26/3/03 UT Brownsville 13

26/3/03 UT Brownsville 14

Trial run on GEO S1 data

• • Uncalibrated h(t). 3.47 days (some breaks).

Plagued by fast non-stationarity in the <1.5kHz band.

• • 90% - 95% of MTFC triggers could be attributed to this fast non-stationarity. These false triggers skew the interpretation of histograms such as the time interval between triggers.

• KSCD can be tuned to be insensitive to these features but still catch “genuine” glitches.

26/3/03 UT Brownsville 15

Rejection of features

26/3/03 UT Brownsville 16

Analysis goals

• Disentangle fast low frequency non-stationarity from “genuine” triggers.

• Study time dependent behavior of the triggers.

 Study trigger rate vis a vis band limited rms trend.

• Does KSCD trigger rate track band limited rms?

 Tune KSCD to reject triggers but catch fast non stationarity • Analyze the dependence of “genuine” trigger channel on fast non-stationarity channel.

26/3/03 UT Brownsville 17

Trigger rate

26/3/03 UT Brownsville 18

Future of KSCD

• Test various aspects of non-parametric change point detection using real data (S1 GEO/LIGO, S2 LIGO) • • • Understand efficiency (very preliminary:  40% of matched filtering) Build LDAS DSO KSCD: Main engine of DCR

26/3/03 UT Brownsville 19

Data/Detector Characterization Robot

All channels View data as a single multivariate time series Transform the multivariate data Example: construct cross correlation of two channels DCR Detect change points Design Database Data Mining

26/3/03 UT Brownsville 20

Data Characterization

What is the best analysis strategy given some data?

• • • Quantify • non-stationarity of noise floor • Types and rates of transients • Drifting carrier frequencies Simulate real data and do Monte Carlo studies Hopefully, lead to more believable detection of GW signals.

26/3/03 UT Brownsville 21

Detector Characterization

• Hunt down sources of deviations from expected ideal behavior and fix them • To help, interferometers blindly record data from several other sensors • • • control system environment monitors (e.g., temperature) Seismometers, magnetometers

26/3/03 UT Brownsville 22

Change Points

Mathematical abstraction of the problem

• • Main interest in both data and detector characterization– •

change points

Example: transients, change in rate of transients, non-stationarity, change in coupling between two channels Natural conclusion-- Build

database

of change points using automated algorithms and analyse the database

26/3/03 UT Brownsville 23

Analysis of databases

• • Exploratory • Limited to small databases of high confidence detections

Data mining

• Emerging field of synthesis between

statistics

and

computing

– aim is to detect

new

,

informative

patterns in

huge

databases • Requires

reliable

database quality

26/3/03 UT Brownsville 24

DCR project

• Overall Aim: enable data mining of multi channel interferometric data • Elements: • Algorithms – few, well understood and complementary (not an arbitrary set of independent simple monitors) • • Software/Hardware Data mining

26/3/03 UT Brownsville 25

Algorithms in DCR

• Change point detector – KSCD • generalized to the case of cross-spectral density of two channels • Line removal – MBLT • • no modeling required of line behavior transient resistant • Robust noise floor tracking – MNFT

26/3/03 UT Brownsville 26

Sample Power Spectral Density

26/3/03 UT Brownsville 27

DCR implementation

• Core Digital Signal Processing library in C++ • T emplate based S tatistics and S ignal P rocessing library (TSSP). Uses STL.

• FFT, Filtering, Filter Design, Windows, PSD, Modulation, Demodulation, ...

• Stand alone C++ main function for a given pipeline

26/3/03 UT Brownsville 28

Stand alone code

• • • • Frame reading class • Multiple ADC channels Database IO class (uses MySQL) • Database to be used for both job description and storing job outputs Multiple jobs launched using Condor At present: dedicated 10 node cluster (Linux-alpha)

26/3/03 UT Brownsville 29

GRB-GW association

• • Finn, Mohanty, Romano, PRD, 1999 Based on two sample comparison • •

on-source sample off-source sample

• Two sample tests also used in CP detection

26/3/03 UT Brownsville 30

• •

Introduction to Gamma-Ray

High-energy, short-duration electromagnetic radiation from extra-galactic sources

Bursts

http://online.itp.ucsb.edu/online/gamma_c99/piran/oh/06.html

Favored models point to exploding fireball • Involve large amounts of matter, • ejected at relativistic speeds, • producing a series of high energy E/M shockwaves-- • initially gamma-rays (some redshift to lower-energy gamma-rays or X-rays, others are absorbed), • then X-rays (red-shifted to optical wavelengths), • then visible light (red-shifted to radio wavelengths)

26/3/03 UT Brownsville 31

GRBs and Gravitational Waves

• • • • • GRB progenitors thought to be new formed Black Holes Black Hole formed as a result of massive stellar collapse or binary NS mergers BH accretes debris rapidly Leads to beams of ultra-relativistic ejecta This violent scenario is a natural candidate for strong GW emission also

26/3/03 UT Brownsville 32

Motivation for an FMR type search

• • • GRBs occur at cosmological distances. Hence chance of detecting GWs from an individual GRB is small However, GRB astronomy is very active • Relatively large number of events were detected (~O(1/day)) by BATSE • Several more missions coming up soon (e.g., SWIFT and GLAST) FMR: Combine information from several triggers to build up signal to noise ratio

26/3/03 UT Brownsville 33

Algorithm

• • • • Cross-correlate time series between two interferometers for each GRB trigger • time shift segments to align GW signal Compare cross-correlation to times not associated with GRBs Build an on-source and a off-source sample of cross-correlations Test if the means values of the two samples are significantly different

26/3/03 UT Brownsville 34

Implementation

• • • External Triggers subgroup of Bursts Upper Limit group • S. Marka, R. Rahkola, S. Mohanty, S. Mukherjee, R. Frey Could not apply FMR

in toto

for S1 because only one trigger received during double lock (LIGO tech note) Already have 15 triggers for S2!

26/3/03 UT Brownsville 35

Issues

• • • • Non-stationarity of data • • Data conditioning – line removal Noise floor tracking -- MNFT Lack of directional accuracy • Use H1+H2 – but

strong

(non-stationary?) correlations How to best use multiple interferometers Systematic uncertainties • • Rely on signal injection and Monte Carlo simulations DCR – simulate real data?

26/3/03 UT Brownsville 36

Summary

• • • • Applications of change point detection in GW data analysis Exploration of such techniques has just only started Offers better control on data analysis with real, complicated data Improvements in efficiency possible. Can be combined with adaptive methods.

26/3/03 UT Brownsville 37