Global FDR analysis - Stanford University

Download Report

Transcript Global FDR analysis - Stanford University

Differential Analysis
& FDR Correction
Differential Analysis Steps
Step 1: Construction of input data table in EXCEL
Step 2: Save EXCEL file into tab delimited txt file
Step 3: Upload data - tab delimited txt file
Step 4: Choose T or U Test
Step 5: Enter your email and submit
Step 6: Result interpretation: global FDR
Step 7: Result interpretation: local FDR
Step 1:
Construction of input data
table in EXCEL
CLASS
1
1
0
0
Gene.1.name …
…
…
…
Gene.2.name …
…
…
…
…
…
…
…
…
…
…
…
…
…
Step 1:
Input data format:
• Cell A1: “CLASS”
• 1st Column: feature names
• 1st Row: sample categories.
• It has to be binary, either 1 or 0
• e.g. 1 is disease, 0 is control
• All other cells should be data, one sample per one column
• e.g. array intensity or protein quantities
CLASS
1
1
0
0
Gene.1.name …
…
…
…
Gene.2.name …
…
…
…
…
…
…
…
…
…
…
…
…
…
EXCEL file example
Step 2: Save EXCEL file into tab
delimited txt file
Step 3: Upload data - tab
delimited txt file
1
2
3
Step 3: Upload data - tab
delimited txt file
Input data “input.txt” selected
Step 4: Choose T test or U test
Choose either T or U test for analysis
Step 4: T test or U test, which
one to choose?
• The U test is useful in the same situations as t test
• U test should be used if the data are ordinal
• U test is more robust to outliers
• U test is more efficient
• For distribution far from normal
and for sufficiently large samples
To Discover Differential Features:
Student’s T test or Mann Whitney U test?
Student’s T test:
Student’s T test is a parametric test of the null hypothesis, where the means of
2 normally distributed populations are equal. It is used when you have a nominal variable,
which must only have 2 values, such as “male” and “female,” and measurement variable,
and you want to compare the mean values of the measurement variable. It is a test of the
null hypothesis, where the means of 2 normally distributed populations are equal.
Mann-Whitney U Test:
Mann-Whitney U Test is a non-parametric test that examines whether 2 sites of
data could have come from the same population. It requires 2 data sets that do not need to
be paired, normally distributed, or have equal numbers in each set.
Step 5: Enter your email and submit
Enter your email
Submit
Step 6: Result interpretation
Global FDR
FDR plot red line:
Total Discoveries (TD) or Total Discovery rate = 1
FDR plot green line:
False Discoveries (MEAN) or False Discovery Rate FDR (MEAN)
FDR plot black bar line:
False Discoveries (MEDIAN) or False Discovery Rate FDR (MEDIAN)
FDR plot blue line:
False Discoveries (95%) or False Discovery Rate FDR (95%)
FDR plot dotted black line:
FDR=0.05
Step 6: Result interpretation
Global FDR
A
95% FD/TD
Mean FD/TD
Median FD/TD
.05 FDR
0.6
0.4
0.6
0.2
0.4
0.0
0.2
0.0
Global FDR
0.8
1.0
1 = TD/TD
10-9
0.05
1.0
10-9
0.01
0.02
0.03
Single hypothesis test P-value thresholds
0.04
Step 6: How to read the gFDR plots
• Commonly used global FDR cut off
• 0.05
• If there are no significant features
• No data points will show up below
the 0.05 dotted horizontal line
Step 6: Result interpretation
Global FDR
A
95% FD/TD
Mean FD/TD
Median FD/TD
0.6
0.4
0.6
0.2
0.4
0.0
0.2
0.0
Global FDR
0.8
1.0
1 = TD/TD
10-9
0.05
1.0
Commonly
used
10-9
0.01
0.02
gFDR cutoff: 0.05
0
Single hypothesis test P-value thresho
Features which satisfy global FDR < 0.05
0.05
Step 6: Result interpretation
Global FDR
Median FD/TD
0.0
0.0
0.0
0.2
0.2
0.2
0.4
0.4
0.6
0.4
0.6
0.8
0.6
1.0
1 = TD/TD
95% FD/TD .05 FDR
Mean FD/TD
AMean FD/TD
Median FD/TD
Global FDR
95% FD/TD
1.0
10-9
10-9
0.01 0.05 0.02
0.03
1.0
0.04
Commonly
used
10-9
0.01
0.02
gFDR cutoff: 0.05
0.0
Single
hypothesis test P-value threshold
ngle hypothesis test P-value
thresholds
Features which satisfy global FDR < 0.05
0.15
0.10
0.6
0.05
0.4
0.2
0.0
0.0
Local FDR
0.8
0.20
1.0
Step 7: Result interpretation
local FDR
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.01
0.02
0.03
Single hypothesis test P-value
0.04
0.05
Step 7: How to read the lFDR plots
It has been suggested (Aubert, et al.,
2004) that the first abrupt change of the
local FDR can be an indication for the
determination of a good threshold to
choose genuinely statistically significant
features.
0.15
0.10
0.05
0.6
0.4
0.2
1st abrupt change of lFDR
0.0
0.0
Local FDR
0.8
0.20
1.0
Step 7: Result interpretation
local FDR
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.01
0.02
0.03
Single hypothesis test P-value
0.04
0.05
Step 7: Result interpretation
local FDR
Click to download result file
Step 7: Result interpretation
local FDR
Local FDR results:
• 1st column: feature name
• 2nd column: t or U test P value
• 3rd column: local FDR results