ISTA Workshop on Statistical Aspects of GMO Detection

Download Report

Transcript ISTA Workshop on Statistical Aspects of GMO Detection

Generalities & Qualitative Testing Plans

May 8-10, 2006 Iowa State University, Ames – USA Jean-Louis Laffont Kirk Remund

Objectives

• Introduce Acceptance Sampling – review assumptions – definitions – understand strengths & limitations • Use with a qualitative assay – zero tolerance plans – plans that allow deviants – purity testing ISTA Statistics Committee 2

Challenges: random sampling variability

Seed Lot

0.09% 0.07% 0.12% 0.11% ISTA Statistics Committee 0.05% 3

Challeges: Sampling & Assay Variability

Seed Lot Sample

0.15% 0.12% < 0.10%

Assay (PCR)

ISTA Statistics Committee 0.09%

Sample Prep

4

Benefits of acceptance sampling approach 1.

Manage sampling variability & assay errors 2.

Maintain flexibility: seed pooling schemes, single or double stage testing 3.

Maintain confidence in decisions – “ We are 95% confident that the GMO presence in this lot is < 0.1% ” ISTA Statistics Committee 5

Assumption:

Representative

Sample

• Definition 1 – “Obtain sample so that each seed has an equal and independent chance of being selected [called a simple random sample (SRS)]” – Index every seed, pick random numbers, obtain indexed seeds ...

– Good idea?

1 2 3 4 5 1,000,000,000 • Definition 2: mimic SRS sample – bag sampling (ISTA rules) – probe sampling (uniform grid) – systematic sampling ISTA Statistics Committee 6

Probe sampling

Sampling bulk containers (e.g., trucks or bins) Often reasonable approach if heterogenuity occurs as horizontal or inverted cone layers Sam pling collection point: probe the depth of the container

ISTA Statistics Committee 7

Systematic sampling

• Sample a flow of seed on regular time interval – flow from hopper bottom truck – flow from a silo • More samples as heterogeneity increases • Sample collect from cut through entire stream of flowing seed • Caution: Make sure that there is not cyclic behavior in flow that correlates with sampling interval ISTA Statistics Committee 8

Obtaining Pools to Evaluate Bulk Characteristics Obtain sample seed lot primary samples … composite sample submitted sample seed pools (bulks) for testing ISTA Statistics Committee 9

Assumption: Seed lot is large

• Sample size should be no larger than 10% of population • This condition must hold to use Seedcalc or Qalstat • If this assumption is not met we must use methods based on the hypergeometric distribution ISTA Statistics Committee 10

Acceptance sampling for qualitative assays

SEED SEED SEED SEED SEED LOT SAMPLE OF SEEDS X DEVIANT SEEDS FOUND X>C X

C REJECT LOT ACCEPT LOT

ISTA Statistics Committee 11

Definitions

• LQL = lower quality limit – highest level of impurity that is acceptable to consumer – “95% confident that seed impurity is below 1%” (LQL=1%) • AQL = acceptable quality level – level of impurity that is acceptable to producer and consumer – Some definitions • Conservative: producer can produce seed at this impurity level or below • Practical:

process average

• Set in relation to threshold 12

Definitions, cont.

AQL LQL

0% 0.15% 0.2% % impurity ISTA Statistics Committee 0.5% 13

Definitions, cont.

• Consumer Risk = chance of accepting “bad” lot (lot impurity = LQL) • also called beta (  ) • Producer Risk = chance of rejecting “good” lot (lot impurity = AQL) • also called alpha (  ) ISTA Statistics Committee 14

Operating characteristic (OC) curve

want these whatever don ’ t want these

100% Ideal OC Curve 80% 60% High chance of accepting lot at AQL (alpha) High chance of rejecting lot at LQL (beta) 40% 20% 0% AQL True Im purity in Lot LQL

ISTA Statistics Committee 15

OC curves, cont.,

100% 90% 80% AQL=0.5

% LQL=1.0% Poor Testing Plan low producer risk high consum er risk 70% 60% 50% Good Testing Plan low producer risk low consum er risk 40% 30% 20% 10% 0% 0.00% Poor Testing Plan high producer risk low consum er risk 0.25% 0.50% 0.75% 1.00% 1.25% True Im purity Level (%) 1.50% 1.75% 2.00%

ISTA Statistics Committee

n=400, c=1 Large n n=400, c=4

16

LQL & AQL in relation to threshold LQL = threshold AQL = what producer can deliver

Retest Acceptance

LQL = 2 x threshold AQL = ½ x threshold (similar to tolerance approach)

Retest Acceptance 0 0.5

1 1.5

Actual % Impurity in Lot 2 2.5

0 0.5

1 1.5

Actual % Impurity in Lot 2 2.5

ISTA Statistics Committee 17

Reducing Costs: Testing Seed Pools Rather than Individuals

5 seed pools 300 seeds per pool

• Works well in testing for adventitious presence • Assay must be able to detect one GM seed in pool of all conventional seed with high confidence ISTA Statistics Committee 18

Challenge: setting the threshold

Option 1 : require true zero threshold result : test all seed in entire lot…..

Option 2 : “zero tolerance” in sample result 1 : hidden non-zero threshold Example:

USDA recommendation for Starlink (Cry9c), test 2400 seeds and allow zero positives yields a 0.19% threshold rather than zero.

result 2 : high cost to producer Throw away a lot of good seed due to false positives and sampling variability ISTA Statistics Committee 19

Challenge: setting the threshold, cont.

Option 3 : set reasonable non-zero threshold, allow for some positives result 1 : manage consumer and producer risks to acceptable levels result 2 : better manage impact of assay errors on results result 3 : most seed approved for sale will be much lower than threshold (e.g., 3 or 10 times lower) ISTA Statistics Committee 20

Zero Tolerance Plans

LQL=1.0% AQL=0.5

% 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 0.00% 0.25% 0.50% 0.75% 1.00% 1.25% True Im purity Level (%) 1.50%

ISTA Statistics Committee

1.75% 2.00%

21

The Perfect Plan

50% 40%

Reject 0% of “ Good ” Lots Accept 0% of “ Bad ” Lots

30% 20% 10% 0% 0.01% 0.10% 0.25% 0.50%

True Lot Impurity

Accept 1.50% Reject 2.00%

ISTA Statistics Committee 22

Zero Tolerance Plan - Test one pool of 300

50% 40% 30% 20%

Reject ~20% of “ Good ” Lots Accept <1% of “ Bad ” Lots

10% 0% 0.01% 0.10% 0.25% 0.50% 1.50% 2.00% Accept

ISTA Statistics Committee

Reject

23

50%

Almost Perfect Plan: Test 6 pools of 300, accept 4 deviants pools or less

40%

Reject 5% of “ Good ” Lots Accept <1% of “ Bad ” Lots

30% 20% 10% 0% 0.01% 0.10% 0.25% 0.50% 1.50% 2.00% Accept

ISTA Statistics Committee

Reject

24

100%

OC curves for two testing plans 1 pool of 300 6 pools of 300

80% 60% 40% 20% 0% 0 0.5

1 1.5

Actual % Impurity in Lot

ISTA Statistics Committee

2 2.5

25

Hypothetical situation: “ Ten seed pools of 300 seeds each are tested from a conventional seed lot and 5 pools test positive for adventitious presence. The lot is labeled as having less than 1% adventitious presence and it is shipped.

Should they have shipped the lot?

ISTA Statistics Committee 26

Yes.

10 pools of 300 seeds each 60 pools of 50 seeds each Can see up to 7 positive pools and still have 95% confidence the true lot purity is below 1% threshold Can see up to 17 positive pools and still have 95% confidence the true lot purity is below 1% threshold ISTA Statistics Committee 27

OC Curves for two testing plans

100%

60 pools of 50 seeds 10 pools of 300 seeds

80% 60% 40% 20% 0% 0 0.5

1 1.5

Actual % Impurity in Lot 2 2.5

ISTA Statistics Committee 28

More definitions

• False negative rate (FNR) – probability that a positive sample tests negative – PCR failures, DNA problems, … • False positive rate (FPR) – probability that a negative sample tests positive – DNA contamination, … ISTA Statistics Committee 29

Assay Error Impact (pool size =1) 100 80 60 No Errors 10% false negative rate 40 20 20%false negative rate 0 1% false positive rate 0 2 4 % Deviants in Lot 2% false positive rate

ISTA Statistics Committee

6 8

30

Double Stage Testing Plan N 1

X

1 

b

X 1

X

1 

a a

X

1 

b

N 2

X

1 

X

2 

c

X 2

X

1 

X

2 

c

ISTA Statistics Committee 31

No Pooling Allowed!!

Trait Purity Testing

• • • Example: Testing RR Soybeans are above 98% trait purity Must test individual seeds DNA or protein assay detects intended trait rather than unintended trait in AP testing • • FNR has larger effect on testing plan than FPR Roles of FNR & FPR reverse in Seedcalc6 and Qalstat programs ISTA Statistics Committee 32

Introduction to Seedcalc

ISTA Statistics Committee 33